As indicated in the guiding precepts articulated in Chapter 1, the committee recognizes the importance of both individual counts and population-level estimates in assessing the impact of disasters. This chapter provides a review of major classes of techniques used to construct population-level estimates of disaster-related mortality and morbidity. This review is intended to be illustrative rather than a comprehensive evaluation of the whole of demographic and epidemiological research on disasters.
The committee’s review of the methods encompassed by the population estimate approach is focused on three general classes of estimation techniques and methods that span a wide variety of analytical options that are particularly salient to measuring disaster effects. This chapter begins by discussing the issues associated with using conventional household or individual person survey interviewing for collecting data in disaster studies, whether to provide data on rates that can be scaled up to the level of the population or data to facilitate linkage of exposures to morbidity and mortality outcomes. Next, the chapter turns to the common practice of modeling excess mortality and significant morbidity effects—differences relative to baseline-level data and trends that may be attributed to a disaster. Finally, the chapter discusses some advanced survey and sampling methods for estimating the size of hard-to-count or hidden populations such as death, illness, or injury that is caused by a disaster but is not reported as such in the usual data sources. These methods require different kinds of information inputs and underlying assumptions, and each has strengths and weaknesses, but can be made more precise depending on attention to study design, comparison groups, available sample sizes, quality of supporting
data sources, and statistical techniques for analysis. These methods and techniques can also be made more accurate with continuing research on the methods themselves.
Put in context, these classes of estimation methods related to disaster effects parallel those that have been applied to study mortality and morbidity in other settings, notably war. Jewell et al. (2018) provide an instructive framework for methods for measuring civilian war casualties, using the term accounting to refer to the general problem and differentiating between two major branches of inquiry. The first, counting, is exactly as the name implies—tallying fatalities one-by-one—while the second, estimation, “will usually mean using a statistical procedure, such as a sample survey, to extrapolate a total number from a subset of deaths that have been observed” (Jewell et al., 2018, p. 380). The Jewell et al. (2018, p. 390) framework lists six broad categories of measurement technique categories:
- Documentation, listing deaths person-by-person along with other information about the circumstances;
- Derivation of excess mortality, which is “the use of census and other population demographic information to estimate mortality potentially attributable to both direct and indirect losses;”
- The use of personal survey interviewing methods (through “epidemiologic or demographic household surveys”) to estimate a count of deaths during and after an incident or incidents;
- Indirect estimation of death counts based on “assumed relationships” between incident-related deaths and information in found data (“data that happen to be available”);
- Crowdsourced death reports, or a more free-form kind of self-report or observer-report data than a usual household survey; and
- The use of multiple systems estimation, using “distinct and separate listings” to arrive at an overall, statistically adjusted estimate.
The analogues of these measurement techniques to the study of disaster-related morbidity and mortality are clear. In particular, documentation corresponds directly to the counting-type measurement challenge described in Chapter 3, attempting to ensure that electronic death and health records include the information necessary to generate a complete, one-to-one itemization of deaths and harms arising from a disaster. The remaining classes of measurement techniques lie in the province of estimation and inference. Although this includes a broad range of methods, each with different strengths, weaknesses, and appropriate uses, the discussion of common issues and the attending conclusions and recommendations at the end of this chapter demonstrates that they do have much in common.
In countries where vital event recording systems are limited or nonexistent, techniques for estimating disaster-associated mortality and morbidity
are essential because there is no viable alternative. In these settings, survey interviewing approaches are used to generate basic information on mortality levels. Even in countries like the United States, with a good system of collecting and recording vital statistics in place (though needing improvement, as detailed in Chapter 3), estimation methods based on statistical inference provide an important check on the completeness and accuracy of data produced solely by counting. Estimation methods are also important for quantifying the morbidity and mortality impacts of disasters over time and are therefore complements to (and not redundant of) counting techniques. Complex phenomena such as population health, crime, and the economy all require a variety of measurements to understand. Estimation techniques are well suited for capturing indirect mortality due to disasters. They may be the best reflection of deaths and morbidities that go unreported or misreported in records systems and may have greater utility in coordinating response to and recovery from subsequent disasters because when carefully implemented, they characterize well-defined populations known to be exposed to risk of the event. As with other approaches, these methods can be misleading when done improperly or too hastily. The advantages and disadvantages of all of these methods will be discussed later in this chapter.
Collecting information from a sample of households or individuals and making inferences about the population for which that sample is designed to be representative is well established in public health and disaster research. Survey information can be relatively quick and inexpensive to generate, and survey data can play important roles in all stages of a disaster, from estimating potential impacts beforehand to forecasting long-term impacts well afterward. Surveys can be highly variable in their mode of administration (face-to-face interviewing, phone, Internet, mail) and their scope and structure, ranging from carefully tailored one-shot interviews to multiyear longitudinal interviews to banks of questions on the national household surveys that are a cornerstone of economic and social statistical indicators. It is also true and well known that survey data collection is facing some existential crises with an increasingly reluctant-to-participate public, and that much recent attention has been trained on harnessing data from administrative records or other sources. That said, in understanding disaster impacts as in other areas of study, surveys can be an indispensable resource.
As its form of a “rapid needs assessment” in the immediate wake of a disaster, the Centers for Disease Control and Prevention (CDC, 2014) advise completion of at least one Community Assessment for Public Health
Emergency Response (CASPER), a household survey meant to provide quick situational awareness of the disaster-affected population. The CDC guidance suggests two-stage cluster sampling as the best balance of feasibility and accuracy; in this approach the geographic assessment area of interest is partitioned into non-overlapping clusters, with a sample of clusters drawn in a structured, representative way and then a sample of households drawn within each cluster. The CDC CASPER toolkit echoes the World Health Organization’s (WHO’s) longstanding Expanded Programme on Immunization (EPI) approach of sampling 30 clusters in the first stage and conducting interviews with 7 households in each of those clusters, numbers chosen to provide reasonable precision—where reasonable means 10 percent in absolute difference from a true underlying population percentage (of immunization coverage, in its original application) with 95 percent confidence (CDC, 2014; Henderson and Sundaresan, 1982). However, most recent CASPER guidance usefully follows Malilay et al. (1996) and reacts to other criticism of the EPI approach (Marker, 2008; Turner et al., 1996) in emphasizing the importance of proper probability sampling in the second stage. The standard EPI approach is to interview consecutively within a cluster until a quota of seven households is reached. Instead, the CASPER preferred approach is to draw 30 clusters (typically census blocks in the U.S. context, selected with probability proportional to the number of housing units in the cluster) and 7 housing units drawn within each cluster (either by quickly generating an address list through field observation and sampling from it or by systematic sampling).
Malilay (2000) reviewed eight such initial survey-based assessments in the wake of disasters that were published in the Morbidity and Mortality Weekly Report from 1980 to 1999 (one each following an earthquake and an ice storm, and the others describing the impact of hurricanes). This review and the capsule description of the suggested CASPER process motivate a discussion of the key advantages and disadvantages of survey methods for studying disaster effects. On the advantages side of the ledger, the number and frequency of the survey studies reinforce the fact that surveys can be a relatively quick, inexpensive, and feasible mode of data collection—and, indeed, that rapid needs assessment has become a standard and accepted part of the disaster response playbook. Malilay (2000) found that the initial assessments were typically completed within 3–10 days of the disaster (reflecting the priority on obtaining information quickly). A second major advantage of survey methods is their flexibility in content relative to available death or health records. To wit, surveys provide the capacity to elicit information on personal and family exposures to the disaster in process, on their preparation for and personal response to the disaster, on experiences with utilities and medical services in the wake of the disaster, and on differential impacts within key subpopulations such as the ill or elderly. A
particular strength of survey methods in the disaster context is the ability to study significant morbidity (relatively uncovered in records-based data sources) and mental health impacts, both in the short term and (with later administrations of the survey) in the longer term. A final advantage of surveys is that the data they generate may be vitally important as a grounding or starting point for more sophisticated estimation techniques such as those discussed later in the chapter.
Survey interviewing is best suited to the collection of information on personal experience and on things that are well known to the specific respondent, such as whether a family member, friend, or acquaintance was killed, sickened, or maimed by a disaster event. The mechanics and mathematics of familiar forms of sampling such as simple random sampling, as well as more complex designs such as stratified sampling (which permit, for instance, better accounting for heterogeneity in disaster exposure by varying sampling rates by demographic groups or geographic proximity from incident center), require an accurate, detailed, and up-to-date sampling frame or a listing of eligible sample units. The precision of sample survey estimates depends in large part on the sample size that is able to be gathered as well as on the adherence to the sampling protocols.
Taken together, these basic facts are essential to understanding some key disadvantages of survey methods in disaster impact analysis:
- If survey content focuses on topics about which respondents’ knowledge is limited or error-prone, the resulting statistics will not be accurate—and, logically, survey measures of mortality or causes of death are necessarily proxy information provided by surviving respondents. Members of a household may know quite precisely whether other members perished in a disaster, particularly if they saw them die or identified the bodies afterward, but their knowledge of the circumstances may be incomplete. “Verbal autopsy” survey methods (see, e.g., Thomas et al., 2018) are used in some international settings where death registration systems are limited, and survey methods may reveal individuals who are missing or unaccounted for (and so may be disaster-related deaths). Of the disaster impact surveys reviewed by Malilay (2000), only one asked about mortality within the household while nearly all asked about morbidity.
- The sampling sizes and strategies that make disaster-affected survey work feasible in disaster-affected areas are typically meant to provide reasonable precision in percentages but not necessarily in point estimates of counts. Moreover, the tragic reality is that whole households (including single-person households) may perish in major disasters, and it may be difficult or impossible to collect
information on them (or, indeed, on whole geographic areas) in the wake of a disaster. What this means is that surveys after disasters are less well suited to generating counts (and death counts in particular) than in collecting information about the characteristics of those affected. Again, as borne out in disaster-specific assessments, Malilay (2000) observed that all of the survey-based studies estimated the proportion of needs or impacts in the affected areas but none of them made the extrapolation to the magnitude/number of impacts.
- The basic cluster sampling approach suggested for CASPER (CDC, 2014) works around the need for a detailed address list by hewing to statistical units (census blocks) and values (housing unit counts within those blocks) in the first stage of sampling—but even these may have accuracy and completeness issues.
- We return to this theme at the end of the chapter but, for now, it should be noted that targeted, survey-based assessments such as CASPER can only be useful if they are actually conducted. Absent resources and support (and a clearer statement of the informational benefits) and amidst the immediate logistical turmoil of the early disaster response stage, it can be a daunting task for state and local officials to field a rigorous and well-executed survey. If these officials are the only ones enlisted in such efforts then surveys may not be done.
In short, survey methods have advantages and disadvantages and the trade-offs can be weighed when other data are available, although in some instances they may be the only viable option. In other instances, they can serve as an independent reading or as a basic check on counting, records-based approaches; indeed, this is why Kishore et al. (2018), one of the major studies of excess mortality in Puerto Rico after Hurricane Maria, based their analysis on a sample survey. This is merely to sound a note of caution in interpreting the results of sample surveys. The direct estimation of mortality through surveys is returned to later in this section, as a lead-in to discussion of other statistical methods.
Other disadvantages of survey methods focused on disaster-related mortality and morbidity are both subtle and unavoidable but can be addressed by taking care in the construction of instruments and interpretation of results. Careful attention to the rules of scientific sampling yields results that can be extrapolated, with known error bounds, to a larger population. However, it may be impossible to follow those rules to the letter in conducting surveys in the immediate wake of a disaster, and this will provide grounds to challenge the true representativeness of the sample. Among these challenges is survivor bias: the people found in a CASPER or other
post-disaster survey survived the disaster and are being contacted in close proximity (spatially and temporally) to the disaster, so there is inevitably concern as to how well they represent the sub-populations that were killed by the disaster or were displaced (voluntarily or not) by the disaster. It is also well known in survey research that the specific wording of questions or instructions, their exact ordering and appearance or the way in which they are asked verbally, and the interviewer’s skill (or capacity) for building rapport with a respondent can have major effects on the results. Throughout this report, several terms are encountered that can mean different things to different people, such as “disaster-related” (“direct” or otherwise) or “significant injury” (morbidity). Translating abstract concepts into short, clear questions requires significant effort. Compounding this difficulty is that in a quick response to a major disaster, survey interviewing may need to be put into relatively unpracticed hands. Survey respondents may be unwilling to engage under the best circumstances, but this can be even more so at a time when their life and property is under direct threat. Survey interviewers and the agencies fielding the surveys must also take care to ensure that the selection probabilities underlying the survey are maintained. This is not always possible, however, which means making difficult decisions on how to handle sample households or even whole clusters that are physically inaccessible after a disaster (e.g., whether to track selected survey participants to new locations or to select alternative participants). Along those lines, a survey has great flexibility in terms of its possible content, and there may be great value in longitudinal, repeated interviews after the initial humanitarian concerns have been addressed following a disaster, but the trade-offs in time and respondent burden must be made carefully.
It is instructive to return to the analogous setting of measuring casualties and injuries in war because injury and death tolls after disasters are, like those in war, sensitive subjects that are subject to political misinterpretation—underscoring the importance of transparency in constructing surveys. Hence, it should be noted that survey-based estimates of civilian casualties in war are often disputed and highly controversial. In May–July 2006, 3 years after the March 2003 invasion of Iraq, Burnham et al. (2006) conducted a cluster sample survey to estimate the number of civilian casualties. They selected 50 clusters within Iraq’s governates through two stages of probability-proportional-to-population sampling, sampling a residential street within the cluster through another staged process, and interviewing adjacent households from a randomly selected starting unit until 40 households were surveyed. The survey asked questions about deaths (and births and migration) between January 2002 and June 2006 (thus permitting before/during/after invasion estimates) and about the cause of death. Between February 2007 and March 2008, the British polling firm Opinion Research Business (ORB) conducted three surveys of adults in Iraq, asking
about deaths since the 2003 invasion; Heald et al. (2010) observe that ORB provided little methodological description, other than specifying that the second survey (the focus of most attention) used multistage random sampling to arrive at a sample of 1,720 adults in Iraq. In contrast to the more nuanced studies of excess mortality described in the next section, the two war death surveys used relatively simple pre- and post-comparisons and yielded estimates much larger than other sources: Burnham et al. (2006) claiming 654,965 “excess Iraqi deaths as a consequence of the war” and ORB suggesting “1.2 million murders” of civilians resulting from the invasion, later revised downward to 1 million (Heald et al., 2010). Both surveys received heavy criticism and, in the case of Burnham et al. (2006), eventual censure from the American Association for Public Opinion Research of the study’s principal investigator. Marker (2008) and Laaksonen (2008) both summarize many of the critiques of Burnham et al. (2006) over lack of transparency in describing the study’s methodology while the journal Survey Research Methods featured a detailed methodological critique of ORB’s study by Heald et al. (2010) and a colloquy between those authors and ORB (Heald et al., 2010). On one hand, the two Iraq War casualty surveys are commendable for what they were able to accomplish, conducting field interviews in extremely volatile and difficult conditions. But, with an eye toward studying disaster effects, the critiques suggest that potential coverage and measurement errors should give serious pause to those considering self-report measures of household mortality and make clear the need for a more nuanced analysis than simple before-and-after comparison of estimated fatality rates.
Similar themes are invoked by the Working Group for Mortality Estimation in Emergencies (2007), which argued for further research on survey-based mortality measures; the authors argued that survey interviewers should be better equipped and trained to elicit accurate temporal recall of deaths and household-member moves. Importantly, the Working Group also argued that the EPI-type two-stage cluster survey method is too frequently used as the cheapest, simplest method without considering alternatives, such as more careful stratified sampling. Doocy et al. (2013) provide a very useful example to consider in their study of the January 12, 2010, earthquake in Haiti. One year after the earthquake, they fielded a survey in metropolitan Port-au-Prince intended to study the effects of such risk factors as basic demography, age and composition of housing, and degree of crowding on mortality in the earthquake. The study was designed to permit comparison between the neighborhood population (still in their homes) and the population residing in displacement camps. Remote sensing had been used to estimate the extent of building damage in communes/districts of Port-au-Prince, and the resulting estimates of the population in heavily damaged areas were used to proportionally allocate sampling
clusters across communes. Ultimately, Doocy et al. (2013) suggested that the camp and neighborhood populations did not experience significantly different odds of death in the earthquake, but extreme crowding and residence in multilevel structures did significantly increase the risk of death.
The studies that have arguably made the best use of survey methods to estimate disaster-related effects share some important common features: they play to the strengths of survey techniques and focus less on short-term mortality impacts and more on the longer-term impacts of disasters. Importantly, they have been able to build from or extend existing data collection efforts and thus have not needed to be built fully from scratch. For example, Phifer et al. (1988) report that in the early 1980s researchers in Kentucky already had a statewide panel survey in the field, interviewing a set of older adults (55 years of age and older) at 6-month intervals between early 1981 and early 1983 regarding their health and levels of stress. Southeastern Kentucky experienced major flooding in a 10-county area in June 1981 (halfway between waves 1 and 2 of the panel survey) and again, more severely, in May 1984. The researchers were able to conduct a sixth wave, re-contacting those survey subjects who had lived in one of the 10 counties affected by the 1981 floods (plus 5 adjacent counties) and adding questions about exposure to the floods and on subjects’ assessment of their personal loss. They resolved a tricky conceptual problem—how best to elicit accurate recall of the longer-ago 1981 flood relative to the fresher-in-memory 1984 flood—by developing a narrative from 1981 newspaper accounts that was read to respondents. The unusual circumstances of the Kentucky study were such that they permitted examination of possible seasonal or time-decaying effects on health measures possibly attributable to the floods, but the effect of the flood/disaster intensity could also be studied (the 1984 flooding having been more damaging than the one in 1981). Similarly, Aida et al. (2017) report that the March 11, 2011, Great East Japan Earthquake and Tsunami occurred 7 months into data collection of the major Japan Gerontological Evaluation Study (JAGES), a nationwide cohort study focused on the health and social connectedness of adults aged 65 and older. The city of Iwanuma, one of the JAGES field sites, was particularly hard hit by the earthquake and because its Tamaura district was completely inundated by the tsunami, Tamaura residents became the focus of the disaster impact study. The Aida et al. (2017) example is different from the others examined here in that the survey information provided the necessary baseline and risk factor/predictor information, but the outcome variable being studied—all-cause mortality—was derived from Japan’s national long-term care insurance database, to which local physicians are required to report via their municipal governments. Among the findings from the research is that elevated mortality was associated with depressive symptoms, which might be useful to consider in conducting disaster evacuations. Another example of disaster-related survey
research building from extant data collections is the set of studies that have been done on data from the World Trade Center Health Registry, itself a voluntary enrollment registry open to people who were directly exposed to the World Trade Center disaster on September 11, 2001. Enrollment in the registry required the completion of a wave 1 general health survey in 2003–2004; thereafter, registry participants were administered follow-up surveys targeted to different age groups in wave 2 (2007–2008), wave 3 (2011), and wave 4 (2015, including a separate survey related to asthma). Registry members were also able to indicate their willingness to participate in other studies and surveys, and the registry has developed into a sampling frame for researchers examining the health after-effects of September 11. Studies born from the registry as sampling frame include the Jacobson et al. (2018) analysis of posttraumatic stress disorder and depression 14–15 years after the disaster; the registry has also facilitated studies such as Jordan et al. (2018) on the risks of mortality from various causes by degrees of September 11–related exposure (e.g., firefighters relative to commuters and passers-by).
An excellent example of a survey-based disaster impact study that vitally benefited from building from a solid baseline in the form of another survey data collection is the Study of the Tsunami Aftermath and Recovery (STAR) (Frankenberg et al., 2011). On December 26, 2004, a 9.3 magnitude earthquake struck off the coast of Sumatra, Indonesia, displacing 1 trillion tons of water and spawning massive tsunami waves affecting Indian Ocean coastline areas. The combined earthquake and tsunami disaster killed on the order of 170,000 Indonesians and displaced about 750,000, most within the province of Aceh, whose western coast parallels the 1,200-kilometer undersea rupture that generated the waves.
This longitudinal survey came to fruition because the national statistics office, Statistics Indonesia, allowed a sub-sample of its 2004 National Socioeconomic Survey (known by the acronym SUSENAS) to serve as a pretsunami baseline. The 2004 SUSENAS, an extensive, district-representative national survey, was fielded in February/March 2004, which was 9–10 months prior to the disaster. To facilitate the collection of unique longitudinal data from respondents from both affected and unaffected communities, Statistics Indonesia provided access to its sample to a project team from U.S. universities and SurveyMETER, an Indonesian survey research institution. The STAR team drew on household rosters for all households included in SUSENAS 2004 in 11 districts of Aceh with coastlines susceptible to tsunami inundation. The STAR project conducted five follow-up interviews on a roughly annual basis beginning in 2005, a 10-year follow-up in 2015, and interviews with a sub-sample in 2017–2018. STAR researchers made extensive efforts to track all (or obtain mortality information on all) of the eligible 2004 SUSENAS respondents, wherever they had relocated to
after the disaster. STAR ascertained survival status for 98 percent of the original household members and has interviewed approximately 95 percent of survivors at least once. Detailed questionnaires were administered to each individual in the household. The STAR data collection included both community-level measures of disaster exposure (based on geographic proximity to coastline, elevation, observed tsunami wave height nearest that community’s shore, and satellite imagery) and individual-level measures (e.g., whether individuals had lost family members in the disaster, had helped search for survivors, or saw people struggle in the tsunami waves). Because STAR covers communities devastated by the tsunami and communities where effects were limited, the data support analyses of the evolution of morbidity and mortality outcomes over time, differentiated by exposure to the disaster.
Papers written based on the STAR data have revealed a number of key findings regarding mortality at the time of the disaster and the evolution of health in its aftermath. In the most heavily damaged communities in the sample, more than 70 percent of the population perished in the tsunami (in comparison, in unaffected communities mortality between the 2004 and 2005 survey was 1.85 percent). Young children, older adults, and prime-age women were much more likely to die than prime-age men, but mortality risks were closely related to the demographic composition of the household, whereas higher levels of socioeconomic status were not protective (Frankenberg et al., 2011). The STAR data also shed light on how mortality risks evolved in the decade after the disaster among those who survived the tsunami. The evidence points to lower risks of subsequent mortality among those from communities in which tsunami mortality was high, but also to evidence of scarring for older men who experienced high levels of posttraumatic stress and for older women who lost a spouse in the disaster (Frankenberg et al., 2020). The evidence of elevated mortality due to psychological scarring did not emerge in analyses at the 5-year mark (Ho et al., 2017). With respect to psychosocial health, exposures to traumas at the time of the tsunami were clearly linked to the emergence of posttraumatic stress reactivity (Frankenberg et al., 2008). In addition, results for C-reactive protein and waist circumference, both aspects of health known to be associated with stroke and cardiovascular disease, indicate that stressful exposures at the time of the tsunami were associated with poorer health outcomes many years later (Thomas et al., 2018).
For coronavirus disease 2019 (COVID-19), efforts are under way to use survey methods to estimate the total impact of the virus. Understanding whether surveys or counting only test-positive cases from those presenting for care more accurately describes if the outbreak in a city or state is getting better or worse can inform a number of important policy questions. As described in Appendix C, sero-prevalence surveys identify and test randomly
chosen individuals to determine the percentage of people in a community recently infected with COVID-19. In early April, CDC announced that such studies were under way in some of the nation’s COVID-19 hot spots and would be extended to the rest of the country in the summer (Branswell, 2020). For example, Rosenberg et al. (2020) analyzed a statewide convenience sample of New York grocery store customers and estimated that the cumulative incidence of COVID-19 through March 29, 2020, was 14 percent. This rate varied substantially by geographic area (reaching 24 percent in New York City) as well as race and ethnicity. They also estimated that only 8.9 percent of individuals infected during this period were diagnosed, and that this fraction varied from 6.1 percent of individuals age 18–34 years to 11.3 percent of those 55 years or older. A number of seroprevalence studies are now under way (Branswell, 2020; CDC, 2020a,b) and WHO is coordinating sero-prevalence studies in at least six countries (Vogel, 2020).
The basic idea behind an excess mortality study is to treat some focal, precipitating event—in this circumstance, the major disaster itself—as a statistical change-point. Mortality data series are obtained that cover a time window before and at least some period after the change-point/disaster. The pre-disaster, or baseline, data are modeled in some way in order to extrapolate or project an estimate of the expected number of deaths that would have occurred absent the change-point/disaster over some forecast time period. Excess deaths are then simply the difference between the observed mortality data after the disaster and the expected values, and the argument is that these excess deaths are attributable to the disaster.
These “excess” studies are a broad class that can be approached from many possible methodological directions, as crude as testing for differences in mean levels before and after the disaster yet sophisticated enough to include intricate time-series or regression structures in the pre- and post-disaster models. Case-control or propensity-matching studies, such as those discussed by Quast (2020), are another way of addressing the same basic issue, comparing different outcomes among disaster-affected cases with those among demographically similar but not disaster-affected baseline controls. So too are difference-in-difference analyses from the econometrics literature and general statistical models that permit change-points in parameter values. In short, there is no set, standard, universally appropriate method for completing the various steps in these analyses, and different approaches and sets of assumptions can yield different conclusions. Indeed, it is the resulting variety of estimates arising from excess mortality studies conducted in Puerto Rico in the wake of Hurricane Maria in September 2017 that was a major impetus for establishing this committee.
Excess mortality studies are tailored to estimate the volume of all deaths that are potentially attributable to the disaster, including those (in the committee’s schema) that are indirectly or partially attributable and that may not be explicitly linked with the disaster in death records; that is, these studies are expressly intended to measure that which is not directly measurable. Moreover, the methodology depends critically on a loosely implied but unproven causal relationship that the detected excess deaths resulted from the disaster. In light of these major intangible factors, what keeps excess mortality studies from lapsing into post hoc ergo propter hoc fallacy is care in specification and documentation of assumptions. The credibility of excess mortality studies hinges on, among other factors:
- The quality, completeness, and sourcing of the pre- and post-disaster mortality data series;
- The care with which the baseline, pre-disaster mortality is measured and the resulting confidence that may be had in projecting expected death rates (absent the disaster) into the future;
- The degree to which the derivation of the baseline accounts for potential confounding factors and thus bolsters the claim that an excess is due to the disaster alone; confounding factors can include gross seasonal trends in the data, the differential presence of particularly vulnerable sub-populations in the study area (i.e., by age, sex, or proximity to the disaster), and the migration/displacement of the population;
- The length of the time window over which post-disaster events are studied, balancing the necessary tension between having too short a time window (some time is required for the events to manifest in the data) and too long a window (in which case the validity of projecting the expected, baseline levels would decay, straining the argument that all excess is truly disaster-related)—while recognizing that virtually any choice in window length will miss some events (such as injuries or disaster-exacerbated chronic conditions that ultimately result in death); and
- The extent to which the work is rigorously validated by assessing alternative counterfactuals/baselines or considering whether effects apply differently to different demographic groups.
In the balance of this section, some major exemplars of these excess mortality studies in the disaster arena are described, focusing on their handling of these design features. It should also be noted that the review in this section speaks exclusively about “excess mortality” because mortality has featured more prominently in the related literature to date than morbidity effects. Some studies have examined relevant cause-specific mortality (e.g.,
myocardial infarction or pneumonia) to permit some glimpse of possible disaster-related morbidity-leading-to-mortality effects and some studies have examined exposure-related morbidity effects in longer-term disasters (e.g., exposure to smoke and particulates in the wake of wildfires). That said, the classic pre- and post-disaster modeling, and derivation of estimates of excess death, should not be extended to examine the incidence of disaster-related significant morbidity in its own right.
This report has already described the high variation in estimated mortality counts in Puerto Rico following Hurricane Maria, which made landfall on the island on September 20, 2017. The following brief review highlights the choices made by various researchers in modeling excess deaths attributable to the storm. First as a preprint and soon revised as a note in the Journal of the American Medical Association, Santos-Lozada and Howard (2017, 2018) compiled vital statistics data on all deaths by month in Puerto Rico for 2010–2017, calculating a mean and 95 percent confidence interval for the observations for each calendar month (i.e., for all of the January counts, all of the February counts and so forth). Santos-Lozada and Howard (2018) recognized that major migration from Puerto Rico before the storm would have reduced the underlying population denominators, but they did not expressly account for that mobility in their analysis. Instead, they compensated in part by being conservative in calculating the excess, subtracting the upper 95 percent confidence bound from the observed all-cause death counts for September–November 2017 rather than the historical mean for the months. Santos-Lozada and Howard (2018) estimated 1,139 excess deaths for September–December 2017 that they argued were hurricane-related, substantially higher than the then-official death toll of 64 stipulated by the Puerto Rico government.
A report by Kishore et al. (2018), published in the New England Journal of Medicine and colloquially called the Harvard study, opted to construct a mortality estimate independent of Puerto Rico’s death registration records, pending review of the quality of construction of those records following the hurricane. In January–February 2018, the researchers fielded a survey with a final sample size of 3,299 households—a survey involving some distinctive design and operational choices relative to conventional household surveys or those mortality/morbidity surveys described in the previous section. Puerto Rico’s 900 barrios (administrative units used as clusters) were stratified into 8 categories based on values of a “remoteness index,” measuring the road travel time to the nearest large city (50,000 population or more). Within each remoteness stratum, 13 barrios were chosen at random (adding 1 sampled barrio from each of the island municipalities of Vieques and Culebra); then, within each sampled barrio, 35 buildings or structures reported on available OpenStreetMap layers were selected. Whenever an interviewer visited a sampled structure location and
could not complete a household interview—whether the structure was not a residence, the household declined to be interviewed, or no person was present at the time of the visit—interviewers were directed to pick at random “from all surrounding visible houses” and try to interview there. The survey questionnaire also differed from some conventional norms, including the rostering of household members. Many household surveys develop a household roster and then use that roster as reference or guide through the other survey questions; the Harvard survey asked about “each household member, including all persons who had moved in, moved out, been born, or died in 2017” but simultaneously did not record “any personal identifiers,” leaving it unclear whether all pertinent questions were asked about each identified household member in turn or whether the interview looped through deaths and moves separately. (This is not to suggest any fault in the method but merely to mention one of the small contextual choices that are critical in survey practice.) The survey also asked about the number of deaths in the neighborhood (Kishore et al., 2018). Mortality rates for September 20–December 31, 2017, were estimated from the survey, with excess rates calculated relative to the official vital statistics death rate for the same date range in 2016; the researchers justified that choice by observing that the vital statistics mortality rates “showed seasonal but stable trends from 2010 through 2016” (Kishore et al., 2018). Ultimately, the Harvard research team would estimate 4,645 excess deaths attributable to Maria through the end of 2017—albeit with a 95 percent confidence interval of 793–8,498. Though the study would draw criticism for yielding too high an estimate of excess mortality and its large error bands, the authors conjectured that their estimate was likely an underestimate due to survivor bias and the conceptual difficulties noted in the previous section in estimating mortality in a survey. Logically, households in which there were no survivors (including single-person households) could not be included in the survey; hence, adjusting their rates using the 2016 vital statistics mortality rate in single-person households, Kishore et al. (2018) suggested that the true excess-death count could be 5,740 (95 percent confidence interval, 1,506–9,889).
The government of Puerto Rico commissioned the Milken Institute School of Public Health at The George Washington University (GWU) to conduct three studies related to deaths in Hurricane Maria, one component of which was an independent review of excess mortality attributed to the hurricane. The excess mortality portion of the broader GWU study is detailed in both the project’s standalone report (GWU Milken Institute School of Public Health, 2018) and an article in The Lancet Planetary Health (Santos-Burgoa et al., 2018) and, importantly, has gone on to be accepted by Puerto Rico authorities as the official death toll for the commonwealth. The GWU study used all-cause mortality data by age, sex, and
residential municipality in monthly time series from July 2010 through February 2018 as its source of mortality data. A subtle innovation made in the GWU study was setting the end point as the end of February 2018—a longer study period, over which estimated excess deaths are posited to be due to Hurricane Maria—than other studies. But the hallmark of the GWU study is the care and rigor with which the baseline, pre-disaster models were calculated. The data were grouped/stratified by age (three categories), sex, and a three-category breakout of a Municipal Socioeconomic Development Index used by the Puerto Rico Planning Board. The study team fit over-dispersed log-linear regression models to the July 2010–August 2017 data and ultimately assessed four different specifications of model interaction terms for each of two different methods for modeling seasonal and long-term trends before selecting a final form. Moreover, when projecting forward trends for September 2017–February 2018 against which to calculate excess deaths, Santos-Burgoa et al. (2018) compared a so-called “census counterfactual” (using U.S. Census Bureau population estimates for Puerto Rico) to a “displacement counterfactual.” This approach decremented the population based on air travel data from the U.S. Bureau of Transportation Statistics and the Puerto Rico Planning Board to account for mass migration off the island. Ultimately, the researchers would conclude that the mobility effects were too substantial to ignore and accepted the displacement counterfactual. For comparison with the other researchers, Santos-Burgoa et al. (2018) estimated 2,098 excess deaths between September–December 2017, but the 2,975 excess deaths estimated through February 2018 is what has become accepted as the official toll in Puerto Rico. Shortly after the publication of Santos-Burgoa et al. (2018), Howard and Santos-Lozada (2019) would replicate their basic approach described above using the GWU team’s compiled data, arguing that the displacement approach (subtracting out-mobility) overstates the mortality rate too much and makes the 2,975 estimate too high; in a rejoinder, Santos-Burgoa et al. (2019) reiterated their conclusion that failing to account for massive outmigration is itself a major bias.
In a more recent addition to the Hurricane Maria excess mortality literature, Cruz-Cano and Mead (2019) tightened the forecasting range, estimating excess mortality levels for only the months of September and October 2017. They fit a standard autoregressive integrated moving average (ARIMA) time-series model to monthly mortality rates from January 2008 to August 2017, themselves obtained from official vital statistics for Puerto Rico, thus exploiting the capacity of ARIMA models to capture broad time trends and seasonal effects. In addition to the time-series methodology, though, the Cruz-Cano and Mead (2019) analysis is distinctive for replicating the work for specific causes of death in an attempt to identify causes of death that may have been particularly exacerbated by the hurricane. Their
overall estimate of 1,205 excess deaths for September–October 2017 is consistent with the GWU study’s estimate of 1,271. The causes of death for which particularly strong excesses were detected included heart disease (253 deaths) and diabetes (195 deaths), along with a catch-all “other” causes (204 deaths), the chronic conditions suggesting that these might be death types where the hurricane indirectly contributed to the death by hindering access to regular treatment.
One particularly notable excess mortality study not focusing on Hurricane Maria and Puerto Rico was Morita et al. (2017), who investigated the after-effects of the triple disaster that struck Fukushima Prefecture, Japan, in 2011. On March 11, 2011, the Great East Japan Earthquake spawned a major tsunami in the Fukushima area; the earthquake had already caused internal electrical failure at the Fukushima Daiichi Nuclear Power Plant, and the tsunami waters inundated the plant and disabled the emergency generators, leading to multiple nuclear meltdowns and radioactive contamination between March 12–15, 2011, and forcing a wide evacuation zone around the plant. Mortality data for two cities in the prefecture for 2006–2015 (allowing for a very long 4-year post-disaster reaction period) were obtained from the nation’s vital registration system, and Poisson regression models were fitted to the data (separately for each month from January through December in turn, as a check on seasonal effects). But the remarkable aspect of this triple-disaster response study was its embrace of the methodology’s basic purpose to study the effects of indirect disaster-related deaths—to the extent that Morita et al. (2017) explicitly subtracted out direct disaster-related deaths before calculating the relative risks. Specifically, the researchers subtracted out numerous external-cause death types from the data for March 2011 that were likely due to the direct physical forces of the disaster, categories such as head injuries, burns and corrosions of multiple body regions, and any death that may have been coded as being “due to exposure to forces of nature;” deaths listed as being caused by drowning on March 11, presumably due to the tsunami, were also excluded (Morita et al., 2017). Though the lengthy time series permitted examination of both short- and long-term indirect health impacts, Morita et al. (2017) found that the estimated excess in indirect disaster-related deaths was concentrated in the first month following the disaster.
Not all studies that use the “excess deaths” nomenclature follow this pattern, but they adopt some unique approaches that are noteworthy nonetheless. An example of reconceptualizing the problem is the analysis by Kim et al. (2017) of excess mortality associated with Hurricane Sandy in New Jersey. The hurricane having made landfall in New Jersey on October 29, 2012, Kim et al. (2017) defined two post-disaster study periods of interest, the “Sandy month” (October 28, 2012–November 27, 2012) and the “Sandy quarter” (October 28, 2012–January 27, 2013). Obtaining
mortality data from the state’s electronic death registration system, the researchers aggregated deaths into monthly counts (months beginning on the 28th day of each calendar month). Negative binomial regression models were used to estimate death rates. The unique approach of this study was to study relative risks, the ratio of, for example, “Sandy month” to the corresponding month (October 28–November 27) across the years 2008–2011; the estimated relative risks were adjusted for seasonality and time trends based on the modeled rates for the pre-study period (the 11 months preceding “Sandy month”) versus the pre-comparison period (the same 11 months across the years 2008–2011). Kim et al. (2017) performed separate analyses for all-cause mortality as well as analyses of some specific cause-of-death groups, and for the elderly separate from other ages, and they found consistent elevated risks in the Sandy-affected periods (though all-cause mortality excesses were statistically significant only for the “Sandy quarter,” not the “Sandy month”).
Another example of an atypical excess mortality–type study in the disaster context arose from work following Hurricane Katrina in August 2005. Noting the then-considerable delays in death reporting in greater New Orleans in Katrina’s wake, Stephens et al. (2007) asked whether counts of death notices in the region’s main newspaper (the Times-Picayune) might plausibly serve as a surveillance indicator for changing mortality trends. However, the study was mainly about whether death notice counts roughly correlated with historic death record totals and focused principally on death notice counts between January and June 2006, which themselves were considerably time-lagged relative to the disaster, rather than trying to estimate the extent of elevated mortality potentially attributable to Katrina. Still, the Stephens et al. (2007) study is distinctive for its attempt to use and corroborate a novel data resource—and for its lack of hubris in portraying its findings, emphasizing the preliminary and limited nature of the work.
A slightly different group of excess mortality–type studies is somewhat atypical in nature because the underlying natural disasters—heat waves and extreme temperature exposure—can develop and have intense effects over a more gradual time period than other disaster types. The exploratory data analysis of Henschel et al. (1969) on a July 1966 heat wave in St. Louis, Missouri, was an early entrant in the field, not positing any formal models but more bluntly examining spatial differences in deaths and recorded temperatures in parts of the region. Since 1969, studies of mortality in heat waves have grown more sophisticated in the ways that they try to control for demographic confounding effects (i.e., at-home deaths among the elderly) or for features of the built environment that may have major effects (e.g., housing conditions, locations of residences on upper floors of buildings). Further examples include Eisenman et al. (2016), Harduar Morano et al. (2016), Joe et al. (2016), Petitti et al. (2013), Vandentorren et
al. (2006), and Zhang et al. (2017). The extreme heat waves that affected France in August 2003 (as well as in 2006 and 2009) have inspired a variety of studies on their own, including Pascal et al. (2012). Pascal et al. (2018) modeled the relationship between mortality and temperature in 18 French cities, with observations spanning 2000–2010, using a quasi-Poisson distribution and building other traditional confounding factors (e.g., population density and the percentage of the population age 75 and older) into their model. Accordingly, it stands as an intensive analysis of important baseline trends in mortality over time. Pascal et al. (2012) was a retrospective study to see whether daily changes in a set of candidate indicators (total mortality being one, but also including emergency room visits)—if available at the time of the less than severe 2006 and 2009 heat waves in France—might have shown enough excess activity (relative to values experienced in previous years, within some time window around the particular date) to trigger a “statistical alarm,” and how the timing of those coincided with warnings of extreme warming from weather forecasts. Ultimately, Pascal et al. (2012) found that none of the chosen health indicators could have performed this early warning function well, even if they had existed in real time, though they still noted important trends in the potential indicators to analyze in broader scope following the heat waves.
As described in more detail in Appendix C, excess mortality analyses have played a prominent role in COVID-19. One method applied is an extension of CDC’s standard method for determining the annual death toll for influenza. CDC regularly tracks the number of deaths from either pneumonia or influenza as a proportion of all deaths recorded each week, and these data are compared with typical seasonal patterns and departures above this pattern. CDC added COVID-19 deaths to this analysis and found that almost 25 percent of all deaths occurring during the week ending April 11, 2020, were due to pneumonia, influenza, or COVID-19. This is far above the traditional epidemic threshold of 7.0 percent, with sharp weekly increases from the end of February through mid-April (CDC, 2020b).
Efforts to use excess mortality methods to estimate the total number of deaths due to COVID-19 are under way. In an analysis originally published in The Washington Post,1Weinberger et al. (2020) conducted a similar analysis for the entire United States from March 1–May 30, 2020. They estimated that there were 122,300 more deaths than would typically be expected at that time of year, 28 percent higher than the official tally of COVID-19-reported deaths during that period. Woolf et al. (2020) analyzed mortality between March 1 and April 25, 2020, and estimated 87,001 excess deaths nationally, of which 65 percent were attributed to COVID-19.
1 See https://www.washingtonpost.com/investigations/2020/04/27/covid-19-death-tollundercounted/?arc404=true (accessed September 1, 2020).
The authors also identified substantial increases in mortality from heart disease, diabetes, and other causes, but few from pneumonia or influenza as underlying causes (Woolf et al., 2020).
One of the challenges in analyzing mortality data in real time is that deaths are reported weeks after the decedent was infected. In their COVID-19 excess mortality study, Weinberger et al. (2020) accounted for this by inflating reported deaths using weekly estimates of reporting delays. Reporting delays also are a problem for measures such as the case fatality rate that are derived from mortality data: at any given time, cases are more completely reported than deaths, so a simple ratio of deaths to cases will be biased downward. A not-yet-peer-reviewed analysis by members of a Centre for the Mathematical Modelling of Infectious Diseases working group on COVID-192 describes the adjustment of Russell et al. (2020) to account for this phenomenon and illustrates its impact.
The full set of deaths or significant morbidities related to a disaster includes those incidences that are only indirectly related or partially attributable to the disaster as well as those that may be incorrectly noted as disaster-related in official records; these can be conceptualized as a “hidden finite set,” or a hidden or hard-to-count population. That is, borrowing nomenclature from the review of the literature by Cheng et al. (2020, p. 1), it is a set whose “elements are not directly enumerable or [its] size cannot be ascertained via a deterministic query.” Over the years, a wide array of techniques for estimating the size of such hidden sets have emerged and been widely used in a number of disciplines, including public health and epidemiology. Not all of these techniques, discussed briefly in this section, have been applied in the specific context of disaster impact and recovery, but to the extent that things like “people who died as an indirect result of a disaster” or “people who have various morbidities and co-morbidities” are hard-to-count populations, the techniques are methods that might be brought to bear in future disaster studies. Cheng et al. (2020) discuss these and other methodologies for estimating hidden populations, adopting common notation and focusing attention on the asymptotic properties of related estimators, in addition to citing their use in a wide array of substantive settings.
We particularly note three broad methods, following a hierarchy presented by McCormick (2020). Given interest in a hidden population (say,
2 See https://cmmid.github.io/topics/covid19/global_cfr_estimates.html (accessed September 1, 2020).
disaster-affected individuals) for which there is no complete sampling frame (by definition), the next question is whether it is feasible or necessary to access the affected individuals directly. If not, and if indirect access would suffice, then the network scale-up method (NSUM) would be appropriate. On the other hand, if it is possible to access the affected individuals directly, then capture–recapture methods or respondent-driven sampling might be useful choices.
NSUM, which was introduced by Bernard et al. (1991) in early form, advanced in fuller form by Killworth et al. (1998), and summarized in the Bernard et al. (2010) overview paper, is a notable exception to other techniques listed here in that its first major application was actually in the context of assessing disaster impact—in that case, estimating the true number of deaths in the September 19, 1985, earthquake that struck Mexico City. NSUM is one of a class of methods that uses some implicit structuring within the hidden set to arrive at an adjusted estimate for the hidden population size. As the name implies, the particular structuring assumed by NSUM is based on a survey respondent’s personal/social network size. For instance, survey respondents in the original Bernard et al. (1991) analysis were asked questions about how many members of five sub-populations (doctors, mailmen, bus drivers, television repairmen, and priests) they know—and how many victims of the earthquake they knew. The collected information can be used to estimate personal network size whether the true sub-population sizes are known (in the first Mexico City example, the numbers of workers in each of the five occupations was known) or unknown, in which case the queries turn to better-known but ideally mutually exclusive relationship categories (e.g., family, coworkers). The estimated network sizes for the designated sub-populations are reconciled with the network information on the phenomenon of interest (here, earthquake victims) to estimate a prevalence rate for and, in turn, the size of the hidden population. NSUM is applicable—and has been used—in many substantive settings, even though it does suffer some important shortcomings, as summarized by Bernard et al. (2010). Among these shortcomings are the assumptions that the survey respondents have accurate knowledge of the characteristics of people in their personal network circles (e.g., whether or not a contact has HIV/AIDS, if that is the hidden population trait of interest) and that they are willing to report that information accurately even if the target characteristic is particularly sensitive or stigmatizing. But, still, NSUM does have some distinct advantages, not least of which is that—being an indirect estimator—information is gathered from a basic probability sample of the general population and does not require resource-intensive searches for specific individuals or full sampling frames for (by definition) hard-to-enumerate groups. The survey questions necessary to perform NSUM
analysis are simultaneously a benefit and a challenge: they can be rendered in such a way as to be relatively quick and easy to administer (and so may be easily slotted into a new or ongoing survey data collection), but they also assume that different respondents will interpret and assess what it means to “know” someone in the same way. Feehan and Salganik (2016) suggest a generalized scale-up estimator that is intended to curb some issues with normal NSUM (such as the imperfect knowledge of contacts’ membership in the hidden population), but with a major departure: the generalized scale-up estimator requires samples from both the general/frame population and also the hidden population itself.
A longstanding methodology for estimating the size of hard-to-count sub-populations is capture–recapture, which takes its name from its original application area in wildlife studies. To estimate the number of fish in a pond, for instance, a sample of fish would be caught, tagged, and released back into the pond. On a return visit—sufficiently close in time that the composition within the pond would not be expected to change greatly—another sample of fish is collected and the number of tagged and untagged fish is recorded. Because the composition of the second sample should be proportional to the composition of the population as a whole, inference can be made about the total population size based on the two samples. The resulting dual-systems estimation methodology has become one of the principal ways in which the coverage of the U.S. decennial census is evaluated; an independently administered, carefully stratified follow-up survey is fielded and the results matched to a similarly carefully designed extract of census returns for the same sample areas, and the match rates between the two independent samples permit estimates of the undercount (or overcount) in the census. The capture–recapture or multiple systems estimation approach could be used in relatively pure form to estimate the true size of disaster-affected populations, though it stands to reason that (as discussed earlier in the chapter) if collecting one good sample survey in the context of a major disaster is difficult, collecting two or more (and properly accounting for mobility or shifts in population) would be even more so.
Finally, respondent-driven sampling (RDS) was inspired by the desire for methods to directly sample—exclusively—from some hidden population of interest. Particularly for hidden populations characterized by stigmatized or illegal behavior, studies have often had to resort to snowball or chain-referral sampling: on finding and completing a survey with a single member of the target, hidden population (the “seed”), the researchers then ask the person to name others in the population who might be interviewed. Then, each other person who completed the survey is asked to name others, and so on. Snowball sampling lacks grounding in effective probability sampling, and RDS, introduced by Heckathorn (1997, 2002) solves that problem.
Although the National Science Foundation (NSF) approves research that employs snowball sampling (NSF, 2020) some people recommended by the initial seed in a snowball sample may not want to be identified as a member of that particular hard-to-find or hard-to-count population. RDS solves that problem, too. RDS begins by finding a small number of subjects (members of the target hidden population) who are recruited to serve as “seeds.” Upon completing the interview, the seeds are offered an incentive to recruit their peers (other members of the hidden population) into the survey. The new recruits, on arriving at the interview, are offered the chance to become new seeds and are thus doubly incentivized, both to complete the interview and to recruit new people from their personal networks. Sampling continues like this until the target sample size is reached or the recruit population is exhausted. The innovation made by Heckathorn (1997, p. 176) was in proving that the samples resulting from this process “are independent of the initial subjects from which sampling begins”—so that ultimately, it does not matter that the initial selection of seeds was a convenience sample. Implemented as instructed, the recruitment process can be treated as a Markov process, ultimately resulting in the surprising result of independence from the initial sample. In building the sample, it is critical to the calculations that key data items be recorded along with the survey—the coupon number returned by the recruit, which permits determination of who recruited whom, and the degree of each respondent (the stage at which the recruit entered the sample and, correspondingly, how many downstream recruits for whom they were ultimately responsible). Baraff et al. (2016), Green et al. (2020), and Raftery (2020) discuss a novel approach (a version of the bootstrap that preserves the tree structure of the seed/recruit process) for computing the variance of estimators from RDS data.
As another general approach along these lines, Bao et al. (2015)—and Raftery (2020) in his presentation to the committee—advocated a Bayesian hierarchical model as a potentially ideal form for estimating the size of hard-to-count populations, illustrating their proposal with a model intended to estimate the count of intravenous drug users in Bangladesh in 2004. The hierarchical model yielded both local and national estimates, and permitted the integration of “mapping data, surveys, interventions, capture-recapture data, estimates or guesstimates from organizations, and expert opinion” in an integrated framework (Bao et al., 2015, p. 125).
Johnston et al. (2015), for example, nested the same seed/recruit information and structure in a Bayesian framework, thus incorporating prior knowledge and approximations of the population size to yield a method called successive sampling–population size estimation. They applied the method to estimate sensitive and hard-to-count populations in six cities in Morocco, among them intravenous drug users and migrants from sub-Saharan Africa; Handcock et al. (2014) validated that approach by using it
to estimate the size of known networked populations in the data files of the National Longitudinal Study of Adolescent to Adult Health, and Johnston et al. (2017) later applied the technique to estimate the number of women with sexual violence-related pregnancies in a province in the Democratic Republic of the Congo. Crawford et al. (2018) borrowed from the theoretical underpinnings of the network scale-up method and analyzed the tree/network structure generated by respondent-driven sampling using graph theory principles to generate a different take on estimating the number of intravenous drug users in St. Petersburg, Russia.
The field of estimation techniques for hidden and hard-to-count populations is developing quickly and will continue to improve—that is, make increasingly accurate estimates—as systematic research is applied on refining and improving the methodology. Feehan and Salganik’s (2016) generalized NSUM offered conceptual advantages relative to the original. Other research has examined technical improvements to the base NSUM approach, including Habecker et al. (2015) and Maghsoudi et al. (2014). More recently, Verdery et al. (2019) combined elements of RDS (link tracing within respondent networks) and Feehan and Salganik’s (2016) generalized NSUM to estimate the size of some sub-populations at risk of HIV/AIDS—all in the context of venue-based sampling (or time–space sampling), in which it is assumed that the target hard-to-count population tends to cluster in identifiable locations that can be sampled in meaningful ways. Much work remains to be done on venue-based sampling, but, just as with the other methods described in this section, systematic research on it promises improvements or efficiencies relative to currently used cluster sampling estimates.
In its review of statistical estimation techniques for disaster-related mortality and significant morbidity, the committee reviewed the literature and, to supplement and close the review, hosted two public webinars on February 11 and 18, 2020, to gather input related to several of the studies and techniques (see Chapter 1). In particular, it invited several participants in the major research projects to summarize their work and offer their own thoughts on best practices (Ho, 2020; Irizarry, 2020; McCormick, 2020; Quast, 2020; Raftery, 2020; Zeger, 2020).
The main message derived from this review is that, in the measurement of disaster-related effects, as is true of virtually any area of scientific endeavor, there is not, nor can there be, a single universally correct or standard method for generating mortality or morbidity estimates. Instead, given the variation in ways for attributing the cause of any death and morbidity, there can be more than one appropriate approach to answering the
question: “How many deaths and severe morbidities were caused by this disaster?” Just as with counting approaches, all of the estimation approaches described in this chapter make assumptions and are subject to bias. Accordingly, the best practices that can be specified for estimating disaster impacts are the same best practices that apply to research in general: (1) clarity in the specification of study objectives and definition of terms, (2) transparency in the statement of assumptions and the sourcing of data used in the study, (3) continued testing and improvement of the accuracy of measures, and (4) caution in advancing any particular measure or method as the single perfect solution. Any statistical indicator reflects a specific time period and geographic area, reflects a particular set of death or morbidity causes, and so forth, and the research should clearly specify these limits. Similar conclusions are drawn by Quarantelli (2001) in discussing high-level conceptual problems in disaster studies.
The preliminary needs assessment from CASPER-type surveys aside, the methods described in this chapter are generally retrospective and take time to implement well. Indeed, excess mortality studies only become possible and meaningful after some time has elapsed and post-disaster mortality data are available. Accordingly, these estimation techniques are not meant to be predictive indicators and are unlikely to be able to provide direct insight in the early disaster response phase, given how tightly focused natural disasters tend to be in space and time. But, with time to develop and care in specification, the estimation techniques are very useful in assessing the total impact of disasters and in planning for future disasters. In addition, the estimation methods described in this chapter can also provide more detail than case counts in terms of demographic and other disparities, types of illness and injuries experienced, and specific causes of death.
While the adoption of a standardized, universally applicable method for estimating the mortality and significant morbidity effects of major disasters is not recommended, there is value in some degree of standardization so that, as much as possible, observed differences reflect substantive differences rather than arbitrary methodological choices. Therefore, this suggests that there are very important avenues for information sharing and cultivation of best practices among researchers and state, local, tribal, and territorial (SLTT) disaster and public health officials for developing estimates and communicating them.
The field would benefit from a research program that begins with a discussion of advantages and disadvantages and a documentation of researchers’ and policy makers’ experience with choices that have worked particularly well (or not) in the past. Many of the estimates in the literature result from one-off efforts that do not build on or seek comparability with previous disasters. The first step in this research agenda requires that there be a careful comparison of different estimates from the same emergency
to gain an understanding of how methodological choices and assumptions affect the estimates (see Recommendation 4-1).
This research program should consider such factors as:
- Spatiotemporal boundaries of the study—including the start and end of the study/disaster period, balancing the capacity to measure immediate/short-term/longer-term effects with respondents’ or data providers’ ability for accurate recall, as well as the precise geographic area being studied;
- Specification of a comparison period or comparison population, or the handling of confounding or seasonal structure in the data, for setting an appropriate counterfactual and bolstering the ability to argue that the estimated effects are attributable to the disaster;
- Determination of an accurate sampling frame or an appropriate baseline or careful elaboration of data collection protocols that ensure that the probabilities of selection and the assumptions underpinning accurate inference are adequately met;
- Development of appropriate standard survey questionnaires (and associated training materials) to promote consistency and comparability of results; and
- Crafting appropriate statistical models and documenting the results.
It has come up several times already but bears repeating: it is essential to develop effective means of characterizing migration and population displacement before, during, and in the immediate wake of the disaster, not only for estimation purposes but also to accurately calculate meaningful population rates.
In general, all of these estimation techniques rely on accurate and appropriate baseline, contextual data. Many of the techniques rely on vital records or vital statistics data—for example, the counting-based methods described in Chapter 3—so it is certainly true that improving the counting-type mortality and morbidity data is important to improving the quality of the estimates that use them. But an effective baseline for estimation techniques is broader than just these incident data. It includes data from the U.S. decennial census and the American Community Survey (ACS; the successor, since the 2010 census, to the census “long form” sample covering more detailed socioeconomic questions), other major federal surveys (such as the Current Population Survey and the National Health Interview Survey), and the growing array of administrative records-type data being compiled at the federal and state levels. Solid baseline data are essential in all of these estimation methods for a variety of purposes—to set denominators so that rates might be compared across time and geography,
to reconstruct population as it was at the moment of disaster impact, to understand the sociodemographic characteristics of the disaster-affected population, to identify sampling frames for contact with survivors, and so forth. In this chapter, some of the more successful survey-based measures of disaster impacts documented had a strong operational baseline in that they were able to build from existing data resources, most notably the STAR tracking study that developed from (and inherited its representativeness from) a major Indonesian socioeconomic survey.
Accordingly, going forward, it will be useful to consider ways to nurture developments along several fronts related to the provision of both baseline and analytical data. First, the utility for disaster research of custom tabulation/estimation tools such as the Census Bureau’s OnTheMap for Emergency Management,3 which generates results from the ACS and other federal data sources for Federal Emergency Management Agency (FEMA)designated disaster areas, and the CDC Wide-ranging ONline Data for Epidemiologic Research data portal should be examined. Second, it would be feasible (though very difficult at present) to replicate the STAR/Indonesia experience in the U.S. system, drawing on a federal household survey sample to implement a longitudinal survey of affected and comparison populations. Still, options for the better use of federal and state data resources should be examined; akin to the question modules and “pulse surveys” now being fielded by the Census Bureau in response to the COVID-19 epidemic, the capacity to add a module to an ongoing survey (to spark post-disaster data collection) or provisions for oversampling in disaster-affected areas (to be able to provide more area-specific survey estimates) on a short-term basis should be explored. The development of effective baseline data may also include exploring opportunities to use alternative and emerging data sources, such as cell phone location records and other administrative data, in ways that derive benefit from the new data resources while managing privacy and confidentiality concerns.
Developing an effective data and information structure for studying disaster impacts is not a basic research activity: it has immediate application value. It is and should be a cornerstone of the nation’s operational disaster response function. It requires participation from actors in all levels of government as well as from outside government. To wit, some summary points:
- Research on analytical methods, based on the experience of past disasters, is essential to support good analytical choices in future disaster research. Research along these lines could be brokered by, among others, the National Institutes of Health (NIH) (as through
- The degree of analytical sophistication and the requirements for detailed data analysis and high-quality fieldwork are generally beyond the capabilities and time availability of most SLTT health departments—particularly in the immediate wake of a major disaster. Accordingly, it would be useful for CDC (perhaps through its Epidemic Intelligence Service), the Assistant Secretary for Preparedness and Response, and FEMA to pursue “jump teams” that might be brought in early in the disaster response cycle to supplement SLTT resources, helping to gather data and (importantly) begin the detailed analyses. Such practiced jump teams would help manage data collection in such a way as to not get in the way of first responders. Personnel in the National Disaster Medical System or Medical Reserve Corps could be part of these teams; the work could also be regionalized and partially accomplished with standing memoranda of understanding with epidemiology institutes and academic departments.
- Critical operational support is needed from federal agencies including the identification of appropriate mortality and morbidity datasets that might be brought to bear and pre-negotiation of data-sharing agreements to ensure access to these data when needed. As mentioned earlier, part of this operational support involves finding ways to use existing federal survey and data collection infrastructure, including identifying ongoing data collection programs to which disaster questionnaire modules could be piggy-backed (with appropriate adaption).
- Some of the survey procedures and data analyses suggested, particularly if building on data previously gathered for other purposes, may appear to conflict with consent procedures under the Common Rule, which guides human subject research, respondent burden issues under the Paperwork Reduction Act, which governs clearance of federal information collections, and the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule, which protects individual health information. While the committee believes that the public health benefits of accurate estimates of the mortality and morbidity effects of a disaster outweigh these concerns, it is useful to address these issues in advance and to ensure alternative arrangements (such as data disclosure rules in place at SLTT health departments) to protect privacy and confidentiality.
4 See https://public.csr.nih.gov/StudySections/DABP/PSE/BMRD (accessed June 10, 2020).
5 See https://www.nsf.gov/funding/pgm_summ.jsp?pims_id=5421 (accessed June 10, 2020).
Cultivating regional centers of excellence—possibly virtual in format, and possibly borrowing from CDC’s previous experience in funding Preparedness and Emergency Response Learning Centers that paired academic institutions with SLTT officials—could be a useful step in facilitating many of these steps. A previous National Academies of Sciences, Engineering, and Medicine committee, tasked with the problem of deriving best practices for measuring community resilience to disasters, offered analogous guidance (NASEM, 2019). Concluding that there is no single, best measurement applicable to all communities and all elements of “resilience,” that committee urged the development of a “resilience learning collaborative” in the Gulf of Mexico region, drawing together a variety of government, industry, academic, and nonprofit actors to coordinate measurement efforts and implement ongoing research (NASEM, 2019). This type of model may be useful to consider relative to the whole disaster cycle, not just resilience.
Disasters are complex, so their health consequences are multifactorial, and no single number can sufficiently describe their impact. Because there can be more than one appropriate approach to answering the question “How many deaths and significant morbidities were caused by this disaster?” there cannot be a universally correct or standard method for generating mortality or morbidity estimates. While “estimation” might sound less precise than “counting,” which is described in Chapters 2 and 3, methods used for counting provide imprecise results, which predictably undercount the total impact of disasters, especially with regard to specific sub-populations. While statistical estimation methods cannot determine whether any given dead or ill person died or became ill as a direct or indirect result of the disaster, those methods can generate a more complete picture of the total impact of the disaster on health outcomes.
In addition, excess mortality studies only become possible and meaningful after some time has elapsed and post-disaster mortality data are available. Accordingly, the estimation techniques discussed in this chapter are unlikely to be able to provide direct insight in the early disaster response phase. However, with time to gather data and develop proper specifications, the estimation techniques are useful in assessing the total impact of disasters and in planning for future disasters.
While there is no standardized, universally applicable method for estimating the mortality and morbidity effects of major disasters, there are best practices that can be specified. As for research in general, these include clarity in the specification of study objectives and definition of terms, transparency in the statement of assumptions and the sourcing of data used in the study, and caution in advancing any particular measure or method as
a single solution. Thus, the field would benefit from a national research program that begins with a discussion of advantages and disadvantages and a documentation of researchers’ and policy makers’ experience with choices that have worked particularly well (or not) in the past (see Box 4-1 for selected research priorities for a national research program).
Conclusion 4-1: Given the variation in ways for attributing the cause of any death and morbidity, there can be more than one appropriate approach to answering the question: “How many deaths and severe morbidities were caused by this disaster?” Nevertheless, methodological best practices can be specified, and a national research program is urgently needed to identify, further develop, and validate these practices. As in all areas of research, these best practices are characterized by (1) clarity in the specification of study objectives and definition of terms, (2) transparency in the statement of assumptions and the sourcing of data used in the study, (3) continued testing and improvement of the accuracy of measures, and (4) caution in advancing any particular measure or method as the single perfect solution.
The counting-based methods described in Chapter 3 rely on accurate baseline and contextual data, including vital statistics data. Improving
these methods is thus essential for improving the quality of the estimates that use them—for setting denominators so that rates might be compared across time and geography, for reconstructing populations as they were at the moment of disaster impact, for understanding the sociodemographic characteristics of the disaster-affected populations, for identifying sampling frames for contact with survivors, and so forth. But an effective baseline for estimation techniques is broader than just these incident data and includes data from the U.S. decennial census, the ACS, other major federal surveys, and the growing array of administrative records–type data being compiled at the federal and state levels.
Developing an effective data and information structure for studying disaster impacts should be a cornerstone of the nation’s operational disaster response function. It requires participation from actors in all levels of government as well as outside government. The research required for the development of the information structure could be brokered by, among others, NIH and NSF (as through the Methodology, Measurement, and Statistics Program).
The degree of analytical sophistication and the requirements for detailed data analysis and high-quality fieldwork are generally beyond the capabilities and time availability of most SLTT health departments—particularly in the immediate wake of a major disaster. Accordingly, it would be useful for CDC and FEMA to pursue jump teams that might be brought in early in the disaster response cycle to supplement SLTT resources, helping to gather data and (importantly) begin the detailed analyses.
Conclusion 4-2: Developing an effective data and information structure for studying disaster impacts on mortality and morbidity should be a cornerstone of the nation’s operational disaster response function. Because the necessary analytical sophistication and high-quality fieldwork are generally beyond the capabilities and time availability of most SLTT health departments, it is essential that federal partners work to build and sustain the capacity of the nation’s existing research and survey infrastructure to support the collection of survey data on the health effects of disasters.
Critical operational support is needed from federal agencies, including the identification of appropriate mortality and morbidity datasets that might be brought to bear and the pre-negotiation of data-sharing agreements to ensure access to these data when needed. Some of the survey procedures and data analyses that are suggested, particularly if building on data previously gathered for other purposes, may appear to conflict with consent procedures under the Common Rule, which guides human subject research, respondent burden issues under the Paperwork Reduction Act,
which governs clearance of federal information collections, and the HIPAA Privacy Rule, which protects individual health information. While the committee believes that the public health benefits of accurate estimates of the mortality and morbidity effects of a disaster outweigh these concerns, it is useful to address these issues in advance and to ensure alternative arrangements (such as data disclosure rules in place at SLTT health departments) to protect privacy and confidentiality.
Finally, academic departments and institutes can be more flexible in initiating and conducting studies, but care would be needed in specifying their work as part of the operational response (surveillance and evaluation function) of the National Incident Management System rather than pure or basic research. The involvement of academic, nongovernment units heightens the importance of being able to execute timely contractual agreements.
Aida, J., H. Hikichi, Y. Matsuyama, Y. Sato, T. Tsuboya, T. Tabuchi, S. Koyama, S. V. Subramanian, K. Kondo, K. Osaka, and I. Kawachi. 2017. Risk of mortality during and after the 2011 Great East Japan earthquake and tsunami among older coastal residents. Science Reports 7(1):16591.
Bao, L., A. E. Raftery, and A. Reddy. 2015. Estimating the sizes of populations at risk of HIV infection from multiple data sources using a Bayesian hierarchical model. Statistics and Its Interface 8(2):125–136.
Baraff, A. J., T. H. McCormick, and A. E. Raftery. 2016. Estimating uncertainty in respondent-driven sampling using a tree bootstrap method. Proceedings of the National Academy of Sciences 113(51):14668–14673.
Bernard, H. R., E. C. Johnsen, P. D. Killworth, and S. Robinson. 1991. Estimating the size of an average personal network and of an event subpopulation: Some empirical results. Social Science Research 20(2):109–121.
Bernard, H. R., T. Hallett, A. Iovita, E. C. Johnsen, R. Lyerla, C. McCarty, M. Mahy, M. J. Salganik, T. Saliuk, O. Scutelniciuc, G. A. Shelley, P. Sirinirund, S. Weir, and D. F. Stroup. 2010. Counting hard-to-count populations: The network scale-up method for public health. Sexually Transmitted Infections 86(Suppl 2):ii11–ii15.
Branswell, H. 2020. CDC launches studies to get more precise count of undetected COVID-19 cases. STAT, April 4. https://www.statnews.com/2020/04/04/cdc-launches-studies-to-get-more-precise-count-of-undetected-covid-19-cases (accessed July 12, 2020).
Burnham, G., R. Lafta, S. Doocy, and L. Roberts. 2006. Mortality after the 2003 invasion of Iraq: A cross-sectional cluster sample survey. The Lancet 368(9545):1421–1428.
CDC (Centers for Disease Control and Prevention). 2014. Disaster preparedness and response: Complete course. Facilitator guide, 1st ed. Atlanta, GA: Centers for Disease Control and Prevention.
CDC. 2020a. COVID-19 serology surveillance strategy. https://www.cdc.gov/coronavirus/2019ncov/covid-data/serology-surveillance/index.html (accessed June 2, 2020).
CDC. 2020b. COVIDView—A weekly surveillance summary of U.S. COVID-19 activity. https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html (accessed July 13, 2020).
Cheng, S., D. Eck, and F. Crawford. 2020. Estimating the size of a hidden finite set: Large-sample behavior of estimators. Statistics Surveys 14:1–31.
Crawford, F. W., J. Wu, and R. Heimer. 2018. Hidden population size estimation from respondent driven sampling: A network approach. Journal of the American Statistical Association 113(522):755–766.
Cruz-Cano, R., and E. L. Mead. 2019. Causes of excess deaths in Puerto Rico after Hurricane Maria: A time-series estimation. American Journal of Public Health 109(7):1050–1052.
Doocy, S., M. Cherewick, and T. Kirsch. 2013. Mortality following the Haitian earthquake of 2010: A stratified cluster survey. Population Health Metrics 11(1):5.
Eisenman, D. P., H. Wilhalme, C. H. Tseng, M. Chester, P. English, S. Pincetl, A. Fraser, S. Vangala, and S. K. Dhaliwal. 2016. Heat death associations with the built environment, social vulnerability, and their interactions with rising temperature. Health & Place 41:89–99.
Feehan, D. M., and M. J. Salganik. 2016. Generalizing the network scale-up method: A new estimator for the size of hidden populations. Sociological Methodology 46(1):153–186.
Frankenberg, E., J. Friedman, T. Gillespie, N. Ingwersen, R. Pynoos, I. U. Rifai, B. Sikoki, A. Steinberg, C. Sumantri, W. Suriastini, and D. Thomas. 2008. Mental health in Sumatra after the tsunami. American Journal of Public Health 98(9):1671–1677.
Frankenberg, E., T. Gillespie, S. Preston, B. Sikoki, and D. Thomas. 2011. Mortality, the family, and the Indian Ocean tsunami. Economic Journal 121(554):F162–F182.
Frankenberg, E., C. Sumantri, and D. Thomas. 2020. Effects of a natural disaster on mortality risks over the longer term. Nature Sustainability 3(8):1–6. https://doi.org/10.1038/s41893-020-0536-3.
Green, A. K. B., T. H. McCormick, and A. E. Raftery. 2020. Consistency for the tree bootstrap in respondent-driven sampling. Biometrika 107(2):497–504.
GWU (George Washington University) Milken Institute School of Public Health. 2018. Ascertainment of the estimated excess mortality from Hurricane Maria in Puerto Rico. https://publichealth.gwu.edu/sites/default/files/downloads/projects/PRstudy/Acertainment%20of%20the%20Estimated%20Excess%20Mortality%20from%20Hurricane%20Maria%20in%20Puerto%20Rico.pdf (accessed July 12, 2020).
Habecker, P., K. Dombrowski, and B. Khan. 2015. Improving the network scale-up estimator: Incorporating means of sums, recursive back estimation, and sampling weights. PLOS ONE 10(12):e0143406.
Handcock, M. S., K. J. Gile, and C. M. Mar. 2014. Estimating hidden population size using respondent-driven sampling data. Electronic Journal of Statistics 8(1):1491–1521.
Harduar Morano, L. H., S. Watkins, and K. Kintziger. 2016. A comprehensive evaluation of the burden of heat-related illness and death within the Florida population. International Journal of Environmental Research and Public Health 13(6):551.
Heald, J., M. Spagat, and J. Dougherty. 2010. Discussion of “Conflict deaths in Iraq: A methodological critique of the ORB survey estimate.” Survey Research Methods 4(1):17–19.
Heckathorn, D. D. 1997. Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems 44(2):174–199.
Heckathorn, D. D. 2002. Respondent-driven sampling II: Deriving valid population estimates from chain-referral samples of hidden populations. Social Problems 49(1):11–34.
Henderson, R. H., and T. Sundaresan. 1982. Cluster sampling to assess immunization coverage: A review of experience with a simplified sampling method. Bulletin of the World Health Organization 60(2):253–260.
Henschel, A., L. L. Burton, L. Margolies, and J. E. Smith. 1969. An analysis of the heat deaths in St. Louis during July 1966. American Journal of Public Health 59(12):2232–2242.
Ho, J. 2020. Methodological challenges and approaches for assessing the mortality impacts of disasters. Presentation at the February 11, 2020, public meeting of the National Academies of Sciences, Engineering, and Medicine’s Committee on Best Practices for Assessing Mortality and Significant Morbidity Following Large-Scale Disasters, conducted in webinar style. https://www.nationalacademies.org/event/02-11-2020/best-practices-in-assessing-mortality-and-significant-morbidity-following-large-scale-disasters-webinar-methodological-considerations-of-the-estimation-of-disaster-related-morbidity-and-mortality-at-a-population-level (accessed September 1, 2020).
Ho, J. Y., E. Frankenberg, C. Sumantri, and D. Thomas. 2017. Adult mortality five years after a natural disaster. Population and Development Review 43(3):467–490.
Howard, J. T., and A. R. Santos-Lozada. 2019. Mortality risk within counterfactual models: Puerto Rico and Hurricane Maria. The Lancet Planetary Health 3(5):e207–e208.
Irizarry, R. A. 2020. Assessing mortality and significant morbidity following Hurricane Maria in Puerto Rico. Presentation at the February 11, 2020, public meeting of the National Academies of Sciences, Engineering, and Medicine’s Committee on Best Practices for Assessing Mortality and Significant Morbidity Following Large-Scale Disasters, conducted in webinar style. https://www.nationalacademies.org/event/02-11-2020/best-practices-in-assessing-mortality-and-significant-morbidity-following-large-scale-disasters-webinar-methodological-considerations-of-the-estimation-of-disaster-related-morbidity-and-mortality-at-a-population-level (accessed September 1, 2020).
Jacobson, M. H., C. Norman, A. Nguyen, and R. M. Brackbill. 2018. Longitudinal determinants of depression among World Trade Center health registry enrollees, 14–15 years after the 9/11 attacks. Journal of Affective Disorders 229:483–490.
Jewell, N. P., M. Spagat, and B. L. Jewell. 2018. Accounting for civilian casualties: From the past to the future. Social Science History 42(3):379–410.
Joe, L., S. Hoshiko, D. Dobraca, R. Jackson, S. Smorodinsky, D. Smith, and M. Harnly. 2016. Mortality during a large-scale heat wave by place, demographic group, internal and external causes of death, and building climate zone. International Journal of Environmental Research and Public Health 13(3):299.
Johnston, L. G., K. R. McLaughlin, H. El Rhilani, A. Latifi, A. Toufik, A. Bennani, K. Alami, B. Elomari, and M. S. Handcock. 2015. Estimating the size of hidden populations using respondent-driven sampling data: Case examples from Morocco. Epidemiology 26(6):846–852.
Johnston, L. G., K. R. McLaughlin, S. A. Rouhani, and S. A. Bartels. 2017. Measuring a hidden population: A novel technique to estimate the population size of women with sexual violence-related pregnancies in South Kivu Province, Democratic Republic of Congo. Journal of Epidemiology and Global Health 7(1):45–53.
Jordan, H. T., C. R. Stein, J. Li, J. E. Cone, L. Stayner, J. L. Hadler, R. M. Brackbill, and M. R. Farfel. 2018. Mortality among rescue and recovery workers and community members exposed to the September 11, 2001 World Trade Center terrorist attacks, 2003–2014. Environmental Research 163:270–279.
Killworth, P. D., C. McCarty, H. R. Bernard, G. A. Shelley, and E. C. Johnsen. 1998. Estimation of seroprevalence, rape, and homelessness in the United States using a social network approach. Evaluation Review 22(2):289–308.
Kim, S., P. A. Kulkarni, M. Rajan, P. Thomas, S. Tsai, C. Tan, and A. Davidow. 2017. Hurricane Sandy (New Jersey): Mortality rates in the following month and quarter. American Journal of Public Health 107(8):1304–1307.
Kishore, N., D. Marqués, A. Mahmud, M. V. Kiang, I. Rodriguez, A. Fuller, P. Ebner, C. Sorensen, F. Racy, J. Lemery, L. Maas, J. Leaning, R. A. Irizarry, S. Balsari, and C. O. Buckee. 2018. Mortality in Puerto Rico after Hurricane María. New England Journal of Medicine 379(2):162–170.
Laaksonen, S. 2008. Retrospective two-stage cluster sampling for mortality in Iraq. International Journal of Market Research 50(3):403–417.
Maghsoudi, A., M. R. Baneshi, M. Neydavoodi, and A. Haghdoost. 2014. Network scale-up correction factors for population size estimation of people who inject drugs and female sex workers in Iran. PLOS ONE 9(11):e110917.
Malilay, J. 2000. Public health assessments in disaster settings: Recommendations for a multidisciplinary approach. Prehospital and Disaster Medicine 15(4):167–172.
Malilay, J., W. D. Flanders, and D. Brogan. 1996. A modified cluster-sampling method for post-disaster rapid assessment of needs. Bulletin of the World Health Organization 74(4):399–405.
Marker, D. A. 2008. Review: Methodological review of “Mortality after the 2003 invasion of Iraq: A cross-sectional cluster sample survey.” Public Opinion Quarterly 72(2):345–363.
McCormick, T. H. 2020. Network scale-up and indirect methods for group size estimation. Presentation at the February 11, 2020, public meeting of the National Academies of Sciences, Engineering, and Medicine’s Committee on Best Practices for Assessing Mortality and Significant Morbidity Following Large-Scale Disasters, conducted in webinar style. https://www.nationalacademies.org/event/02-11-2020/best-practices-in-assessing-mortality-and-significant-morbidity-following-large-scale-disasters-webinar-methodological-considerations-of-the-estimation-of-disaster-related-morbidity-and-mortality-at-a-population-level (accessed September 1, 2020).
Morita, T., S. Nomura, M. Tsubokura, C. Leppold, S. Gilmour, S. Ochi, A. Ozaki, Y. Shimada, K. Yamamoto, M. Inoue, S. Kato, K. Shibuya, and M. Kami. 2017. Excess mortality due to indirect health effects of the 2011 triple disaster in Fukushima, Japan: A retrospective observational study. Journal of Epidemiology and Community Health 71(10):974–980.
NASEM (National Academies of Sciences, Engineering, and Medicine). 2019. Building and measuring community resilience: Actions for communities and the Gulf Research Program. Washington, DC: The National Academies Press.
NSF (National Science Foundation). 2020. Frequently asked questions and vignettes. https://www.nsf.gov/bfa/dias/policy/hsfaqs.jsp#snow (accessed June 10, 2020).
Pascal, M., K. Laaidi, V. Wagner, A. B. Ung, S. Smaili, A. Fouillet, C. Caserio-Schönemann, and P. Beaudeau. 2012. How to use near real-time health indicators to support decision-making during a heat wave: The example of the French heat wave warning system. PLOS Currents 4:e4f83ebf72317d.
Pascal, M., V. Wagner, M. Corso, K. Laaidi, A. Ung, and P. Beaudeau. 2018. Heat and cold related-mortality in 18 French cities. Environment International 121:189–198.
Petitti, D. B., S. L. Harlan, G. Chowell-Puente, and D. Ruddell. 2013. Occupation and environmental heat-associated deaths in Maricopa County, Arizona: A case–control study. PLOS ONE 8(5):e62596.
Phifer, J. F., K. Z. Kaniasty, and F. H. Norris. 1988. The impact of natural disaster on the health of older adults: A multiwave prospective study. Journal of Health and Social Behavior 29(1):65–78.
Quarantelli, E. L. 2001. Statistical and conceptual problems in the study of disasters. Disaster Prevention and Management 10(5):325–338.
Quast, T. 2020. Excess mortality of older individuals with diabetes following Hurricanes Katrina and Rita. Presentation at the February 18, 2020, public meeting of the National Academies of Sciences, Engineering, and Medicine’s Committee on Best Practices for Assessing Mortality and Significant Morbidity Following Large-Scale Disasters, conducted in webinar style. https://www.nationalacademies.org/event/02-18-2020/best-practices-in-assessing-mortality-and-significant-morbidity-following-large-scale-disasters-webinar-methodological-considerations-for-estimating-excess-mortality-and-morbidity (accessed September 1, 2020).
Raftery, A. E. 2020. Methods for estimating hard to count populations. Presentation at the February 11, 2020, public meeting of the National Academies of Sciences, Engineering, and Medicine’s Committee on Best Practices for Assessing Mortality and Significant Morbidity Following Large-Scale Disasters, conducted in webinar style. https://www.nationalacademies.org/event/02-11-2020/best-practices-in-assessing-mortality-and-significant-morbidity-following-large-scale-disasters-webinar-methodological-considerations-of-the-estimation-of-disaster-related-morbidity-and-mortality-at-a-population-level (accessed September 1, 2020).
Rosenberg, E. S., J. M. Tesoriero, E. M. Rosenthal, R. Chung, M. A. Barranco, L. M. Styer, M. M. Parker, S-YJ Leung, J. E. Morne, D. Greene, R. Holtgrave, D. Hoefer, J. Kumar, T. Udo, B. Hutton, and H. A. Zucker. 2020. Cumulative incidence and diagnosis of SARS-CoV-2 infection in New York. Annals of Epidemiology 48:23–29. doi: 10.1016/j. annepidem.2020.06.004.
Russell, T. W., N. Golding, J. Hellewell, S. Abbott, L. Wright, C. A. B. Pearson, K. van Zandvoort, C. I. Jarvis, H. Gibbs, Y. Liu, R. M. Eggo, J. W. Edmunds, and A. J. Kucharski. 2020. Reconstructing the early global dynamics of under-ascertained COVID-19 cases and infections. medRxiv 2020.07.07.20148460. doi: https://doi.org/10.1101/2020.07.07.20148460 [preprint].
Santos-Burgoa, C., J. Sandberg, E. Suárez, A. Goldman-Hawes, S. Zeger, A. Garcia-Meza, C. M. Pérez, N. Estrada-Merly, U. Colón-Ramos, C. M. Nazario, E. Andrade, A. Roess, and L. Goldman. 2018. Differential and persistent risk of excess mortality from Hurricane Maria in Puerto Rico: A time-series analysis. The Lancet Planetary Health 2(11):e478–e488.
Santos-Burgoa, C., J. Sandberg, E. Suárez, C. M. Pérez, and L. Goldmann. 2019. Mortality risk within counterfactual models: Puerto Rico and Hurricane Maria—Authors’ reply. The Lancet Planetary Health 3(5):e209.
Santos-Lozada, A. R., and J. T. Howard. 2017. Estimates of excess deaths in Puerto Rico following Hurricane Maria. SocArXiv. doi: 10.31235/osf.io/s7dmu.
Santos-Lozada, A. R., and J. T. Howard. 2018. Use of death counts from vital statistics to calculate excess deaths in Puerto Rico following Hurricane Maria. JAMA 320(14):1491–1493.
Santos-Lozada, A. R., and J. T. Howard. 2019. Excess deaths after Hurricane Maria in Puerto Rico—Reply. JAMA 321(10):1005–1006.
Stephens, K. U., Sr., D. Grew, K. Chin, P. Kadetz, P. G. Greenough, F. M. Burkle, Jr., S. L. Robinson, and E. R. Franklin. 2007. Excess mortality in the aftermath of Hurricane Katrina: A preliminary report. Disaster Medicine and Public Health Preparedness 1(1):15–20.
Thomas, L. M., L. D’Ambruoso, and D. Balabanova. 2018. Verbal autopsy in health policy and systems: A literature review. BMJ Global Health 3(2):e000639.
Turner, A. G., R. J. Magnani, and M. Shuaib. 1996. A not quite as quick but much cleaner alternative to the Expanded Programme on Immunization (EPI) cluster survey design. International Journal of Epidemiology 25(1):198–203.
Vandentorren, S., P. Bretin, A. Zeghnoun, L. Mandereau-Bruno, A. Croisier, C. Cochet, J. Ribéron, I. Siberan, B. Declercq, and M. Ledrans. 2006. August 2003 heat wave in France: Risk factors for death of elderly people living at home. European Journal of Public Health 16(6):583–591.
Verdery, A. M., S. Weir, Z. Reynolds, G. Mulholland, and J. K. Edwards. 2019. Estimating hidden population sizes with venue-based sampling: Extensions of the generalized network scale up estimator. Epidemiology 30(6):901–910.
Vogel, G. 2020. “These are answers we need.” WHO plans global study to discover true extent of coronavirus infections. Science Magazine, April 2. https://www.sciencemag.org/news/2020/04/these-are-answers-we-need-who-plans-global-study-discover-true-extent-coronavirus (accessed July 12, 2020).
Weinberger, D. M., J. Chen J., T. Cohen, F. W. Crawford, F. Mostashari, D. Olson, V. E. Pitzer, N. G. Reich, M. Russi, L. Simonsen, A. Watkins, and C. Viboud. 2020. Estimation of excess deaths associated with the COVID-19 pandemic in the United States, March to May 2020. JAMA Internal Medicine 180(10):1336–1344.
Woolf, S. H., D. A. Chapman, R. T. Sabo, D. M. Weinberger, and L. Hill. 2020. Excess deaths from COVID-19 and other causes, March–April 2020. JAMA 324(5):510–513.
Working Group for Mortality Estimation in Emergencies. 2007. Wanted: Studies on mortality estimation methods for humanitarian emergencies, suggestions for future research. Emerging Themes in Epidemiology 4(1):9.
Zeger, S. L. 2020. Comments on prediction of causal effects of disasters on morbidity and mortality. Presentation at the February 18, 2020, public meeting of the National Academies of Sciences, Engineering, and Medicine’s Committee on Best Practices for Assessing Mortality and Significant Morbidity Following Large-Scale Disasters, conducted in webinar style. https://www.nationalacademies.org/event/02-18-2020/best-practices-in-assessing-mortality-and-significant-morbidity-following-large-scale-disasters-webinarmethodological-considerations-for-estimating-excess-mortality-and-morbidity (accessed September 1, 2020).
Zhang, Y., M. Nitschke, A. Krackowizer, K. Dear, D. Pisaniello, P. Weinstein, G. Tucker, S. Shakib, and P. Bi. 2017. Risk factors for deaths during the 2009 heat wave in Adelaide, Australia: A matched case–control study. International Journal of Biometeorology 61(1):35–47.