This chapter reviews the sampling-related elements of the proposed study design for the National Children’s Study (NCS). After summarizing the proposed study design and its background, the chapter delineates established principles for evaluating the design of studies such as the NCS and then evaluates the proposed design against these principles. It ends by recommending next steps. Taken together, the chapter addresses statistical issues surrounding the following key items the panel was asked to consider: the national probability sample’s overall sample size and design; the use of hospitals and birthing centers as the primary sampling unit; the use of health care providers to sample and recruit prospective participants; the relative size of the prenatal and birth strata in the probability sample; and the optimal use of sibling births.
As described in Chapter 1, the decision of the NCS to use probability sampling methods and the perceived importance of collecting preconception and early prenatal data led to an initial design of a national equal probability of selection sample of 100,000 births drawn from a stratified area sample design using geographic locations (mostly counties) as the primary sampling units (PSUs) and households within clusters of census blocks in the selected locations as the secondary sampling units (SSUs). However, the experience of the Vanguard Study indicated that a household-based sampling design was likely to be too inefficient regardless of the recruitment methods, so the NCS developed several alternative designs. Currently, the proposed design is a national equal
probability of selection sample1 of 90,000 births using a stratified list sample design with hospitals and prenatal care providers as the sampling units and places of recruitment. The target population consists of all births in the United States (excluding U.S. territories) during a 4-year reference period. In practice, each PSU will possibly have different overlapping 4-year periods, the “birth window,” during which births are enrolled in the study, because the start of the sample enrollment will be rolled out over time.2 While the broad outline of this current sample design was provided to the panel, many crucial details of the sample design had not yet been resolved as this report was being written.3
The probability sample would use birthing hospitals and birthing centers as the PSUs and prenatal care providers whose patients deliver at the selected hospitals and birthing centers as the SSUs.4 The list of providers associated with selected hospitals (the SSU frame) would be split into two strata: the details of this split were not provided to the panel. The first SSU stratum, the prenatal stratum, would consist of those prenatal care providers (practice locations) that would be sampled for inclusion in the prenatal stratum. This stratum would include births to women who were sampled and recruited at a selected provider at their first prenatal visit. The second SSU stratum (the birth stratum) would include women who had their first prenatal visit at a provider listed in the second provider stratum’s frame. They would be eligible for sampling and recruitment into the study at the selected hospital shortly after delivery (NICHD, 2013d, pp. 13-14), as would women who had their first prenatal visit at a provider not listed in either stratum’s frame (which could occur if a provider had too few patients to be listed, or a new provider location was established during the enrollment period) or who did not receive any prenatal care. Although providers are listed in the second SSU stratum’s sampling frame, the providers in this stratum are not used as sampling units for the hospital-based sampling of births, though they are used to establish the eligibility of those births.
Although it would be possible to recruit all of the potential NCS participants who receive prenatal care during the women’s initial prenatal visits, the
1The term “equal probability” is used somewhat loosely here because, as explained below, plans for the sibling component of the national probability sample will essentially double the selection probabilities of sampled siblings relative to other births. In addition, differential attrition and other, more technical, aspects of any sample design produce variation in the achieved selection probabilities in a planned equal probability sample.
2Children born to mothers who have previously agreed to participate in the study are enrolled at the time of their births. For women recruited into the study prenatally, the recruitment period will begin 6 to 9 months before the start of the birth window and will end 6 to 9 months before the end of the birth window.
3The proposed sample design also includes supplemental (convenience) samples, which are discussed in Chapter 2.
4The provider SSU is operationalized as practice locations, rather than practice associations. For example if ABC practice has two locations, and both locations deliver patients at the selected hospital, then ABC practice would be listed in the provider frame twice, once for each location.
Program Office stated that it would be too costly to enroll and collect data during the prenatal period for the full Main Study cohort. The current proposal is that 50 percent of births in the probability sample will come from the prenatal stratum and 50 percent from the birth stratum. The first- and second-stage sample design for the PSUs and SSUs, respectively, would be a stratified probability-proportional-to-size design using births in recent preceding years to develop the measure of size.
Because the PSUs will have more annual births than the target number needed for the NCS, individual women will be sampled. For the prenatal stratum, there will be sampling of eligible women at their “first prenatal visit” to a sampled provider. For the birth stratum, there will be sampling of eligible women who have a live birth at the selected hospitals and birthing centers. For both strata, the penultimate sampling stage will use randomly selected days and time periods at the sampled provider office, hospital, or birthing center. The ultimate sampling unit by which the NCS cohort is defined is technically the live birth, selected from pregnant women who will give birth or women who just gave birth during the birth window.
In the plan initially provided to the panel (NICHD, 2013b, 2013d), the sampling frame for hospitals was to be developed from the list of hospitals maintained by the American Hospital Association, augmented with a list of birthing centers from the American Association of Birthing Centers, and further augmented with other information such as natality data, the State Inpatient Databases, and other federal and commercial data bases. However, at the meeting of the panel in October 2013, the NCS proposed that the list frame be developed from the 2010 State Inpatient Databases of the health care cost and utilization project data repository compiled by the Agency for Healthcare Research and Quality of the U.S. Department of Health and Human Services.
Two months later, in December 2013, the panel received a summary from a preliminary report written by NCS expert sampling consultants that documented progress on design of the first-stage sample. Based on an initial analysis of data from 27 states and with no stratification, a sample size of between 200 and 300 PSUs (individual hospitals or hospital clusters) would “permit generation of reasonably precise national estimates of birth outcomes as well as allow for a nominal level of precision for analysis of relatively large subgroups (sex, larger race/ethnic groups, income quartiles)” (NICHD, 2013i, p. 2). The consultants’ analysis used a size cutoff to omit hospitals with fewer than 50 births per year (birthing centers were not mentioned) in order to account for 99.9 percent of national in-hospital births.
The sample frames of providers would be constructed by working with each hospital selected in the first stage to identify a list of referring practice locations. Additional sources of information about prenatal providers would include: state licensing records, insurance lists, medical society listings, birth data from official state and county sources, and professional association mem-
bership lists (NICHD, 2013h, p. 1). The Program Office wrote that it may also administer provider questionnaires to use in preparing the sampling frame for providers (NICHD, 2013d, p. 71).
The NCS proposes to implement the study in phases by initiating the recruitment activities in different subgroups of the PSUs at different times. A 4-year sample rollout period,5 when combined with the expected 4-year birth window within sampled PSUs, means that any given round of data collection (e.g., information gathered when a child is 6 months old) will last 7 years.6
Another novel aspect of the sample design (discussed in NICHD, 2013d) is to include in the sample with certainty all siblings born during the 4-year birth window to mothers with a child already in the sample (which we refer to as the “target” child or birth). This plan will allow for the collection of preconception and early pregnancy exposure biologic data for subsequent siblings that will not be available for the originally sampled (target) child, for whom data collection began later in pregnancy.
NCS also plans to monitor recruitment by category and increase enrollment efforts to achieve the desired representation (discussed in NICHD, 2013d, p. 80; illustrated with Vanguard Study data, pp. 6-8). The NCS also plans to follow movers (discussed in 2013e, pp. 4-5) and to target retention efforts at subpopulations at greater risk of attrition (discussed in NICHD 2013b, pp. 26-28).
The panel evaluated the elements of the sampling plan described above against the kinds of sampling plans that are often developed for national surveys. Typical sampling plans describe a study’s objectives and constraints and the steps proposed to operationalize and realize those objectives. Examples of surveys with readily-available sampling plans include the National Survey on Drug Use and Health,7 the National Survey of Child and Adolescent Well-Being,8 the Current Population Survey,9 the National Longitudinal Study of Youth 1997,10 and the National Postsecondary Student Aid Study.11 These reports provide more detail than the NCS Program Office can be expected to
5NCS wrote that the “specific rollout plan is still under development” (NICHD, 2013j, p. 2). To be consistent with the cost model referenced elsewhere in the report, we assume a 4-year rollout period.
6Prenatal recruitment, of course, precedes the start of any birth window.
7For a description, see the appendixes at http://www.samhsa.gov/data/NSDUH/2012SummNatFindDetTables/NationalFindings/NSDUHresults2012.pdf [March 2014].
8For details, see http://www.ndacan.cornell.edu/datasets/pdfs_user_guides/092_Intro_to_NSCAW_Wave_1.pdf [April 2014].
9For a description, see http://www.census.gov/cps/methodology/techdocs.html [April 2014].
10For details, see https://www.nlsinfo.org/sites/nlsinfo.org/files/attachments/121221/TechnicalSamplingReport.pdf [April 2014].
Key Elements of an NCS Sampling Plan: An Illustration
A well-specified sampling plan for the National Children’s Study would contain the following elements:
- Clear statement of the study objectives
- Clear statement of the target population that is to be sampled
- Target sample sizes at the beginning, middle, and end of the study
- For each stage of sampling in the prenatal and birth sample:
o Definition of the population and sampling unit
o Description of the sampling frame and its quality
o Plans for stratification and allocation based on fixed and variable costs and variance
o Target number of units to be sampled
o Sampling protocol (equal or unequal selection probabilities within stratum)
- The sampling protocol for subsequent siblings
- A rigorous determination of the overall inclusion probabilities
- Expected completion rates at each stage
- Estimated design effects
It would also contain an explanation of how the sampling plan attempts to minimize both sampling and nonsampling errors.
specify at the design stage. Still, they suggest the type of information that is needed, and much of it can be specified before the first interview is conducted. All of those surveys’ materials include a detailed description of the scientific justification for the various design decisions. For an illustration of how a comparable sampling plan would look for the NCS, see Box 3-1.
The NCS study objectives and main hypotheses need to be stated clearly and include important domains and key outcomes. The target population has to be fully described, and the sampling plan should delineate the target sample size at recruitment, birth, and key data collection milestones through age 21. For each sampling stage, the plan needs to provide a precise definition of the recruitment, sampling, data collection, and analytical units; and how and when the sampling will be implemented. The sampling protocol should be provided. In the case of the NCS design, the sampling protocol for subsequent siblings should be provided in detail. The plan should include a rigorous determination of the overall inclusion probabilities, expected yield rates, response rates, and retention rates at each stage (hospital, provider, birth) and cumulatively. Meth-
ods to adjust the target sample sizes for sample ineligibility and other sampling uncertainties should also be included in the plan.
The NCS sampling plan needs to be based on scientifically valid methods that attempt to minimize both sampling and nonsampling errors,12 balancing the often competing goals of minimizing both variance and bias. Making the correct design decisions requires simultaneously considering many quality and cost factors, choosing the combination of design features and parameters that minimizes variances and biases while satisfying the specified costs and precision constraints.
The panel was asked to consider “the overall sample size and design” proposed for the NCS. The rest of this chapter provides our assessment of design elements when possible and indicates for what elements the panel has been unable to evaluate the design because the NCS Program Office did not provide enough information to do so. As described below and in Appendix A, the various NCS documents provided to the panel included needed details on some of these elements but not others. Relatively clear descriptions were provided of the target population, initial sample size, and plans for the inclusion of subsequent siblings. The incomplete nature of the hospital-based sampling plan generates many uncertainties about its quality and feasibility.
The target population for the proposed design is all live births in the United States during a 4-year time period, referred to as the inference period in this document. Although they are part of the NCS target population, two types of births are excluded from the NCS sample frame: births to women who do not deliver in birthing hospitals or birthing centers and births to women at hospitals in which there are too few births to be included in the sampling frame.
The decision to use hospitals and birthing centers as the PSUs rather than geographic PSUs necessitates eliminating from the sampling frame the estimated 1 percent of births to women that do not occur in hospitals or birthing centers.13 The interim proposed design (NICHD, 2013i) also provides a cost and operational justification for excluding hospitals with a very small number of annual births: namely, that it would be inefficient to establish and maintain
12This approach is referred to as minimizing total survey error (see Weisberg, 2005).
13The NICHD (2013e, p. 1) reports: “According to the 2010 and 2011 U.S. natality data, only 0.8% of women give birth outside of hospitals or birthing centers, mostly in homes…. These women are more likely to be US born, older, non-Hispanic white, married, and to have either less than 9 years or more than 16 years of schooling.”
field operations for recruitment in such PSUs, and the interim design specifies a size threshold of 50 births annually for hospitals. Thus, any hospital with fewer births would not have a chance to be sampled. Planned future analyses will consider the feasibility of increasing the size threshold, which would increase the undercoverage.
All documents received by NCS before the interim proposed design document state that the sample is to be drawn from a list of hospitals and birthing centers. The design document does not state whether the same size threshold criterion would also be used for birthing centers, or even whether birthing centers are included in the database. If birthing centers are included on the same list frame with hospitals and a single size threshold is used, they are unlikely to be adequately represented in a probability proportional to size sample because they tend to have a much smaller number of births than birthing hospitals. If birthing centers are to be included in the sample, a separate list frame with a lower threshold might be considered; however, a separate sample of birthing centers may not be logistically feasible. Considering that birthing centers account for only 0.4 percent of births, with substantial variation by geography and race and ethnicity (MacDorman et al., 2014), it may be that the 0.4 percent of births at birthing centers should be excluded from the frame. Taken together and in light of these logistical considerations, the panel views these exclusions as reasonable, although the NCS plan for the inclusion or exclusion of birthing centers needs clarification.
Due to the possibility of seasonal patterns in births, exposures, and outcomes, the panel judges that the inference period should consist of full calendar years. Also, to maximize the utility of size measures that will be needed to sample with probability proportional to size at the PSU and SSU stages, the number of births to be used for those size measures should be based on whole years as well. However, the NCS plan is that the birth windows will be rolled out over time for the PSUs in the study. It appears that blocks of PSUs would begin enrollment in three or four approximately annual waves and there would be a phased activation of PSUs within waves over a period of several weeks to a few months, starting with the smaller PSUs first. Whether this rollout happens in blocks of PSUs being launched the first of each year or on a rolling basis over a few years, the different start dates for the birth windows will require reconciliation in the estimation process. Although this staggering of launch dates is necessary for a study of this size, it should be noted that these differences in start dates for the birth windows affect not only how the target population is described but may also increase the bias and variance of estimates because data from the birth windows will have to be adjusted to represent the inference period.
Deviations from a full 4-year birth window for any PSUs or sample components could also be problematic from an estimation standpoint. The NCS Program Office commented (NICHD, 2013i, p. 2) that “some of the larger
providers may have a slightly shorter than 4-year recruitment period.” The panel’s understanding is that larger providers would be fielded later due to their smaller sampling fractions and the flexibility that brings to the enrollment process, but that it would not follow that the duration of their birth window would be shorter. Similarly, the NCS Program Office’s comment that enrolling women for the prenatal sample component would take place only through the first 3 years (NICHD, 2014a) is another possible example of such a planned deviation from a 4-year birth window. By definition, because the births must occur during a PSU’s 4-year window to be eligible for the study, prenatal recruitment of births must start 6 to 9 months prior to the start of year 1 of the birth window and end 6 to 9 months prior to the end of year 4 of the birth window. We expect that women recruited during these sampling times who end up delivering earlier or later than expected, and who therefore do not deliver during the birth window, would later be considered ineligible for the NCS.
CONCLUSION 3-1: The panel endorses the proposed target population of all births in the United States during a specified time period consisting of 4 full calendar years, as well as the proposed sample exclusions from this target population.
An overall size of 100,000 first appears in NCS documents in 2002 and has been assumed or endorsed many times since then.14 In the current proposal, the size of the probability sample has been reduced to 90,000. The NICHD (2013d, p. 20) states:15
No matter what design NCS may use, there will be limits to detect associations between exposures and outcomes. The proposed design with a national probability sample size of 90,000 is limited to detect associations
14The second meeting of the NCS Advisory Committee in 2002 discussed a sample size of 100,000. For a record of that meeting, see http://www.nationalchildrensstudy.gov/about/organization/advisorycommittee/2002Jun/Pages/SAC_062002_minutes.aspx [March 2014]. In sample design documents presented to the NCS Advisory Committee in 2004, the sample size of 100,000 was a “given.” See http://www.nationalchildrensstudy.gov/about/organization/advisorycommittee/2004Jun/Pages/other_work_062004.aspx [March 2014]. The design reviewed in the earlier study also specified a probability sample size of 100,000 (National Research Council and Institute of Medicine, 2008, p. 2).
15For the 30 hypotheses included in the 2007 design, NICHD (2013b, Appendix 2, pp. 40-41) illustrates minimum detectable odds ratios for various sample sizes (70,000, 80,000, and 90,000), for selected exposure percentages and outcome percentages under the previous (geographic-based) sample design. The table also includes the design effect (“DEFF”) associated with the previous county-based sample design. A design effect is used to indicate the extent to which a sample design that deviates from simple random sampling increases the variance of estimates. In that table DEFF typically ranges between 3 and 4, but the maximum is about 20.
between exposures with a prevalence of about 3% and outcomes with a prevalence of about 2%.
In addition, a table in NICHD (2013g, p. 4) shows the sample sizes needed to detect effect sizes of various magnitudes with 80 percent power and 5 percent significance level16 for different levels of exposure and different prevalence levels for the outcome. These calculations are based on an assumption of simple random sampling, although the document notes that, because a complex sample design will be needed, the sample sizes in the table will need to be “multiplied by design effects for the particular estimate under study.” Because the hospital-based sample design has not yet been completed and actual selection probabilities for births have not been derived, these design effects are not yet known. The uncertainty surrounding the scope of the design effects makes it difficult for the panel to judge the adequacy of the proposed 90,000-birth sample size.
Another difficulty is the NCS Program Office strategy to deemphasize the use of explicit hypotheses to guide the study design so the NCS can serve as a platform for future research. This change makes it difficult to evaluate the proposed sample size by consideration of explicit study objectives. In particular, if lack of resources were to result in a smaller probability sample, analysis of specific objectives might become necessary to reevaluate the effects of the reduction on the study design, and, more broadly, to assess the feasibility of the study itself. The panel does endorse the notion of the NCS being a study platform, but this decision means the proposed design is likely to be inefficient and perhaps insufficient for certain types of analyses to answer specific questions that arise later.
CONCLUSION 3-2: Because of the lack of explicit hypotheses in the study design, it is not possible for the panel to judge whether the proposed sample size is justified on the basis of the study’s objectives.
Equal Probability Sample
The stated goal of the NCS is to have an equal probability of selection sample to the extent possible.17 The intent is to select each newborn, either
16In this case, the stated goal is to identify an odds ratio of 2 or greater with a power of 0.8 and a two-tailed significance level of 0.05.
17As stated by the NICHD (2013d, p. 11): A guiding principle of the NCS sample design is “simple sampling weights at the outset to increase utility of the data later.” In addition, (NICHD 2013e, p. 3): “The NCS is proposing an equal probability design …”; and (p. 4) “the only circumstances when an unequal probability of selection may arise is in the sibling cohort.” And in another document, the NICHD (2013g, p. 5): “We are currently planning on an equal probability sample of births, that is, not oversampling for any special group.”
prenatally or at birth, with the same probability of being included in the sample, exclusive of the sibling sample. Despite the complexity and variations of the designs that have been proposed or investigated over the last decade, all have attempted to provide an equal probability of selection to all births occurring during the birth window. As discussed in Chapter 2, one rationale provided by the NCS is that “equal probability” sampling is a logical approach if the study is to serve as a study platform that would be able to address many current and possibly unanticipated domains of future scientific inquiry.
The previous review of the NCS (National Research Council and Institute of Medicine, 2008) judged that the lack of any oversampling of population subgroups is justified because the planned sample size of 100,000 would provide sample sizes for major demographic subgroups that are large enough to provide adequate statistical power across a number of subgroups of interest for research on health disparities.
The current panel reconsidered this issue, as well as whether oversampling might be needed to adjust for analytically interesting population subgroups, such as low-birth-weight babies to families of low socioeconomic status, who may have higher expected nonresponse and attrition rates. As noted above, the 2000 Children’s Health Act directed that the NCS be designed to “consider health disparities among children.” To address the issue of sample size for population subgroups important for research on health disparities, the panel calculated the fraction of births in 2011 for combinations of race, ethnicity, and maternal education level (factors related to low socioeconomic status). As shown in Table 3-1, one of the smallest percentages of births across the combination of these characteristics is 1.9 percent for non-Hispanic blacks with a bachelor’s degree or higher. The NCS’s initial sample size of 100,000 would be expected to yield about 1,900 such births without any oversampling.18 The forecasted 20 percent attrition for the study would reduce this figure to about 1,500, which is still likely to be large enough to support many important estimates for this group.
Of course, the categories of “non-Hispanic Black” and “Hispanic” are broad and do not include other potentially important but smaller racial and ethnic subgroups, such as American Indians and Alaska Natives, Asians (overall and of different origins), or specific Hispanic national origins (e.g., Mexicans or Puerto Ricans). The sample design should take into consideration whether the benefits of including adequate representation of births in such smaller subgroups justifies stratification or oversampling, relative to the statistical cost of reducing the efficiency of the sample for making estimates of the overall population of births. More generally, the sample design needs to be explicit about the subgroups for which there may be adequate numbers and the subgroups for which the numbers may be inadequate. The lack of core hypotheses and
18Stratification could be used to help ensure this yield.
TABLE 3-1 Percentage of Births in Various Combinations of Race and Ethnicity and Maternal Educational Attainment, 2011
|Race and Ethnicity||Less Than High School||High School Graduates||Some College||Bachelor’s Degree or Higher||Percentage|
NOTE: This table is based on the 80.0 percent of births for which both race and ethnicity and maternal school was known: N = 3,293,891.
SOURCE: Data from the 2011 Natality Public Use File, Centers for Disease Control and Prevention. See http://www.cdc.gov/nchs/data_access/vitalstatsonline.htm [May 2014].
research priorities to guide the design makes weighing such sample allocation decisions virtually impossible to resolve by standard scientific approaches.
Differential attrition is also an important consideration in deciding whether to use any oversampling in the design. While the stated intention of the NCS (NICHD, 2013b, pp. 26-28) is to monitor attrition by population subgroups and develop improved retention approaches when necessary, the panel is still concerned about disproportionate attrition among disadvantaged groups that have relatively more health and developmental problems. Most national longitudinal studies, such as the Early Childhood Longitudinal Study—Birth Cohort Study and the Fragile Families and Child Well-being Study,19 have suffered disproportionate attrition among socially disadvantaged groups.
A strategy to counterbalance the analytic impacts of disproportionate attrition is to oversample disadvantaged groups at the beginning of a study relative to their likely attrition patterns through the middle or end of the follow-up period. This approach will increase statistical precision by both reaching and maintaining the targeted sample size for high attrition groups and by achieving final weights that are more homogeneous, thus reducing the effects of weight variation on the estimates.
During the panel’s October 2013 public meeting, NCS staff expressed the belief that the advent of social media would likely produce different and less
predictable attrition patterns in the NCS, making it unwise to alter the study design with disproportionate representation of disadvantaged subgroups. The panel believes that this judgment needs to be based on a careful review of the recent experience of U.S. studies with designs and survey contractors similar to those employed by the NCS.
CONCLUSION 3-3: By adopting an equal probability of selection design for the National Children’s Study, it is likely that the sample sizes for a number of subgroups of interest will be inadequate for some important types of analysis. These subgroups are likely to include minorities in the U.S. population who are known to be on the negative side of health disparities and to have higher attrition in longitudinal studies. However, the absence of explicit study hypotheses and objectives makes it difficult to identify these important population subgroups and their associated sample size requirements.
Stratification is a key element of most sample designs and has been mentioned but not detailed in documents provided to the panel. Stratification can help to ensure proportional representation of key subgroups of the target population and that the sample includes regions with varying levels of demographic characteristics, exposures, and other variables of analytic interest. It can also be used to oversample certain subgroups. In this section, we discuss the use of stratification to ensure variation of key attributes of subgroups to address the NCS’s goals regarding health disparities and environmental influences.
Describing and understanding health disparities by race and ethnicity requires separating the independent contributions to health of socioeconomic status, race and ethnicity, and geographic location (see La Veist, 2005; Williams and Sternthal, 2010; Yang et al., 2004). Prior studies have had difficulty separating the effects of these different factors because of strong confounding of socioeconomic status and geography with race and ethnicity. The presence of sufficient variability in socioeconomic status and geography within race and ethnic groups is fundamental to answering key questions about health disparities.
The brief PSU design summary the panel received (NICHD, 2013i) indicated that the current PSU design is likely to be comprised of 200 to 300 hospital PSUs. That design, which did not include stratification, said that the contractors will continue to refine their analyses by adjusting some assumptions and evaluating potential stratification variables, including: area-level income, birthweight, infant death, race and ethnicity, premature birth, respiratory distress syndrome, and health insurance type; however, this work had not been completed as of February 2014.
To strengthen its ability to study health disparities, the NCS Main Study
sample could be stratified by characteristics currently thought to be strongly associated with these disparities, including the ones listed in the PSU design summary, as well as by geographic region, exposures, and urbanicity. The panel believes it is essential to stratify in such a way as to include in the sample a sufficient number of births into families with both high and low socioeconomic status within racial and ethnic minority groups, as well as assuring variation in exposures and geography.
Stratification can be done at more than one sampling stage, depending on what characteristics are available in the sampling frames at each level. In fact, the information needed to stratify may drive the NCS to one particular sampling plan over another. For example, if needed stratification variables are available at the county level but not for hospitals, that may be an argument for keeping counties as the PSUs, rather than using hospitals as the PSUs. Thus, those refining the NCS sample design should investigate the extent to which any or all of these proposed stratification variables (or variables highly correlated with them) are available for various sampling units, whether counties, hospitals, or provider locations.
The currently proposed plan to split the provider sample frame to enable half of the sample of births to enter the sample from prenatal providers and half from hospitals is an atypical form of stratification. While the panel recommends (Chapter 2) that the sample enrollment not be split into prenatal and birth strata, if this were to be implemented as planned, the Program Office would need to specify how the SSU provider frames would be split, including any stratification and probability methods, and the strategy for making the two subframes comparable.20 If the split is nonrandom, then each half of the split would not be representative of the target population on its own and so an unbiased inference could proceed only when both subsamples are analyzed in combination.21 This approach would be quite restrictive for some types of analysis—in particular, estimates using data collected prenatally.
RECOMMENDATION 3-1: The National Children’s Study Main Study sample should be stratified by characteristics that will achieve variability in socioeconomic status within important population groups to support analysis of health disparities, as well as achieving variability in envi-
20The NICHD (2013d, p. 14) states: “In a balanced sample allocation model between birth and prenatal recruitment, the prenatal location strata frame will have a cumulative measure of the number of annual births needed equal to 50% of the PSU annual births.”
21For example, if the birth stratum providers are not selected randomly, then any model that uses prenatal information as a predictor can only use half of the data, and that half is not a random sample. The degree of bias this will impart in an analysis depends on what nonrandom system was used for stratification. If, for example, mostly smaller practices were selected for the birth stratum and if richer women go to large practices, then the estimated relationships between prenatal factors and birth outcomes would be biased.
ronmental exposures and geography to support analysis of relationships between exposures and health outcomes.
Hospitals as Primary Sampling Units
The panel was asked to evaluate the proposed hospital-based sampling design, that is, the use of a list of hospitals and birthing centers as the PSUs. This approach differs from those tested in the Vanguard Study, which used geographically based PSUs.22 It also differs from the approach proposed at the workshop on the design of the NCS Main Study in January 2013, which used geographically based PSUs with hospitals as SSUs (see National Research Council and Institute of Medicine, 2013).
In the locations involved in the Vanguard Study initial and alternative recruitment pilots, all hospitals at which women selected into the sample delivered were asked to participate in the study by providing birth specimens for women who had already consented to be in the sample. Hospitals were not used as sampling units. Hospital-based recruitment was tested in the Vanguard Study by targeting three hospitals selected in each of three geographic PSUs, asking them to collect birth specimens for all women in order to have the information for women recruited after birth. Currently, the Program Office has proposed using a list frame of hospitals as the PSU frame for the study. The stated rationale is that by selecting hospitals first, and then providers within hospitals, the study would minimize the number of hospitals to enlist to participate in the birth specimen collections.
The documentation provided by the NCS concerning the proposed sample design is incomplete because the statistical work to develop the first-stage sampling plan for hospitals had not been completed by February 2014. It may be that a list of hospitals could provide a useful frame for the NCS first-stage sample. However, the panel has very little information concerning the quality of the proposed frame as a basis for the proposed PSU design.
While the State Information Databases currently proposed as a frame appear to be a feasible approach, the panel is concerned about potential undercoverage because the analysis currently under way is based on data from only 27 states.23 Because the objective of the NCS is to draw inferences that are representative of the U.S. population, all hospitals in the United States must be listed and have a non-zero chance of selection. If the state databases are incomplete, especially with regard to only some states, the Program Office would have to
22The previous set of 110 PSUs was selected by the National Center for Health Statistics in 2005, using the number of live births for 1999-2002 as the measure of size.
23According to the NICHD (2013i), data from 27 states are available without getting separate permissions from the states. The data from the remaining states may or may not be available to the NCS because the Program Office would have to obtain permission from each of the other 23 states to use their data.
consider a hybrid approach in which states that are fully covered in the State Information Databases hospital frame use the hospital-based sampling design and other states make use of a geographically provider-based design.
Furthermore, according to the documentation for the State Information Databases, the availability of variables varies by state, including the hospital identifier. There is also some need to verify that the distribution of births on which the sample will be drawn is consistent with expected births during the recruitment period. This factor is important because the health care environment is highly volatile, with patient populations shifting quickly among hospitals. If based on data from 2010, at least 4 years will have passed between the date of the sample and the start of the study. Hospitals could have joined different networks, changed affiliated provider groups, or experienced increases or decreases in the number of births of various types. Access to vital statistics data at the state level for the prior year would indicate if such shifts have occurred.
CONCLUSION 3-4: The panel has not been provided with sufficient detail on the planned hospital-based sample design and recruitment strategy to judge their merits and scientific validity or determine potential coverage bias and the availability of appropriate stratification variables.
The NCS Program Office has asserted that the proposed hospital-based sampling approach would make it easier to make substitutions for hospitals that refuse to participate or to replace those that are found to be ineligible for the NCS (for example, if they have closed the obstetrics service) relative to the county-based approach. However, no evidence was provided to the panel to support this claim.
In the Vanguard Study—as in many community-based studies—geographic sampling even at the county level, and especially at the segment level, could function as a method to achieve diversity in socioeconomic status, exposures, and race and ethnicity because they are clustered by geography. It is less clear whether and how a hospital sampling frame could identify a set of key variables that would allow a similar type of efficient stratification while still maintaining strict “all births” probability sampling.
Assessment of the proposed sample design, when completed, should include comparisons with the previous designs and variations to those designs. As noted above, a previous design, which was tested in three Vanguard Study locations, used geography to define the first-stage sample, with prenatal care providers selected from the sampled geographic areas, followed by recruitment of women from the providers. One of the challenges with this plan was the number of hospitals that would need to be enlisted to collect birth specimens for women already enrolled in the study. A variation, previously proposed by NCS (see National Research Council and Institute of Medicine, 2013), was to use the same sampled geographic locations, with the second stage to be a
sample of hospitals in the selected area, and the third stage to be the providers associated with selected hospitals. A comparison could also be made with a hybrid approach: using hospitals as PSUs in states for which a complete list is available and using geographic PSUs in other states. Any comparison needs to include a cost-effectiveness analysis of the options and an assessment of the ability of each option to ensure coverage and to control for such characteristics as race and ethnicity, socioeconomic status, age, and marital status so that the sample will support evaluation of health disparities.
If the NCS reconsiders a variation on the previous county-based PSU design, the PSUs would have to be redrawn to reflect more current data. However, maximizing the overlap between the old PSUs (at least the locations that had experience in the Vanguard Study) and newly drawn PSUs (see, e.g., Ernst, 1999) might add efficiencies because the birthing hospitals are already familiar with the NCS, and NCS contractors continue to follow the Vanguard Study cohort participants in these locations.
CONCLUSION 3-5: The panel has not been provided with sufficient justification for moving to hospital-based primary sampling units from the sampling approach previously proposed by the National Children’s Study discussion at the 2013 NCS Workshop (see National Research Council and Institute of Medicine, 2013) and based on Vanguard Study pilot testing—namely, county-based primary sampling units with hospitals as secondary sampling units and providers as third stage sampling units.
Because the current plan calls for hospitals to be selected with probability proportional to size, it is important that a good measure of size be available for each hospital on the frame. Inaccurate size measures in a multistage sample can lead to a less efficient design. The best that can be achieved in practice is that such measures would be at least 1 year old at the time of sampling.
Better information concerning initial hospital cooperation rates and recruitment of women at hospitals after birth is needed for efficient protocol development, planning, and cost analysis. A very small sample of hospitals was used in the hospital sampling component of the Vanguard Study, and those samples were drawn within three geographic PSUs. The method of selection was not described to the panel, however, and in one county the “Study Center staff had an existing relationship with selected hospitals” (NICHD, 2013f, p. 2). A detailed plan for replacement of hospitals that decline to participate needs to be better delineated, as well as how such replacements will be dealt with in weighting and response rate calculations, especially if similar replacements do not exist or are otherwise not available.
NCS contractors will need to receive institutional review board (IRB)
approval from the NICHD IRB and possibly other IRBs24 of record in order to receive individually identifiable information needed for the study from hospitals and providers. Based on Vanguard experience, most providers do not have separate IRBs. However, hospitals may have their own IRBs and require that the NCS use their IRBs for in-hospital data collection. This may lead to challenges in gaining cooperation, and may increase costs. (The panel’s cost analysis did not include costs for IRBs.)
Providers as Secondary Sampling Units
The panel was also asked to evaluate the use of health care providers to sample and recruit prospective participants. The use of providers to recruit prospective participants was tested in the provider-based recruitment arm of the Vanguard Study (10 locations), and the use of providers to sample and recruit prospective participants was tested in the provider-based sampling part of the Vanguard Study pilot (3 locations). In both of them, providers were selected within the previous plan’s geographic PSUs, and women were sampled within these providers. Even though many procedures that are being proposed for the Main Study have been tested in the Vanguard Study, the panel was not provided with detailed information about all approaches that were tested and how well they worked. Some detail and discussion of the NCS provider-based sampling experience has been published by former Vanguard Study principal investigators (see, e.g., Belanger et al., 2013).
Several aspects of the design are of particular concern to the panel: the lack of information concerning the feasibility of developing a complete sampling frame of providers (specifically, practice locations) for selected hospitals; definition of the appropriate measures of size for sampling; approaches for dealing with providers that use more than one hospital; and how women are proposed to be recruited at their “first prenatal” visit, including the precise definition of such a visit. Like the hospital frame, the provider frame would have to be complete, recent, and of high quality and contain, at a minimum, a size measure that can be used for a sampling probability proportional to size of providers. The plan for this stage of sampling is to stratify the provider frames “by such provider location characteristics as are available for that list. Stratification factors could include geographical location, size, race/ethnicity mix, percentage of women on Medicaid, depending on the data available” (NICHD, 2013h, p. 2).
24The NICHD (2014a, pp. 6-7) states “there will be no need for IRB approval at each hospital as the HHS Office for Human Research Protection (OHRP) has determined that the engagement of facilities in the NCS is not human subject research. Actual data collection will be performed by NCS contractors. NCS will make use of the central IRB at NICHD in accordance to the HHS Office of Human Research Protections. If a hospital opts not to conform to the OHRP determination and wishes to use the NICHD IRB, the NCS does not plan to pay the IRB administrative costs but will provide a standard submission package and annual reports.”
As with hospitals, a detailed plan for replacement of providers at the time of recruitment needs to be better delineated, as well as how such replacements will be dealt with in weighting and response rate calculations. In response to expected attrition, the current plan is to select a replacement entity from the same stratum on the frame. After 2 years in the field, NCS plans to check the stability of the provider list and, if needed, draw replacement providers (from an updated frame) that are similar to those that withdrew from the study. The Program Office (NICHD, 2013g, p. 6) stated that, in the Vanguard Study experience, attrition of providers over the 1st year or 2 has been small: 1 of 49.
Births as Last-Stage Sampling Units
With their provider-based recruitment and provider-based sampling experiments, the NCS may have sufficient pilot testing experience with the sampling of women at prenatal provider offices. But little information was provided to the panel about the pilot testing of the sampling of women just after birth at hospitals. (Chapter 4 discusses concerns about recruitment and data collection of sampled women at hospitals.)
As currently proposed by NCS, some births will be sampled at hospitals whether the hospital sampling is used only for women who have had no prenatal care or who received their first prenatal visit from a provider not included in the provider frame (as recommended in Chapter 2). While some of the procedures for identifying women eligible for hospital recruitment have been tested in the Vanguard Study, the panel was not provided with information on how these hospitals were selected, only that some may have been positively disposed to the NCS investigators.25
For hospital-based sampling, it is important that the NCS demonstrate it is able to: (1) identify all women eligible for selection in that setting; (2) sample them with known probabilities that can be adequately controlled during the recruitment process; and (3) recruit them with high success rates before these women leave the hospital. It is not clear to the panel that all hospitals will be willing to allow women to be recruited shortly after delivery or to assist in the identification of women in labor who are eligible for sampling in a consistent and scientifically valid way. It is likely that the opportunity to sample and recruit some eligible women will be missed. To calculate selection probabilities and response rates, it is crucial that all deliveries that are eligible or potentially eligible for inclusion in the study be accounted for, even if the recruitment opportunity is missed.
The Vanguard Study conducted a limited hospital-based sampling approach (proposed for the birth stratum). Attempts were made to recruit three hospitals
25In one location, researchers had existing relationships with hospitals and were able to make use of the hospital’s electronic records to prescreen women for eligibility (NICHD, 2013f).
in each of three locations, and seven of the nine participated: thus, the provider-based sampling recruitment rate for hospitals was 77.8 percent.26 However, the Program Office pointed out that the two that did not participate were not refusals but that the project ran out of time so it is not possible to predict what the ultimate refusal rate would have been. Based on information from all hospitals approached during the Vanguard Study, the Program Office has stated that they “they expect no more than 15%, and perhaps no more than 10% of hospitals to decline to participate” (NICHD, 2013d, p. 56).27 However, this may be optimistic because most of the hospitals in the Vanguard Study were asked only to collect specimens for women who had already consented to participate in the study—a much lower level of effort than what is currently proposed for the Main Study. If the study design includes a substantial hospital-based sampling stratum, a pilot study would be needed to evaluate these issues.
The Possibility of Nonresponse Bias
Based on the Vanguard Study, nonparticipation rates at each stage of sampling (hospital, provider, pregnant women, and births) and the associated cumulative nonparticipation rates appear to be high.28 Aside from nonparticipation of hospitals and providers,29 which will be addressed using sample substitution and perhaps not included in the denominator of the response rate, it appears that provider-based sampling of women in the Vanguard Study (NICHD, 2013b, p. 14) had the following response rates:
- contacted for screening among sampled: 74 percent
- completed screening among contacted: 70 percent
- recruited among screened and eligible: 68 percent30
26The panel calculated this rate from NICHD (2013f, p. 4): 7/9 = 77.8 percent. This is rounded in NICHD (2013d, p. 72) which states: “In PBS [provider-based sampling] experience the hospital recruitment rate was 80%.” The panel’s cost analysis assumed that 20 percent of hospitals would agree to participate.
27There is also considerable local variation. In one Vanguard Study location that implemented provider-based recruitment, Kerver et al. (2013) found that the participation rate among hospitals was 71 percent, even though hospitals were only engaged to obtain biospecimens for previously consented women.
28The NICHD (2013g p. 7) agreed with this statement but clarified that “the response/participation rates at each stage of the Vanguard Study have been acceptable and comparable to or higher than other surveys of this nature.”
29Results from the provided-based recruitment and provider-based sampling in the Vanguard Study indicate expected provider recruitment success rates of 64 to 68 percent.
30The product of rates in the above bullets for “completed screening among contacted” and “recruited among screened and eligible” is 47.5 percent, an estimate for the percentage of women contacted who were eligible and recruited. The panel’s cost analysis assumed that this rate is 50 percent.
The response rate at recruitment (i.e., the fraction of women recruited relative to the total number of women sampled and estimated to be eligible) is 35 percent—the product of these three response rates.
This low response rate is problematic in part because if it differs across groups, the resulting estimates based on the NCS Main Study sample may be biased, even with well-designed weighting adjustments. Findings from the initial Vanguard Study recruitment pilot showed variation in cooperation rates at each stage across PSUs and by race and ethnicity, with significantly lower consent for women eligible for recruitment among Asians (Baker et al., 2014). The current design also may introduce differential enrollment between minors and adults due to the requirement for consent of the legal guardian for unemancipated minor pregnant women, because the legal definition of an emancipated minor varies among jurisdictions.
Some of the children enrolled in study will be lost to attrition, which is likely to occur differentially across groups.31 According to NICHD (2013b, p. 27): “[The] NCS Vanguard Study data shows for preconception and prenatal women attrition between enrollment and birth ranges from 10 to 20 percent.”32 The Vanguard Study experiences were similar to those reported by other studies: attrition is greater during the initial one or two data collection visits after enrollment and then tends to decrease. In the proposed birth stratum, if this pattern occurs, there likely will be more substantial attrition during the first six months after birth than will be observed during the six months after birth in the prenatal stratum, as the latter already experienced their initial attrition pattern during their prenatal period. However, data to estimate attrition during the first six months after birth in the birth stratum are not yet available from the Vanguard Study (see NICHD, 2013f, p. 2). Attrition due to mobility33 is greater in urban areas and among populations of low socioeconomic status who live in rental housing. In addition, NCS estimates that the annual retention after birth will be between 97 and 99 percent based on the experience of other longitudinal studies (NICHD, 2013b, p. 26).34
It is difficult to estimate the cumulative fractions of women and children likely to be lost over the course of the study. The assumed 10 percent to 20 per-
31By attrition, we mean individuals who drop out of the study after recruitment. In a study as complex and long-running as the NCS, there are also challenges because sample members may provide incomplete information on survey or other data collection instruments. These kinds of missing data issues are not addressed here.
32Some of this sample loss is due to ineligibility (e.g., miscarriage, moving out of the PSU) rather than attrition.
33Though the NCS plans to follow movers post-birth, many of them are likely to be difficult to follow.
34The panel’s cost analysis assumed 10 percent attrition in the first year after recruitment of the mother, 3 percent in the second year, 2 percent in the third year, and then 1 percent annually after that.
cent attrition rates between prenatal recruitment and birth reduce the 35 percent response rate at recruitment to a cumulative response rate at birth of between 28 percent and 32 percent.35 Even an annual 98 percent retention rate between birth and age 12 reduces this estimated range of cumulative response rates to between 22 percent and 25 percent. Maintaining a 98 percent annual response rate through age 21 produces an estimated cumulative response rate of between 18 percent and 21 percent.
CONCLUSION 3-6: Assuming that participation in the National Children’s Study Main Study follows patterns in the Vanguard Study, the cumulative response rate to birth for the prenatal stratum would be between 28 and 32 percent, and the rate to age 12 would be 22 to 25 percent—very small fractions of the eligible sample. The cumulative response rate to age 21 would be 18 to 21 percent. A thorough analysis of nonresponse bias is clearly indicated, and in any case will be required by the U.S. Office of Management and Budget.
CONCLUSION 3-7: The high rates of cumulative nonresponse expected in the National Children’s Study pose a severe risk for nonresponse bias that may not be mitigated by weighting adjustments, potentially making some study results invalid.
Optimal Use of Sibling Births
The current plan includes enrollment of all siblings from multiple births and all siblings born subsequently to the originally sampled “target” children but within the 4-year birth window. Assuming 20 percent cumulative attrition, there should be approximately 2,000 of the former and 8,000 of the latter.36 The panel endorses this proposal (as discussed in Chapter 2), but we note some issues with the plan.
Enrolling subsequent siblings in this way offers some important analytical advantages, but it also entails costs and loss of precision. Foremost among the advantages is that the sibling data will provide information on the preconception and very early prenatal maternal environments of mothers of the study’s target children who have a subsequent birth during the enrollment period. Such measures are not available for the target births since women are recruited into the study, at the earliest, at their first prenatal visit.
35Because the 10 to 20 percent attrition between prenatal recruitment and birth includes loss due to both nonparticipation and ineligibility, this cumulative rate is technically no longer a “response rate” as defined by industry standards. We use the term “cumulative response rate” here to indicate the rate of continued participation in the study among the population assumed to be eligible at the time of prenatal recruitment.
36These figures were estimated independently by the panel and the NCS Program Office.
Another analytic advantage of collecting data on siblings (including multiple births) is that sibling data can be used to estimate so-called sibling fixed-effect models that relate sibling differences in outcomes of interest to sibling differences in early environments (see Wooldridge, 2012). These models will allow researchers to control for characteristics of early environments that siblings share and that do not change over time (e.g., elements of maternal background), reducing bias in the estimated effects of variables of interest.
One analytic disadvantage of including siblings in the sample is that siblings are more alike than children chosen at random from the population, which reduces the precision of statistical estimates.37 Another drawback is that the preconception and early pregnancy data collected on these subsequent siblings cannot be generalized to all births because, while originally sampled births are a mixture of first and higher-order births, none of the subsequent siblings is a first birth.
CONCLUSION 3-8: Enrolling siblings as members of the National Children’s Study sample provides many analytic advantages, most prominently the gathering of preconception exposure information for second- and higher-order births. The panel endorses current plans to recruit siblings born after the initially recruited child—but only within the 4-year recruitment interval associated with the original primary sampling unit for the target birth—and to continue to follow these children until age 21.
Because subsequent siblings are to be included in the NCS with certainty, their probability of selection is quantifiable and they can, therefore, be considered part of the probability sample. For a probability sample of 90,000 births, this means that about 80,000 would be comprised of target births (recruited at providers or hospitals). About 2,000 would be multiple births (e.g., twins) and 8,000 would be subsequent sibling births.
The NCS has proposed at least two options for dealing with the fact that the subsequent siblings have more than one chance of selection in the sample: using multiplicity weighting adjustments to the sampling weights and screening a woman at the time of recruitment to determine whether she had a prior birth during the enrollment period for that PSU. In the second option, if a screened mother indicates a prior birth during this period (using one of the providers or hospitals listed on the frame), the child associated with this mother’s current pregnancy already had a chance of selection as a subsequent sibling and would,
37For example, based on data from the National Longitudinal Study of Adolescent Health, Duncan et al. (2001) found that the correlation between same-sex full siblings in adolescent receptive vocabulary is about 0.50; in height it is about 0.46; and in an index of delinquent behavior it is 0.25.
therefore, be screened out of the sample, because its older sibling had a chance of selection into the study as a target child (whether actually selected or not).38
The weighting adjustments (option 1) could affect the precision of estimates for the total sample (90,000), in the same way that oversampling demographic groups could introduce weight variation, making them less precise. The consequences of these multiplicity adjustments are unknown but could lead to reductions in statistical precision and power. The panel has not been provided information that the screening methodology (option 2) has been pilot tested.
In either case, detailed information about all births to a woman within the enrollment period (date and place of birth, date and place of prenatal visit) has to be routinely collected at the time of sampling and recruitment. Option 1 requires additional effort to compute selection probabilities for prior births, and option 2 could substantially increase the recruitment effort and time needed to obtain the targeted number of women in the Main Study sample. Pilot testing could be used to develop procedures and questionnaires for both options, determine how well they work, and estimate how many women would be screened out under option 2.
CONCLUSION 3-9: Weight adjustment and screening are viable options for accounting for the fact that subsequent siblings have more than one way to enter the sample. The panel was not provided sufficient information to recommend one over the other. In either case, detailed information on prior births to the mother will need to be collected.
In terms of analytical issues, the NCS needs to consider how the subsequent siblings will be combined with the target children to make national estimates. One possibility is for the estimates to be stratified by birth order. Design-based estimates of preconception and early pregnancy exposures can be made from the subsequent siblings alone, but as mentioned above, these preconception and early pregnancy findings cannot be generalized to all births, only to second and higher-order births.
Overall, the documents given to the panel did not provide sufficient details for an evaluation of whether the proposed sample would meet the minimal standards of a carefully specified, scientifically based sample design required for large national data collections. Many of the NCS design changes since 2008 appear to have been reactive and piecemeal, in response to issues that have arisen during the field testing or from prior reviewers, rather than from
38The likelihood of more than one child from the same mother actually being selected as part of normal sampling procedures at selected providers or hospitals would be quite small.
a planned and comprehensive approach to the design. One key missing document is a hospital-based PSU design with an assessment of the quality of the frame, discussion of stratification, selection of PSUs necessary to achieve design targets, methodology for replacing hospitals that decline to participate, details concerning calculation of selection probabilities, and proposed weight adjustments for nonresponse and attrition. While the NCS does have substantial pilot experience with the county-based PSU design in the Vanguard Study, little equivalent experience exists for the hospital-based PSU design.
CONCLUSION 3-10: As of February 2014, the currently proposed hospital-based sample design for the National Children’s Study had not been sufficiently developed or documented to support an evaluation.
CONCLUSION 3-11: The identification, sampling, and recruitment of women at the time of birth have not been sufficiently pilot tested, using a representative set of hospitals, to support any conclusion about this feature of the design of the National Children’s Study.
A detailed sampling plan and recruitment strategy for the NCS needs to be fully developed and documented by sampling and survey experts who have extensive experience in conducting large longitudinal national surveys. This group should be external to the Program Office but would work in collaboration with it on all aspects of the design. The sampling plan needs to include a justification in greater detail for moving away from the geographically based provider-based sampling tested in the Vanguard Study to the currently proposed hospital-based design. The plan also needs to address the need for oversampling various subgroups that are expected to have a higher attrition rate over the life of the study. The stratification plan may, in turn, dictate whether a geographic- or hospital-based PSU is used for the NCS, as stratification variables may only be available for one of these two PSU types.
A plan to mitigate nonresponse bias also needs to be developed before the study moves forward, and it needs to include identifying and collecting auxiliary variables or covariates that are thought to predict response and retention propensity. Outside survey experts can help determine: what metrics to use to monitor the extent of bias in survey results; what reporting strategies to use to monitor those metrics during data collection; and how the covariates will be used to adjust analysis weights to mitigate nonresponse bias.
RECOMMENDATION 3-2: A detailed plan for sampling, recruitment, and minimizing attrition bias for the National Children’s Study’s (NCS) Main Study should be fully developed and evaluated by sampling and survey experts independent from the NCS and approved by the proposed independent oversight committee before the study moves forward.
The NCS needs to evaluate and document what has been learned from the Vanguard Study. The independent sampling and survey experts, in collaboration with NCS, should determine what further pilot testing may be required. The NCS will need to conduct any needed pilot tests39 or otherwise demonstrate the ability to carry out each stage of sampling, recruitment, and initial data collection. The sample in the pilot test needs to be large enough and essentially representative so that estimates of coverage, cooperation, and other key rates that affect costs and sampling validity can be made. To the extent that some aspects of this design have already been pilot tested, this information should be analyzed to identify gaps that require pilot testing. Results from the Vanguard Study and other pilot testing can be used by the independent survey experts in determining the final study design. We also recommend an oversight committee (see Chapter 6) that would approve the final design before the Main Study is implemented.
39The NCS Program Office has indicated that it is committed to full pilot testing of the alternatives considered for the Main Study.