This chapter reviews the conceptual framework for designing the study content and evaluates the proposed study visit schedule, the proposed collections of data and samples, and the approach to define and characterize health disparities. It addresses two of the key items in the panel’s charge: the proposed study visit schedule with its emphasis on more frequent data collection in pregnancy and early childhood and the proposed approach to define and characterize health disparities. The chapter also comments on the approach to data dissemination after the study is under way because an important objective of the National Children’s Study (NCS) is to make the data publicly available.
The panel did not receive specific study protocols, information on specific data collection methods, or study instruments. Consequently, this review cannot address the scientific merit or quality of these aspects of the NCS data collection.
As noted in Chapter 1, several NCS features in addition to the sampling frame have changed since the previous review (National Research Council and Institute of Medicine, 2008). The current proposal conceptualizes the study less as a vehicle for testing current hypotheses and more as a platform for future researchers to formulate and investigate hypotheses that could be tested using data from the study instruments and previously collected biological and environmental specimens. Consequently, the proposal no longer relies on specific
pre-specified hypotheses to define the content of the study. Instead, “the proposed plan was developed using several exemplar hypothesis so it is hypothesis informed but not hypothesis limited” (NICHD, 2013b, p. 34).
The NCS is also now conceptualized as a platform for researchers covering a broad range of health domains. It will not focus on classifying participants into predetermined disease categories, but instead will collect a set of primary observations and events to enable researchers to apply their own health criteria and form “cases” (p. 31). According to the NCS Program Office, this framework can be used to develop more flexible phenotypes. A theoretical framework of health as multidimensional and dynamic will guide the selection of assessment methods and instruments.
Study Platform and Exemplar Hypotheses
The panel recognizes that not all cohort studies, particularly of this size, are designed around specific hypotheses. Some, like the Norwegian Mother and Child Cohort Study, appear to use the “platform” approach of the NCS or follow specific precedents, such as the Millennium Cohort Study and earlier British birth cohort studies. However, more commonly, studies are organized around key questions or assessments of specific exposures and provide details on specific subquestions and how they inform the data collection. While there are a variety of approaches to delineating the design and content of a study, most studies appear more focused than the NCS, for example:
- The Generation R study focuses on five specific areas: see Hofman (2004). Each area has one or more “aims” with more specific questions linked to that aim. Those questions drive the data collection for that aim.
- The Fragile Families and Child Wellbeing Study relies on four overarching questions.1
- The Norwegian Mother and Child Cohort Study seems to take the platform approach of the NCS: see Magnus et al. (2006).
- The Millennium Cohort Study follows the precedent set by prior British birth cohort studies; specific questions are not listed.2
- The French Elfe Child Cohort study has a list of seven research questions to be answered by the study: see Charles et al. (2011).
2For a description, see http://www.cls.ioe.ac.uk/page.aspx?&sitesectionid=880&sitesectiontitle=Survey+Design [April 2014].
A summary of several of these studies and others indicates that the studies usually address broad, but focused questions that are not termed “hypotheses.”3
While the strategy to develop exemplar hypotheses or to state domains of interest rather than specific aims seems to be consistent with the range of approaches used in other large birth cohort studies, the panel determined that it was important to evaluate the specific strategy to use exemplar hypotheses proposed by the NCS. The panel reviewed material provided by the Program Office and asked for additional information about the exemplar hypotheses in order to understand how the hypotheses will be used to guide the study design. NICHD (2013b) mentions study hypotheses only twice, stating that the “proposed plan is hypothesis informed because it was developed using modeling of several exemplar hypotheses, but the plan is not hypothesis limited” (p. 5) and that the proposed plan was developed using several exemplar hypotheses (p. 34). The document included an appendix related to sample size that listed the hypotheses from the 2007 research plan, but it did not list the current exemplar hypotheses.
At the request of the panel, the Program Office later provided a description of the exemplar hypotheses. The document (NICHD, 2013d, p. 45) stated that the development of hypotheses “included a matrix approach utilizing exposures at various prevalence levels and outcomes at various prevalence levels as well as individual exemplar hypotheses.” The document listed five examples of exemplar hypotheses based on four “exemplar exposures” and four “exemplar outcomes.” None of the exemplar hypotheses specified time periods for the relevant exposures (e.g., first trimester, puberty), although one mentioned cord blood as a biological sample. None of the hypotheses mentioned assessment of confounding, effect modification, or gene-environment interactions. Other than mentioning that hypotheses were used as a general guide to estimating sample size, the NCS documents did not clarify how the exemplar hypotheses informed key decisions regarding the study design.
The sample size discussed in the document focused only on main effects, although effect modification or interactions is an important justification for a sample size in the range proposed for the NCS. Nor do the hypotheses quantitatively address implications of study power for assessing transient and nonpersistent risk factors (e.g., transient chemical, social, or maternal uterine conditions). In general, the sample size calculations were based on outcomes with prevalences at 2 percent and exposures at 3 percent: it is not clear how less prevalent conditions (e.g., autism, malformations) would be analyzed even though they are included in the document (pp. 44-45). The Program Office’s response to a query on this point (p. 44) was a general disclaimer that the NCS could not address all important children’s health issues.
An important criterion in developing exemplar hypotheses should be
whether the hypotheses anticipate the possible trajectory of future scientific inquiry in children’s health and development. At the same time, it is also important to recognize that a unique strength of the NCS should be the capability to resolve inconsistent findings reported by smaller or more focused birth cohort studies. There have been a large number of smaller birth cohort studies, but the findings have not always been consistent, and important scientific issues are not yet resolved because of the limitations of the smaller studies. Addressing important unresolved scientific issues identified by recent epidemiologic research on children’s health and development would be an important criterion to be considered in developing exemplar hypotheses for the NCS.
While exemplar hypotheses can serve as meaningful archetypes of important scientific issues that the NCS should be able to address, the few exemplar hypotheses provided to the panel are not sufficient to serve as the primary basis for planning a long-term birth cohort study. The Program Office described additional planning strategies in materials provided to the NCS Federal Advisory Committee (NICHD, 2014b, p. 7), including the concept of anticipating the developmental trajectory of a healthy 21-year-old person and then identifying “potential antecedent factors that could be measured earlier in assessing exposure that contribute to later outcomes.” Another strategy is to identify multiple use cases, which would be scenarios of “sequences and interactions related to a particular outcome. Multiple use cases can be used to frame specifications for a system such as a research study” (p. 3). However, information on these additional strategies was not given to the panel, so the panel cannot evaluate the overall effectiveness of the study planning. Even if multiple study planning tools will be used, the NCS plan still must delineate specific scientifically robust exemplar hypotheses that can be used by the NCS and the scientific community to formally evaluate sample size and design issues, as well as the NCS proposals for the nature and timing of data collection.
CONCLUSION 4-1: A strategy of using a few exemplar hypotheses rather than stating a large list of hypotheses requires that the planners of the National Children’s Study ensure that the exemplar hypotheses are important and scientifically robust to guide the study design and data collection.
RECOMMENDATION 4-1: Prior to proceeding with the Main Study, the National Children’s Study (NCS) should develop scientifically well-grounded exemplar hypotheses that should be used to guide and evaluate decisions regarding the NCS design and data collection schedule and domains.
RECOMMENDATION 4-2: Because hypotheses will change over time, the National Children’s Study should implement a strong and public pro-
cess to revise and develop new exemplar hypotheses to guide future study implementation, engaging with the extramural and intramural research communities.
Health Phenotype Concept
The NCS proposes to use a health phenotype and profile to describe each participant (NICHD, 2013b, p. 30):
The term phenotype is used for the observable characteristics including morphology, physiology, developmental stage, behavior and products of behavior…. The term, profile, is used for the larger concept of phenotype plus environmental context. A profile includes observable characteristics about the participant plus information about the environment such as air particle measures, noise level, family structure and dynamics, access to health care, etc.
Thus, at each study visit a participant will be assessed using a health phenotype framework and will be the subject of collection of environmental data and biospecimens. Documents provided to the panel state that the rationale for using a health phenotype concept is that it would (NICHD, 2013b, pp. 29-30)
- Use a conceptual framework grounded in health that applies to all Study participants.
- Capture a broad scope of outcomes and not limit observations to particular conditions or diseases.
- Establish consistency in reporting exposures and outcomes across different research fields that may have different paradigms and methods.
The document also states that using the health phenotype concept would achieve another objective (NICHD, 2013a, pp. 21, 23):
… maintain flexibility as new opportunities and assessment innovations arise as they can be integrated into the conceptual framework…. Using reactive airway disease as an example, the NCS will emphasize accurately capturing medical history, participant experiences, and respiratory symptoms, coupled with biospecimens, genetic analyses, and environmental samples. Researchers can then use these data in conjunction with the case definitions and classifications they deem optimal for their analyses.
In the information provided to the panel, the health phenotype concept and the conceptual framework for health development (which are discussed below) are said to guide “the development of assessments and the structure of data collection to ensure that essential relevant information to understand
health and development are included” (NICHD, 2013b, p. 31). The NCS documents provide examples of possible study visit content, but they do not explain how the health phenotype concept and conceptual framework will guide specific decisions about the content (see, e.g., NICHD, 2013b, App. 3).
Although the health phenotype concept is consistent with the overall NCS strategy to serve as a platform for future research, the panel cannot evaluate the scientific merit or ability of the NCS to operationalize the concept on the basis of the limited information provided to it. A specific concern is that the documents do not adequately explain the criteria and procedures that will be used to prioritize data collection, considering that the volume of detailed data collection needed to implement the concept could result in substantial respondent burden.
Consider the example provided by the Program Office for reactive airways disease: it could require a very large number of questions, including about symptoms, activities, and functional status; medication use and health care utilization (e.g., emergency room visits); and family knowledge about asthma management. These questions and related data collection, such as lung function tests, would be expected to be collected in a study focused on childhood asthma, but it is not clear how the NCS would be able to collect such detailed information on all domains of child health, development, and disease. The NCS documents mention the need to identify priorities for data collection, but the discussion does not adequately address how the NCS will overcome what is likely to be a major impediment in operationalizing the health phenotype concept, since respondent burden and limitations in collecting biological and environmental specimens will be critical considerations in designing the study content.
CONCLUSION 4-2: While using a dynamic health phenotype concept to plan the content of the National Children’s Study appears to be a promising strategy, the panel lacked sufficient information to judge whether the implementation of such an approach would be feasible given constraints imposed by respondent burden and overall study costs.
Conceptual Framework for Measuring Health and Development
Although the use of domains and primary observations, rather than predetermined disease categories, does not necessarily require a shift from the original disease-oriented focus of the NCS, the current proposal tends to deemphasize a focus on disease outcomes and gives greater emphasis to positive health and development domains. The prior study (National Research Council and Institute of Medicine, 2008) expressed concern that there was no apparent overarching conceptual framework for health and development to tie the study together. In response, the current NCS proposal describes a detailed conceptualization
of health and development that consists of seven domains (demographics, physical health, psychosocial, neurodevelopmental, health behaviors, social environment, physical environment), each with subdomains (NICHD, 2013b, pp. 29-31). The NCS proposal indicates that these domains have been prioritized on the basis of gaps identified through literature review, public health significance, and the need for NCS platform and suitable instrumentation.
This approach has used the work of NICHD’s existing Health Measurements Network, which views health as multidimensional with each dimension being assessed from very low to very high levels.4 Each dimension includes several domains, and these can be assessed through multiple measurement modalities. Health is the product of a complex and dynamic interaction between the child and its environment (e.g., NICHD, 2013d, p. 27). These domains and subdomains can be considered along two axes, health and development. Dimensions for the health axis include adaptability, experience, function, and potential. For the development axis, the dimensions are plasticity, experience, and complexity. Although NCS proposes to look at a variety of dimensions of health, many concepts, such as functional status and severity of illness, remain unclear.
The panel judges that the conceptualization of health and development represents a substantial advance from the one reviewed in the previous evaluation (National Research Council and Institute of Medicine, 2008). The breadth of the conceptualization would encompass most of the issues affecting child health and development and provide many dimensions that could be linked to environmental exposures. The delineation of domains and subdomains is detailed enough to suggest quite specific measures that would need to be obtained.
However, as with the health phenotype concept, the panel did not receive needed details on the operationalization and effectiveness of the new conceptualization. Since no information of the specific measures for domains and subdomains was provided, the level of detail to be obtained is uncertain. While there may be time to develop measurement strategies for later years, the proposed 2015 start date for the NCS requires that the data collection rationale and strategy be more fully developed for at least pregnancy and the first year of life.
CONCLUSION 4-3: Many of the principles and concepts guiding development of the study design and the concept of having processes for developing future hypotheses and study content are consistent with the study platform framework for the National Children’s Study. However, it is not clear whether and how those principles and concepts can be effectively used to design the study content.
4For a description, see: http://www.nationalchildrensstudy.gov/research/workshops/Pages/Forrest-Metadata-workshop-jan-2012.pdf [May 2014].
Study Visit Schedule and Mode
The panel was asked to comment on the more intense schedule early in the study compared with later years. Tables 4-1 and 4-2 show the proposed study visit schedule and selected content. For purposes of comparison, the tables also shows the same materials proposed for the previous review (National Research Council and Institute of Medicine, 2008). Table 4-1 displays the study visit schedule with mode and notes the planned collection of biospecimens, environmental measures, questionnaire content, and examinations from pregnancy through age 3. Table 4-2 shows the study visit schedule and collection mode planned for ages 3.5 through 21. The concentration of data collection in the early years is apparent in both the 2008 and 2013 schedules, with the current schedule including two prenatal interviews (rather than three as in the previous plan) and four data collections between 3 and 12 months of age.
Although there is reasonable scientific justification to conduct more frequent data collection during the prenatal period and early years, the documents provided to the panel do not explain adequately the scientific basis for the specific schedule of visits proposed for the NCS. For example, in view of concerns about study cost, it is not clear why the specific 3-, 6-, 9-, and 12-month schedule is needed in the first year.5
The panel requested that the NCS provide a rationale for the proposed study schedule. The first response (NICHD, 2013d, p. 75) was “Early and frequent data collection will help build health profiles as well as collect data during periods of rapid development. Operationally the time periods must be standard and easy to apply to a large cohort.” A second response (NICHD, 2013h, p. 2) was
The rationale for the proposed data collection intervals is based on the need for frequent data collection during periods of rapid change. The proposed visit schedule is intended to capture information about critical developmental events with the greatest precision. The data collection framework is based on a life course research model following extensive consultation with multiple stakeholders over a two-year period. Alternative schedules with less frequent visits were considered but rejected to avoid gaps in data collection opportunities during critical developmental periods. In addition, empiric experience supports frequent visits to increase retention and improve participant tracing.
5The cost model described in Chapter 5 shows that the incremental cost of a single in-person interview in the child’s first year adds $90 million to the cost of fielding the study. The incremental field costs of a telephone interview total $45 million.
TABLE 4-1 Proposed NCS Study Visit Schedule: Comparison of Time, Mode, and Types of Measures Between the 2008 and Current Plans: Pregnancy to Age 3
|Time of Measuremnt||Mode||Type of Measure: 2008 Plana||Type of Measure: Current Planb|
|Prepregnancy||Home visit||Biosamples: blood [M], urine [M], saliva [M], vaginal swabs [M], hair [M]|
|Environmental: indoor air, house dust|
|Questionnaire/diary: demographics, household composition, medication use, health behaviors, housing characteristics, chemical exposures, product use, occupational exposures, diet|
|Examinations: anthropometrics [M], blood pressure [M]|
|Prepregnancy||Telephone||Questionnaire/diary: diet, chemical exposures|
|First trimester||Home visit||Biosamples: blood [M, P], urine [M, P], saliva [M], vaginal swabs [M], hair [M, P]||Biosamples: blood [M, P], urine [M, P], saliva [M, P]|
|Environmental: indoor air, house dust, drinking water, soil||Environmental: Air, dust, water, visual assessment of house and neighborhood|
|Time of Measuremnt||Mode||Type of Measure: 2008 Plana||Type of Measure: Current Planb|
|Questionnaire/diary: demographics [M, P], household composition [M, P], medication use, health behaviors, housing characteristics, chemical exposures, product use, occupational exposures, diet, medical history [M, P], stress and social support [M], depression [M], tobacco use [P], cognition [P] (all updates if second interview)||Questionnaire/diary: [M, P]: demographics, medical conditions, disease exposure history, psychosocial, household occupational or hobbies exposure to chemicals, health behaviors, family medical history. [M]: medications, consumer product use, treatments and medical events, pain or other complications, dietary assessment|
|Examinations: anthropometrics [M, P], blood pressure [M, P], fetal ultrasound (from med report or clinic visit)||Examinations: anthropometrics [M, P], blood pressure [M, P], fetal ultrasound (from med report or study administered)|
|Second trimester||Telephone||Questionnaire/diary: major life events [M], and updates [M] on mental health, medical, chemical exposures, and housing|
|Third trimester||Clinic visit||Biosamples: blood [M], urine [M], saliva [M], vaginal swabs, hair [M]||Biosamples: blood [M], urine [M], saliva [M], vaginal swab|
|Environmental: indoor air, house dust (self-collected and mailed in)||Environmental: house dust, visual assessment of house and neighborhood|
|Questionnaire/diary: updates from [M] on demographics, household composition, medication use, health behaviors, housing characteristics, chemical exposures, product use, occupational exposures, diet, medical||Questionnaire/diary: psychosocial occupational and hobby exposures, health behaviors, medications, treatments and medical events, consumer product use, dietary assessment|
|history, stress and social support, prenatal life events, depression.|
|Examinations: anthropometrics [M], blood pressure [M], fetal ultrasound||Examinations: anthropometrics [M], blood pressure [M], 2-D fetal ultrasound (subsample)|
|Birth||Hospital||Biosamples: blood [M], urine [M], cord blood, placenta and cord samples, heal stick [C])||Biosamples: blood [M], cord blood, placental weight, photo and tissue, meconium, infant skin, stool, and oral swab, blood spot [C], breast milk (1 month mail or pick-up)|
|Environmental: If enrolled at hospital (air, dust, water to be self-collected and mailed in)|
|Questionnaire/diary: health behaviors [M], diet [M], chemical exposures [M], plans for infant feeding, sleeping, etc.||Questionnaire/diary: [M]: medications, consumer product use, treatments and medical events, pain or other complications, recent medical social and environmental history, planned health behaviors, medical record review including abstration for hearing screen and neonatal exam|
|Examinations: anthropometrics [C], dysmorphology and neurologic exam [C], digital photographs of face and anomalies [C], chart abstraction [M, C]||Examinations: neonatal anthropometry|
|Time of Measuremnt||Mode||Type of Measure: 2008 Plan a||Type of Measure: Current Planb|
|3 Months||Telephone/ remote||Biosamples: breast milk mailed in||Biosamples: breast milk mail or pick-up|
|Questionnaire/diary: child care, medical update [C]||Questionnaire/diary: age-specific and other modules, dietary assessment, neurodevelopmental|
|6 Months||Home visit||Biosamples: urine [C], hair [C], saliva [M, P], breast milk||Biosamples: urine [M, C], blood [M], skin, stool and oral swab [C]|
|Environmental: indoor air, house dust, drinking water, soil, visual assessment of house and neighborhood||Environmental: air, dust, water, visual assessment of house and neighborhood|
|Questionnaire/diary: stress and social support, family process and parenting practices [M, P], health behaviors [M], depression and cognition [M], diet [C], medical update [C], medication use [C], media exposure [C], child care, chemical exposures, temperament [C], tobacco use [P], cognition [P]||Questionnaire/diary: [M]: core questionnaire, age-specific and other modules, dietary assessment, neurodevelopmental|
|Examinations: anthromorphics [C], dysmorphology exam and photos [C], dermatologic exam [C], social development observation [M, C]||Examinations: infant anthropometry|
|9 Months||Telephone/ remote||Questionnaire/diary: child care, medical update [C], housing update, chemical and occupational exposures [M, C]||Biosamples: breast milk mail or pick-up|
|Questionnaire/diary: age-specific and other modules, dietary assessment, neurodevelopmental|
|1 Year||Home visit||Biosamples: blood [C], urine [C], hair [C], saliva [C], breast milk||Biosamples: urine [A, C], blood [A, C], skin, stool and oral swab [C]|
|Environmental: indoor air, house dust, drinking water, soil, visual assessment, noise survey||Environmental: air, dust, water, visual assessment of house and neighborhood|
|Questionnaire/diary: household composition update, family process and parenting practices [M, P], health behaviors [M], diet [C], medical update [C], medication use [C], media exposure [C], child care, housing update, chemical and occupational exposures [M, C], language acquisition and social interaction [C], tobacco use, cognition (if not assessed at first trimester)||Questionnaire/diary: core questionnaire, age specific and other modules, dietary assessment, neurodevelopmental|
|Examinations: anthromorphics [C], blood pressure [C], dermatologic exam [C], cognitive exam [C], motor and language assessments [C], social development observation [P, C]||Examinations: infant anthropometry|
|Time of Measuremnt||Mode||Type of Measure: 2008 Plana||Type of Measure: Current Planb|
|18 Months||Telephone/remote||Questionnaire/diary: child care, medical update [C], diet [C], housing update, chemical and occupational exposures [M, C]||Questionnaire/diary: core questionnaire, age specific and other modules, dietary assessment, neurodevelopmental|
|2 Years||Telephone||Environmental: indoor air and house dust self-collected and mailed in|
|Questionnaire/diary: child care, medical update [C], housing update, chemical and occupational exposures [M, C], life events [M]|
|Home or other visit||Biosamples: urine [M, C], blood [M, C], skin, stool and oral swab [C]|
|Environmental: air, dust, water, visual assessment of house and neighborhood|
|Questionnaire/diary: core questionnaire, age specific and other modules, dietary assessment, neurodevelopmental|
|2.5 years||Remotec||Examinations: child anthropometry|
|Home or other visit||Biosamples: urine [A, C], blood [A, C], saliva [A, C]|
|Environmental: noise, dust, visual assessment of house and neighborhood.|
|Questionnaire/diary: core questionnaire, noise exposures, risk and safety behaviors, social activities, physical activity, sun exposure, toilet training, occupational/hobby exposures, reported height and weight, Ages and Stages, SWAN, NIH Toolbox Early Childhood Cognition Battery, Neuro-Psycholosocial Direct Observation Data Collector Instrument, WAST, Major Life Events, alcohol tobacco and substance use.|
|Examinations: height or length, weight, circumferences, upper arm length, blood pressure, physical activity (subsample), vision screening.|
NOTES: The entries in boldface type are survey characteristics that are different between the earlier and current plans. C, child; M, maternal; P, paternal; A, adult living with child, preferably first-degree relative. Mode: remote, telephone, internet, or mail.
aInformation from National Research Council and Institute of Medicine (2008, pp. 26-31).
bInformation from NICHD (2013b, pp. 42-47).
cThere is no information on what data would be collected.
TABLE 4-2 Proposed NCS Study Visit Schedule: Comparison of Time and Mode for the 2008 and Current Plans, Ages 3.5 to 21
|Time of Measurement||Mode: 2008 Plana||Mode: Current Planb|
|4 Years||Home or other visit|
|5 Years||In-home or clinic||Home or other visit|
|7 Years||In-home or clinic||Home or other visit|
|9 Years||In-home or clinic||Home or other visit|
|11 Years||Home or other visit|
|12 Years||In-home or clinic|
|13 Years||Home or other visit|
|15 Years||Home or other visit|
|16 Years||In-home or clinic|
|17 Years||Home or other visit|
|19 Years||Home or other visit|
|20 Years||In-home or clinic|
|21 Years||Home or other visit|
NOTES: Neither the 2008 nor current plan specifies what data would be collected at the visits.
aInformation from National Research Council and Institute of Medicine (2008, pp. 26-31). The report (p. 32) states the schedule after age 5 is provisional, and there may be phone calls at more frequent intervals.
bInformation from NICHD (2013b, pp. 42-47).
These statements provide a rationale for having a frequent visit schedule based on operational considerations; however, they do not describe the scientific basis for choosing the precise study visits schedule and content.
The NCS documents suggest that it is not possible to provide justification for the particular data collection schedule because the “the specific times of vulnerability [to environmental exposures that influence child growth and development] remain largely unknown” (NICHD, 2013d, p. 31). Although specific etiological windows for many exposures may be unclear, much is known of the human developmental process that could also guide the selection of critical periods for data collection. Furthermore, other aspects of health and development could be used to guide the assessment schedule, such as documentation of key developmental tasks; emergence of specific health or developmental problems; characterization of interventions that might improve outcomes; or even methodological issues, such as cohort maintenance. Besides issues of
persistent and transient environmental exposures, there is a robust research literature on factors influencing the duration of recall that might also guide the selection of assessment points. Thus, the response provided to the panel that some windows of vulnerability are unknown is an insufficient justification for the specific proposed study visit schedule.
CONCLUSION 4-4: The panel agrees that more intensive data collection in the early years of the National Children’s Study is important, but the panel did not receive sufficient scientific justification to assess the merits of the specific data collection schedule.
Preconception Data Collection
As discussed in Chapter 2, a key goal of the earlier NCS research plan was to enroll nonpregnant females through the household-based sampling frame in order to collect data during the preconception period for some of the participants. This goal is retained in the current proposal, although the NCS now proposes to collect preconception data on subsequent siblings of enrolled participants by collecting data on the families and enrolling the subsequent siblings with certainty. In addition, it would supplement this group by enrolling a convenience sample of mostly nulliparous nonpregnant women for preconception data collection. As stated in Chapter 2, the panel acknowledges the potential scientific value of gathering preconception exposure data, but it recommends that a supplemental convenience sample not be included due to the very high recruitment and data collection costs associated with such a sample. (See Chapter 2 for additional discussion of a convenience sample.)
The current design would obtain preconception data through enrollment of subsequent siblings of enrolled participants. A potential benefit of this strategy is that it would minimize data collection cost because much of the data collection, such as collection of mother’s biological specimens or a household dust specimen, are common to almost all of the in-person visits and would be relevant in assessing the preconception environment of a subsequent sibling. Nevertheless, it is not clear whether or how the planned data collection would be modified during the postnatal period for enrolled families to be able to address scientific hypotheses related to the preconception environment of the not-yet-born subsequent sibling. In addition, the current proposed data collection schedule for families when the target child is between 1 and 5 years of age, when the vast majority of subsequent siblings would be conceived, includes only one in-person and one remote data collection event per year. The intensity of the data collection visits may not be sufficient to be able assess the preconception environment, considering the need to adjust for varying times to conception.
Prenatal Data Collection
The scientific rationale for the NCS to enroll pregnant women and collect data during the prenatal period is detailed in Chapters 1 and 2. A substantial body of scientific research suggests that multiple social, biological, and physical factors during the prenatal period could affect child health and development and that the effects of these factors could vary during the pregnancy due to the developmental process. Although the prenatal period data collection protocol for the Main Study has not been finalized, the NCS documents list the most important domains and subdomains the NCS will try to measure.
The importance of collecting prenatal data derives in part from the fact that many factors to be measured, such as diet or medication use, may not be reliably recalled in postnatal questionnaires. Also, as indicated in the NCS documents (NICHD, 2013a, App. 4; 2013b, pp. 50-51), many environmental factors that could be measured in biological or environmental specimens are not persistent and might not be measured in specimens collected during or after the birth visit. Thus, prenatal data collection is essential to ensure concurrent collection of key social, biological, and environmental data.
RECOMMENDATION 4-3: The National Children’s Study Main Study should collect data during the prenatal period at multiple times for as many of the study participants as the budget will allow.
Birth Enrollment and Data Collection
The current NCS proposal would enroll half of the probability sample (aside from subsequent siblings) at the time of birth in hospitals and birthing centers. Following the discussions in Chapter 2 on the scientific merit of this proposed strategy and in Chapter 3 on issues related to the sampling strategy, this section discusses issues related to the feasibility and quality of the data collection for women and children enrolled during a hospital admission at the time of the child’s birth.
Enrolling and collecting high-quality data on women and babies in a hospital at the time of delivery presents several logistical challenges and concerns about the informed consent process. One NCS document (NICHD, 2013b, p. 23) indicates that on scheduled enrollment days, all women who are admitted to the hospital for possible deliveries would have to be screened for eligibility either by study staff or hospital personnel. Another document (NICHD, 2013g, p. 2) states that women could be approached for enrollment only after a minimum of 12 hours after delivery. The U. S. Office of Management and Budget, which must approve the protocols for all federal data collections, might require even a longer time interval.
Yet even the 12-hour minimum time interval could result in many potential participants being discharged before a recruitment visit and data collection.
Also because of the importance of collecting biological specimens, such as cord blood, as well as samples of the cord and placenta, the birth stratum enrollment protocol would necessitate making arrangements for these specimens to be collected on all potentially eligible women and babies prior to obtaining formal consent. In addition, the quality and validity of responses to a full baseline questionnaire administered postpartum to women in the hospital prior to their discharge may be less than optimal. It is also likely that the postpartum attrition of these participants would be greater than that of the women enrolled during the prenatal period who had already participated in an in-person data collection and agreed to continue with the study. In summary, there are many unclear and unresolved logistical issues related to the plan for a time-of-birth study enrollment and data collection. The NCS Vanguard Study has conducted only a very small pilot test of this enrollment and data collection strategy, which involved only two or three hospitals in each of three locations. The findings are still preliminary.
CONCLUSION 4-5: The strategy of the National Children’s Study (NCS) to enroll a substantial proportion of participants at the time of the child’s birth poses substantial logistical and operational challenges that have not been adequately tested in the NCS Vanguard Study.
RECOMMENDATION 4-4: Although the panel does not endorse the current proposal for a substantial birth enrollment stratum, if the National Children’s Study (NCS) Main Study retains such a stratum, the NCS should conduct a full pilot test of recruitment and data collection during the birth visit before the Main Study is implemented.
Data Collection Content
The Program Office did not provide a detailed protocol or proposal for data collection, although the material it provided to the panel included examples of data and specimens that are being considered for inclusion in the protocol (see Table 4-1, above). The panel also did not receive draft questionnaires. The apparent lack of a draft final protocol and limited descriptions of possible data collection elements raises questions about the status of the NCS protocol. After several years of Vanguard Study pilot testing and based on the description of an elaborate process to develop study content, the panel expects that the NCS should be able to provide well-justified, near-final data collection protocols and study instruments, at least for the initial periods of the Main Study through the children’s first year.
Study Visit Format
The NCS documents (e.g., NICHD, 2013b, p. 33) describe a strategy for the study visits to manage participant burden that consists of administering a core questionnaire to all participants at each visit and supplementing this questionnaire with modules on individual topics. The topics of the modules could address such factors as those related to the child’s age at the time of the visit, a particular exposure, a new diagnosis, or a change in household composition. The documents indicate that the modules could be administered on the basis of contextual triggers or through random assignment (e.g., for validation or to collect information on a control group for which there was no contextual trigger). In addition to a questionnaire, the modules could include additional modalities for data capture, such as images or environmental specimens.
The proposed study format strategy seems to be well conceived and necessary in order to achieve reasonable respondent burden. The documents contained a draft list of domains to be addressed in a core questionnaire, but did not include a questionnaire. The documents also did not detail the process of selecting and prioritizing modules in real time prior to or during a data collection visit. It seems possible, for example, that participants who live in socially and environmentally disadvantaged homes and neighborhoods with poor access to schools, social services, and medical care and who have multiple health and development conditions could have a very large number of contextual triggers for the additional modules. Due to the lack of detailed information on how the NCS would implement the strategy of a core plus modules and the actual measures to be used, the panel cannot assess whether the proposed strategy would be able to contain respondent burden while collecting the data needed to characterize outcomes, identify key issues for health disparities, and operationalize the health phenotype concept.
As noted in Chapter 1, the Children’s Health Act of 2000 mandated that the NCS should be planned to be a “longitudinal observational birth cohort study to evaluate the effects of chronic and intermittent exposures on child health and human development in U. S. children” (P.L. 106-310). Environmental assessment—in which “environment” is broadly conceived to encompass social, biological, physical, chemical and other factors—is a critical component of the NCS study content. Unfortunately, high-quality environmental assessment can be expensive if it involves collecting media (e.g., dust, air, water) on multiple occasions and then processing, archiving, and analyzing the media for many possible agents. The NCS has to strike a balance between study cost and the imperative to collect a sufficient amount of high-quality data needed for environmental assessment.
The NCS documents (e.g., NICHD, 2013b, App. 4) describe the general
approach to environmental assessment and provide tables that list examples of biological or environmental specimens that could be used to measure different potential exposures. The information provided in these documents is not specific, and the list of specimens to be collected for study visits is considered to be preliminary. As previously mentioned, although the congressional mandate for this panel’s study called for “a comprehensive review and issue a report regarding proposed methodologies for the NCS Main Study,” the NICHD did not ask the panel to review the environmental assessments, and the panel did not receive sufficient information to evaluate the scientific merit of the draft environmental assessment protocols.
When discussing the environmental assessment, the NCS documents (e.g., NICHD, 2013b, p. 48) refer to National Research Council and Institute of Medicine (2013), a summary of a workshop on the design of the NCS, and an earlier workshop convened jointly by the NCS, the Environmental Protection Agency, and the National Institute of Environmental Health Sciences in 2010 to review the specific exposure matrices for the NCS. The NCS documents quote statements from the workshop speakers to indicate there was a consensus that the NCS strategy and plans for environmental assessments were reasonable. However, the panel’s review of the workshops’ reports (U.S. Environmental Protection Agency, 2010; National Research Council and Institute of Medicine, 2013) found that important caveats and critical measurement issues discussed in the two workshops were not sufficiently acknowledged by the NCS in the documents provided to the panel.
The exposure assessment protocols reviewed and discussed by the workshop participants were based on earlier protocols that included collection of biological specimens and “air, dust, water” during multiple study visits. The documents provided to the panel did not have the same extensiveness of biological or environmental data collection as was being considered in 2010. The findings of the workshops are not necessarily applicable to the environmental assessment currently being considered. Furthermore, the workshop participants expressed concerns about the validity of using questionnaires to assess many types of environmental factors and especially to rely on retrospective recall of exposures. For example, the primary exposure source of many non-persistent hormonally active agents, such as phthalates and bisphenol A, is consumer product use. The report of the earlier workshop (U.S. Environmental Protection Agency, 2010, p. 6) notes: “However, most adult participants cannot provide sufficiently accurate information for classifying exposures based on product use or activities.” The workshop participants noted that these agents or their metabolites can be measured in biological specimens, but it would be important to collect specimens during multiple data collection visits because of the short half-life of the metabolites in urine. A similar comment about using questionnaires to assess chemical exposures was made in the second workshop (National Research Council and Institute of Medicine, 2013, p. 22):
“For example, you can’t ask people if they have PCBs in their home or if they have polybrominated diphenyl ether flame retardants in their TVs or couches.”
The exposure assessment experts at the two workshops emphasized the importance of collecting biological and environmental specimens during multiple data collection visits during the critical time periods of development. The summary of the 2010 workshop stated (U.S. Environmental Protection Agency, 2010, p. 2)
All workgroups agreed that in utero and through early childhood (up to ages 3 to 5 years) were the time periods when children were most susceptible and when exposure monitoring should be conducted. At a minimum, all groups preferred to conduct monitoring during three visits, one each during the first trimester, the third trimester, and the first year.
Retrospective exposure assessment based on bulk dust samples may be used to assess average exposures to persistent metals and chemicals over a period of several months, but it is not a viable strategy to assess transient exposures to nonpersistent agents. Furthermore, the two workshops did not endorse the current NCS proposal to conduct a retrospective exposure assessment for families enrolled at the time of the child’s birth by providing collection kits to the families for self-collection of environmental samples. The panel also judges that the proposed methods to assess environmental exposures by relying on maternal collection of in-home environmental samples have not been adequately pilot tested.
CONCLUSION 4-6: Exposure assessment, including collection of biological and environmental specimens during multiple study visits beginning during the prenatal period, is a critical component of the National Children’s Study in addressing the mandate of the Children’s Health Act of 2000 and fulfilling the study’s goal to serve as a platform for future scientific inquiry.
Process for Selection of Measures
Specific measures for the measurement domains and subdomains have not yet been specified. Although the NCS provided a detailed description of the advisory and consultative process (NICHD, 2013b, pp. 34-35; 2013g, pp. 7-8) to inform the decision making for measurement methods and instruments, it is not clear how the advisory and consultative process actually informs decision making. The process is extensive, but seemingly unwieldy for timely development of protocols and study instruments. The panel received no documentation that the process for developing measurement methods and instruments has been formally evaluated or compared with other large national and international longitudinal cohort studies. Nor did it receive specific documentation or
evidence of the process in action with any domains, subdomains, or instruments and how it resulted in instruments or testing in the Vanguard Study.
Considering that Vanguard Study field work started in 2009 with more than 4,000 families enrolled and followed at least through the birth visit and that the advisory and consultative process has been in place since at least that time, it would seem that the NCS ought to be able to provide documentation of nearly final data collection methods for the prenatal period and child visits through 6 months of age based on this experience. The comments by the Program Office that findings and data from the Vanguard Study pilot studies are still being evaluated, and reports from the NCS Federal Advisory Committee that the Committee also has not yet been provided draft data collection instruments and methods for review, reinforce this concern.
CONCLUSION 4-7: The processes for developing content for the National Children’s Study are complicated, and insufficient documentation has been provided to demonstrate that the processes will be effective.
RECOMMENDATION 4-5: The National Children’s Study Program Office should document and provide justification for development of the data collection schedule, content, and methods now and going forward. The documentation should be sufficient to guide use of the study data by future researchers.
RECOMMENDATION 4-6: The National Children’s Study Program Office should finalize the study visit data collection protocols that it intends to use for the Main Study (including questionnaires and other measurements), at least through age 1, and then pilot test the protocols before implementing the Main Study. The protocols and findings of the pilot tests should be peer reviewed and approved by the proposed independent oversight committee prior to initiating the Main Study.
(See Chapter 6 for discussion and recommendations regarding an oversight committee.)
As discussed in Chapter 2, the NCS proposes to address health disparities during data collection by ensuring that information about demographic and other characteristics that define these populations is gathered in the core questionnaire and measuring exposures that may be important for understanding health disparities.
Based on the responses to panel questions, the NCS clarified that the major domains of interest for health disparities are race and ethnicity, socioeconomic
status, geography, and immigration status, stating that the NCS will follow the Data Collection Standards of the U.S. Department of Health and Human Services6 for collecting information on race, ethnicity, sex, primary language, and disability status. Some questions will be included to assess immigration status, and in addition, information on health insurance status, other health care access characteristics, and education will be collected. The document also noted (NICHD, 2013d, p. 79) that geography can be used to investigate urban and rural differences, and also to identify specific industrial exposures common in some areas. Although the domains identified by the NCS documents are standard and reasonable, there was no indication that the NCS has developed or adopted a conceptual framework for health disparities (e.g., similar to the framework the NCS has developed to guide assessment of child development) or a strategy to identify additional domains and measures relevant to health disparities, such as psychosocial factors or features of social or physical environments that may be of special relevance to understanding health disparities.
CONCLUSION 4-8: Based on the information provided, the panel concludes that the National Children’s Study plan has paid insufficient attention to how health disparities should be taken into account in the development of the schedule visit and content of the Main Study.
RECOMMENDATION 4-7: The relevance to health disparities should be an explicit criterion for selecting the constructs that will be assessed as part of the National Children’s Main Study, the measures that will be used to assess them, and the timing of the assessments. The NCS should obtain input from experts on health disparities in childhood as part of the documented process through which the measures for inclusion are selected, and the measures should be approved by the proposed oversight committee.
(See Chapter 6 for discussion and recommendations regarding an oversight committee.)
As noted in the previous review (National Research Council and Institute of Medicine, 2008, p. 199): “Past experience with virtually all national data sets is that the research value of the data is maximized when as many skilled analysts as possible are able to access the data for original and replication analyses, and when the peer-review process judges the quality of the analyses performed.” It recommended: “[T]he NCS should begin planning for the rapid dissemination
6Available at: http://aspe.hhs.gov/datacncl/standards/aca/4302/index.pdf [April 2014].
of the core study data, subject to respondent protection, to the general research community.”
Guttmacher et al. (2013, pp. 1873-1874) describe a reassuringly open data release policy for the study:
The NCS is committed to broad, rapid sharing of all data and samples, while respecting participants’ privacy and confidentiality. No individuals or institutions that gather data and samples will have prioritized claims to them. Electronic data will be available to all qualified researchers through controlled access mechanisms, in keeping with current National Institutes of Health practices. Because biologic and environmental samples are exhaustible, there will be an application process for obtaining them. To maximize their use, the NCS will share promptly with the entire research community the results of all analyses performed.
Additional details about the NCS study data release policies were provided in documents made available to the panel (NICHD, 2013a, 2013d) and are based on review of the data release policies of a number of federal government and university-based surveys. The NCS expects a 2-year lag between the end of data collection and data release to the research community. It plans to release three types of analytic files. One would be a public-use file that would be “disseminated into the public domain without restrictions on access or use”7 (NICHD, 2013d, p. 83). In order to protect confidentiality, in the public-use data, individual level data would be coded, aggregated, or otherwise altered to mask individually identified information” (p. 83). Use of the second type of data would be restricted through controlled access and use “through a licensing process whereby each data request is individually evaluated and, if approved, the data user enters into a formal data sharing agreement … [and] the approved environment for access … could include a ‘Census-Bureau-type’ data center” (p. 83). The third type of data would be controlled-use materials, such as environmental samples, biospecimens, images, and audio files. Access to these data or specimens would be even further restricted and would have to be approved because of the limited amount of specimens or because such data as images cannot be de-identified.
In order to develop and implement plans for data sharing, the NCS established the NCS Data Access Committee in 2009, which defined governing principles for data access and confidentiality. It also hosted a data use workshop in February 2013 with invitees from federal agencies, contract research organization, study centers, and other stakeholders (see NICHD, 2013a). Following the workshop, the NCS published a document on data access and confidentiality concept of operations (NICHD, 2013k), which provides more detailed information on the NCS data dissemination strategies.
7Public-use files are available for international and commercial use.
In its consideration of the NCS plans, the panel investigated the data release practices of a number of the studies referenced by NICHD (2013d). Perhaps the closest models for the NCS are the surveys conducted by the National Center for Health Statistics (NCHS) in the Centers for Disease Control and Prevention, which face the same general set of federal government constraints on data release as the National Children’s Study.
The NCHS study that shares the most features with the NCS is the NCHS’s National Health and Nutrition Examination Survey (NHANES). Although each cohort of NHANES is much smaller than the NCS (5,000 persons of all ages are interviewed each year) and is a repeated cross-sectional rather than longitudinal design,8 it does involve personal interviews and collects data from physical examinations and laboratory tests, and it maintains a DNA repository on its samples. It also monitors environmental exposures and children’s growth and development.
The general principles guiding data release for the NHANES are to distribute the data as widely as practicable, as soon as possible after data collection, and in as much detail as possible while maintaining survey participant confidentiality.9 The NHANES data release performance matches these goals well. Almost all of the person-level survey, laboratory, and environmental data collected in NHANES are available to the public on the study’s Website. The data are processed in 2-year cycles with the first data releases available within 9 months of the end of a given cycle’s data collection period. Files for NHANES components that require longer to process are released as the datasets become available.
Confidential data, including DNA, imaging data, and geographic location, are made available to researchers under restricted data agreements. These data are made available through the NCHS Research Data Center (both on site and remotely) and through the Census Bureau’s national network of Remote Data Centers.10
Given the similarities between NHANES and the National Children’s Study, the panel views the general structure of the NHANES’s data release policy and performance of NHANES as a model for the NCS. Confidentiality concerns arising from the longitudinal nature of the NCS may affect somewhat the balance of data released publicly and confidentially, but the panel would expect these kinds of changes to be relatively minor.
8The NHANES provides for the possibility of longitudinal follow-ups for its sample but does not routinely conduct such follow-ups.
9For details, see http://www.cdc.gov/nchs/data/nhanes/nhanes_release_policy.pdf [March 2014].
CONCLUSION 4-9: The panel endorses the general structure of the data distribution plans for the National Children’s Study (NCS), although it fails to understand the need for a 2-year lag between the availability of analytic data and their release to the research community. Subject to confidentiality concerns, timely and complete data access are vital to maximize the scientific value of the NCS and have been achieved by other federal government surveys, which ought to serve as models for the NCS.
Finally, the panel considered another challenging issue related to data release. Given the nature of the recruitment cycle and the roll out of the survey at different times in different primary sampling units (PSUs), with the 4-year roll-out period assumed in the panel’s cost analysis, it will be 7 years before any given data item (e.g., from the questionnaire administered during the 1-year visit) has been collected for all children enrolled in the study. The reason, as explained in Chapter 3, is that the PSUs will be divided into groups, so the field work will be implemented in one group each in 4 successive annual years, called waves, with the birth window being 4 years in each location. Therefore, it will take 7 years from the first data collection in the first wave of PSUs to the last data collection in the last wave of PSUs. Given this lengthy interval, it is imperative to develop data processing and the documentation associated with data release based on data gathered in the first few years, so that minimal effort will be needed to release a given wave’s data after its data have been collected.
Beyond a rapid end-of-wave data release, the 7-year data cycle argues for consideration of an “early release” data policy 2 or 3 years into the cycle to encourage data quality exploration. Given the complications of the sampling design, these preliminary data could not be used to generate national or local estimates. But if experienced analysts were provided access to these data through the proposed network of “restricted access” data centers, a great deal could be learned about properties and quality of these data, in particular, newly developed interview and observational data. This approach would improve the quality and timely release of complete-wave data and their documentation, and it would likely inform the design of recurrent question and observation sequences in future waves. It will be important for the NCS to clearly state to prospective analysts that such data are incomplete and not representative. An “early release” policy would increase processing costs somewhat, so the value of the policy would need to be judged against its costs.
RECOMMENDATION 4-8: The panel recommends that the National Children’s Study should consider producing an “early release” version of the data from the Main Study that includes data collected in the early years of each wave’s data collection cycle and makes those data available to analysts under the terms of restricted access data centers.