Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 67
3
Current Data Collection
Methods and Sources
Summary of Key Findings
• There is a lack of comparable, standardized data (due in part
to a lack of consistent definitions) in the measurement of health
status and quality of health care for children and adolescents.
• Many health conditions and health care processes that are im-
portant to children appear in rates/numbers that are too small
to be adequately represented in survey data sets.
• Improving linkages among administrative record systems and
between those systems and population-based survey data sets
would facilitate comprehensive assessment of child and adoles-
cent health and health care quality.
• The use and interoperability of electronic health records are
expected to increase dramatically over the next 5 years, creat-
ing a robust source of data that can be readily analyzed and
acted upon.
Imagine that you are driving a complex piece of machinery. You want
to know the direction in which you are headed, your rate of speed, how
much fuel you have, the engine temperature (and possibly the external
temperature as well), and whether the engine is performing as it should. If
67
OCR for page 68
68 CHILD AND ADOLESCENT HEALTH
you are flying a plane, you want to know more details, such as your alti-
tude and the wind speed. If you are under water, you want to know other
things. The display that signals whether you are on track is derived from
hundreds of intricate gauges, sensors, computer chips, and monitoring de-
vices. Each mechanism is designed to collect certain types of performance
data; these data are then compared against standard specifications, and the
results are analyzed to determine whether the data are signaling a problem
that requires the operator’s attention. Some gauges are large and dominate
the operator’s routine field of vision; others are more peripheral and show
alerts only when significant problems arise.
The above analogy is useful in considering the monitoring systems that
are used in determining the quality of child and adolescent health and health
care services. The clinician examines an individual child and collects data
from numerous sources—temperature, heart rhythm, height, weight, sleep-
ing and eating habits, and so forth—before concluding whether the child is
“healthy” or requires attention for some specific reason. In much the same
way, health professionals and policy makers examine data from a variety of
population surveys and administrative data sets in making judgments about
the health and health care of children and adolescents. Yet the data system
used to measure the quality of child and adolescent health and health care
services is not as finely developed as the instrumentation in the above anal-
ogy or the collection of clinical data. Indeed, it may be inappropriate even
to refer to the existing data sets on child health and health care services as
a “system,” since these data sets consist of multiple, independent efforts
that are largely uncoordinated and unrelated to each other. In many cases,
data sets were designed for specific objectives without regard to how they
fit within the larger landscape of child health measures. Furthermore, child
and adolescent health data sets are not harmonized or coordinated with
efforts that collect data about other aspects of development, education, or
family and social contexts. The result is a tremendous wealth of data about
many different specific dimensions of child and adolescent health and well-
being, significant gaps with respect to important areas of health and selected
populations, and the absence of an analytic framework that can provide
routine guidance for general or even specific areas of concern.
The remainder of this chapter begins with a brief review of current
methods used to collect data on health and health care. It then describes
existing sources of these data for children and adolescents. Next, the chap-
ter examines the limitations of these data sources. The final section argues
for the need for a coordinated approach to integrate measures of child and
adolescent health and health care quality.
OCR for page 69
69
CURRENT DATA COLLECTION METHODS AND SOURCES
DATA COLLECTION METHODS
Methods used to collect data on health and health care can be charac-
terized by the following features:
• Sample versus census—Some data are collected for the entire popu-
lation to which they apply; such data are sometimes referred to as
census data. One example is the actual decennial census, which
aims to obtain counts by geographic location and basic demo-
graphic characteristics for the entire resident population of the
United States. However, the term census may be used to refer to
any data collection aimed at collecting data for every unit in the
population of interest (i.e., a subset of a larger population of em-
phasis). Conversely, many data cannot be collected for the entire
population without excessive cost and/or a burden on respondents.
Instead, the data are collected from a subset of the population, or
a sample, that is selected (usually by randomization) in a way that
makes it representative of the entire population; thus, estimates can
be calculated from the sample that approximate those for the entire
population.
• Based on administrative records versus respondents—Some data
are extracted from records that already exist because they are
necessary for the administration of a program or intervention. Ex-
amples are government records (tax files, social security and Med-
icaid enrollment, school enrollment, accident reports), commercial
records (health plan enrollment files, medical claims), and medical
records (from physicians’ offices, hospitals, and other providers of
health care). Other data are collected directly from respondents, for
example, by interviewing individuals about their experiences. The
line between the two may not be entirely distinct; for example, a
physician might be asked to provide data derived from the medical
records she uses in her practice; thus the data collection is respon-
dent based, but the data are ultimately derived from administrative
records. In the case of children, most respondent-based data are
collected from proxy respondents (e.g., parents and caregivers). A
third category to consider is that pertaining to clinical data, such
as observational studies.
• Population- versus service-based—Some data collection efforts fo-
cus on a general population defined only by broad demographic
characteristics, such as all children under age 6 or all adolescent
girls. (Note that population-based in this sense could encompass
data collection using sampling, and thus is unrelated to census
data collection from an entire population.) Other data collection
OCR for page 70
70 CHILD AND ADOLESCENT HEALTH
TABLE 3-1 Data Collection Methods
Source Census Sample
Population-based Administrative Vital statistics Some components of
records Medical Expenditure
Panel Survey (MEPS)
cost data; national
samples of discharge
abstracts, etc.
Respondents Decennial census Most national surveys
(e.g., Behavioral
Risk Factor
Surveillance System
[BRFSS], MEPS,
National Health
Interview Survey
[NHIS], National
Immunization Survey
[NIS], National Survey
of Family Growth
[NSFG], Pregnancy
Risk Assessment
Monitoring [PRAMS])
Service-based Administrative Some Healthcare Some HEDIS measures
records Effectiveness Data (those requiring
and Information medical record review)
Set (HEDIS)
measures (those
available in plan
billing records)
Respondents Health plan Consumer Assessment
collection of race/ of Healthcare
ethnicity data Providers and Systems
(CAHPS) measures
SOURCE: Committee on Pediatric Health and Health Care Quality Measures.
efforts in health and health care operate only through specific sites
or administrators of services, such as health plans or clinics; such
service-based data collection can cover only subpopulations defined
by their attachment to the service providers.
While the above three features (summarized in Table 3-1) are not unre-
lated in practice, they are nonetheless conceptually and practically distinct.
Two examples follow:
OCR for page 71
71
CURRENT DATA COLLECTION METHODS AND SOURCES
• Census and administrative records—Given the costs and burden of
respondent-based data collection, census (100 percent) data collec-
tion for a specific population is almost always limited to adminis-
trative records that can be accessed inexpensively and efficiently.
However, not every data collection from administrative records is
a census; cost, access, or confidentially issues may necessitate use
of a sample of records.
• Respondent-based and population-based—For some data needs,
the relevant administrative records are service based. To obtain
general population coverage, either records must be consolidated
across providers or a respondent-based collection must be con-
ducted. However, many respondent-based data collections are
aimed only at coverage of a set of service providers, not a general
population.
It should also be noted that none of these distinctions bears a perfect
relationship to the distinction between health and health care data. Com-
pared with health care data, health data tend more often to be population
based (at least in objective) and respondent based; however, many examples
of health care data are population or respondent based, while many ex-
amples of health data are based on administrative records or service based.
Furthermore, the same data on health might be regarded as a population
measure or as a measure of quality (through sentinel care processes) for a
health care provider, depending on how they are collected and reported. For
example, immunization rates are both a population measure and a measure
of system performance.
Assessment of child and adolescent health and health care quality
relies on data collected through a variety of the methods discussed above
and from a variety of sources. Sources may include primary or second-
ary sources, surveys or registries, and voluntary or required reports. They
may include parents or health care providers, as well as older children and
adolescents who self-report their own data. Surveys may be conducted by
telephone or through interviews with children and their families in health
care or other service settings. Some surveys may involve a review of health
records in providers’ offices or claims records submitted to public or pri-
vate health plans. Surveys may be conducted at one point in time, or they
may recur annually or over other time periods. The reporting source may
change over different time periods, or the same population may be surveyed
or interviewed on multiple occasions. Data may be retrospective, based on
respondents’ recall of certain events or conditions, or prospective, which
involves collecting data at multiple intervals over time to monitor changes
in health characteristics. Surveys may be administered to a universal or
randomized sample of children on a national, state, or local basis; or they
OCR for page 72
72 CHILD AND ADOLESCENT HEALTH
may focus on selected populations, such as underserved children, children
with special health care needs, or children with specific demographic char-
acteristics. Registries are another common source for data on health and
health care, especially when a specific procedure (such as immunization)
can be recorded electronically in a central data collection site.
The consistency and rigor of the measurement method are directly
associated with the quality of the data collected. In examining child and
adolescent health and health care, therefore, it is important to know details
about the sampling strategy, data collection method, and reporting source
associated with surveys or reports.
EXISTING DATA SOURCES
The federal government supports numerous surveys and information
systems that collect data about selected aspects of child and adolescent
health and health services. Prior studies have reviewed many of these data
sets, often with detailed analyses of their sampling strategy, periodicity, and
specific data components (IOM and NRC, 2004; NRC, 1998, 2010; NRC
and IOM, 1995).
Federal Population Health Data Sets
The committee developed Appendix F, a table briefly describing the
major population health data sets that include information about child
and adolescent health and health care services. In developing this table, the
committee examined the following sources:
• Children’s Health, the Nation’s Wealth: Assessing and Improving
Child Health (IOM and NRC, 2004), which identifies 30 federal
data sets used for measuring children’s health and relevant influ-
ences and includes a gap analysis of specific measures for 12 of
these data sets;
data sets reviewed by the Federal Interagency Forum on Child and
•
Family Statistics, which produces the annual America’s Children
reports (FIFCFS, 2010a);
the Directory of Health and Human Services Data Resources, pre-
•
pared by the Department of Health and Human Services’ (HHS’)
Data Council (HHS, 2003);
a list of federal data sets and repositories available on the research
•
portal of the National Information Center on Health Services Re-
search and Health Care Technology (NICHSR) at the National
Institutes of Health (NIH, 2010a);
OCR for page 73
73
CURRENT DATA COLLECTION METHODS AND SOURCES
three research papers examining selected federal data sets for chil-
•
dren, youth, and families (Hogan and Msall, 2008; NRC and IOM,
1995; Stagner and Zweigl, 2007);
a review of longitudinal data sets compiled during the planning for
•
the National Children’s Study (The Lewin Group, 2000); and
a list compiled by the Agency for Healthcare Research and Qual-
•
ity’s (AHRQ’s) Data and Surveys web site (AHRQ, 2010a).
This inventory includes surveys of health and health care services ad-
ministered for children and adolescents (aged 0−18) within the past 20
years (beginning in 1990). Data sources for these surveys include informa-
tion provided by children, adolescents, parents, caregivers, and health care
providers. Some surveys involve reviewing health records. Only surveys
administered within the United States to sample sizes greater than 1,000
are included in the above list.
The largest number of population health surveys, registries, and studies
are administered by HHS. Other federal agencies collect child health data as
part of their administration of information systems for other purposes, such
as environmental quality (Environmental Protection Agency), education
(U.S. Department of Education), or occupational injuries (U.S. Department
of Labor). In addition, some federal agencies collect data on health influ-
ences, such as poverty (Census Bureau), housing and homelessness (U.S.
Department of Housing and Urban Development), and motor vehicle safety
(U.S. Department of Transportation).
Longitudinal Studies of Children and Youth
In addition to data systems administered directly by federal agencies (or
their contractors), federal funds have supported hundreds of longitudinal
studies examining selected aspects of child health, frequently focusing on
small populations that are followed intensely over several years or even de-
cades. No central source exists that can catalogue the information gleaned
from these longitudinal studies, although many of these studies have been
described in earlier reports (NRC, 1998).
One example of a longitudinal study is the National Children’s Study
(NCS), launched in January 2009. The NCS is the largest long-term study
of environmental and genetic effects on children’s health conducted in the
United States. A nationally representative probability sample of 100,000
births will be followed from before birth to age 21. Data will be collected
on multiple exposures and multiple outcomes using repeated measures over
time (NIH, 2010c).
Other longitudinal studies include the National Longitudinal Study of
Adolescent Health (Add Health) and the Great Smoky Mountains Study
OCR for page 74
74 CHILD AND ADOLESCENT HEALTH
(GSMS). Add Health, which began in 1994, examines how social contexts
(such as families, friends, peers, schools, neighborhoods, and communi-
ties) influence adolescents’ health and risk behaviors (NICHD, 2007). The
GSMS, a population-based community survey of children and adolescents
in North Carolina, estimates the number of youth with emotional and
behavioral disorders, the persistence of those disorders over time, the need
for and use of services for those disorders, and the possible risk factors
for developing them (Costello et al., 1996) (see Appendix F for additional
information on selected longitudinal studies of children and adolescents).
Administrative Data Sources
In addition to the population health and longitudinal studies described
above, data on child health and health care services can be derived from
service-based records. These data sets include those prepared for adminis-
trative purposes, such as vital statistics (birth and death records), medical
records, health plan payments, and quality measures. They also include
surveys of populations from selected service settings, such as children or
youth who are enrolled in specific health plans (e.g., Medicaid or CHIP),
children who are hospitalized, or children who are identified in cases of
abuse and neglect.
The committee identified and catalogued these service-based data sets
by reviewing the sources on population health described above and draw-
ing on a commissioned background paper (MacTaggart, 2010). Appendix
F provides a listing of the individual data sets derived from service-based
studies, which include, for example, Healthcare Effectiveness Data and
Information Set (HEDIS) measures, National Committee for Quality As-
surance (NCQA) measures, and hospital administrative data.
LIMITATIONS OF EXISTING DATA SOURCES
Estimates of the scope and severity of certain health conditions are
sometimes derived from service-based information sources rather than gen-
eral population surveys. Existing data sources have a number of limitations
related to standardization, data collection, the ability to capture disparities,
case mix adjustment, and data aggregation methods.
Standardization
There is no lack of standards; rather, there are multiple standards that
are competing and conflicting in nature. The same is true of existing qual-
ity performance measures. A range of such measures exist for children and
adolescents, and the administrative requirements for their collection vary
OCR for page 75
75
CURRENT DATA COLLECTION METHODS AND SOURCES
with respect to which measures are collected, the sources of the data (based
on administrative records or respondents or a mix of the two), validation of
the data sources, and the reporting period. The lack of comparable, stan-
dardized data has limited the ability to develop benchmarks from national
or state sources.
Interstate issues are significant as a result of variations in state reporting
requirements, state information technology (IT) infrastructure capacity and
specifications, state collection methods, cross-state access to data, and the
way various parameters are defined. For instance, the definition of “fully”
immunized and the components of a newborn screening can vary by state;
therefore, the data elements that are collected and tracked may vary and not
be comparable (Ferris et al., 2001). Data are more likely to be equivalent if
claims data are used as the source and the services are provided in the same
setting; however, the conversion from the ninth to the tenth edition of the
International Classification of Diseases (ICD-9 to ICD-10) in the coming
years will require additional scrutiny to ensure continued comparability.
One of the greatest challenges is standardizing the definition of chil-
dren. For Medicaid early and periodic screening, diagnosis, and treatment
(EPSDT), a child is defined as up to age 21. For the Children’s Health Insur-
ance Program (CHIP), a child is defined as up to age 19. For the Consumer
Assessment of Healthcare Providers and Systems (CAHPS) (Berdahl et al.,
2010), a child is defined as age 17 or younger. And the Federal Interagency
Forum on Child and Family Statistics (FIFCFS) of the National Center for
Health Statistics defines teens as those aged 12−17 (FIFCFS, 2010a). Family
structure likewise is not standardized across funding mechanisms and time.
Other problems occur in attempting to compare similar health issues
across data sets. These problems illustrate both the advantages and difficul-
ties of attempting to standardize definitions and data collection methods.
For example, Bethell and colleagues’ (2002) characterization of good health
raises concern about how the information is obtained. Many national sur-
veys have converged on using a single question on how the individual rates
his/her own health or parents rate their child’s health along a spectrum of
excellent, very good, good, fair, or poor (Anderson et al., 2001; Andresen
et al., 2003; Hennessey et al., 1994; NCHS, 1973; Roghmann and Pless,
1993). Such convergence allows for comparison over time and across age
groups. However, little variation in the responses is seen, and the measure
is insensitive to fairly major differences in health. A more nuanced measure
that captures more dimensions of perceived health status would be useful,
but its use might sacrifice the value of comparability. Addressing such is-
sues would require ongoing methodological work on assessing and refining
measures and establishing comparability over time, as is done with changes
in the ICD (Anderson et al., 2001).
Likewise, the Maternal and Child Health Bureau has developed a short
OCR for page 76
76 CHILD AND ADOLESCENT HEALTH
screener to identify children with special health care needs (Bethell et al.,
2002). While ensuring comparable ascertainment across populations, the
use of this instrument hinders comparisons with data sets that rely on di-
agnoses. Standardized measures of child health and the quality of relevant
health care are also important for all child health problems, but especially
for those children with preventable, ongoing, or serious health conditions
(Kuhlthau et al., 2002). Child health problems include a large number of
relatively rare conditions (see Chapter 4). Moreover, the implications of
the existence of a health condition may vary with child development (IOM
and NRC, 2004). Thus, an early sign of a health problem may be slower
rates of physical growth, but later implications may include poorer school
achievement, perhaps due to repeated absences (Byrd and Weitzman, 1994;
Weitzman et al., 1982), and may be associated with behavioral issues that
may further impede school success (Gortmaker et al., 1990). In addition,
conditions may vary in severity across different children and over time and
have implications for adult health.
Criteria for the design of health measures are identified in Children’s
Health, the Nation’s Wealth (IOM and NRC, 2004, p. 43):
importance to current and future health,
•
reliability and validity,
•
meaning in terms of the special aspects of child health and
•
development,
cultural appropriateness,
•
sensitivity to change, and
•
feasibility of collection.
•
Inherent in these criteria is the challenge of a measurement system that
speaks to the various parties engaged in improving the health of children.
Diagnoses (ICD codes), for example, may be meaningful to health care
providers but less so to parents, who, in turn, may be concerned about
functional implications, including management strategies. Both types of
information may be critical to the development of an education plan for
special education students.
Data Collection
The use of administrative data to assess child health and health care
quality is limited to some extent to certain dimensions of quality, such as
access and some process measures. The combining of medical records and
claims data through the development and operation of electronic health
record (EHR) systems and electronic health information exchange (e-HIE)
will appreciably reduce this limitation. The evolution to ICD-10 coding will
also expand the value of claims data. Data linkages resulting from Medic-
OCR for page 77
77
CURRENT DATA COLLECTION METHODS AND SOURCES
aid Transformation Grant initiatives, Children’s Health Insurance Program
Reauthorization Act (CHIPRA) provisions, and American Recovery and
Reinvestment Act (ARRA) funding are providing critical data elements.
For example, the opportunity to collect some measures more efficiently is
enhanced through the linkage of Medicaid with vital statistics, state labo-
ratories, and registries. In addition, the availability of web-based interfaces
expands options for the collection and transmission of data.
Given that the cost of quality oversight and performance measurement
reporting is a cost to public and private purchasers and providers, the fiscal
impact as well as efficiency of using standardized, formatted data through
an ongoing infrastructure is considerable. However, the realization of these
benefits assumes that the data are collected and documented at the site of
care, which is not always the case. Also assumed is that the individual is
identifiable. A current issue is that Medicaid requires coverage of newborns
under their mother’s identification until their own eligibility can be estab-
lished, which may take up to a year. Data coded to a mother’s identification
may or may not be tracked back to the newborn when the child becomes
individually enrolled.
Another factor that can potentially affect the data collected is a change
in payment methods. For example, while there is significant interest in
episode-of-care payment methods, there is a risk that some of the previous
detailed claims data may be lost. A lesson learned from the transition from
individual to bundled payments for prenatal visits and delivery was that
the requirement to collect and track the number of prenatal visits through
administrative data no longer existed.
Identification and Monitoring of Disparities
As discussed in Chapter 2, it is crucial to identify and monitor health
and health care equity issues among children and adolescents. Racial/ethnic
and linguistic disparities in children’s health and health care cannot be
identified, tracked, addressed, or eliminated without consistent collection
of race/ethnicity and language data on all patients (Flores, 2009). Yet, one-
third of all health plan enrollees (28.7 million individuals) are covered by
plans that collect no race/ethnicity data (AHIP and RWJF, 2006). A national
survey of 272 hospitals found that only 39 percent collected data on pa-
tients’ primary language (Hasnain-Wynia et al., 2004), and no information
is available on what proportions of hospitals or health plans collect data on
English proficiency. Parental limited English proficiency (defined by the U.S.
Census Bureau [Shin and Kominski, 2010] as the self-rated ability to speak
English less than “very well”) has been shown to be superior to primary
language spoken at home as a measure of the impact of language barriers
on children’s health and health care (Flores et al., 2005a).
Although the Office of Management and Budget (OMB) requires highly
OCR for page 80
80 CHILD AND ADOLESCENT HEALTH
Restriction to homogeneous populations—Some measures can be
•
made comparable by restriction to a homogeneous population.
For example, childhood immunizations typically run on strict age-
based schedules and are appropriate for essentially all children
in the age window; hence the measure can be calculated from a
specific age group, and no age adjustment is needed. One can then
compare immunization rates in different states at that single age.
Stratified reporting—There might be several groups of interest for
•
a measure, each of which is homogeneous. For example, one might
be interested in immunization rates across a range of ages, but
recognize that younger children are more likely than older ones to
have immunizations complete. A simple comparison of childhood
immunization rates across states could be confounded if one state
has a higher proportion of young children. Instead, one might
stratify reporting by age, that is, prepare a separate measure for
each of several nearly homogeneous age groups. Unconfounded
comparisons could then be made for each stratum.
Direct standardization—Stratified reporting might be impractical
•
for any of at least three reasons: (1) there might be insufficient
data with which to calculate measures for each of the relevant
strata with adequate precision for stratified reporting; (2) stratified
reports might provide more detail than is desired (for example,
comparing 51 states in 10 age strata involves cognitively processing
510 measures, obscuring overall state differences); and (3) when
a control variable has many levels or several control variables
must be considered at once, the number of strata can become very
large, exacerbating both of the previous problems. A set of strati-
fied measures can be consolidated into a simpler single measure by
combining measures across strata with fixed weights corresponding
to some reference population. To develop a single immunization
measure for comparison of states, for example, one might combine
immunization rates by year of age with weights based on the na-
tional age distribution. Then no state would receive a higher score
simply because it had a larger proportion of young children.
Model-based standardization—Direct standardization may fail
•
when the number of observations per cell is small or zero. Model-
based (regression) standardization is a generalization that can be
more robust against such problems (Little, 1982). Regression stan-
dardization can accommodate simultaneous adjustment for mul-
tiple variables. A variety of models are appropriate for use with
different kinds of data.
OCR for page 81
81
CURRENT DATA COLLECTION METHODS AND SOURCES
Given the existence of technical methods for implementing case mix
adjustment in a variety of settings, the key scientific or policy question
is which variables to adjust for in reporting any particular comparison.
Since case mix adjustment is a method of removing extraneous composi-
tional effects from a comparison, the key is to figure out which effects are
extraneous for a given purpose and which are of interest. For example, it
is common to adjust for severity of illness and comorbidities when using
outcome measures to evaluate the quality of care provided by hospitals.
Without such adjustment, hospitals that treat more severely ill patients
might be rated as worse than those of similar quality that treat mildly ill
patients. Similarly, when evaluation is based on a measure of process, it is
appropriate to adjust for patient variables associated with either the degree
of appropriateness of the process or the difficulty of applying it.
To consider a slightly more complex example, one might be interested
in unadjusted rates of severe emotional distress (SED) if one simply wanted
to determine how to distribute funds for mental health services across
schools. If one wanted to compare schools on their psychological climates,
one might want to adjust for age distributions (if age is a predictor of a
determination of SED). If one wanted to evaluate schools on how well they
(and their associated support systems) help children cope with stressors that
tend to engender SED, one might further adjust for known stressors such
as family poverty or instability.
While adjusting for age is rarely controversial, adjusting for socioeco-
nomic or race/ethnicity variables raises more subtle issues. Suppose, for
example, that low-income patients with a certain condition at each hospital
are less likely than upper-income patients at the same hospital to obtain a
service equally needed by both. Without adjustment of two hospitals that
perform identically on a measure of this service, the one with a greater
proportion of low-income patients would receive a worse quality score. By
the logic of the previous examples, adjustment for patient composition by
income group might be considered. It has been argued that such adjustment
obscures and excuses inferior performance for disadvantaged (low-income,
in this case) patients (Romano, 2000). On the other hand, by hypothesis in
this example and perhaps empirically in many cases, inferior performance
for low-income patients is a systemwide failure, not just a failure of the
hospitals that see many such patients. Such a systemwide failure might
arise, for example, from a lack of insurance coverage for needed medica-
tions, a lack of resources required to enable less educated patients to master
complex treatment regimens, or unconscious discrimination against such
patients. Indeed, such a pattern of inferior treatment within each hospi-
tal is not discernible in unadjusted hospital-level reports, which combine
income groups. (If some hospitals serving many low-income patients have
OCR for page 82
82 CHILD AND ADOLESCENT HEALTH
generally inferior performance—that is, for each income group—this could
be observed in either adjusted or unadjusted reports.) Reports stratified by
income for each hospital would reveal the pattern, albeit only after further
analysis, and become subject to the disadvantages discussed above. In fact,
the pattern would be revealed most explicitly in the coefficients of the case
mix regression model, which summarize the within-hospital differences in
a single number (Zaslavsky, 2001). The point here is that hospital (or other
unit-specific) reports are good for some purposes but are best examined in
conjunction with analysis of more general patterns.
Another controversy concerns the applicability of case mix adjustment
in assessment of racial/ethnic health and health care disparities. It is logical
to age- and sex-adjust intergroup comparisons of health, and similarly to
adjust comparisons of health care for clinical characteristics affecting need
and outcome. However, the IOM report Unequal Treatment: Confronting
Racial and Ethnic Disparities in Health Care (2003a) argues that it is not
appropriate to adjust for socioeconomic measures (that is, remove their
effects) in such comparisons since worse socioeconomic status is one of
the aspects of disadvantage imposed on disadvantaged racial/ethnic groups
and a mediator of effects on health, treatment, and outcomes. Others have
argued for adjustment for socioeconomic variables, thus more or less ex-
plicitly taking a much narrower view of what counts as a disparity that ex-
cludes effects mediated through socioeconomic differences between groups
at variance with the IOM-endorsed definitions (Satel and Klick, 2006). This
controversy illustrates how important scientific and normative principles
may arise in case mix adjustment.
Data Aggregation Methods
Any analysis of data used to measure health or health care quality
requires aggregation of the data. These data may be collected with the
primary goal of measurement, using any combination of tools and design
approaches as described previously; in this case, the time-consuming and
expensive process of data collection for measurement must be balanced
against the rigor with which these data can be collected. In many cases, sec-
ondary data, such as those collected for clinical, billing, research, or other
purposes, may be used secondarily to assess health or health care quality.
These data are often less well validated and may contain errors or formats
that compromise data analysis; for some data types in some populations,
however, secondary data are the only accessible source of the needed infor-
mation. In either case, IT often plays an important role. Databases, medical
data registries, and clinical health information technology (HIT) are three
common approaches to data aggregation and reuse.
Databases, defined as a structured collection of organized, retrievable,
OCR for page 83
83
CURRENT DATA COLLECTION METHODS AND SOURCES
and (typically) machine-readable information (Frawley et al., 1992), are a
common tool for assembling data before conducting analyses. Database
software is specifically designed to support the storage, manipulation, and
retrieval of data, and is a critical tool for the biostatistician dealing with
large data sets. One of the key features of databases is the ability to define
relationships among data elements. For example, databases allow billing
system data that include provider identifiers and sites of care to be com-
bined with survey data that may include a provider name. These two col-
lections of data can be combined because the provider name and date of
visit may match the provider name and date of completion in the survey.
This relationship allows the site of care to be linked to the survey, thereby
supporting a variety of analyses that compare some measure across sites
of care.
Medical data registries are a specialized type of database designed to
contain data collected in the course of caring for a specific patient popula-
tion (Drolet and Johnson, 2008). Because the goal of medical data registries
is often to support secondary data analysis, they feature well-characterized
data collection methods and carefully constructed data fields that rely on
controlled terminologies to support the aggregation of data in ways not
always defined a priori. Medical data registries also characteristically sup-
port longitudinal data collection (i.e., the collection of data on a particular
patient over time), as well as cross-sectional data collection (e.g., survey
results on functional status after hip replacement in clinics across the
country). Finally, the use of a medical data registry implies attention not
only to the quality of the data, but also to the rigorous policies of human
subjects assurance, the Health Insurance Portability and Accountability
Act (HIPAA), and internationally sanctioned approaches to privacy and
security.
Clinical HIT has received significant attention because of its potential
impact on quality and safety (IOM, 1999). EHR and, more recently, per-
sonal health record (PHR) systems are primary data sources that provide
a rich source of information about health and health care quality. These
systems promote the collection of comprehensive, patient-specific data on
active medications, allergies, medical diagnoses, encounter summaries, re-
ferrals, and laboratory tests, as well as other longitudinal data. As utiliza-
tion of EHRs and PHRs continues to grow, they will provide an important
opportunity to integrate data across specialty care, such as care for mental
health and substance use disorders.
In addition to the above three approaches, the adoption of con -
trolled terminologies, such as the Systematized Nomenclature of Medicine
(SNOMED) or the ICD, together with relatively structured formats for
encounter summaries or document types, makes it possible to aggregate
data across patients, sites of care, and even entire regions, as demonstrated
OCR for page 84
84 CHILD AND ADOLESCENT HEALTH
by numerous health information exchange demonstration projects around
the United States (Denny et al., 2009; Doan et al., 2010). These systems
may catalyze the formulation of new health and health care quality mea-
sures and may radically lower the implementation cost of measurement.
Moreover, through the use of algorithmic approaches to data analysis,
researchers are beginning to demonstrate near-real-time feedback of quality
measures to providers at the point of care (Roberts et al., 2009; Starmer
and Giuse, 2008; Starmer and Waitman, 2006; Zaydfudim et al., 2009).
Unfortunately, as of 2008, fewer than 20 percent of providers were
using a comprehensive EHR in their practice (DesRoches et al., 2008).
Similarly, demonstration projects of e-HIE have achieved usage for under
20 percent of encounters (Johnson et al., 2008; Vest, 2009), although with
recent federal incentives, the adoption of both EHRs and e-HIE is expected
to increase dramatically over the next 5 years.
The promise of these technologies suggests that measurement research-
ers should modify validated measures to support them and investigate how
best to integrate efforts to collect valid and reliable data with available pop-
ulationwide data samples that may be of lower quality. Furthermore, issues
surrounding privacy and access to state-based Medicaid data continue to
underscore challenges in EHR and e-HIE implementation. While the issues
of privacy and confidentiality are of critical concern, detailed discussion of
these issues is beyond the scope of the report. (For a more comprehensive
discussion of privacy and confidentiality issues, see Engaging Privacy and
Information Technology in a Digital Age [NRC, 2007] and Beyond the
HIPAA Privacy Rule: Enhancing Privacy, Improving Health Through Re-
search [IOM, 2009b].) HIPAA and the regulations that followed protect
personal health information held by third parties and give patients an ar-
ray of rights. They also established a range of administrative, physical, and
technical safeguards to ensure the confidentiality, integrity, and availability
of electronic health information.
HIPAA was followed by the Patient Safety and Quality Improvement
Act of 2005 (PSQIA), which established a voluntary reporting system to
resolve patient safety and health care quality issues: “To encourage the
reporting and analysis of medical errors, PSQIA provides Federal privilege
and confidentiality protections for patient safety information called patient
safety work product. Patient safety work product includes information
collected and created during the reporting and analysis of patient safety
events” (HHS, 2011a).
Both of these pieces of legislation represent the policy consensus and
technical capabilities at the time they were enacted. It is unlikely that new
legislation will be enacted in the near future to refine and update this
policy consensus and incorporate technical advances. In the meantime, well-
designed systems that produce robust data with strong privacy protection
OCR for page 85
85
CURRENT DATA COLLECTION METHODS AND SOURCES
will be able to meet the needs and protections encompassed by these two
pieces of legislation, but also self-adjust to adapt to the needs and chal-
lenges of the future.
At present, privacy protections can conflict with attempts at data ag-
gregation. The adolescent population poses special data collection issues,
particularly with regard to privacy and security concerns, as confidentiality
is known to be a significant and necessary component when interviewing
adolescents. Conflicts also exist at the state and local levels with respect
to accessing Medicaid and vital statistics data; there is marked variation in
the way states have interpreted recent guidance from the Centers for Medi-
care and Medicaid Services (CMS) regarding access to and the availability
of Medicaid data. Successful future efforts to conduct cross-state quality
measurement will require specific guidance from CMS to the states regard-
ing the priority associated with these efforts. Although necessary safeguards
for patient confidentiality are essential, they need not preclude the ability
to develop and utilize analytic methods to conduct both cross-sectional and
longitudinal comparisons among states. The failure of CMS to facilitate the
comfort of states in providing limited yet essential access to Medicaid data
would restrict the ability to perform quality measurement across the nation
for this important patient population.
Illustrative Examples
This section presents two illustrative examples of the challenges dis-
cussed above: an assessment of a state-based demonstration program and
measurement of health insurance coverage.
Hypothetical State-Based Demonstration Program
The first example is a hypothetical state-based demonstration program
designed to examine the effect of changes in insurance coverage strategies
aimed at reducing preventable hospitalizations and hospital costs among
low-income children. To conduct such an assessment would require data
on the details of insurance coverage; on the details of hospitalizations; and
on personal characteristics of each child’s family, notably income, by state.
The Medical Expenditures Panel Study (MEPS) is carried out by interview-
ing parents of a nationally representative sample of children about their
children’s health and health care use (AHRQ, 2010b), the parents’ employ-
ers about insurance benefits, and health care providers about the children’s
use of services and charges. Thus, this data set would appear to contain
all the necessary data. In 2006, however, the sample included only 12,609
individuals younger than 24, slightly fewer than half of whom were from
low-income families. Moreover, hospitalization is a relatively infrequent
OCR for page 86
86 CHILD AND ADOLESCENT HEALTH
event for children: only 6.5 percent of children younger than 5 and 1.5 per-
cent of those aged 5−17 have any hospital expenditures. With such small
samples, further winnowing by specific diagnoses (e.g., those preventable),
by subgroups of interest (e.g., by race/ethnicity or type of insurance cover-
age), and by state would preclude stable or meaningful estimates.
Two state-based data systems might prove more useful. The Kids’ In-
patient Database (KID) contains data on all admissions for those younger
than 20 from 38 states in the most recent compilation (HCUP, 2006). Data
elements include primary and secondary diagnoses and procedures, admis-
sion and discharge status, demographic information such as age and gender,
hospital characteristics, length of stay and charges, and expected source of
payment on 2−3 million discharges per year. While providing a substantial
window on hospital use by children, however, this data set has significant
limitations. Among these is the characterization of socioeconomic status, as
the income data reflect the median income of the zip code of the hospital,
not the income of the child’s family, and the insurance data (expected source
of payment) may not be for the final payer. In addition, the data set does
not permit linkage of multiple hospitalizations for the same child, nor does
it provide much information on the events before and after hospitalization.
Even with substantial numbers of events, quality indicators designed to
parallel those used for adults may not occur in sufficient numbers to yield
information on safety (Scanlon et al., 2008) or to support stratification by
important covariates such as race/ethnicity, income, or insurance status
(Berdahl et al., 2010).
Other state-based assessments of child health can be obtained from
the series of surveys funded by the Maternal and Child Health Bureau on
general child health (NCHS, 2009c) and the health experience of children
with special health care needs (NCHS, 2009b) based on the State and Lo-
cal Area Integrated Telephone Survey (NCHS, 2009a). These surveys are
designed to provide robust samples for analysis at the state level and a
wealth of data on health conditions and functional status, insurance cover-
age, use of medical care and other services, and individual family health
behaviors for children generally and for the more vulnerable subgroup of
those with special needs. As with the MEPS, however, the data come from
parent reports and may be limited on any one issue because of the breadth
of the topics covered. Unlike the MEPS, moreover, these surveys include
no longitudinal component, so that assessing changes in health status or
use of care is not possible. For the purposes of assessment of a hypotheti-
cal state-based demonstration program, virtually no data on costs of care
are available except for out-of-pocket costs for families with children with
special needs. Thus, each of these data sets might provide some insight, but
none would be sufficient to support a comprehensive assessment.
OCR for page 87
87
CURRENT DATA COLLECTION METHODS AND SOURCES
Measurement of Health Insurance Coverage
Another example of the limitations imposed by the fragmentation of
current data collection systems is measurement of health insurance cover-
age. Currently, there is no agreement on the number of children who are
uninsured (CBO, 2003; Kenney et al., 2006; SHADAC and RWJF, 2009).
Confusion as to the number of uninsured children arises in part because
a range of different insurance concepts are relevant, in part because there
is no proven method for collecting health insurance information, and in
part because multiple surveys produce coverage estimates for children on
an annual basis.
A number of different insurance coverage concepts exist—for example,
the number of children who are uninsured at a particular point in time, the
number of children who have been insured for a year or longer, the number
of children who experienced short periods (less than 12 months) without
coverage in a 12-month period, and the average number of children who
are uninsured over a particular period in time. A priori, one would expect
the number of uninsured children to depend on the particular concept:
the number of children who are uninsured for a full year is expected to be
smaller than the number of children who are uninsured at a particular point
in time, which in turn is expected to be smaller than the number of children
who experienced any period without coverage in a given year. Indeed, ac-
cording to one source, which includes measures of two different insurance
concepts, the number of children who are uninsured at a particular point
in time is 1.6 times larger than the number of those who are uninsured for
a full year (Davern et al., 2009; Klerman et al., 2009).
Each of the different insurance concepts provides valuable information
about the nature of the coverage problem facing children. In particular, esti-
mates of the number of children who are uninsured at a particular point in
time are useful for budgeting purposes (Orszag, 2007). For example, when
Medicaid and CHIP programs assess how eligibility expansions could affect
program enrollment and spending, they rely on estimates of how many chil-
dren are uninsured in the targeted income group. Similarly, knowing how
many children are uninsured for a full year or longer provides important
information on the extent to which uninsurance is a chronic problem for
children, whereas knowing how many children experience short bouts of
uninsurance could provide key insights about program operations related
to churning (how individuals move back and forth between having and not
having insurance) and retention (Tang et al., 2003).
Since there is no proven method for accurately measuring a given in-
surance concept, moreover, each survey’s approach to measuring the unin-
sured differs along a number of dimensions that likely affects the estimated
number of uninsured children. In particular, surveys differ in the wording
OCR for page 88
88 CHILD AND ADOLESCENT HEALTH
of the insurance questions they include, the names used to designate dif-
ferent Medicaid and CHIP programs, the order of the questions, whether
the insurance questions pertain to a specific child or to multiple individuals
in the family, who is providing information on the insurance coverage of
a particular child, what survey mode is used to collect the data (e.g., mail,
telephone, in person), whether the survey is cross-sectional or longitudinal
(which likely affects duration-dependent concepts such as the number of
children who have lacked insurance coverage for a full year), how missing
data on coverage are handled, how a response that requires some interpre-
tation is coded (e.g., when respondents reply that they have both private
coverage and Medicaid), and whether an explicit attempt is made to adjust
for what appears to be a systematic underreporting of Medicaid and CHIP
coverage in household surveys (Kenney et al., 2006; SHADAC and RWJF,
2009). The factors listed here shape the coverage estimates that emerge
from a particular survey.
Four federal surveys—the CPS, the American Community Survey
(ACS), the MEPS, and the National Health Interview Survey (NHIS)—
currently provide annual estimates of the number of children who are
uninsured. The ACS, MEPS, and NHIS all ask explicitly about coverage
at the time of the survey, which corresponds to the point-in-time concept.
The MEPS and NHIS also include measures of full-year uninsurance, with
the MEPS tracking coverage over the course of a year through multiple
interviews at 3- to 4-month intervals and the NHIS collecting information
on current and prior coverage from a single interview. In principle, the
CPS provides an estimate of the number of children who were uninsured
for a full year. However, the survey’s long recall period (14−16 months)
may lead to inaccurate responses, especially among individuals who were
enrolled in Medicaid for a brief period in the previous calendar year or at
the beginning of the previous calendar year (DeNavas-Walt et al., 2009;
Klerman et al., 2009).
For 2008, the most recent year for which official estimates are available
from each of these surveys, the number of uninsured children aged 0−17
at a particular point in time ranges from 6.6 million on the NHIS to 10.7
million on the MEPS (the CPS [unadjusted] and ACS estimates are both
7.3 million). Not only is there disagreement about how many children lack
health insurance coverage at a particular point in time nationally, but state-
level estimates vary across surveys as well (Blewett and Davern, 2006; Call
et al., 2007).
OCR for page 89
89
CURRENT DATA COLLECTION METHODS AND SOURCES
THE NEED FOR A COORDINATED APPROACH TO
INTEGRATE MEASURES OF CHILD AND ADOLESCENT
HEALTH AND HEALTH CARE QUALITY
Much progress has been made in developing and expanding the scope
of measures of child and adolescent health and health care quality. How-
ever, a comprehensive set of ideal measures does not yet exist for children
and adolescents that can support the types of analyses needed in both of
these areas. What is available instead is a patchwork of measures of health
and health care quality drawn from different population surveys, admin-
istrative data sets, and longitudinal studies of children and adolescents,
each of which was designed for different specific purposes, as reviewed
above. In the absence of a framework that can prioritize selected mea-
sures of health outcomes, health services, or care processes, it is difficult
to achieve an appropriate balance between population-based measures of
health and service-based measures of health care quality. Separate efforts
to strengthen both systems of measurement are currently under way at the
federal, state, and local levels, as well as in private-sector initiatives (see,
for example, How et al., 2011; IOM, 2011a; NQF, 2011). But the nation
lacks a coherent strategy and process for coordinating these efforts and for
establishing national priorities to guide emerging health informatics efforts
at the federal, state, and local levels. One example of the latter activity is
the new Health Indicators Warehouse, part of the Community Health Data
Initiative (Bilheimer, 2010), which is aimed at improving data transparency
and timeliness and access to federal health and health care data sets.
The committee believes a coordinated approach is needed to link these
data sets and recommended measures to accomplish several objectives:
prioritize the health domains that should inform the next genera-
•
tion of quality improvement efforts;
suggest strategies by which child health indicators could be devel-
•
oped from existing child and adolescent data sources; and
identify gaps that should be addressed through future research on
•
health measures or enhanced data collection efforts.
Any effort to create such an integrated approach is challenged by mul-
tiple factors:
a lack of consensus on the fundamental areas of health that are
•
important to monitor both for the general population of children
and adolescents and for vulnerable groups;
the absence of high-quality state-level data that make it possible to
•
monitor the health status of children and adolescents over time;
OCR for page 90
90 CHILD AND ADOLESCENT HEALTH
a growing realization that children’s and adolescents’ health status
•
and levels of functioning are frequently influenced by social and
economic factors;
methodological challenges in establishing relationships among chil-
•
dren’s and adolescents’ health status, insurance status, use of health
care services and their quality, care processes, and health outcomes;
the recognition that access to and utilization of high-quality health
•
care services may be insufficient to compensate for adverse social
and economic conditions within families and communities; and
the persistent inability within various data sets to link measures of
•
children’s and adolescents’ health status with measures of social
and economic status and family conditions.
A coordinated approach is a necessary step toward building consensus
on the definition of health and the types of health indicators that are impor-
tant to monitor in assessing the health status of children and adolescents,
especially those from disadvantaged and underserved communities.
SUMMARY
This chapter has provided an overview of current methods used to col-
lect data and demonstrated how the consistency and rigor of measurement
methods are directly associated with the quality of the data collected. In
examining the measurement of child and adolescent health and health care,
the committee identified several key findings that highlight areas in which
current measurement efforts fall short. In particular, the evidence reveals a
need for greater consistency, standardization, and interoperability of data.
From its examination of the evidence, the committee determined that
consistent standards for data elements, based on common definitions of key
concepts, are necessary to facilitate the integration of data across health care
systems and geographic areas. In particular, greater consistency is needed in
measuring such characteristics as insurance coverage. Improving linkages
among administrative record systems and between population-based survey
data sets and administrative records would enhance the comprehensive as-
sessment of child and adolescent health and the quality of their health care.
Finally, the emergence of EHRs and personal health records (PHRs) has the
potential to provide an important and novel source of primary data for as-
sessing health and health care quality. The committee believes that the use
and interoperability of EHRs and PHRs will create a robust source of data
that can be readily analyzed and acted upon.