Read "A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases" at NAP.edu

Page 65 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

5

Existing Surveillance Data Sources and Systems

INTRODUCTION

The Centers for Disease Control and Prevention (CDC) defines public health surveillance as “ongoing, systematic collection, analysis, interpretation, and dissemination of data regarding a health-related event for use in public health action to reduce morbidity and mortality and to improve health” (CDC, 2001). This definition is particularly appropriate for acute health issues, such as infectious diseases and injuries, in which an exposure, a diagnosis, or an event is a data point for tracking incidence. Surveillance approaches vary in terms of scope, methods, and objectives: some are established to track particular diseases such as specific cancer types or communicable infections; others track behaviors, health conditions, or events such as smoking, obesity or childhood window falls, or occupational hazards such as on-site injuries.

Surveillance data can be used to estimate the magnitude of specific problems, determine the distribution of illness, portray the natural history of a disease, generate hypotheses, stimulate research, evaluate control measures, monitor changes, and facilitate planning. Data sources and methods for surveillance systems include notifiable diseases, laboratory specimens, vital records, sentinel surveillance, registries, surveys, and administrative data systems.

Surveillance can be either passive or active. With passive surveillance, reports are received from physicians, hospitals, laboratories, or other individuals or institutions. Examples of passive surveillance systems include the Food and Drug Administration’s (FDA’s) Adverse Events Reporting System (AERS), which is focused on patient safety, and the Vaccine Adverse Events Reporting System (VAERS), which is operated by the CDC in conjunction with the FDA and is concerned with the negative effects of licensed vaccines. Passive surveillance is a relatively inexpensive strategy, but its reliance on people and institutions to initiate providing data reduces completeness and data quality. Active surveillance approaches regularly contact reporting sources to obtain information. It is generally considered more complete, but such a system is more costly than a passive system (Groseclose et al., 2000).

While there is no single nationwide surveillance system for cardiovascular and chronic lung diseases, a number of surveys, registries, cohort studies, and vital statistics are used by different stakeholders to gather different kinds of information about these diseases. To fulfill its task to develop a nationwide framework for surveillance, the committee sent 49 requests for information to different institutions engaged in some form of relevant data collection.¹ Each request asked for information about the main purpose of the data collection effort; sample characteristics; data collection methods, sources, and frequency; the kind of information obtained (i.e., incidence,

____________

¹ While every attempt was made to include as many systems as possible, systems about which the committee was unaware are likely to exist.

Page 66 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

prevalence, risk factors, functional health outcomes, clinical care information, and demographic characteristics); costs and source(s) of funding for the system; and data dissemination (i.e., online availability of data, online query, and who can obtain access).

Of the 49 requests, 35 responses were received. Information on eight additional data collection approaches was obtained through published literature and online queries (see Appendix A). The following discussion reviews the strengths and limitations of various types of data collection efforts, including surveys, registries, cohort studies, administrative and health services data, vital statistics, and data regarding hospital performance.

DATA COLLECTION EFFORTS

Surveys

Routine surveys are particularly valuable surveillance tools for chronic diseases and health-related behaviors. In general, surveys are most useful for disease surveillance when they ask people about information for which they may be the most valid and reliable source (e.g., their own private behaviors, attitudes, or mental health status), or for which they can report with reasonable reliability, even if they are not the only or most valid source of information (e.g., whether he or she went to the doctor in the past month). In some cases, surveys link such self-reported data to data collected from other sources. The following sections of this chapter discuss major surveys at the national level as well as examples of state and local surveys. The discussion includes a description of the purpose of the survey, its methods, the extent to which data are collected on topics relevant to cardiovascular and chronic pulmonary diseases, and how data are disseminated. Each description includes a brief discussion of strengths and limitations.

National Population-Based Surveys

The Behavioral Risk Factors Surveillance System (BRFSS) The BRFSS, nationally coordinated by the CDC and conducted by state health departments in all 50 states and the District of Columbia, is a state-based system of cross-sectional health surveys of adults. It collects information on health risk behaviors, preventive health practices, and healthcare access, primarily related to the areas of chronic disease and injuries. The BRFSS has been the primary source of state-level population health estimates from surveys and has been available in all states since 1984. States may request information from the CDC; the information includes samples of telephone numbers with substate or local strata, an option taken by 41 states. The core questionnaire is required of all states. Data collection is funded by several sources, including state and federal agencies and private organizations. The CDC supports a portion of the data collection efforts, and the states provide their own funding for optional modules and state-added questions. Private partners also support collection of data in the different states. BRFSS data are widely used for policy development and advocacy at both the national and state levels.

The BRFSS questionnaire is administered on a continuous basis by telephone using random-digit dial sampling methods. The design consists of a probability sample of all households with telephones in the state. Survey respondents are between the ages of 18 and 99, and only one adult per household is interviewed. As part of the core survey questionnaire developed by the CDC, self-reported information is routinely collected on diagnosed health conditions, including stroke, congestive heart failure (CHF), coronary heart disease (CHD), diabetes, and asthma, but not chronic obstructive pulmonary disease (COPD). The CDC provides an optional module on COPD that states may include at their discretion (and expense). The core questionnaire also collects information on diagnosis of cardiovascular risk factors, including hypertension, diabetes, and high cholesterol. Questions on tobacco use, alcohol consumption, physical activity, nutrition, and weight status, including consumption of fruits and vegetables, are also asked. Limited data are also collected on access to, and use of, healthcare services, including preventive services.

Sociodemographic data collected include age, sex, race/ethnicity, marital status, education, employment, and household income. Most states and localities with BRFSS surveys have the ability to examine prevalence of health conditions and risk factors by major race/ethnic and income groups. Race/ethnicity is collected as Hispanic, white, black or African American, Asian, Native Hawaiian or Other Pacific Islander, and American Indian or Alaskan

Page 67 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Native. Only some states collect explicit data on nativity. Geographically, in addition to state-level estimates, the CDC currently aggregates BRFSS data to produce a limited set of annual estimates for 177 metropolitan and micropolitan statistical areas and 166 counties (which vary from year to year due to sampling variations).

The BRFSS provides annual findings and data files via the website http://www.cdc.gov/brfss/ and on CD-ROM, with additional information on survey instruments, other documentation materials, sets of trend analysis tables for the states and the nation, and sets of demographic-specific tables for estimates of risks and conditions, including bar charts for comparison of areas or survey years. Results are also easily accessible via interactive tools, although the “Web Enabled Analysis Tool” is available for limited survey years. At the time this report was in press, data were available from 1984 through 2009. More than 1,500 peer-reviewed journal articles have been published using BRFSS data.

Strengths and limitations The BRFSS has numerous strengths for use in surveillance. The CDC’s strong control over survey questions to be used ensures that data collected by each state’s BRFSS are reasonably comparable to data collected by other states. As an ongoing survey, it enables tracking of trends. The BRFSS collects information on prevalence of self-reported asthma/adult asthma history, cardiovascular disease (heart attack/stroke), diabetes, and health risk factors that include cholesterol and hypertension awareness (CDC, 2009b). It is adaptable for local use at the expense of each jurisdiction that wishes to use it.

The prevention of CVD and chronic lung disease is a long-term effort that must address risk factors throughout the life course, and the absence of significant information collected about children and adolescents means that the BRFSS does not provide local surveillance of obesity, diet, and physical activity in these age groups. Although other surveys do collect such information on children and adolescents, not being able to link that information to parents’ information is a handicap for prevention efforts. In addition, the BRFSS’s thin measurement of health insurance coverage and access to care limits its value for assessing factors that affect the receipt of clinical preventive and disease monitoring services.

Because it typically does not collect locally representative survey samples, the BRFSS has limited use for local-level analyses and research. Such research is necessary to support efforts to address geographic and social disparities. The CDC recognized the need for local data and used aggregated BRFSS data to produce a limited set of annual estimates for local geographic areas, but these vary from year to year due to sampling variations. It is doubtful that these can meet needs for in-depth data for research and analysis of local variations in chronic diseases and their risk factors. Nearly a third of states have expanded state BRFSS samples at their own expense to generate representative data sets for local substate strata. Such efforts are described in the section below on state surveys.

The BRFSS also relies on self-reported information. It does not collect blood specimens or contain information on incidence of disease and health outcomes or data on chronic bronchitis or emphysema (IOM, 2009). The required core and optional module BRFSS questionnaires of the survey examine disease history and signs and symptoms of disease (e.g., shortness of breath), but the BRFSS core does not collect national data about chronic lung disease, with the exception of asthma. Furthermore, response rates to the BRFSS are lower than ideal and declining, a limitation that it shares with all telephone surveys, and as a telephone survey, it does not include people without telephones.

Youth Risk Behavior Surveillance System (YRBSS) The YRBSS is focused on monitoring priority health risk behavior, including physical inactivity, dietary behaviors, the prevalence of obesity, and asthma among students in grades 9–12 (CDC: http://www.cdc.gov/HealthyYouth/yrbs/index.htm). The survey is conducted by the CDC and by state, territorial, and local education and health agencies and tribal governments. The purpose of this survey is to provide critical behavioral information on adolescents nationwide. At the state level, information is used for school- and community-based program evaluation and policy development as well as for national research and surveillance of health behavior and health risk disparities.

Data are collected every other year, usually during the spring semester. Information is collected from a nationally representative sample of public and private high school students (grades 9–12) in each participating jurisdiction as well as a representative sample of students enrolled in middle schools and alternative schools. The survey is administered in 10 to 15 sites per cycle. A class is randomly selected to participate, and all students in that class are asked to take part in the survey. The survey is a self-administered written questionnaire conducted in school classrooms.

Page 68 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

The YRBSS monitors six categories of priority health risk behaviors among youth and young adults, three of which pertain to CVD risk factors. These include behaviors that contribute to unintentional injuries and violence; sexual behaviors that contribute to unintended pregnancy and sexually transmitted diseases, including HIV infection; tobacco use; alcohol and other drug use; unhealthy dietary behaviors; and physical inactivity. In addition, the YRBSS monitors the prevalence of obesity, diagnosed asthma, and prevalence of asthma attacks.

Black and Hispanic students are oversampled in the YRBSS to examine race- and ethnic-specific estimates, but the 2009 sample size from other racial and ethnic groups is “too small to permit meaningful analysis” at the national level (http://www.cdc.gov/HealthyYouth/yrbs/pdf/press_release_yrbs.pdf). However, some states and localities have sufficiently diverse samples to examine other race and ethnic subgroups. No information is collected regarding household or neighborhood income or nativity.

Similar to the BRFSS, the CDC provides annual findings and data files via the website www.cdc.gov/yrbs,² with additional information on survey instruments and other documentation materials. Results are also easily accessible to non-researchers via interactive tools and summary tables that can be queried.

Strengths and limitations The YRBSS shares many of the same strengths as the BRFSS for surveillance, despite the different methodologic design. Like the BRFSS, the CDC’s control over core survey questions to be used in the YRBSS ensures that data collected by each state are comparable to data collected by other states, and results are summarized in an annual Morbidity and Mortality Weekly Report (MMWR). As an ongoing cross-sectional survey, it enables tracking of trends in prevalence. Unlike the BRFSS, the reported response rate of YRBSS surveys is typically 70 percent or greater. Finally, the CDC allows states and localities to add a small subset of questions of local import, thus making it somewhat flexible for local adaptation. The information collected enables surveillance of the prevalence of self-reported asthma and health risk factors (CDC, 2010).

Although the YRBSS has several strengths, its main shortcomings include its limited representativeness and lack of detailed questions on risk factors for CVD and other chronic diseases. In most states and localities, the YRBSS is conducted using sampling frames of public high schools only, and thus it is not generalizable to private, parochial, or some vocational high school students, nor does it include adolescents who have dropped out of high school. In terms of risk factors, the survey does not collect detailed information on factors such as family medical history, food consumption or physical activity patterns, or access to clinical and preventive services. In addition, the lack of information on household or neighborhood socioeconomic status, nativity, or ancestry limits the ability to examine disparities in risk factors. The YRBSS does not collect information that could link adolescents’ responses to information on adults, precluding analyses of risk factors within families and households. Because it does not collect locally representative survey samples, the YRBSS has limited use for local-level analyses and research. Finally, like the BRFSS, the YRBSS relies on self-reported information and does not collect blood specimens nor does it contain information on incidence of disease and health outcomes, chronic bronchitis, or emphysema (IOM, 2009).

National Health Interview Survey (NHIS) The NHIS has monitored the health of the nation since 1957. It is a federally funded survey conducted by the National Center for Health Statistics, which provides data that are used widely to monitor trends in illness and disability, to progress toward achieving national health objectives, for determining barriers to accessing and using appropriate health care, and for evaluating federal health programs. The data also are used for public health research and policy development nationwide and regionally.

The NHIS is a cross-sectional household interview survey of men and women between the ages of 1 and 99. It is conducted in English and Spanish by interviewers employed and trained by the U.S. Census Bureau. The sampling plan follows a multistage area probability design that permits the representative sampling of households and noninstitutional group quarters (e.g., college dormitories), and the plan is redesigned after every decennial census. All states and the District of Columbia are included in the sample. Sampling takes into account multiple geographic levels (e.g., local, state, national), but the sampling design is primarily aimed at making national and regional estimates.

For the Family Core component, all adult members of the household aged 17 and older who are at home at the time of the interview are invited to participate and to respond for themselves. Information about children and adults not at home during the interview can be provided by a responsible adult family member who is 18 or older

____________

² ICF Macro is a research and technology consulting firm.

Page 69 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

and who resides in the household. For the Sample Adult questionnaire, one civilian adult per family is randomly selected and responds for her- or himself. Data are collected annually and continuously, with a different, large cross-sectional sample of approximately 35,000 households each year, with a response rate of nearly 90 percent of eligible households (http://www.cdc.gov/nchs/nhis/about_nhis.htm).

The NHIS questionnaire uses a computer-assisted personal interviewing (CAPI) model. The revised NHIS questionnaire, implemented since 1997, has core questions and supplements. The core contains four major components: Household, Family, Sample Adult, and Sample Child. The household component collects limited demographic information on all of the individuals living in a particular house. The family component verifies and collects additional demographic information on each member from each family in the house and collects data on topics such as health status and limitations, injuries, healthcare access and use, health insurance, and income and assets. The supplements are used to respond to new public health data needs as they arise, particularly those for which other federal agencies provide funding. The most recently published NHIS core questionnaire includes 5 questions on diabetes, 13 questions about CHD and stroke, 5 on asthma, 1 on emphysema, and 1 on bronchitis.

The current NHIS sample design oversamples blacks, Hispanics, and Asians and persons over age 65. National and regional prevalence estimates on conditions in these race/ethnicity and age groups, as well as by household income group and nativity, are robust.

Data files are released to the public through the NHIS website. The results of different studies using NHIS data are published in several types of reports released through the Internet or in journal articles. Information is also available at http://www.cdc.gov/nchs/nhis.htm.

Strengths and limitations The NHIS serves as the nation’s benchmark health survey. The main strengths of the NHIS are its representativeness, large sample size, adequate sampling of minorities, good response rates, and data on CVD and chronic lung conditions and risk factors. Incidence of self-reported diabetes and CVD can also be roughly estimated, and it is possible to link the survey to national mortality statistics.

The major limitations of NHIS are the lack of physical examinations or directly measured risk factors and disease, and the inability to generate local estimates. Larger states (now approximately 20) have sufficient sample sizes so that reliable state estimates can be made, although that is not the case for the remaining states.

National Health and Nutrition Examination Survey (NHANES) A federally funded survey, also conducted since the early 1960s³ by the National Center for Health Statistics, NHANES is the largest and longest running national source of objectively measured health and nutrition data. Data are collected on a broad range of health topics through personal household interviews, physical examinations, and laboratory testing. NHANES data provide objective assessments of prevalence of major chronic and infectious diseases nationally, and they generate key indicators of disease management for benchmarking purposes. They are used for surveillance and policy development by a range of federal agencies, and in etiologic research by a wide range of government, academic, and other institutions. Historically, NHANES was conducted periodically, but starting in 1999, NHANES has been in the field continuously. NHANES is designed to assess the health and nutritional status of a statistically representative sample of the civilian, noninstitutionalized population of the continental United States. NHANES conducts a cross-sectional, household-based survey of nearly 10,000 adults and children aged 2 months and older. The sampling plan follows a multistage area probability design that permits the representative sampling of households. Health measurements are performed in specially designed and equipped mobile centers, which travel to locations nationwide. The study team consists of a physician, medical and health technicians, and dietary and health interviewers. Many of the study staff are bilingual in English and Spanish. A series of computer-assisted questionnaires are administered in both the home and in a mobile examination center, followed by a physical examination, and finally, biological specimens are collected as part of a laboratory component.

Detailed information on chronic conditions—including cardiovascular disease, diabetes, and respiratory health and disease—are collected by questionnaire, and participants undergo comprehensive dietary interviews and body measurements. The physical examination includes several measures relevant to CVD and respiratory diseases, including blood pressure and spirometry, as well as cardiovascular fitness, body mass index, and body

____________

³ NHANES evolved from the Health Examination Survey, which was launched in 1959 (IOM, 1996).

Page 70 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

composition. Relevant biomarkers include cholesterol and triglyceride measures, C-reactive protein, and fasting plasma glucose. NHANES uses collected data to produce estimates of medically defined prevalence of CVD and its clinical risk factors, diabetes, and lung diseases (asthma, chronic bronchitis, emphysema) in the United States.

The current NHANES sample design oversamples blacks and Hispanics, and a new feature of the current sample design is that Asian persons are also oversampled. Detailed information is also collected on household income, nativity, education, and occupation, allowing for fairly sophisticated analysis of health disparities.

The continuous NHANES survey data are released on public-use data files in 2-year increments. Information about NHANES, downloadable public-use data sets, and published reports are made available through the Internet (http://www.cdc.gov/nchs/nhanes.htm) and on easy-to-use CD-ROMs. More than 10,000 peer-reviewed journal articles have been published using NHANES data; a bibliography is available on the survey homepage.

Strengths and limitations NHANES is also a benchmark national health survey. It is one of the few population-based surveys that include validated examination measures, biological specimen collection, and limited measures of health status. Rigorous training in recruitment and data collection ensures high response rates, national representativeness, and high-quality data collection. The sample size is large enough for fairly precise prevalence measures at the national level. The national serologic repository allows for trend estimation of newly emerging biomarkers, such as C-reactive protein.

Since the inception of the continuous NHANES, any 2-year analysis may be limited in sample size, and statistical power consideration should be used to determine if sample size is sufficient for a particular analysis or if additional years of the survey need to be combined to produce statistically reliable analysis. Interview (questionnaire) data are based on self-reports and are therefore subject to recall problems, misunderstanding of the question, and a variety of other factors. Despite high standards for data collection, examination data and laboratory data are also subject to measurement variation and possible examiner effects. The survey does not collect data on incidence of acute CVD events or exacerbations of chronic lung disease. Finally, the cohort is not large enough to generate state or local prevalence estimates.

State Surveys

Nearly a dozen states have established separate surveys to meet their needs for local and state population health data. The growth of state and local health surveys is a positive development, demonstrating that policy makers at those levels recognize and are responding to the need for population health data. Although these surveys differ in the topics covered, measures used, and sample designs, many adopt designs and questions from the national surveys described above, and they have considerable use for tracking change and disparities in CVD and chronic lung disease within their target geographic areas. Their value for a national surveillance system is limited in measuring differences across geographic domains for which consistency of measurement is critical (Gold et al., 2008). A small number of states are experimenting with health examination surveys modeled after NHANES, including the Survey of the Health of Wisconsin (SHOW) (http://www.show.wisc.edu/) and the Arkansas Cardiovascular Health Survey (http://www.healthy.arkansas.gov/programsServices/chronicDisease/Initiatives/Documents/ARCHES/ARCHESQuestionnaire.pdf). The committee selected three examples of ongoing state telephone surveys to illustrate these developments.

California Health Interview Survey (CHIS) One of the nation’s largest ongoing health surveys, the CHIS is the state’s primary source of data for public health surveillance and tracking of changes in health insurance coverage as well as eligibility for public healthcare coverage programs. The CHIS covers a broad range of health issues, including health conditions and behaviors, mental health, health insurance, healthcare use and access, and special modules on the health of women, children, and persons over age 65. CHIS data are used for policy development and advocacy within California at both the state and county levels. They are also used for national research and surveillance of racial, ethnic, and other social disparities in health and health care. The CHIS is funded by multiple public agencies and private organizations at the federal, state, and local levels.

Over any 2-year period, the CHIS conducts telephone interviews with about 50,000 households, selected by random-digit dialing (RDD), throughout the state. CHIS develops samples for each of 44 geographic strata,

Page 71 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

including 41 single-county strata and 3 multiple-county strata; two large counties also include several subcounty strata. Data files include samples for each geographic stratum. Households are selected for participation through random-digit dialing sampling of landline phones and cell phones. In each household, one adult (aged 18 or over) is randomly sampled for interview. In addition, in households with children, one child (through age 11) is randomly sampled and the most knowledgeable parent is interviewed, and one adolescent (aged 12–17) is sampled and directly interviewed (after obtaining parental permission).

The CHIS collects information on asthma (diagnosis, asthma symptoms, emergency room visits, and control and management of asthma), diabetes (pre-diabetes or borderline diabetes, diagnosis, and management of diabetes), and heart disease (heart attacks, heart failures, congestive heart failure, and control and management). Information is also collected on conditions and behaviors associated with these diseases, such as diet, physical activity, and smoking. Information is collected on access to and use of health care, including health insurance coverage, usual source of care, doctor visits, delays in getting care, medical home, communication problems with doctor, and long-term care. CHIS questions are typically drawn or adapted from the NHIS, BRFSS, and other national surveys.

The survey also collects detailed sociodemographic information, including age, sex, detailed race/ethnicity, marital status, education, employment, household income, veteran status, sexual orientation, citizenship and immigration status, languages spoken at home, and English-language proficiency. Questionnaires are translated and administered in English, Spanish, Mandarin, Cantonese, Korean, and Vietnamese. The sample is designed to collect adequate samples of key racial/ethnic populations and to reflect the geographic and other social diversity of California.

The CHIS is conducted by the University of California–Los Angeles (UCLA) Center for Health Policy Research in collaboration with several government agencies and private foundations that fund it. The center uses multiple approaches to disseminate CHIS data and findings. AskCHIS, a free easy-to-use online data query tool, enables users to tailor detailed descriptive analyses for any CHIS health topic by detailed demographics and geographic locations (http://www.chis.ucla.edu). Public-use data files for all years can be downloaded from the CHIS website in SAS, SPSS, and Stata data formats. Confidential CHIS data can be accessed by researchers through the secure CHIS Data Access Center (DAC). Nearly 200 peer-reviewed journal articles have been published using CHIS data. Workshops on data access and use are conducted for community organizations and agencies and for researchers. Further information about the survey is available at http://www.chis.ucla.edu.

Ohio Family Health Survey (OFHS) This survey, conducted in 1998, 2004, 2008, and 2009, provides state policy makers with information about the health status, healthcare use, health insurance coverage, and healthcare access of Ohioans at the state and county levels. Special attention is paid to those on Medicaid and the uninsured. OFHS data are used for health policy development within Ohio, and by local jurisdictions in their health planning and policy development. This survey is supported by various government and health agencies in Ohio.

OFHS interviews about 50,000 adults, aged 18 years or older, by telephone and obtains proxy responses for more than 13,000 children, one from each household. Households are randomly selected by RDD to landlines and cell phones. The sample includes 88 county strata and random selection of an adult respondent within each household. Questionnaires are translated and administered in English and Spanish.

The questionnaires include three questions related to heart conditions (heart attacks, coronary heart disease, strokes, and congestive heart failure), three questions on asthma, and five questions on diabetes. Information is obtained about three risk factors: smoking, weight, and height. Additional information is collected on health insurance coverage, coverage for supplemental services (vision, dental, prescriptions, mental health care), healthcare use, access to care, and unmet needs for care. OFHS questions are typically drawn or adapted from the NHIS, BRFSS, and other national surveys.

The survey collects information about demographics (marital status, gender, and education), employment characteristics, and income. Minority groups, such as African Americans and individuals with an Asian or a Latino surname, are oversampled to ensure that minority groups are covered in each county.

Data from the OFHS, which is conducted by the Ohio State University with funding from multiple government agencies, are accessible through public-use data files and confidential research data sets for restricted use. Researchers must contact the Ohio Colleges of Medicine Government Resource Center to obtain permission to use the confidential data sets. Further information about the survey can be found at http://grc.osu.edu/ofhs.

Page 72 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Hawaii Health Survey (HHS) The Hawaii Health Survey aims at providing the Hawaii Department of Health, other agencies, and the public with data on health services, programs, and health issues. This survey was originally initiated in 1968 and modeled after the NHIS. Until 1996, interviews had been conducted in person, but in 1996 it became a telephone survey. Hawaii Health Survey data are used in public health policy analysis and development within Hawaii, and by local jurisdictions for which samples are available.

Surveys are conducted annually (since 1968); information is collected from approximately 6,769 adult respondents, aged 18 and older, on behalf of about 20,000 individual household members. Respondents are not randomly selected; an adult who is identified as the most knowledgeable about his or her household is selected for an interview in English about all household members. The sample is adjusted and weighted for subareas of Honolulu (city and county), Hawaii, Kauai, and Maui.

Specific questions related to CVD, COPD, asthma, and/or diabetes include questions on whether the person has been diagnosed as having arthritis, asthma, diabetes, high blood cholesterol, hypertension, or cancer (questionnaires are not publicly available). Other Hawaii Health Service questions include behaviors and risk factors (overweight and obesity), health insurance coverage, child care, access and use of health care, other chronic conditions, mental health, and food insecurity. The survey includes detailed information on age, gender, race/ethnicity, household income, education, and household size. Selected data tables are available online. Public-use data files are not available, although publications of researchers using these data are. Further information about the survey is available at http://www.hawaii.gov/health/statistics/hhs/index.html.

Local Surveys

Some counties and cities have established their own periodic health surveys. Los Angeles County has conducted periodic surveys of its population, and New York City has gone farther than any other local jurisdiction by developing surveys of adults from all five boroughs as well as a one-time local Health and Nutrition Examination Survey (http://www.nyc.gov/html/doh/html/hanes/hanes.shtml). These surveys are designed to meet state and local needs for population health data to guide efforts to address chronic disease and other domains of health disparities.

New York City (NYC) Community Health Survey (CHS) The NYC CHS is a local health survey that collects information on health risk behaviors, health conditions, preventive health practices, and healthcare access, primarily related to chronic disease and injuries. This survey was initiated in 2002 and is conducted annually. NYC CHS data are used for policy development, program evaluation, and advocacy within NYC and at the neighborhood level. They are also used for research and surveillance of racial, ethnic, and other social disparities in health. The survey is funded by the NYC Department of Health and Mental Hygiene. There are no federal funds to support this survey.

The study sample consists of a stratified quota probability sample of households with telephones in the city (approximately 10,000 participants per year). This design uses random-digit dialing to enroll sufficient quotas of participants from different ZIP codes. One adult, age 18–99, per family is randomly selected to participate. Interviews are conducted 10 months of the year. Information is collected on self-reported prevalence of hypertension, high cholesterol, diabetes, and asthma, and on aspirin use. Information is also collected on physical activity; nutrition and weight control, including consumption of fruits and vegetables; tobacco use and alcohol consumption; and access to, and use of, healthcare services. Self-reported sociodemographic data are collected, including age, sex, race/ethnicity, nativity, marital status, education, employment, and household income. The large survey size and diverse urban population allow for the ability to examine and describe social disparities in health and health care.

NYC CHS provides annual public-use data files through its website, http://www.nyc.gov/doh/mycommunityhealth/, as well as survey instruments and other documentation materials, sets of trend analysis tables for the states and the nation, and sets of demographic-specific tables for estimates of risks and conditions, including bar charts for comparison of areas or survey years. More than 40 peer-reviewed journal articles have been published using NYC CHS data.

Page 73 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Strengths, Limitations, and Opportunities of Population Health Surveys

Health surveys of the general population provide valuable information about the prevalence and distribution of chronic diseases as well as about associated risk factors that may contribute to them and their consequences. Major strengths are the breadth of information they offer and the ability to achieve representativeness through careful sampling. Such information may be helpful in tracking distributions, changes in rates, and comparisons among subgroups. Population surveys are especially valuable because they are based on nonclinical samples, including people who may not have been included in disease reporting systems or registries. Furthermore, population surveys provide valuable data for analyses of disparities in health and healthcare related to the social characteristics they measure (e.g., race, ethnicity, income, geographic area of residence, and other social characteristics). Comprehensive surveys enable researchers to include in their analyses other issues that may be relevant to chronic diseases, including mental health status, health behaviors, and other health and social factors.

Most population health surveys that collect data on chronic conditions use samples drawn from a general population, but they do not include residents of nursing homes or other institutions, many of whom may have the condition of interest. In-person surveys are widely considered to be most inclusive of the population because they select people based on where they are rather than whether they have a telephone or respond to mail surveys, and because they often have high response rates.

Challenges exist in conducting population health surveys. The high cost of conducting in-person surveys has motivated the use of telephone surveys, which can reach a larger and geographically more dispersed sample at far lower cost per completed interview. As the field of telecommunications has changed in the past two decades, telephone surveys have begun sampling both persons with landlines and those who rely on cell phones. Nonetheless, with call screening technologies widely available and with increasing demands on people’s time, telephone surveys have seen steep declines in response rates, eroding public confidence that they include a truly representative sample of the population. Survey methodologists are struggling to develop modes of survey data collection that cover all relevant sectors of the population, including those more responsive to web-based communication than telephone, as well as persons from all relevant races, ethnicities, income, and education levels.

Good chronic disease surveillance requires valid and reliable measurement of the condition. Many population surveys rely exclusively on respondent self-report to questionnaire items, which is perhaps most valid for measuring many health behaviors, mental health conditions, perceived barriers to accessing health services, and reporting of symptoms. However, surveillance of chronic disease also requires reliable examination and laboratory data, which are expensive to collect within the context of a population survey. Examples of population health surveys that rely on respondent self-report include the NHIS, the YRBSS, the BRFSS, and many comprehensive state and local health surveys, such as the CHIS, the OFHS, and NYC CHS. Examples of population health surveys that employ both in-person clinical and laboratory examinations as well as respondent self-reports are the NHANES, SHOW, and NYC HANES.⁴

The CDC’s BRFSS and YRBSS are two examples of surveys that have advanced chronic disease surveillance capacity at the state level through the efficient leverage of federal resources, and in some cases they include local sampling. Likewise, dedicated state surveys such as CHIS and the OFHS demonstrate that state and private funds can be harnessed for expanded data collection that is highly responsive to a wide range of local and regional stakeholder needs. Similar synergies are needed to (1) link state and local BRFSS data to data sources that provide neighborhood environmental information; (2) promote coordination of state and local surveys with federal surveys to enhance the comparability of measures and resulting data; (3) support state and local efforts to collect examination and laboratory data as part of population surveys; and (4) increase timeliness of national and state survey data releases. Researchers generally make good use of surveillance survey data when data files are available from the surveys. However, most surveys could usefully expand their dissemination strategies and resources to facilitate and encourage the use of surveillance survey data for policy development and advocacy, particularly at the state and local levels.

____________

⁴ See http://www.nyc.gov/html/doh/html/hanes/howto.shtml (accessed August 2, 2011).

Page 74 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Registries

One of the most powerful tools employed for the recording of chronic diseases is the use of a register, a place in which discrete facts are precisely recorded. The use of a register as a tool was first described nearly a millennium ago in England in the Domesday Book, used to ascertain royal land holdings and revenues (Weddell, 1973). The passing of the Census Act in Great Britain followed in 1800 (Weddell, 1973), enabling the creation of a means to collect complete basic demographic data about a population.

A registry, as it pertains to health care, is defined as “a file of data concerning all cases of a particular disease or other health-relevant condition in a defined population such that the cases can be related to a population base” (Last, 2001). The Agency for Healthcare Research and Quality (AHRQ) (2010) has defined a patient registry as “an organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by a particular disease, condition, or exposure and that serves predetermined scientific, clinical or policy purpose(s).”

Types of Registries

There are several distinct types of health-related registries that compile unique information and context. Examples of patient registries defined by AHRQ include product registries (device or pharmaceutical), health service registries (relating outcomes to exposure to a healthcare service), disease or condition registries (in which the presence of the disease or condition becomes the inclusion attribute), or combinations of the above. Weddell (1973) classified registries along a somewhat different taxonomy, including as major categorizations specific information registries, disease-based registries, treatment registries, aftercare registries, at-risk registries, and resource registries. Although both of these classification systems are equally appropriate, the committee has chosen to use the Weddell categories in the following description of registries.

Specific information registries collect and record information pertaining to specific and defined conditions, enabling calculation of incidence and prevalence of the condition. Examples of specific information registries might include information on a specific medical condition, such as congenital malformations, or perhaps it might include attempts to monitor health practice in response to new legislation. Disease-based registries are, as the category implies, related to a specific disease condition, with case definitions clear enough to be recorded and catalogued. Examples of disease-specific registries might include conditions such as ischemic heart disease, COPD, specific types of cancer, schizophrenia, blindness, etc. (WHO, 1969). Such registries can serve as a powerful means of observing and recording the natural history of a disease, the response over time, and the effectiveness of various treatments. They can also accrue information pertaining to the safety or harm of various treatments, the care provided, care patterns, quality of care, disparities in care provision or outcomes, and other information (Gliklich and Dreyer, 2007).

Treatment registries require an ongoing list of all individuals who have received a particular treatment, along with follow-up information. These types of registries can be procedure based, for instance, applying to those who have had certain types of surgical procedures such as carotid endarterectomy. They can be based on medical therapy, such as use of a new inhalational agent, or related to use of specific devices. An important modern-day example of the latter includes registries based on implantation of cardioverter-defibrillators. Participation in such a registry is a requirement for reimbursement for these expensive and potentially life-saving devices.

Aftercare registries record information pertaining to care regimens, such as institutionalizations or hospitalizations. At-risk registries consolidate information on individuals with known or perceived risk factors for a disease, such as for those who smoke (creating risk for chronic pulmonary disease, cardiovascular disease, and cancer) or for those who have elevated levels of blood cholesterol, creating a risk for cardiovascular disease. Occupational health risk-exposed individuals or individuals with medical hazards exposures can also be tracked via this type of registry. A resources registry conglomerates information related to a specific resource of interest, such as blood or tissue banking resources. Genetic repositories (actual DNA banks or virtual sequence repositories) could also be considered to fall into this category of resource registries.

Page 75 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Many registries are relevant to cardiovascular and pulmonary diseases. Specific examples of registries available to collect information on cardiovascular disease include the Cardiac Arrest Registry to Enhance Survival (CARES),⁵ the Cardiovascular Research Network (CVRN),⁶ the National Cardiovascular Data Registry (NCDR),⁷ the International Registry of Aortic Dissection (IRAD), the Global Registry for Acute Cardiac Events (GRACE),⁸ various third-party, payer-based cardiovascular disease registries such as BMC² (sponsored by Blue Cross), and the Paul Coverdell National Acute Stroke Registry. Though acute lung disease registries and tissue banking (e.g., Acute Respiratory Distress Syndrome Clinical Network, or ARDS Net⁹) have become important tools to understand and combat acute pulmonary disease, registries are more limited in the area of chronic lung disease. The COPD Foundation, in conjunction with the National Jewish Medical & Research Center in Denver, has established a registry of individuals diagnosed with COPD and their families who have indicated a willingness to participate in COPD research. Additionally, there are local registries, such as the Ohio State University COPD Registry, that seek to identify factors that contribute to the development of COPD.

Strengths and Limitations

Disease-specific registries are useful tools for capturing patient-specific data for individuals who have selected conditions. Registries have significant advantages. The most important is prospectively collecting the exact surveillance data needed in the exact format required. At the most fundamental level, registries allow calculation of incidence rates. If the cases are regularly followed up, a registry can also provide information on remission, exacerbation, prevalence, and survival. Registries are often used in chronic disease control, thereby enabling data collection on risk factors and prevention programs, diagnosis, treatment approaches, and mortality.

A most interesting potential use of registries is in the translation of information into gains in understanding and treating diseases. Although clinical trials are immensely useful in defining utility (or futility) of a given treatment in a highly defined population, registries provide a more real-world application and data source, accounting for wide variation in human beings, conditions, practice settings and patterns, environmental exposures (both known and unknown), and hidden biases that may creep into clinical trials when enrolled subjects do not fully represent a population at risk for or affected by a disease. Registries can therefore be the basis for “observational” studies, providing important inferential data regarding disease causality or treatment efficacy, futility, or toxicity. This can provide pivotal information leading to the improved design of a subsequent clinical trial.

The distinction between surveillance- or registry-based information and clinical trials can be marked. Clinical trials entail a population of patients who meet entry (and fail to meet exclusion) criteria. Surveillance or registry data, on the other hand, are more reflective of community or population settings. In fact, creation of community-based registries can aid in the diffusion of therapeutic advances into clinical practice; one example is the use of beta blockers for the treatment of a chronic cardiovascular disease such as heart failure (Franciosa, 2004).

As for disease surveillance, national registries can be used to improve the quality of health care. Registry information on a national level can be gleaned from administrative data sets, such as those used by the Centers for Medicare & Medicaid Services (CMS) or from large third-party payers. Such information can come in the form of hospital or practitioner report cards, or other health reporting measures that lead to changed practice and improved outcomes. For example, a number of European countries have developed national disease registries. In Portugal, such registries include those for acute coronary syndromes, percutaneous coronary interventions, and stroke; these registries contain both clinical and administrative data (Sousa et al., 2006). Sweden has more than 50 voluntary disease-based registries, developed by consensus of a given medical specialty. The registries are used to make comparisons over time so that performance indicators can be established, and hospitals may benchmark against a national database (Sousa et al., 2006). In the United Kingdom, the National Health Service has developed registries to provide open benchmarking of clinical outcomes and performance of specific institutions against a

____________

⁵ See https://mycares.net/ (accessed August 2, 2011).

⁶ See http://www.cvrn.org/ (accessed August 2, 2011).

⁷ See http://www.ncdr.com/webncdr/common/ (accessed August 2, 2011).

⁸ See http://www.outcomes-umassmed.org/grace/ (accessed August 2, 2011).

⁹ See http://clinicaltrials.gov/ct2/show/NCT00000579 (accessed August 2, 2011).

Page 76 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

national comparator (Sousa et al., 2006). Not unlike the current patchwork of chronic disease surveillance systems in the United States, European registries are often diffuse, lack interconnectivity, and lack certain usefulness that might be better achieved through nationwide harmonization.

Despite the advantages of using registries for surveillance, there are some inherent limitations. One of these limitations pertains to bias, which may creep unrecognized into data sets, and which may result in misleading conclusions. Bias in registry composition and analysis can take several forms, including that related to patient selection into the registry, unmeasured confounders, and misclassification of patients entered into a registry. It is important to understand how even registries that apparently take into account all known cases of a disease or procedures may still be confounded by bias of the sorts listed above when used for surveillance.

Aside from biases that may affect case definition, inclusion, and other material information for registries, another confounder is that subsequent data on registry patients may be missed; for example, data will be missed when registry patients visit healthcare providers not participating in the registry. Yet another potential limitation of registries is that the ability to investigate secondary questions is limited. Questions that arise after a given registry is established might prove difficult to investigate if needed data were not prospectively collected. This would be especially true when healthcare providers begin using new tests and treatments, or adopt new terminologies.

Another potential pitfall in using registries for surveillance relates to the fact that collecting registry data is not central to healthcare delivery. Collecting and entering data into the proper forms and format requires time and effort beyond the usual healthcare delivery processes. Because resources devoted to the registry often do not immediately benefit the practice or its patients, clinicians may be reticent to register patients or collect and record data on busy days, and busier clinicians may be less inclined to participate in registries altogether. The mandatory nature of some registries (e.g., those for implantation of internal cardioverter-defibrillators) tied to reimbursement is one approach to mitigate this potential pitfall.

Summary

A number of types of registries are related to health care. They collect and record information about specific conditions, treatments, outcomes, or populations. Registry data have the potential for various types of biases more than survey data do. On the other hand, registry data can provide more specific insights into disease-specific treatment effectiveness (or futility), and they help in the area of evidence-based medicine by promulgating diffusion of knowledge, treatments, and technology into community practice. Registry data fall somewhere in the evidence-based spectrum between clinical trials and surveillance. Clinical trials define prescriptive and proscriptive entry criteria. In surveillance efforts, data accrue from all (or unselected representative members) of a defined population. One can see that there is a complementarity among the three approaches—surveillance data, registry data, and clinical trials—each providing different insights, degrees of bias, and applicability to a given disease or treatment or to a particular setting. There is inherent value in each approach, with each type of data providing input into the hierarchy of evidence needed to improve healthcare outcomes.

Cohort Studies

National surveillance in the United States is largely cross-sectional and includes the household surveys such as the NHIS and the NHANES (both conducted by the NCHS) and the BRFSS (conducted by the CDC). These studies provide rapid information about national or regional populations within the United States and allow inferences about changes in disease rates or changes in prevalence in subsequent surveys. Another approach to surveillance is the cohort study. The cohort design can be either prospective or retrospective. Retrospective cohort studies are less costly, shorter in duration, and useful for examining prior exposures; however, the resulting information is less complete and accurate than through the prospective approach. Familiar examples of this approach include the Framingham Heart Study, Atherosclerosis Risk in Communities Study (ARIC), Cardiovascular Health Study (CHS), Coronary Artery Risk Development in Young Adults Study (CARDIA), Rancho Bernardo Study (RBS), and Strong Heart Study (SHS). A more comprehensive list of cohort studies is provided in Appendix A.

Page 77 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Framingham Heart Study

The Framingham Heart Study began in 1948 to secure epidemiological data on arteriosclerotic and hypertensive cardiovascular disease. The initial cohort included 5,209 persons aged 30–62. Data were collected using interviews and measurements from in-person examinations, biomarker collection, health history updates by mailed questionnaire or telephone interview, and follow-up medical records from healthcare providers. Framingham Heart Study data are available through research proposals submitted online and approved by relevant review committees. Variables are posted on the study website (http://www.framinghamheartstudy.org/risk/coronary.html).

Survivors from the original cohort continue to be followed, as do their children and grandchildren. An assessment by D’Agostino and colleagues (2008) that included 8,491 participants from the original Framingham Heart Study and the Framingham Offspring Study demonstrated that a sex-specific multivariable risk factor algorithm could be easily used in primary care to quantify general CVD risk and specific CVD risk (coronary, cerebrovascular, and peripheral arterial disease and heart failure).

The original study design for Framingham could not be used to yield prevalence rates, but it was well suited for the estimation of incidence rates. The findings could not be reliably generalized to other ethnic groups, as the cohort was primarily composed of white individuals. Investigators have also speculated that participation in periodic examinations may have motivated Framingham subjects to modify risk factors (Lloyd-Jones et al., 2002). The major contribution of the study has come in detailing incidence rates, in particular relating risk factors, and with its careful longitudinal follow-up it has near complete data on the development of diseases. Furthermore, much of these data were collected when medical interventions such as antihypertension therapies were not actively used. This has provided a “natural history” of the risk factors and the diseases that followed them. All this is ideal for the estimation and study of incidence rates. Few other studies have been in such optimal positions (http://www.framinghamheartstudy.org/about/background.html).

The Atherosclerosis Risk in Communities Study (ARIC)

ARIC is a prospective study conducted in four U.S. communities (Forsyth County, North Carolina; Jackson, Mississippi; Minneapolis suburbs, Minnesota; and Washington County, Maryland) to investigate the etiology and natural history of atherosclerosis in middle-aged adults. It also measures variation in cardiovascular risk factors, medical care, and disease with respect to race, sex, place, and time. ARIC includes a cohort component composed of 15,792 persons aged 45–64, and a community surveillance component. The cohort component serves to validate incidence rates, while community surveillance enhances the generalizability of cohort findings. A data request must be submitted to the National Heart, Lung, and Blood Institute (NHLBI) to use ARIC data for research and data analysis. ARIC includes a series of quality assurance and quality control protocols that include steps such as repeated measurements.

White and colleagues (1996) reported strengths and weaknesses of the ARIC study design with regard to CHD. In comparison to community surveillance, they observed that the cohort design “permits the more complete and standardized characterization of a broader range of CHD endpoints, including angina and, via repeated ECGs [electrocardiograms] obtained during repeat clinic visits, clinically unrecognized myocardial infarction.” Other advantages are a more accurate classification of incident versus recurrent CHD events, intensive measurement of risk factors every 3 years, and increased understanding of morbidity and mortality trends by observing changes in risk factors over time. The weaknesses include insufficient size to precisely characterize CHD rates and trends, and volunteer bias that may limit generalizability to the reference communities.

Cardiovascular Health Study (CHS)

CHS is a prospective population-based cohort study of risk factors for CHD and stroke in adults aged 65 and older; 5,201 participants were recruited from four field centers (Forsyth County, North Carolina; Sacramento County, California; Washington County, Maryland; and Pittsburgh, Pennsylvania) in 1990, with an additional 687 predominantly African American participants recruited in 1992. The baseline examinations included a home

Page 78 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

interview and a clinical examination that assessed traditional CVD risk factors as well as measures of subclinical disease, including carotid ultrasound, echocardiography, electrocardiography, and pulmonary function (http://www.chs-nhlbi.org/CHSDesc.htm). These examinations permitted evaluation of CVD risk factors in older adults, particularly in groups previously underrepresented in epidemiologic studies, such as women and the very old (Fried et al., 1991). CHS data are available with proposals approved by the CHS Publications and Presentations Committee and the Steering Committee; they are also available as a limited-access data set with NHLBI.

Coronary Artery Risk Development in Young Adults (CARDIA)

The CARDIA Study examines the development and determinants of clinical and subclinical cardiovascular disease and its risk factors in young adults. It began in the mid-1980s with a cohort of 5,115 black and white men and women aged 18–30 residing in four cities: Birmingham, Alabama; Chicago, Illinois; Minneapolis, Minnesota; and Oakland, California. The data that have been collected include blood pressure, cholesterol and other lipids, glucose, physical measurements, lifestyle, dietary and exercise patterns, behavioral and psychological variables, medical and family history, and other chemistries. Subclinical atherosclerosis was measured via echocardiography during years 5 and 10, computed tomography during years 15 and 20, and carotid ultrasound during year 20 (http://www.cardia.dopm.uab.edu/overview.htm). Use of CARDIA data requires approval from the Publications and Presentations Committee and affiliation with a CARDIA-approved investigator. A data repository data set (formerly known as Limited Access Dataset) can be requested directly from NHLBI.

Pereira and colleagues (2002) used CARDIA to examine the associations between dairy intake and incidence of insulin resistance syndrome. The authors noted that the main limitations of their study were related to the observational nature and potential for residual confounding. The strengths included the longitudinal design and (with regard to the diet history method), the comprehensiveness, the interviewer-administered format, the suitable time frame for capturing habitual diet without exacerbating recall error, and the applicability to populations differing in social and cultural characteristics.

The Rancho Bernardo Study

The Rancho Bernardo Study began in 1972 as one of 12 North American Lipid Research Clinic (LRC) Prevalence Studies designed to describe the prevalence of hyperlipidemia in different populations. An initial goal was to study gender and diabetes as risk factors of cardiovascular disease. The LRC was funded by the NHI (now the NHLBI) through an 8-year follow-up and is now in its 39th year of receiving support from the National Institute of Diabetes and Digestive Kidney Disease and the National Institute of Aging.

The LRC site was located in Rancho Bernardo, an almost entirely white suburb of San Diego. A survey was used to identify residents aged 30 and older; 82 percent (2,500 men and 2,900 women) enrolled. Survivors are invited to be seen in the research clinic seen every 3–5 years and are followed every year by mail or phone for vital status. The RBS added the classic CVD risk factors, including diabetes, to the baseline visit; subsequently it broadened its scope to include many other common exposures and chronic disease outcomes. Most risk factors, including psychosocial variables, are measured at every visit. Multiple novel risk factors, pulmonary function using spirometry, coronary artery calcium, carotid ultrasound, and peripheral arterial disease were measured at least once. Data from the RBS are available to approved investigators and have been used in more than 400 publications.

Most subjects were white and had at least a high school education, so results may not be generalizable to other groups. Multiple evaluations, with ethically mandated reports of identified risk factors or health problems, may lead to interventions, improve prognosis, or reverse causality. The strengths of the study include excellent baseline prevalence data in the era preceding widespread use of effective blood pressure or lipid-lowering medications, and > 95 percent follow-up to 2008 for clinical and fatal CVD and multiple comorbidities.

Page 79 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

The Strong Heart Study

The Strong Heart Study (SHS), which began in 1988, is a longitudinal population-based study of CVD and its risk factors in American Indians from three field centers in Arizona, Oklahoma, and South and North Dakota. There are two cohorts in the SHS: an initial sample of 4,549 American Indian men and women, age 45–74 years, and a set of 3,838 extended family members aged ≥ 15 years in 94 families, including 574 from the initial cohort. The initial cohort (62 percent of total population aged 45–74) was first examined in 1989–1992. The survivors were reexamined in 1993–1995 and 1998–1991, and the family cohort was first examined in 2001–2004 and reexamined in 2006–2009. Every examination included a personal interview and a thorough physical examination. Questions and procedures that related to CVD, chronic pulmonary disease, asthma, and diabetes included medical history, personal health habits, EKG, echocardiogram, carotid ultrasound, pulmonary function testing, and laboratory tests of lipids, glucose, insulin, albuminuria, and others (http://www.strongheart.ouhsc.edu). A total of 3,798 family members were included in genome-wide linkage scans. In addition, an annual CVD mortality and morbidity surveillance using medical records and hospital discharges has been ongoing since the beginning of the study. The longitudinal data can be used to estimate prevalence and incidence of CVD and its risk factors in the American Indian population. SHS data are available with proposals approved by the SHS Publications and Presentations Committee and the Steering Committee.

Strengths and Limitations

In general, the prospective cohort design offers several advantages, including the ability to provide incidence rates, determine a temporal sequence of events (exposure precedes disease), and examine multiple outcomes from the same exposure simultaneously. Additional advantages of the cohort design are the emphasis on systematic data collection and uniformly conducted measurements. A major weakness is the potential for differences between study volunteers and the general population (Shlipak and Stehman-Breen, 2005). Additional disadvantages include subject attrition, inability to produce prevalence data, and relative expense.

Health Services Data

Data drawn from health services encounters or medical records can be used to understand healthcare access; identify services that people with chronic conditions receive, including patient visits, examinations, and laboratory and imaging studies; and examine healthcare quality and costs. These data are valuable in chronic disease surveillance when they are based on systematic recording of information by trained professionals; they are less valuable when the recording of data is less uniform and is based more on subjective professional judgments regarding what to record about the person’s condition. Two types of health services data are claims data and medical record data obtained from manual chart abstraction or emerging electronic health records (EHRs).

Claims data (including medical, dental, and pharmacy claims) can be used to enumerate each encounter or service used by a person. It can be collected for hospitalizations, outpatient visits, public program coverage, or private health insurance. Claims data may include information that is sufficiently detailed to analyze the incidence rate of a chronic condition, the types of services patients receive, and the social characteristics of people who receive services for the condition. Claims data may also include geographic identifiers for persons or service providers and may be used to map geographic patterns of the incidence of hospitalizations, other services provided, and healthcare costs, which can be used in analyses of healthcare disparities.

Data abstracted from medical records and EHRs can provide a detailed record of the process of health services for persons with chronic conditions. (For a more detailed discussion of the use of electronic medical records in surveillance, see Chapter 6.) Such data can be used to assess quality of care provided to persons with chronic conditions and, if they include characteristics of the individual patients, the data can be used to assess disparities in care received. These data can be abstracted for use in registries (as discussed earlier in this chapter), for combination into data sets such as the Healthcare Cost and Utilization Project, or for surveys such as the National

Page 80 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Hospital Discharge Survey, the National Ambulatory Medical Care Survey, and the National Hospital Ambulatory Medical Care Survey, all of which are discussed below.

Claims Data

Claims data can be used to collect information on hospital utilization. Such data are largely collected and reported by state and federal agencies. States use standardized methods developed by AHRQ to use these data to report on hospitalization rates and mortality.

Medicare Part A claims data, also known as MedPAR or inpatient standard analytic files, are one of the most readily available and widely used sources of data on hospitalizations in the United States. All U.S. adults aged 65 and older who have paid Social Security payroll taxes for at least 10 years or who were the spouse of such a worker are eligible, as well as those who are permanently disabled or have end-stage renal disease (ESRD) or amyotrophic lateral sclerosis (ALS) are eligible for Medicare Part A. Medicare claims data are particularly useful because they are nationally representative and longitudinal for all enrollees in the traditional Medicare fee-for-service program, representing about 35 million beneficiaries.

Clinical data on Medicare hospital claims are limited to 10 diagnoses and 6 procedure codes, as defined by the International Classification of Diseases, 9th Revision-Clinical Modification (ICD-9-CM). The first or “principal” diagnosis is the reason determined at discharge as the main reason for a patient’s admission to the hospital. Medicare Part A claims also include patients’ Medicare identification number, a hospital identifier, and basic demographic data including age, sex, and race. Strengths of Medicare Part A data for chronic disease surveillance include the ability to link hospitalizations longitudinally for individual patients and to link to Medicare Part B data to assess physicians’ services and ambulatory care before and after hospitalizations. Limitations of Medicare data for monitoring cardiovascular and pulmonary hospitalizations include the very limited data on patients under age 65 (i.e., only those with permanent disabilities, ESRD, or ALS) and the lack of data on patients enrolled in private health plans through the Medicare managed-care program known as Medicare Advantage. With the growing need for data to evaluate health system performance and public health policy, a number of states are developing all-payer claims databases (Love et al., 2010).

Although administrative claims data are useful at the macro level to describe patterns of use and mortality, a number of limitations are inherent in the use of administrative data that need to be considered in the interpretation and use of these data. These limitations include coding errors, limited clinical information, and diagnostic misclassification, which include underdiagnosis, overdiagnosis, and misdiagnosis common for cardiovascular and chronic lung diseases. Although the specificity of diagnostic algorithms shows promise for selected applications (Mapel et al., 2006; Yarger et al., 2008), their sensitivity and positive predictive value may be low (Rector et al., 2004; Singh, 2009). Moreover, variations in patterns of diagnostic practices may further bias claims data (Song et al., 2010).

Healthcare Cost and Utilization Project

Another widely used source of data on hospitalizations is the federal Healthcare Cost and Utilization Project (HCUP) maintained by the Agency for Healthcare Research and Quality (http://www.ahrq.gov/data/hcup). The HCUP family of data sets includes the State Inpatient Datasets (SID) and Nationwide Inpatient Sample (NIS). The SID includes data from 42 state data agencies that submit hospital discharge abstracts from all hospitals in their respective states in a standardized format. The SID includes approximately 26 million discharges per year, representing about 90 percent of all acute-care discharges in the United States annually. A closely related database is the NIS, which includes 8 million discharges per year from a sample of over 1,000 hospitals in the SID, representing about 20 percent of all U.S. hospitals. Data from the SID and NIS, respectively, can be used to estimate hospitalization rates in selected states and nationally for cardiovascular disease, chronic lung disease, and other major conditions.

Strengths of these data include information on patients of all ages covered by all payers (including the uninsured). Limitations of the SID and NIS include the inability to link hospitalizations for individual patients

Page 81 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

because of the lack of unique patient identifiers that are consistent across hospitals, thereby precluding the calculation of true population-based rates of hospitalizations for specific conditions.

National Hospital Discharge Survey

The Centers for Disease Control and Prevention has been conducting the National Hospital Discharge Survey since 1965 (Hall et al., 2010). This is a voluntary survey of a national sample of hospitals, which included 422 in 2007, that describes patient and hospital characteristics, hospital discharge diagnoses, and procedures. Moreover, the longitudinal design allows monitoring of trends in use.

State Cardiac Procedure Databases

Surveillance of hospitalizations related to Coronary Artery Bypass Graft (CABG) or Percutaneous Coronary Interventions (PCI) procedures can also be conducted with specialized databases mandated by selected states, including California, Massachusetts, New Jersey, New York, and Pennsylvania, to monitor risk-adjusted outcomes in all nonfederal hospitals that perform cardiac surgery procedures; only Massachusetts and New York monitor outcomes after PCI. Strengths include abstraction of key clinical variables; rigorous data adjudication; inclusion of all adults, regardless of payer and insurance status; linkages to billing data; and for some states, ability to monitor outcomes longer. Limitations include a focus on inpatient hospitalizations and lack of patient-reported outcomes such as functioning.

Administrative Claims and Clinical Data

Multiple sources of data are available on the quality and safety of hospital care in the United States, including self-reports by patients and physicians (Davis et al., 2010; Jha et al., 2008), administrative claims data, and clinical data (Chassin et al., 2010; McCarthy et al., 2009). The focus of this overview is on the use of administrative claims and clinical data.

Since 2002, hospitals nationwide have been required to collect and report administrative and clinical data that are used in accreditation, CMS reimbursement, pay-for-performance, and public reporting of performance (Chassin et al., 2010; Lindenauer et al., 2007). These data provide information on clinical indicators of the processes and outcomes of healthcare delivery (Chassin et al., 2010). Currently hospitals provide data to the Joint Commission on 57 inpatient measures, including metrics on processes of care for acute myocardial infarction and congestive heart failure, but do not include metrics for chronic lung diseases. Of these inpatient measures, 31 are publicly reported. In addition, the CMS collects data on patient satisfaction (Jha et al., 2008) and clinical outcomes (e.g., readmissions and death) (CMS, 2009).

Overall, the results of the reporting and feedback on performance of process indicators have been positive, with substantial improvements in hospital performance since 2002 (Chassin et al., 2010; Jha et al., 2005; Williams et al., 2005;). Such improvements in hospital performance provide evidence on the feasibility and potential effectiveness of a larger chronic disease surveillance system; however, there are limitations to the current hospital surveillance activities (Chassin et al., 2010; Joint Commission, 2008; Pronovost and Goeschel, 2010). These limitations and experiences from the Joint Commission and CMS hospital surveillance provide a rich resource to guide further improvement of the existing system and development of a nationwide chronic disease surveillance system. The Joint Commission report, Health Care at the Crossroads: Development of a National Performance Measurement Data Strategy (2008), summarizes the current state of affairs. Many stakeholders are conducting performance measurement initiatives (e.g., National Quality Forum, Joint Commission, National Committee on Quality Assurance, American Medical Association–Physician Consortium for Performance Improvement, the AQA Alliance, the CMS, Hospital Quality Alliance, AHRQ, and the CDC), yet, as this quote from the Joint Commission demonstrates, these initiatives have limitations:

Most performance measurement efforts operate in isolation from one another to meet the specific needs of their sponsors.… Since data are collected and used in fragmented ways, they rarely provide a picture of the overall

Page 82 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

quality of performance for a specific clinician or organization, or how well patients fare, or the state of the public’s health at-large.

Insufficient attention has been paid to the data infrastructure that needs to be in place to support performance improvement activities. The framework for designing such a data infrastructure must address consumer expectations for data privacy, support a data highway that allows for data sharing and linkages, and operate under an agreed-upon set of rules and governance structure. These issues must be addressed expeditiously. (Joint Commission, 2008)

A number of other limitations of the system need to be considered, including burden of data collection, inconsistent effectiveness of some indicators, insufficient standardization and accuracy, and gaps in measurement for some diseases and components of healthcare delivery (e.g., post-discharge and outpatient care) (Chassin et al., 2010; Joint Commission, 2008; Pronovost and Goeschel, 2010; Pronovost et al., 2007). Currently, for most hospitals the process is labor intensive with review and data abstraction from medical records. For example, an estimated 22 minutes are needed to abstract the record of a patient with congestive heart failure (Joint Commission website), which translates to more than 400,000 person-hours each year for U.S. hospitals (Fonarow and Peterson, 2009). The link between measures of process performance and health outcomes (e.g., rehospitalization, mortality) has been inconsistent (Fonarrow and Peterson, 2009; Jha et al., 2007; Mansi et al., 2010; Werner and Bradlow, 2006).

A major gap in the current CMS/Joint Commission hospital reporting and feedback is the lack of measures for some chronic lung diseases. Moreover, there is relative paucity of and scant evidence for the effectiveness of COPD-specific performance measures currently used by various organizations (Heffner et al., 2010).

National Ambulatory Medical Care Survey (NAMCS)

The NAMCS is a national survey that collects data on the provision and use of ambulatory medical care services in the United States. Data from the NAMCS are used in health services planning. Data are collected from patient visits to non-federally employed, office-based physicians. The survey is a systematic random sample of patient visits based on records and does not include anesthesiologists, pathologists, or radiologists. Information is collected in multiple stages. First, the survey samples primary sampling units, physician practices, and patient visits. Second, NAMCS selects practicing physicians from a master file from the American Medical Association and the American Osteopathic Association, stratified by specialty. Finally, physician samples are divided equally, and every subsample is randomly assigned to a week of reporting over the course of a year. Interviewers visit physicians in person prior to survey participation and show them how to fill out the forms. Data are collected on diagnosis of ischemic heart disease (IHD), heart failure, hypertension, hyperlipidemia, diabetes, COPD, and asthma, and risk factor data and laboratory measures are extracted. Patient demographic variables in NAMCS include age, ZIP code, sex, ethnicity, and race.

NAMCS allows approximate estimation of prevalence of diagnosed CVD, diabetes, and COPD and to assess resource use patterns. It is also useful for monitoring trends in ambulatory care for these conditions. However, data are restricted to those who seek care in participating physician-based offices and are thus not representative of the general population. Even for those settings, the survey does not include patients contacted by phone, contacts by house calls, visits made in institutional settings, and visits for administrative purposes only. This limits the use of this data source to generate reliable prevalence estimates or characterize health disparities.

National Hospital Ambulatory Medical Care Survey (NHAMCS)

Since 2001, the NHAMCS has collected data annually on ambulatory care services provided in noninstitutional, short-stay, and general hospital emergency rooms and outpatient departments as well as in ambulatory surgery centers. The survey does not include federal, military, and Veterans Administration hospitals. The survey covers all 50 states and the District of Columbia. A four-stage probability sample is used. Geographically defined areas are sampled, then hospitals within those areas are selected for survey. Clinics within outpatient departments are selected for the third stage; this includes all emergency service areas and ambulatory surgery locations. Finally, the survey samples patient visits from these locations (CDC, 2009a).

Page 83 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Specially trained interviewers visit the facilities before the survey to explain procedures, verify participation eligibility, develop a sampling plan, and instruct staff about how to gather information. Staff then complete the survey forms on patient visits using a systematic random sample within a randomly assigned 4-week reporting period. Survey instruments in NHAMCS are the patient record forms from three settings of care: the emergency department, the outpatient department, and the ambulatory surgery facilities. Data collected include diagnoses, diagnostic and screening services, procedures, medication therapy, types of providers seen, and disposition.

NHAMCS includes detailed questions on chronic diseases, including a checklist of chronic conditions, participation in disease management programs, and diagnostic and screening services. The survey also collects information on each medication prescribed, as well as information on health education and non-medication treatment.

The NHAMCS provides the most current nationally representative data on outpatient care in the United States. It is the longest continuously running, nationally representative survey of hospital ED and outpatient use, which are major sources of ambulatory preventive care for lower income and Medicaid patients, and of specialty care for people with other types of insurance.

Like the NAMCS, the NHAMCS is not representative of the general population; rather, it represents a population of active outpatients, and the sampling frame of NHAMCS is not well defined. In addition, the NHAMCS surveys are designed primarily to provide national estimates. Although estimates by geographic region (Northeast, Midwest, South, and West) and metropolitan statistical area status are available, meaningful estimates cannot be made on a state-level basis.

Vital Statistics

In the United States, all deaths are legally required to be reported to health departments in the state where they occur, and data from these reports serve as a critical source of information on mortality trends and patterns. While monitoring of all-cause mortality is complete and reliable, the quality and utility of surveillance for cause-specific mortality patterns varies depending on the cause of interest, in part due to the reporting and coding processes. In nearly all states, physicians are required to record underlying and contributing causes of death on the death certificate, which are then coded by trained nosologists at health departments using a standardized international classification known as the International Statistical Classification of Diseases and Related Health Problems (currently in its 10th revision) (WHO, 2010). These data are compiled at the state and local levels, then shared voluntarily with the National Center for Health Statistics.

Studies have shown CHD listed on death certificates to have relatively low sensitivity and specificity compared with medical chart review or autopsy findings, and the majority of studies suggest that death certificates overreport CHD mortality (Agarwal et al., 2010; Coady et al., 2001; Lloyd-Jones et al., 1998; Sington and Cottrell, 2002). For example, the ARIC study found that death certificates overestimated CHD deaths by 20 percent compared with a physician review panel (Coady et al., 2001). A study of Framingham Heart Study participants found that death certificates attributed 24 percent more deaths to CHD than a physician panel reviewing medical records (Lloyd-Jones et al., 1998). Differences between in-hospital versus out-of-hospital deaths also have been identified. A study in New York City identified 50 percent overreporting of CHD deaths on death certificates among in-hospital deaths in persons aged 35 to 74 (Agarwal et al., 2010), In contrast, studies in Olmstead County have identified 5 percent underreporting of out-of-hospital CHD deaths (and 10 percent overreporting of sudden cardiac deaths) (Goraya et al., 2000).

Cause of death tracking also does not accurately reflect burden of disease for COPD-related mortality. A number of studies have shown that patients with severe COPD may not have COPD listed on their death certificate, despite respiratory involvement noted in their charts (Camilli et al., 1991; Mitchell et al., 1971). Using the U.S. National Center for Health Statistics data, Mannino and colleagues (1997) found obstructive lung disease underestimated in studies looking only at the underlying cause of death.

While errors in coding do occur, most misclassification of cause of death occurs when the cascade of health events leading to the death is improperly or incompletely reported by the physician and administrators completing the initial death certificate. Few physicians are adequately trained in identifying underlying and contributing causes of death for certifying fact of death (Lakkireddy et al., 2004). Also, in most hospitals, the providers who

Page 84 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

know most about the patient do not always complete the death certificate. Instead, available clinical and administrative staff members who may not know a decedent’s medical history are filling out the cause of death information (Messite and Stellman, 1996). In addition, providers may not have all relevant patient information available at the time. Lack of information is a particular challenge with out-of-hospital, dead-on-arrival, outpatient, and emergency department decedents. Similarly, deaths that occur among older adults with multiple comorbidities are at particularly high risk for being misclassified.

Despite these limitations in misclassification, surveillance of CVD and chronic lung disease mortality is important to monitor reductions in the burden and impact of chronic diseases on population health and for assessing improvements in treatment and management. To increase their utility, cause of death recording needs to be improved. Training of medical residents, coupled with periodic retraining of practicing physicians on death certificate completion, can potentially improve the validity of cause of death recorded. Many states are now also adopting electronic death certification, a process that may introduce ongoing and potentially interactive training opportunities (Koppaka, 2010). Eventually the process may lead to automated updates in death certificates from medical records, potentially reducing physician misclassification. This has already undergone limited piloting in Iowa (Nangle, 2010). Electronic reporting of death certificates may also improve timeliness (Goff et al., 2007).

INTERNATIONAL CHRONIC DISEASE SURVEILLANCE

International data on trends in CHD mortality have been reported from the late 1960s through the mid-2000s (PAHO, 2002; Wei et al., 2010; Zhang et al., 2003). Mortality rates from CHD in some countries (e.g., Denmark, Australia, and Canada) have decreased but remained stable or increased in others (e.g., Hungary, Romania, and Korea) during this period. Few countries in Europe, Asia, Africa, or South America have local or regional CHD surveillance systems in place to systematically disentangle changes in CHD incidence from mortality rates over time, or the contribution of primary and secondary prevention efforts to changes in CHD mortality over time. A similar lack of data exists for monitoring changing trends in the population magnitude and impact of other chronic diseases, including diabetes, heart failure, pulmonary disease, and stroke. Moreover, contemporary data describing the incidence and death rates of cardiovascular disease—which can be used to systematically compare the changing magnitude and impact of these conditions among countries—are essentially nonexistent. (See Appendix B for a list of international data collection efforts.)

Due to the lack of standardized collection of CHD mortality and incidence data, and limited availability of comparative information from a multinational perspective, the World Health Organization initiated the Multinational Monitoring of Trends and Determinants in Cardiovascular Disease (MONICA) project more than two decades ago. This ambitious project, which has provided extensive international insights into the descriptive epidemiology of CHD, examined changes over time in the incidence rates of fatal and nonfatal acute coronary events and in the primary risk factors for CHD. Forty-one MONICA centers in 21 countries, with only one U.S. center, collaborated in these monitoring efforts. However, the last point for data collection efforts of this observational study was in the mid-1990s. The need remains for the contemporary tracking of CHD and other chronic diseases and their risk factors in representative community samples.

A limited number of community-based investigations have been carried out during the past 25 years in the United States, Europe, Scandinavia, Australia, and New Zealand. They examined changes over time in the incidence and death rates from acute myocardial infarction and out-of-hospital deaths attributed to CHD. Several of these studies are either quite dated or are no longer collecting data. Each of the major population-based studies in this area has shown a net decline in the incidence rate of acute coronary events over the varying periods examined, with estimated declines of approximately 2 to 3 percent.

A number of European countries have developed national disease registries. In Portugal, such registries include those for acute coronary syndromes, percutaneous coronary interventions, and stroke; these registries contain both clinical and administrative data (Sousa et al., 2006). Sweden has more than 50 voluntary disease-based registries, developed by consensus of a given medical specialty. The registries are used to make comparisons over time so that performance indicators can be established, and hospitals may benchmark against a national database (Sousa et al., 2006). In the United Kingdom, the National Health Service has developed registries to provide open bench-

Page 85 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

marking of clinical outcomes and performance of specific institutions against a national comparator (Sousa et al., 2006). Not unlike the current patchwork of chronic disease surveillance systems in the United States, European registries are often diffuse, lack interconnectivity, and lack certain usefulness that might be better achieved through nationwide harmonization.

An example of nationwide harmonization comes from The Health Improvement Network (THIN), a primary care database composed of electronic medical records from 446 general practices throughout the United Kingdom. It contains nearly 7 million patient records (http://www.thin-uk.com/). This database has been used for the surveillance of CVD and chronic respiratory diseases (Donaldson et al., 2010; Feary et al., 2010).

In terms of COPD and chronic bronchitis, several large-scale international efforts have been carried out that have primarily assessed the prevalence rates of chronic pulmonary disease and its predisposing factors in population-based samples of adults residing in urban and rural settings; these studies have been carried out in a number of developed and developing countries of varying population size and characteristics. Many of these studies, however, were carried out in the distant past. Most have not provided information on the incidence rates of COPD and bronchitis and the major risk factors for these chronic conditions, and few have examined long-term trends—or changes over time therein—in morbidity, mortality, functional status, or use of different treatment regimens in persons with chronic pulmonary disease.

Lessons Learned from International Studies of CVD and COPD

A considerable amount of useful clinical, epidemiologic, and policy-related information has been obtained from the design and conduct of CVD, COPD, and risk factor registries as well as surveillance systems and observational studies that have been carried out in developed and developing countries over the past several decades. The growing burden of chronic diseases worldwide has resulted in a renewed emphasis on surveillance of chronic diseases in developing countries (Alwan et al., 2010). Despite the extensive amount of data collected and disseminated from these investigations, sustainability of these projects has been difficult due to funding constraints/concerns and continued interest on the part of the investigators and funding agencies.

On the other hand, a number of national disease registries and chronic disease surveillance programs have been conducted in a more cost-efficient manner, through the use of a unique personal identifier. The use of a personal health identifier has allowed for the linking of different computerized databases and files for the express purpose of bringing together patient demographic, medical history, clinical, treatment, and outcomes data, which has greatly facilitated the design and conduct of population-based surveillance studies. Indeed, several developing countries are also either utilizing at present, or considering utilizing, a unique patient health identifier.

Surveillance studies of CVD and COPD in both the United States and abroad have shown the feasibility and utility of these surveillance systems. Data from these investigations have provided insights into the descriptive epidemiology of these chronic conditions and their pre-disposing factors; hospital and long-term outcomes and factors associated with a good or unfavorable long-term outlook; and use of different management approaches that could be linked to different outcomes in future comparative effectiveness studies. These studies have shown the type of data that can be realistically collected in the context of these surveillance studies. Furthermore, they have provided insights into information that should be collected at a minimum and “wish list” type of information that might be collected either from direct personal interviews or computerized health databases.

Given ever-present economic uncertainties throughout the world, and the costs associated with initially developing, field testing, collecting information, and analyzing and disseminating surveillance-related findings, serious consideration needs to be given to streamlining the collection of pertinent data in future surveillance studies; strong consideration also needs to be given to the scientific and cost efficiencies associated with the use of a unique personal health identifier and use of standardized case definitions and data collection elements so that results can be compared within and across different countries, regions, and locales. More specialized surveys can be developed at the local, community, or state level to address more narrowly defined geographic and socioeconomic disparities in CVD and COPD with more detailed insights provided into high-risk groups and areas in need of enhanced surveillance and/or intervention.

Page 86 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

CONCLUSION

As this chapter has discussed, there are several different types of data sources collecting surveillance information for specific conditions, treatments, outcomes, or populations. National and local surveys provide powerful tools to inform on self-reported chronic conditions, health behaviors that may help prevent chronic conditions or, alternatively, increase the risk of developing such conditions. They also provide valuable information on disease management. There are several limitations with these surveys, however, including the inability of national surveys to calculate reliable local estimates unless sampling is designed to generate local data sets, the inability to link anonymous surveys to other information (or from children to parents), the limited amount of information on risk factors and outcomes for chronic lung disease, and little information on receipt of clinical services.

Registries can provide detailed clinical, demographic, and treatment-specific information on disease or procedure-specific populations or insured populations for monitoring the quality and quantity of care provided. But many registries focus on specific clinical populations, and this may result in incomplete information if the patient or beneficiary accesses care outside of the healthcare system. Hospital-based registries also include detailed information about a hospitalization but often no follow-up data. Finally, the decision to include a patient in a registry can be subjective as this decision is typically made by a clinician.

Prospective cohort studies can provide critical information for surveillance through the collection of incidence rates of temporal events (including exposures) and of clinical and patient-reported outcomes. However, prevalence information cannot be measured beyond baseline. Moreover, because longitudinal follow-up (a major strength of cohort studies) is resource intensive, the number of subjects participating will be limited, preventing the study of rare events or the study of smaller subgroups such as counties.

Critical surveillance indicators are also available in health services research data. These include hospitalization and readmission rates at the national, state, and local levels. Healthcare surveys have the ability to provide prevalence rates for cardiovascular and chronic lung disease risk factors and some outcomes. While there is as yet no universally accepted interoperable data platform for electronic medical health record data, incidence and prevalence information would be available. However, like registry data, health services data exclude information extraneous to the healthcare delivery system.

The strengths of the current data systems for cardiovascular disease and chronic lung disease surveillance relate to the multiple and diverse informants used to monitor care—population-based surveys, patient-based surveys, provider-based surveys, and health services data. The weaknesses relate to the lack of integration of surveillance information obtained from the multiple informants and to the absence of focus on the life span of subjects. Another issue of concern is the lack of inclusion of incarcerated populations in current data collection efforts. According to Wang and Wildemann (2011), “mass incarceration affects not only disease surveillance but also studies of risk factors for the development of cardiovascular disease or tests of interventions to reduce disease in minority populations.”

Despite the limitations of existing data collection systems, they are powerful tools for the collection of surveillance information. There are also emerging approaches to data collection that can enhance exiting efforts. The following chapter explores some of these emerging approaches.

REFERENCES

Agarwal, R., J. M. Norton, K. Konty, R. Zimmerman, M. Glover, A. Lekiachvili, H. McGruder, A. Malarcher, M. Casper, G. A. Mensah, and L. Thorpe. 2010. Overreporting of deaths from coronary heart disease in New York City hospitals, 2003. Preventing Chronic Disease 7(3):A47.

AHRQ (Agency for Healthcare Research and Quality). 2010. Registries for evaluating patient outcomes: A user’s guide. 2nd ed. Rockville, MD: Agency for Healthcare Research and Quality.

Alwan, A., D. R. MacLean, L. M. Riley, E. T. d’Espaignet, C. D. Mathers, G. A. Stevens, and D. Bettcher. 2010. Monitoring and surveillance of chronic non-communicable diseases: Progress and capacity in high-burden countries. The Lancet 376(9755):1861-1868.

Camilli, A. E., D. R. Robbins, and M. D. Lebowitz. 1991. Death certificate reporting of confirmed airways obstructive disease. American Journal of Epidemiology 133(8):795-800.

Page 87 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

CDC (Centers for Disease Control and Prevention). 2001. Updated guidelines for evaluating public health surveillance systems: Recommendations from the guidelines working group. MMWR Recommendations and Reports 50(RR-13):1-35.

CDC. 2009a. Ambulatory health care data. http://www.cdc.gov/nchs/ahcd.htm (accessed March 8, 2010).

CDC. 2009b. 2010 Behavioral Risk Factor Surveillance System questionnaire. www.cdc.gov/brfss/questionnaires/pdf-ques/2010brfss.pdf (accessed May 27, 2011).

CDC. 2010. Data and statistics. Youth Risk Behavior Surveillance System. http://www.cdc.gov/HealthyYouth/yrbs/index.htm (accessed March 7, 2010).

Chassin, M., J. Kosecoff, D. Solomon, and R. Brook 2004. How coronary angiography is used: Clinical determinants of appropriateness. RAND Corporation.

Chassin, M. R., J. M. Loeb, S. P. Schmaltz, and R. M. Wachter. 2010. Accountability measures—using measurement to promote quality improvement. New England Journal of Medicine 363(7):683-688.

CMS (Centers for Medicare & Medicaid Services). 2009. Roadmap for quality measurement in the traditional Medicare fee-for-service program. Baltimore, MD: Centers for Medicare & Medicaid Services.

Coady, S. A., P. D. Sorlie, L. S. Cooper, A. R. Folsom, W. D. Rosamond, and D. E. Conwill. 2001. Validation of death certificate diagnosis for coronary heart disease: The Atherosclerosis Risk in Communities (ARIC) study. Journal of Clinical Epidemiology 54(1):40-50.

D’Agostino, R. B., Sr., R. S. Vasan, M. J. Pencina, P. A. Wolf, M. Cobain, J. M. Massaro, and W. B. Kannel. 2008. General cardiovascular risk profile for use in primary care: The Framingham heart study. Circulation 117(6):743-753.

Davis, K., C. Schoen, and K. Stremikis. 2010. Mirror, mirror, mirror on the wall: How the performance of the U.S. health care system compares internationally. 2010 update. Washington, DC: The Commonwealth Fund.

Donaldson, G. C., J. R. Hurst, C. J. Smith, R. B. Hubbard, and J. A. Wedzicha. 2010. Increased risk of myocardial infarction and stroke following exacerbation of COPD. Chest 137(5):1091-1097.

Feary, J. R., L. C. Rodrigues, C. J. Smith, R. B. Hubbard, and J. E. Gibson. 2010. Prevalence of major comorbidities in subjects with COPD and incidence of myocardial infarction and stroke: A comprehensive analysis using data from primary care. Thorax.

Fonarow, G. C., and E. D. Peterson. 2009. Heart failure performance measures and outcomes. JAMA: The Journal of the American Medical Association 302(7):792-794.

Franciosa, J. A. 2004. The potential role of community-based registries to complement the limited applicability of clinical trial results to the community setting: Heart failure as an example. The American Journal of Managed Care 10(7 Pt 2):487-492.

Fried, L. P., N. O. Borhani, P. Enright, C. D. Furberg, J. M. Gardin, R. A. Kronmal, L. H. Kuller, T. A. Manolio, M. B. Mittelmark, and A. Newman. 1991. The cardiovascular health study: Design and rationale. Annals of Epidemiology 1(3):263-276.

Gliklich, R. E., and N. A. Dreyer. 2007. Registries for evaluating patient outcomes: A user’s guide (prepared by Outcome Decide Center). Rockville, MD: Agency for Healthcare Research and Quality.

Goff, D. C., Jr., L. Brass, L. T. Braun, J. B. Croft, J. D. Flesch, F. G. R. Fowkes, Y. Hong, V. Howard, S. Huston, S. F. Jencks, R. Luepker, T. Manolio, C. O’Donnell, R. Marie Robertson, W. Rosamond, J. Rumsfeld, S. Sidney, and Z. J. Zheng. 2007. Essential features of a surveillance system to support the prevention and management of heart disease and stroke: A scientific statement from the American Heart Association councils on epidemiology and prevention, stroke, and cardiovascular nursing and the interdisciplinary working groups on quality of care and outcomes research and atherosclerotic peripheral vascular disease. Circulation 115(1):127-155.

Gold, M., A. H. Dodd, and M. Neuman. 2008. Availability of data to measure disparities in leading health indicators at the state and local levels. Journal of Public Health Management and Practice 14 (Suppl):S36-S44.

Goraya, T. Y., S. J. Jacobsen, P. G. Belau, S. A. Weston, T. E. Kottke, and V. L. Roger. 2000. Validation of death certificate diagnosis of out-of-hospital coronary heart disease deaths in Olmsted County, Minnesota. Mayo Clinic Proceedings 75(7):681-687.

Groseclose, S. L., K. M. Sullivan, N. P. Gibbs, and C. M. Knowles. 2000. Management of the surveillance information system and quality control data. In Principles and practice of public health surveillance. Vol. 2, edited by S. M. Teutsch and R. E. Churchill. New York: Oxford University Press. Pp. 95-111.

Hall, M. J., C. J. DeFrances, S. N. Williams, A. Golosinskiy, and A. Schwartzman. 2010. National Hospital Discharge Survey: 2007 summary. National Health Statistics Report (29):1-20, 24.

Heffner, J. E., R. A. Mularski, and P. M. Calverley. 2010. COPD performance measures: Missing opportunities for improving care. Chest 137(5):1181-1189.

IOM (Institute of Medicine). 1996. Primary care: America’s health in a new era. Washington, DC: National Academy Press.

IOM. 2009. State of the USA health indicators: Letter report. Washington, DC: The National Academies Press.

Page 88 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Jha, A. K., Z. Li, E. J. Orav, and A. M. Epstein. 2005. Care in U.S. hospitals—the hospital quality alliance program. New England Journal of Medicine 353(3):265-274.

Jha, A. K., E. J. Orav, Z. Li, and A. M. Epstein. 2007. The inverse relationship between mortality rates and performance in the hospital quality alliance measures. Health Affairs 26(4):1104-1110.

Jha, A. K., E. J. Orav, J. Zheng, and A. M. Epstein. 2008. Patients’ perception of hospital care in the United States. New England Journal of Medicine 359(18):1921-1931.

Joint Commission. 2008. Healthcare at the crossroads: Development of a national performance measurement data strategy. Terrace, IL: The Joint Commission.

Koppaka, V. 2010. Death registration in the 21st century: Challenges and opportunities. Presented at the 2010 national conference on health statistics. http://www.cdc.gov/nchs/ppt/nchs2010/26_Nangle.pdf (accessed December 13, 2010).

Lakkireddy, D. R., M. S. Gowda, C. W. Murray, K. R. Basarakodu, and J. L. Vacek. 2004. Death certificate completion: How well are physicians trained and are cardiovascular causes overstated? American Journal of Medicine 117(7):492-498.

Last, J. M. 2001. A dictionary of epidemiology. 4th ed. New York: Oxford University Press.

Lindenauer, P. K., D. Remus, S. Roman, M. B. Rothberg, E. M. Benjamin, A. Ma, and D. W. Bratzler. 2007. Public reporting and pay for performance in hospital quality improvement. New England Journal of Medicine 356(5):486-496.

Lloyd-Jones, D. M., D. O. Martin, M. G. Larson, and D. Levy. 1998. Accuracy of death certificates for coding coronary heart disease as the cause of death. Annals of Internal Medicine 129(12):1020-1026.

Lloyd-Jones, D. M., M. G. Larson, E. P. Leip, A. Beiser, R. B. D’Agostino, W. B. Kannel, J. M. Murabito, R. S. Vasan, E. J. Benjamin, and D. Levy. 2002. Lifetime risk for developing congestive heart failure: The Framingham heart study. Circulation 106(24):3068-3072.

Love, D., W. Custer, and P. Miller. 2010. All-payer claims databases: State initiatives to improve health care transparency. Issue Brief (Commonwealth Fund) 99:1-14.

Mannino, D. M., C. P. Brown, and G. A. Giovino. 1997. Obstructive lung disease deaths in the United States from 1979 through 1993. An analysis using multiple-cause mortality data. American Journal of Respiratory and Critical Care Medicine 156(3):814-818.

Mansi, I. A., R. Shi, M. Khan, J. Huang, and D. Carden. 2010. Effect of compliance with quality performance measures for heart failure on clinical outcomes in high-risk patients. The Journal of National Medical Association 102(10):898-905.

Mapel, D. W., F. J. Frost, J. S. Hurley, H. Petersen, M. Roberts, J. P. Marton, and H. Shah. 2006. An algorithm for the identification of undiagnosed COPD cases using administrative claims data. Journal of Managed Care Pharmacy 12(6):457-465.

McCarthy, D., S. K. H. How, C. Schoen, J. C. Cantor, and D. Belloff. 2009. Aiming higher: Results from a state scorecard on health system performance, 2009. New York: The Commonwealth Fund.

Messite, J., and S. D. Stellman. 1996. Accuracy of death certificate completion. JAMA: The Journal of the American Medical Association 275(10):794-796.

Mitchell, R., J. Maisel, G. Dart, and G. Silvers. 1971. The accuracy of the death certificate in reporting cause of death in adults. With special reference to chronic bronchitis and emphysema. American Review of Respiratory Disease (104):844-850.

Nangle, B. 2010. Interoperating death registration and electronic health record systems for more timely mortality surveillance. Paper presented at National Conference on Health Statistics, Washington, DC.

PAHO (Pan American Health Organization). 2002. The Central America diabetes initiative (CAMDI): Costa Rica, El Salvador, Guatemala, Honduras and Nicaragua. http://www.paho.org/Spanish/HCP/HCN/IPM/camdi-1.htm (accessed September 2004).

Pereira, M. A., D. R. Jacobs, L. Van Horn, M. L. Slattery, A. I. Kartashov, and D. S. Ludwig. 2002. Dairy consumption, obesity, and the insulin resistance syndrome in young adults. JAMA: The Journal of the American Medical Association 287(16):2081-2089.

Pronovost, P. J., and C. A. Goeschel. 2010. Viewing health care delivery as science: Challenges, benefits, and policy implications. Health Services Research 45(5p2):1508-1522.

Pronovost, P. J., M. Miller, and R. M. Wachter. 2007. The GAAP in quality measurement and reporting. JAMA: The Journal of the American Medical Association 298(15):1800-1802.

Rector, T. S., S. L. Wickstrom, M. Shah, N. Thomas Greenlee, P. Rheault, J. Rogowski, V. Freedman, J. Adams, and J. J. Escarce. 2004. Specificity and sensitivity of claims-based algorithms for identifying members of medicare+choice health plans that have chronic medical conditions. Health Services Research 39(6 Pt 1):1839-1857.

Shlipak, M., and C. Stehman-Breen. 2005. Observational research databases in renal disease. Journal of the American Society of Nephrology 16(12):3477-3484.

Singh, J. A. 2009. Accuracy of Veterans Affairs databases for diagnoses of chronic diseases. Preventing Chronic Disease 6(4):A126.

Page 89 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×

Sington, J. D., and B. J. Cottrell. 2002. Analysis of the sensitivity of death certificates in 440 hospital deaths: A comparison with necropsy findings. Journal of Clinical Pathology 55(7):499-502.

Song, Y., J. Skinner, J. Bynum, J. Sutherland, J. E. Wennberg, and E. S. Fisher. 2010. Regional variations in diagnostic practices. New England Journal of Medicine 363(1):45-53.

Sousa, P., M. Bazeley, S. Johansson, and H. Wijk. 2006. The use of national registries data in three European countries in order to improve health care quality. International Journal of Health Care Quality Assurance Incorporating Leadership in Health Services 19(6-7):551-560.

Wang, E. A., and C. Wildeman. 2011. Studying health disparities by including incarcerated and formerly incarcerated individuals. The Journal of the American Medical Association 305(16):1708-1709.

Weddell, J. M. 1973. Registers and registries: A review. International Journal of Epidemiology 2(3):221-228.

Wei, J. W., J.-G. Wang, Y. Huang, M. Liu, Y. Wu, L. K. S. Wong, Y. Cheng, E. Xu, Q. Yang, H. Arima, E. L. Heeley, and C. S. Anderson, for the ChinaQUEST Investigators. 2010. Secondary prevention of ischemic stroke in urban China. Stroke 41(5):967-974.

Werner, R. M., and E. T. Bradlow. 2006. Relationship between Medicare’s hospital compare performance measures and mortality rates. JAMA: The Journal of the American Medical Association 296(22):2694-2702.

White, A. D., A. R. Folsom, L. E. Chambless, A. R. Sharret, K. Yang, D. Conwill, M. Higgins, O. D. Williams, and H. A. Tyroler. 1996. Community surveillance of coronary heart disease in the Atherosclerosis Risk in Communities (ARIC) study: Methods and initial two years’ experience. Journal of Clinical Epidemiology 49(2):223-233.

WHO (World Health Organization). 1969. Ischaemic heart disease registers. Copenhagen: World Health Organization, Regional Office for Europe.

WHO. 2010. International classification of diseases and related health problems (ICD-10). Geneva.

Williams, S. C., S. P. Schmaltz, D. J. Morton, R. G. Koss, and J. M. Loeb. 2005. Quality of care in U.S. hospitals as reflected by standardized measures, 2002-2004. New England Journal of Medicine 353(3):255-264.

Yarger, S., K. Rascati, K. Lawson, J. Barner, and R. Leslie. 2008. Analysis of predictive value of four risk models in Medicaid recipients with chronic obstructive pulmonary disease in Texas. Clinical Therapeutics 30:1051-1057.

Zhang, L.-F., J. Yang, Z. Hong, G.-G. Yuan, B.-F. Zhou, L.-C. Zhao, Y.-N. Huang, J. Chen, and Y.-F. Wu. 2003. Proportion of different subtypes of stroke in China. Stroke 34(9):2091-2096.

Page 90 Cite

Suggested Citation:"5 Existing Surveillance Data Sources and Systems." Institute of Medicine. 2011. A Nationwide Framework for Surveillance of Cardiovascular and Chronic Lung Diseases. Washington, DC: The National Academies Press. doi: 10.17226/13145.

×