E
Additional Examples of Surveys by Federal Agencies

This appendix provides a brief overview of the practices and experiences of a few illustrative federal surveys. It is intended to show that (1) the surveys have tackled and adequately dealt with survey methodological and operations management issues that were faced by NAOMS; and (2) often, repeated efforts over time, with sufficient resources, are needed before the survey can provide good-quality results. In fact, a first-rate federal survey that still uses its original design and data-collection procedures is rare. It is far more common that, even though initial efforts may be quite competent, excellence is attained over time. At different points during their evolution, these surveys gradually improve, but also sometimes—for example, in the case of the National Assessment of Educational Progress (NAEP) discussed below and the NCVS discussed in Chapter 3—undergo major design changes. Such changes have to be preceded by careful development and testing before implementation.

Another feature of successful government surveys is that they typically have a research component to support the investigation of issues or problems of particular import. Each of them also has a core staff dedicated to the survey’s ongoing improvement and adaptation to change. Put another way, these major government survey programs develop a professional organizational culture that fosters the approaches noted and also produces professional staff in both technical and administrative areas that are very knowledgeable about the particular survey’s issues and history.

NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS

The NAEP is an educational assessment that is administered to a probability sample of schoolchildren across the United States. It is conducted by contractors under the direction of the National Center for Education Statistics. It was initiated in 1969 to measure the performance of U.S. schoolchildren in mathematics, reading, and other subjects. The NAEP was established even though there were other ongoing student assessments at the district and state levels. The existence of these other data sources caused some to question the value of the NAEP and raised concerns about whether NAEP results may disagree with other data. In the end, it was determined that accurate evaluations of the effect of educational policies and change over time could be made only through an assessment of the entirety of the educational system such as through the NAEP, and not through a self-selected sample that the available testing programs provided.

The students are tested in each subject using sophisticated test conceptualization, test construction, administration, and scoring. The NAEP requires the combined and coordinated expertise of educators, statisticians,



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 58
E Additional Examples of Surveys by Federal Agencies This appendix provides a brief overview of the practices and experiences of a few illustrative federal surveys. It is intended to show that (1) the surveys have tackled and adequately dealt with survey methodological and opera - tions management issues that were faced by NAOMS; and (2) often, repeated efforts over time, with sufficient resources, are needed before the survey can provide good-quality results. In fact, a first-rate federal survey that still uses its original design and data-collection procedures is rare. It is far more common that, even though initial efforts may be quite competent, excellence is attained over time. At different points during their evolution, these surveys gradually improve, but also sometimes—for example, in the case of the National Assessment of Educational Progress (NAEP) discussed below and the NCVS discussed in Chapter 3—undergo major design changes. Such changes have to be preceded by careful development and testing before implementation. Another feature of successful government surveys is that they typically have a research component to sup - port the investigation of issues or problems of particular import. Each of them also has a core staff dedicated to the survey’s ongoing improvement and adaptation to change. Put another way, these major government survey programs develop a professional organizational culture that fosters the approaches noted and also produces profes - sional staff in both technical and administrative areas that are very knowledgeable about the particular survey’s issues and history. NATIONAL ASSESSMENT OF EDuCATIONAL PROgRESS The NAEP is an educational assessment that is administered to a probability sample of schoolchildren across the United States. It is conducted by contractors under the direction of the National Center for Education Statistics. It was initiated in 1969 to measure the performance of U.S. schoolchildren in mathematics, reading, and other subjects. The NAEP was established even though there were other ongoing student assessments at the district and state levels. The existence of these other data sources caused some to question the value of the NAEP and raised concerns about whether NAEP results may disagree with other data. In the end, it was determined that accurate evaluations of the effect of educational policies and change over time could be made only through an assessment of the entirety of the educational system such as through the NAEP, and not through a self-selected sample that the available testing programs provided. The students are tested in each subject using sophisticated test conceptualization, test construction, admin - istration, and scoring. The NAEP requires the combined and coordinated expertise of educators, statisticians, 

OCR for page 58
 APPENDIX E psychometricians, survey sampling statisticians, and methodologists. While the design and measurement processes are complex, the results must be reported in a manner that is useful to policy makers and researchers, yet compre - hensible at some level to the general population of parents and teachers. This complex system of multiple ongoing surveys, with its demand for high levels of both accuracy and timeli - ness, did not emerge full-blown on the first attempt. Over the four decades of its existence, almost every technical aspect of design and implementation has undergone revision while still, with some exceptions, maintaining trend measures of performance changes over time. The NAEP provides several lessons for NAOMS. The nature of the concepts that the NAEP measures has changed over time. For example, the content of mathematics topics and skills in 1970 are vastly different from those in 2000. The availability of alternative performance measures based on other tests has also continued over this time, with the introduction of more state testing programs. The NAEP has evolved over time to accommodate these changes, while still regularly producing reliable data that allow trend estimation. BEHAVIOR RISK FACTOR SuRVEILLANCE SYSTEM The Behavior Risk Factor Surveillance System (BRFSS), started in 1984, is managed by the Centers for Disease Control and Prevention, and is another example of a survey designed to identify trends over time. It is designed to collect information about health behaviors (known to be related to particular health risks) of the gen - eral population of the United States. Unlike the NAEP, which is entirely a centralized enterprise, the BRFSS is designed and managed by the individual states, with federal guidelines. This approach came about for historical and political reasons, as well as for technical and resource considerations. The BRFSS is a telephone survey of the general population that is repeated annually (through the aggrega - tion of quarterly data), using a core instrument that is supplemented with state-selected questions. From a sample design perspective, the BRFSS has to be concerned with issues of frame coverage. The sample design omits cell - phone-only households and, by definition, nontelephone households. These frame shortcomings are not stationary “targets,” but change over time. The cellphone-only households are increasing and are known to have generally younger residents. The number of nontelephone households varies somewhat with the economy, and the group is known to be disproportionately likely to contain young children. The estimation procedures try to extend results, in a defensible manner, to the nontelephone household population. The survey relies on respondent self-reports of behaviors within a specified reference period. Some of the behaviors (for example, substance abuse) are socially undesirable and thus potentially subject to under-reporting. All of the behaviors are also subject to errors of memory and frequency. These various challenges have necessi - tated a program of methodological research. This research has included consideration of alternative estimators for extending results to the nontelephone population and of questionnaire design to find out how best to encourage the reporting of socially undesirable behaviors. 2001 NATIONAL HOuSEHOLD TRAVEL SuRVEY The National Household Travel Survey (NHTS), conducted for the Bureau of Transportation Statistics (BTS), measures both the daily and the long-distance travel of the U.S. household population. The NHTS is an example of a survey designed to update and build on information from prior survey series. The earlier surveys focused on either daily travel (Nationwide Personal Transportation Survey) or long-distance trips (American Travel Survey). The NHTS was intended to cover both, as well as walking and biking trips. The survey was conducted from March 2001 through May 2002. Households were recruited by telephone. The recruited households were each sent a survey form and asked to report all travel by household members on a randomly assigned “travel day.” Telephone interviewers conducted a follow-up interview that asked respondents about their travel on the travel day and the preceding 27 days. This survey has a number of design issues relevant to NAOMS. It required self-reports of travel, some requir- ing very detailed information, over a previous reference period. The survey also illustrates one way to deal with a major public event that occurs during the data-collection period and is clearly related to the survey measure -

OCR for page 58
0 AN ASSESSMENT OF NASA’S NATIONAL AVIATION OPERATIONS MONITORING SERVICE ments. The occurrence of such an event obviously presents challenges to satisfying the survey’s original goals. Such an event also adds a goal—to measure the impact of the unexpected event—that the survey was not intended to accommodate. In the case of the NHTS, the pre- and post-September 11 data were reweighted to make each group a nationally representative sample. Rather than using the exact event day (9/11), the groups were defined by a September 25, 2001, cutoff (though, clearly, some long-distance trips could not logically be divided by that exact date). In addition to providing a case study of one way to deal with a major, unanticipated event, the mere possibility of such an event may have implications for the design of a survey such as NAOMS, perhaps supporting the use of relatively small sample replicates that can be more easily combined for estimates preceding and following a particular date than can a single annual sample. The altering of the sample design to accommodate an event that in most instances will not occur has to be balanced against the effect of the design change on other survey goals as well as on the cost of operations. The NHTS illustrates a sort of hybrid design, in which the survey continues a previous survey series, in the sense of estimation objectives, but is not a replication of the previous surveys. The NHTS provides an example of how a survey series that proceeds intermittently might take advantage of what has been learned about the design challenges, extend the data series, and also add new measures. SuRVEY OF RESPIRATOR uSE AND PRACTICES The Survey of Respirator Use and Practices was an establishment mail survey conducted by a contractor for the Bureau of Labor Statistics (BLS) in 2001. The objective was to obtain data that would permit an assessment of compliance with regulations and safe practices, as well as an estimation of the types of equipment in use under particular conditions and by different employee groups. The sample of establishments, stratified by type and size of firm, was selected from a BLS database. The identification of establishments eligible for the survey was based on information that these establishments had provided to BLS in an earlier, unrelated survey. It was unknown exactly which companies used respirators, but this prior survey information was considered a predictor of respirator use at a company. One effect of this consideration was to carry over nonresponses from the previous survey into the current one. Determining how fully the actual target population was covered was also problematic. The questionnaire was designed by survey sponsors and pretested with cognitive interviews and a field test. Data collection was by mail addressed to a “knowledgeable person” at each establishment. This essentially first- time federal survey followed standard practices but still encountered a number of problems, some of them very similar to those identified in the NAOMS survey. The following excerpts of findings from an NRC report summarizing a review of the survey 1 illustrate that first-time surveys often encounter similar problems, which while not “fatal,” need to be corrected and can be cor - rected if the opportunity for subsequent waves of data collection are available: • The survey was an important first step in collecting respiratory protection data from a probability sample. • There was insufficient documentation and detail . . . to reconstruct key aspects of the methodology. • The field test paid little attention to exploring validation procedures that might have provided information on the quality of data. • There were several material weaknesses in the procedures for instrument testing. • NIOSH [The National Institute for Occupational Safety and Health] did not set specific precision guidelines for key estimates. These findings were followed by a series of recommendations to address the listed problems and other flaws should a follow-up survey be conducted at some future date. 1 National Research Council, Measuring Respirator Use in the Workplace, William D. Kalsbeek, Thomas J. Plewes, and Ericka McGowan, eds., The National Academies Press, Washington, D.C., 2007.

OCR for page 58
 APPENDIX E REFERENCES FAA (Federal Aviation Administration). 1995. Aviation Safety Action Plan: Zero Accidents: A Shared Responsibility. Wash - ington, D.C., February 9. Available at http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA307192&Location=U2&doc= GetTRDoc.pdf. Accessed July 15, 2009. FSF (Flight Safety Foundation). 2009. Global Aviation Information Network (GAIN). Alexandria, Va. Available at http://www. flightsafety.org/gain_home.html. Accessed July 15, 2009. GAO (Government Accountability Office). 2000. Aviation Safety: Safer Skies Initiative Has Taken Initial Steps to Reduce Accident Rates by 2007. RCED-00-111. Washington, D.C. June 30. NCARC (National Civil Aviation Review Commission). 1997. Avoiding Aviation Gridlock and Reducing the Accident Rate, Final Report. Washington, D.C. December. Excerpt available at http://www.faa.gov/library/reports/aviation/final/media/ NCARC_Rpt_PartIII_SectIII.pdf. Accessed July 15, 2009.