Click for next page ( 196


The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 195
10 Putting Surveys, Studies, and Datasets Together: Linking NCES Surveys to One Another and to Datasets from Other Sources George Terhanian and Robert Boruch "Relations stop nowhere....The exquisite problem is eternally but to draw the circle within which they happily appear to do so." . . . Henry James, Roderick Hudson, 1876 This paper examines ideas about combining different datasets so as to inform science and society. It was prepared at the invitation of the National Research Council's (NRC) Board on Testing and Assessment so as to inform the board's deliberations about policy on education surveys in the United States. The surveys of paramount interest are those sponsored by the National Center for Education Statistics (NCES). The research reviewed here and the implications that are educed from it are directed first to the NRC. They are dedicated in the second place to the interests of the NCES. The third target is the social sciences community more generally. Examples given here are drawn from a variety of sciences inasmuch as data linkage issues transcend academic disciplines. They are taken from different institutional jurisdictions because the issues cross geopolitical boundaries. Two studies are used to provoke discussion and to frame some issues: Hilton's (1992) Using Data-bases in Educational Research and Hedges and Nowell's (1995) paper on national surveys of the mathematics and science abili- ties of boys and girls. We also depend heavily on other materials generated by NCES, the NRC, and others. This includes work, for example, on teacher supply, demand, and quality (National Research Council, 1992) and on integrating fed- eral statistics on children (National Research Council, 1995~. The minutes of the 195

OCR for page 195
196 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER NCES Advisory Council on Education Statistics reflect periodic interest in the way NCES surveys can be linked to one another or to data generated by other federal agencies (Griffith, 1992) and we exploit these also. In what follows we begin with the two illustrations that help frame discus- sion. The pedigree of linkage is considered briefly, and the ubiquity of linkages in contemporary surveys is then discussed. Inasmuch as the meaning of words such as linkage, merging, and so on are used differently in the research literature, the next section covers ways to clarify the language. Distinctions are further drawn between statistical policy for making surveys connectable in contrast to de facto policy in which post facto connections are difficult. Evaluating the prod- ucts of any variety of linkages is important, and this topic is covered also, based on suggestions about mapping and registering linkage studies. In the next to last section of the paper we suggest exploring some new kinds of linkage. The paper concludes with a summary of the implications of this work. TWO INTRODUCTORY ILLUSTRATIONS The origin of Hilton's (1992) book was in a project undertaken by the Edu- cational Testing Service (ETS) to understand whether different sources of statis- tical information, each based perhaps on a national sample, could be combined to produce a "comprehensive unified database" of science indicators for the United States. Sponsored by the National Science Foundation, the project' s general goal was to improve the way we capitalize on data that bear on educating scientists, mathematicians, and engineers. The book's implications, inadvertent and other- wise, are important for designing NCES surveys, among others. Twenty-four education databases were reviewed by the project, including the Survey of Doctoral Recipients, national teacher examinations, and at least four massive longitudinal databases. Only 8 of the 24 were deemed worthy of deeper examination. That is, the eight could be "linked" in some sense with others, given the resources available. They included the National Longitudinal Study of the Class of 1972 (NLS:72) and the National Education Longitudinal Study of 1988 (NELS:88), the equality of Opportunity Surveys (1960s), cross- sectional systems such as the Scholastic Assessment Test (SAT), and the NCES National Assessment of Educational Progress (NAEP). As Hilton made plain in the preface to his book, the project was "not fea- sible." Put more bluntly, the ETS effort to combine datasets was a flop despite competent and thoughtful efforts. The databases chosen for examination could not be used for the purpose considered (i.e., to produce a comprehensive science database). It was, nonetheless, a project noble in aspiration and diligent in its execution. The questions posed in the Hilton project about the available databases and which are relevant to linking any datasets seem important for designing new NCES surveys. Put in modified terms, the questions are as follows:

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 197 What variables are common to various databases? What ways of measuring each variable, ways of sampling, and adminis . tration are common, making comparison (or linkage) among datasets easy? What differences in ways of measuring, administrating, and sampling make comparison (or linkages) dubious or difficult? What can be done to fix different datasets so they are "comparable" (or sinkable) in some way and therefore make it sensible to put them together? The Hilton book contained no detailed catalog of why the databases failed to meet one or more of the criteria implied by these questions. Hedges and Nowell (1995) attacked a different but related topic under- standing gender differences in mental abilities of various kinds based on dispar- ate surveys. These authors chose to depend only on studies based on samples of roughly the same target populations and that purportedly measured the same abilities (e.g., reading). That is, they selected only studies that approached the first three questions above in similar ways. Their final selections included NCES- sponsored work, notably NELS:88, NLS-72, High School and Beyond (HS&B), NAEP (trend data only), Project Talent, and the National Longitudinal Youth Survey sponsored by the U.S. Department of Labor, among others. These are summarized in Table 10-1. We rely periodically on its contents in what follows. There was sufficient commonality in what was measured on whom in the target populations in the Hedges-Nowell (1995) ambit to produce an informative analysis. It is a fine illustration of combining different datasets so as to learn whether males and females really differ on mental abilities and how they might differ. For instance, the authors' dependence on well-defined national probability samples avoided the inferential problems encountered in earlier studies, notably depending on self-selected samples (as in SAT/ACT testing), idiosyncratic samples (e.g., in test storming), and distributional assumptions (to get at charac- teristics of extreme scores). A main product of the Hedges and Nowell's work is learning that males are more variable than females in their tested intellectual achievement. This finding helps to elevate substantially the scientific conversa- tion about the purported differences in the mean levels of math and science abilities of boys and girls. It helps to show how more variability among boys may produce specious claims about their ability relative to girls. THE PEDIGREE OF EFFORTS TO PUT DIFFERENT DATASETS TOGETHER The idea underlying any linkage effort undertaken by NCES or by others is that combining data from different sources can help us learn something new. More to the point, the combination permits us to learn something that cannot be learned from individual sources. The idea has fine origins. Alexander Graham

OCR for page 195
198 _' 3 o Cq a' be a' a' a' Cq o Cq a' Cq x VO a' o Cq C) .= Cq .= a' o VO To EM A Do Do . . V, z o o ;o~ ~ m ~ 4= ~ ~ 0 0 ~ 0 0 ~ z V) z 4= EM C) .O o ca C) 4= ~ SO C) v ca ~ 0 ~ ~ ~ 8 _1-~ ;^ cola ~ ~ ~ .~ Ct V, [~ ca sit o = ~ .= Ed ca .N ~ ~ o Do Do o .o cd ~ ;^ 4= ca o ~-~ (~N -~ o ~) s~ O ~ R ;^ O ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ o O t,.4 0 ~ ca ~ ~ ~ ~ ~ e ,, ~ = , ~ O .~ e , ~ ~= ~ ~ ~ .=

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 199 Bell, for instance, exploited the notion in his study of genetic transmission of deafness. In the late 1880s he depended on completed Census Bureau interview forms found strewn in a government building basement and linked these to ge- nealogical records from other sources (Bruce, 1973~. One can also trace the theme to John Graunt's effort in the seventeenth century to learn how to use records in the Crown's interest. Graunt exhorted the King to understand his empire through a lens consisting of compilations of records in statistical form: the counts of soldiers at arms, for instance, from one source and the numbers of births, deaths, and so on from other sources. Scheuren (1995), similarly thoughtful and exhortative, has reviewed and refreshed our thinking about how to augment administrative records and understand them better through surveys. The pedigree of linkage studies is also reflected in contemporary efforts to evaluate social programs. In studies of manpower training and employment, for example, it has become common to link the employment records on specified individuals to their program records and to link these data in turn to research records on individuals (Rosen, 1974~. In agriculture, health, and taxation, there have been fine studies of why and how one ought to couple data from different sources in a variety of ways (Kilss and Alvey, 1985~. From papers by Scheuren (1995) and others we learn about contemporary history of record linkage algorithms (developed by Tepping and Felligi-Sunter, among others), the construction of matching rules and the information exploited in matches, the idea of linkage documentation, and various approaches to adjust- ing for mismatches. We can learn about the role of privacy issues and statistical analysis implications from a related body of work (e.g., Cox and Boruch, 1988~. We learn about appraising the benefits and costs of linkage of administrative records or the difficulty of doing so on account of sloppy practice, from aggres- sive investigatory agencies such as the U.S. General Accounting Office (1986a, 1986b). The title of Hilton's book, Using National Data-bases in Educational Research, may suggest to some readers that they can learn something about whether, why, and how massive studies are combined and used. In fact, recent work on how to enhance the usefulness of statistical data is pertinent. Some of it has been economically oriented for instance, Spencer's work (1980) on cost- benefit analysis to allocate resources to various data collection efforts and the follow-up papers by Moses, Spencer, and others. Scholarly papers on why and how social research data, including educational and health research data, are used are also relevant. Kruskal's volume (1982) is a gem on this account. The analyses in Hilton's book were not burdened by the history of linkage. That is, the authors failed to put the ETS linkage studies into the larger context of such studies or the still larger context of design and exploitation of databases and survey. We learn about attempts to link the Armed Services Vocational Aptitude Battery to tests given in the longitudinal HS&B survey and to SATs, but we are

OCR for page 195
200 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER not told about how this would enhance science indicators or inform decisions or, more importantly, improve the design of surveys. Similarly, the Hedges and Nowell (1995) paper does not consider implications of the work for the design of better surveys that can be linked in any respect, despite the fact that the authors are sensitive to the implications of their work on other accounts. THE UBIQUITY OF PUTTING DIFFERENT DATASETS TOGETHER AND FUNCTIONAL CATEGORIES Some varieties of linkage are common, even pedestrian. So frequently do they occur that they are taken for granted. Other varieties of linkage are not encountered often. They may be undertaken for reasons that seem obviously important or, to the lay public, obscure or trivial. This section provides illustra- tion of linkages, pedestrian and otherwise. The examples are put into categories that have meaning for scientists and an informed public: national probability sample surveys, longitudinal studies, studies of the quality of data, intersurvey consistency, and hierarchical studies. National Probability Sample Surveys Virtually all national probability sample surveys in this country and else- where are an exercise in combining information from different systems. Tele- phone surveys often draw on a population listing of telephone numbers. A population census may draw on an address list for dwellings. The NCES Schools and Staffing Survey, for instance, depends on lists of schools identified as admin- istrative units or locations. List information is used to construct the sample. Listed information is often combined in the same microrecord with the informa- tion provided by the respondent. Longitudinal Studies: Tracking Change Any longitudinal survey involves linkage at a basic level. Microrecords obtained on individuals or institutions at one point in time are linked to those obtained subsequently, as in NELS:88, NLS:72, and HS&B. The organization responsible for each wave of the survey may vary, of course, as when NCES used different contractors. Target populations, variables, and their measurement may also differ somewhat between waves. Studies of the Quality of the Data Any postenumeration survey of a national census and most post facto studies of the quality of a large survey employ linkage. Microrecords in the main initial survey, for instance, are compared to those generated in a more intensive, smaller,

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 201 and presumably more accurate study of a subsample of the original target popu- lation. Efforts to estimate reliability of achievement tests focus on stability of individual scores over time; individual scores must be linked across time. Finally, many if not most studies of the validity of respondent reports in surveys rely on two or more sources of information on the trait or characteristics of interest. Enrollment records in colleges may be compared to self-reported enrollment information in a sample of students receiving subsidized loans. In the federal statistics arena, most studies of response quality or measure- ment error require linkage and are described regularly in the professional litera- ture. Scholarly reports usually appear, for instance, in the annual Proceedings of the Section on Survey Methods of the American Statistical Association and in reports issued by the federal agency that sponsored the work. It is disconcerting to see little representation of municipal statisticians in these Proceedings and reports. It is not clear why their contribution is sparse, and the matter deserves a bit of researchers' attention. Intersurvey Consistency The NCES has conducted a Private School Survey (PSS) independent of a special supplement to the Schools and Staffing Survey (SASS). SASS has depended on the PSS for a sampling frame of schools, using a basic form of linkage. More generally, both the supplemented SASS and PSS have provided estimates of the numbers of schools, teachers, and students in the private sector. Each survey is normally run at different times and measures some of the same variables. On at least one occasion each was run in the same year (Holt et al., 1994~. The results of each survey may or may not agree, differences in time frame being one possible reason for discordance. The occasion of a PSS and a SASS supplement in the same year permitted NCES to investigate the consistency between them. At times then NCES depends on applying algorithms to SASS that reweigh subgroups' totals of schools, teachers, and students in various cat- egories so as to produce overall group totals that are consistent with PSS group totals. A "group" here might be a type of private school (e.g., Catholic). "Linkages" here are of two kinds. First, the PSS is used as the sampling frame for SASS. Second, the memberships of schools in subgroups are supposed to be identical in PSS and SASS, and a linkage between the two is required for estimating new sampling weight. Consider next the problem of assuring that a school's locale is properly identified as a large city or as midsized, as urban or suburban, and so on. Each year NCES attempts to record every school and its locale through the annual Common Core of Data (CCD) survey. Census Bureau data are used in the CCD to identify locales, using seven well-defined locational categories used by the bureau. Every two to five years SASS is run, targeting a sample of schools. In

OCR for page 195
202 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER this effort SASS also elicits information on locales using a simplified question involving eight categories or responses. A challenge lies in reconciling the two sources of information about school locale (Johnson, 1993; Bushery et al., 1992~. Reconciliation of the SASS and CCD files then involves linkage. Such studies reveal, for instance, that roughly 70 percent of SASS reports on locales are correct, that 87 percent of Census classifications are accurate, and that the most common discordance lies in the suburban categories. More important, note that both data sources are imperfect in different ways. This makes linkage-based reconciliation studies essential to assuring the quality and interpretability of the survey results. Reconciliation studies that illuminate the discrepancies that might be found between two or more independent surveys are important. It would be dis- comfitting to find a 10 percent difference in the number of teachers in the United States based on one NCES survey, for example, in contrast to another NCES survey undertaken independently and within a year or two of the first survey. The differences between results of two independent surveys may be a matter of sampling error. Or they may be substantial and attributable to differences in questionnaire wording, definitions, and sampling frame. Being able to link records so as to understand the discrepancies is essential. Linkages may be at the entity level, such as a school, school district, or state. Or they may be at the individual level, as when teachers respond to a questionnaire about their career in teaching. Consider the following examples based on Kasprzyk et al. (1994) and Jenkins and Wetzel (1994~. Discrepancies between independent surveys of institutions, such as "schools," occur for a variety of reasons. For instance, some commercial firms define schools in terms of their physical locations. The CCD defines schools in terms of administrative units, two or more of which may be lodged in the same location. These differences are relevant to sampling frames and to results of surveys, of course. Careful analyses are done to assure that discrepancies and their implications are understood. Furthermore, estimates of the number of teachers in each state may be based on SASS or on state-generated counts for CCD. The estimates may and do differ at times for some states. For instance, overestimates of 15 percent in nine states appeared in the 1990 to 1991 SASS for a variety of reasons. One such reason was the questionnaire wording used in each survey. A respondent in the CCD would report on a unit involving grades kindergarten through 6; the SASS respondent might report on kindergarten through 6 and on grades 7 and 8. Postprocessing edits helped reduce discrepancies. Hierarchical Studies Once said, it is obvious that any survey of schools, teachers in schools, and students assigned to particular teachers must involve a basic linkage of micro .

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 203 records to be useful as a hierarchical study. That is, one must be able to link each child to his or her teacher and each teacher to the school that the teacher serves. Research on the problem of doing such work in the context of SASS has been conducted since at least the early 1990s (King and Kaufman, 1994~. Partly because such work often involves ex-ante design, rather than ex-post facto record linkage, difficulties in linkage appear to be ordinary. Rather, estimation issues appear to be difficult. Of course, many more levels of linkage are possible. The Third International Mathematics and Science Study (TIMSS) is an obvious example. It involves no temporal linkage of the kind that longitudinal studies require. It does involve sampling test items in each child, sampling classrooms in schools, sampling schools in each nation, and a nonprobability sample of nations. Thousands of instances of linkage of diverse kinds are entailed in such a study. WHAT DOES "LINKAGE" MEAN? Vernacular in the sciences is not as uniform as one might expect. Recall, for instance, debates over what constitutes a gene or genome in the Human Genome Project. Discussions about integrating or linking data in the social sciences also are affected by dialect differences. We discuss illustrations below and then dimensionalize the idea of linkage. The focus is on units whose records are to be linked, the populations from which units are sampled, and the variables that are measured on these units and other matters. All in what follows depends on learning from others about what linkage has meant in the context of work spon- sored by NCES and others. Vernacular and Definitions in Education Statistics The Hilton (1992) book's vernacular is sufficiently different from technical parlance in related areas to confuse some readers. For instance, there are repeated references to "linking" and "merging" of different databases, but these terms are undefined. Further, the book's use of these words is at times not the same as is customary in contemporary statistical work. For instance, linkage is defined, in effect and occasionally, as combining microrecords based on a common identi- fier for the same person or entity. At times the book's use of the word link is to imply an intention to "put together." At other times the word link means to stratify the units in each database in the same way (e.g., high ability, Hispanic, and so on) in order to look at how frequencies in these strata change over time on a dimension such as persistence in studying science. The word merge is also used to describe putting different records together, records that may or may not have a common source. The phrase "pooling data" was used by Hilton (1992) and has been used by others in the sense of doing a side-by-side comparison of statistical results from each of several different datasets. This phrase is not used in a way that some

OCR for page 195
204 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER readers would expect. For some analysts "pooling data" means combining the data from two or more samples of the same population into one that can be analyzed as a complete sample. For others it means combining the results from samples of different populations. Finally, consider another more recent example. Bohrnstedt (1997) uses the words link, integrate, and connect in a thoughtful essay entitled "Connecting NAEP Outcomes to a Broader Context of Educational Information." His use of these terms, at first glance, is instructive. The consci- entious reader might observe, for example, that Bohrnstedt makes a careful dis- tinction between link and integrate. He refers, for example, to the "integration" of CCD information with NAEP data, and he discusses the possible "linkage" of NELS:88 and NAEP data. The reader who also possesses some knowledge of what these datasets contain might then conclude that two datasets can be linked, at least in the context of education, if both involve the assessment of achievement or performance. This reader would be mistaken, though. As Bohrnstedt con- cludes, he uses the term link when referring to CCD/NAEP integration: that is, he substitutes link for integrate. The word connect does not reappear in the paper's prose. What are the implications of this example? Especially in creative efforts such as Bohrnstedt's, the precise meanings of such words as link, integrate, and connect ought to be made plain. Vernacular in Other Sciences Work on genes and genomes engenders problems of differences in labeling the object of their attention in context. For instance, a gene for one species may be called something different from the same gene in another species. Given the remarkable growth in genetic research, including the number and size of genome sequence databases, this is not a trivial matter (Williams, 1997~. Similarly, scientists have begun to build a World Wide Web-oriented database on gene mutations as a part of the Human Genome Project effort. A feature of the design problem is to agree on what to call mutation. "The nomenclature is nearly agreed on . . . (with) the systematic name . . . based on the nucleic acid change and . . . the common name based on the amino acid change" (Cotton et al., 1998:9~. The Internet will be used to further explicate and debate. The vernacular problem is not confined to the life sciences. It extends to mathematics. "Computation," for instance, was heralded in a recent Science piece on bridging databases. In fact, basic statistical analyses, rather than compu- tations, were the main topic: understanding how to estimate relationships when there are many errors attributable to sampling and measurement (Nadis, 1996~. The lead on an interesting letter to Science was entitled "Bioinformatics: Math- ematical Challenges" (Grace, 1997~. Yet the letter concerns what is now regarded as a conventional statistical analysis approach to understanding the structure underlying data (i.e., analysis of variance), developed by two scholars who admired and exploited mathematics, R.A. Fisher and O. Kempthorne.

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 205 Science has also carried excellent articles with headings such as "Digital Libraries" (e.g., Nadis, 1996), "Letters" (Cotton et al., 1998), and "Bioinformatics" (Williams, 1997~. They all deal with the names of things. But such papers are not easily found in any Web or library-based search based on a single keyword. One of us had to review the articles published over a five-year period to get the connection. Implication: Understanding and Standardizing Nomenclature One of the implications of this vernacular problem for NCES is that discus- sion, analysis, and agreement on terminology are in order. Because there has been little standardization in educational statistics produced at the state level, in recent years NCES has played a leadership role in getting state education agen- cies to agree to common definitions in statistical reporting. Witness the rough consensus on using two or three definitions of "dropout," for example. Witness also the NCES surveys of how public schools ask about student's race and ethnicity and the stupefying variety in measurement that then impedes better thinking. NCES can play a related role here and to refresh the roles taken at times by the Internal Revenue Service's Statistics of Income division, the Census Bureau's methods division, and others. That is, NCES can help make plain what we mean by "combining" datasets or surveys; "connecting" them; "linking" microrecords, datasets, or surveys; "pooling" datasets or surveys; "integrating" surveys or statistical systems; "unified databases; and "merging" files. In other words, putting things together. Absent explicit definitions of what these words mean, reaching mutual understandings in the statistical and political communities will be difficult or impossible. Most importantly, designing surveys so that they can be linked, compared, merged, and so on will be impossible. NCES can be a leading agency in this effort. Dimensions of Linkage One way of arranging the way we think about linkage is to depend on the elements used in designing conventional statistical surveys. Consider then the ideas of units of sampling, populations, and variables in this context and exten- sions of the ideas. Units: Individuals, Entities, or Both Records on an individual may be linked, as when a child's school transcript is linked to the child's responses to a survey questionnaire, as in High School and Beyond. Or responses on one wave of the HS&B may be linked to responses on subsequent waves, as in any longitudinal study. Similarly, a child or parent's

OCR for page 195
218 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER counts are a stereotypical device for characterizing value, but other approaches can be exploited. For instance, the Proceedings of the American Statistical Association is not viewed by some as a scholarly journal. Nonetheless, the work products published therein are fundamental to our understanding of what goes on at NCES and other statistical agencies. NCES's planned journal, and other peer- reviewed journals, may publish works that appeared earlier in the Proceedings. But it would be as foolish to rely on the latter alone as it would be to ignore the Proceedings. SOME LINKAGE OPTIONS IN EDUCATION STATISTICS There is no formal, well-articulated "linkage policy" at NCES or any other statistical or research agency in the United States. We are aware of no such policy in Sweden, Israel, France, the United Kingdom, Japan, or Germany. Absent formal policy, identifying viable and interesting examples of what is desirable is a dubious objective. In what follows we suppress our ambivalence and discuss what might be desirable. Each suggestion for the future ought to be considered in light of our earlier suggestions in this paper about evaluation and vernacular. Linking NCES Surveys Several of the NCES datasets mentioned earlier, including NAEP, SASS, NELS:88, and CCD, contribute in distinct ways to the research and policy-making communities' understanding of a variety of important educational issues. NAEP for example, generates national and subnational estimates of achievement in core subject areas on a regular basis. SASS, on a somewhat less regular basis, pro- duces a wealth of information concerning teacher supply, demand, quality, and, more generally, conditions in schools. NELS:88 allows researchers to test myriad hypotheses bearing on how, and how well, students learn over time. And the CCD provides general information on the nation's universe of school districts and schools, respectively, on an annual basis. Are These Datasets "Puzzle Pieces" that Fit Together Neatly? Despite their unique contributions, these NCES surveys are not pieces of an education puzzle that fit together neatly. On the contrary, certain pieces seem broken, several duplicate pieces exist, some pieces are inexplicably missing, and a few new pieces are produced so slowly that they appear to be altogether lost. Examples are given in what follows.

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH Broken Pieces: Example 1 219 Terhanian (1997) analyzed 1994 NAEP data in the interest of developing a deeper understanding of the relationship between school expenditures and student reading proficiency. To obtain school expenditure information for his analysis, Terhanian linked CCD district information (which he then converted to per-pupil values) with NAEP district, school, teacher, and student information. The task of linking CCD and NAEP data was by no means straightforward or seamless, however, because the NAEP dataset did not include the CCD unique identifica- tion code for participating school districts or schools. Yet, as Terhanian discov- ered inadvertently, the NAEP dataset did include the two "broken" pieces (i.e., separate variables) of the unique district code. By simply concatenating the two, Terhanian was able to create the one variable that was necessary to augment the NAEP data with CCD data. A Peculiar Irony NCES does not provide researchers with instructions on how to "fix" the "broken" pieces in the NAEP user's manual. Nor do NCES representatives actively publicize the presence of these pieces. It is perhaps for these reasons that scholars who focus on NAEP' s improvement often recommend linkage with the CCD. They simply do not realize that the two datasets are already sinkable, albeit with difficulty. Duplicate Pieces: Example 2 Several NCES datasets, including NELS:88, SASS, and NAEP, include ques- tions about school quality, teacher experience, and other common areas that concern policy makers and researchers. In some cases the exact same questions, or very similar ones, appear on different surveys. In other cases, however, questions about the same topic are phrased so differently across surveys that it is impossible to compare responses. Understanding NCES's rationale here is not as complicated as it seems. No one at NCES is charged with the responsibility of coordinating the various surveys, many of which run during the same year, at the microlevel. That is, no one really knows which questions are on which surveys, much less how they got there. We believe there is a better way. Missing Pieces: Example 3 Linkage efforts are less successful than planned at times because puzzle pieces are missing. In the 1992 NAEP eighth-grade national math assessment, for instance, only about 60 percent of 8,300 math teachers could be linked cor- rectly to their students. Data were completely missing for 35 percent of the total

OCR for page 195
220 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER sample of teachers and partly missing for another 5 percent. Attempts by re- searchers to shed light on the relationship between teacher characteristics and student achievement, then, could only flop. NCES and its contractors seem to have corrected the within-school linkage problem the teacher/student match rate improved appreciably for the 1994 and 1996 NAEP assessments. The ability of NCES and its contractors to learn from such failures certainly bodes well for the future. Lost Pieces: Example 4 NCES datasets are not always produced expeditiously. Instead, some datasets, notably the CCD, are produced so slowly that they appear to be altogether lost. This not only diminishes the usefulness of the CCD to researchers and others but also adds to their frustration. Consider, as an example, the case of the JASON Foundation for Education. In 1997 the foundation developed a promising method to deliver science instruction via the Internet to middle school students. At the same time, it developed a simple registration process for potential participants that exploited the interactive nature of the Internet and relied on information from the 1992 to 1993 CCD. In order to register for the pilot program, participants had to first identify their school district from a menu of districts and then their school from a menu of all schools in their district. After they did so, additional information about the district and the school populated several data fields on the registration page. JASON then asked potential participants to complete the registration form by confirming or editing the CCD information that populated the data fields. From start to finish, the entire process should have taken less than five minutes. The registration process turned out to be flawed, however, because a non- trivial percentage of CCD information was either obsolete or missing (i.e., it seemed "lost". For this reason about 10 percent of the first several hundred JASON registrants could not find their school districts or schools listed among those on the registration Web site menu. Others who were able to find their school districts or schools often felt obligated to correct dated information (e.g., number of students in the school). The registration process turned out to be a burden for respondents despite the good intentions of the folks at JASON. What does this example of a "lost piece" imply for NCES? If researchers and others are to rely on the CCD, NCES must ensure that data are collected and compiled more expeditiously. Comparing the pace of the current collection and compilation process to that of the movement of a glacier, regardless of the cause (e.g., state officials possess no obvious incentive to provide NCES with informa- tion in a timely manner), seems fair.

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH What Combination of NCES Data Is Available and at What Linkage Level? 221 For any randomly chosen public school in the United States, the CCD is likely to be the only NCES information source available to researchers and policy makers. Absent a change in how NCES designs its surveys, there is little reason to expect some nontrivial combination of CCD, SASS, NAEP, and NELS:88 data to be collected during the same year for a meaningfully representative sample of schools. This is despite the fact that some combination of these data would, in our opinion, better serve the research and policy-making communities. Table 10-2 displays crudely the current linkages among and between the NCES datasets mentioned here. It also describes the level at which these datasets are currently sinkable. What are the current research implications of these poten- tial linkages on analysis? It is possible to link some combination of CCD (e.g., core per-pupil expenditures of the Amarillo Independent School District), SASS, NAEP, and NELS:88 information at the district level in a given year. See Terhanian (1997), Wenglinsky (1997), and Taylor (1997) for recent examples of analyses that have exploited some combination of these linkage opportunities. It is also possible, in some cases, to link CCD, SASS, and NELS:88 at the school level in a given year. About 23 percent of the schools in which the sample of NELS:88 students were enrolled in both 1990 and 1992, for instance, also partici- pated in the 1990 to 1991 wave of SASS. CCD information, then, is also avail- able for these schools during these years. The value of linkage may seem trivial to researchers who wish to carry out analyses of student or school samples that are representative of the nation or states. The implications for the design of future surveys, however, are perhaps less trivial. Just as we recommended that NCES or some other thoughtful federal agency develop a map or maps of variables across surveys, we also suggest that they consider doing so for the actual surveys they sponsor. The object of map- ping is to better understand how the education puzzle pieces fit together, what pieces are missing, and what pieces are needed to better complete the puzzle. Linkage and Augmentation of NCES Data and Non-NCES Data At times, states, other federal agencies, and government contractors produce information that can be linked to NCES datasets, including NAEP. For instance, the Pennsylvania Educational Policy Studies Project, which is affiliated with the TABLE 10-2 Linkages Between and Among NCES Datasets Level Data Source District SASS NELS:88 CCD NAEP School SASS NELS:88 CCD

OCR for page 195
222 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER University of Pittsburgh, maintains a database that provides general descriptive data on the universe of Pennsylvania's school districts. These data include valu- able information that is not available through other sources such as the CCD, notably each school district's Equalized Subsidy for Basic Education (ESBE) revenue (which is the largest source of state aid to school districts) and the ratio the state uses to determine ESBE revenue. States such as Pennsylvania, then, are in a position to exploit linkage oppor- tunities. For instance, the Pennsylvania state department of education might compare NAEP results with results from its own state assessment. Or Pennsylva- nia might undertake a large-scale satisfaction survey of the sample of schools participating in NAEP or SASS in the interest of understanding the effect of school quality, measured more broadly than it is currently measured, on school and perhaps even student achievement. Instances of states capitalizing on NCES's efforts are hard to find, however. An example of a government agency capitalizing on and augmenting NCES' s work is not so hard to find. The General Accounting Office (GAO) used the SASS sample in its recent work to investigate the quality of school facilities across the United States. GAO did not, however, return an augmented dataset to NCES for analysis because no arrangement had been made with NCES in ad- vance. To us this seems quite shortsighted on the part of either NCES, GAO, or perhaps both. The American Institutes for Research, a government contractor, has pro- duced a Teacher Cost Index (TCI) to which NAEP or other NCES datasets might be linked. The TCI is a district-level index that accounts for factors that underlie differences in the cost of living among school districts (Chambers, 1995~. Devel- oped in part on the basis of an analysis of the 1993 to 1994 SASS, the TCI provides researchers with an arguably important tool for adjusting expenditure data to make expenditure effectiveness comparisons more fair. It enables re- searchers to estimate, for instance, the annual salary that school districts across a state would have to pay a similarly qualified teacher. Private Organizations At a high level of analysis, private organizations often link their efforts to a dataset generated by public agencies. Louis Harris and Associates, for instance, periodically surveys nationally representative samples of teachers, students, and parents. The sampling frames on which the organization relies include the CCD. Harris's efforts do not usually engender individual privacy issues because data are reported only in the aggregate. Moreover, the issues that concern Harris are not necessarily those that NCES and other federal agencies are able to focus on. Rather, Harris consciously seeks to fill missing information gaps and therefore focuses on certain important issues in far greater depth than NCES. These issues

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 223 include parental involvement; safety and violence in schools, neighborhoods, and cities; and gender equity in schools. There is no great reason why Louis Harris and Associates or other private organizations could not cooperate with NCES (or other statistical agencies) to enhance understanding of the value of sample augmentation linkage of the sort described earlier. Harris could have used the NCES Schools and Staffing Survey or any of the recent NAEP samples, for example, to inform or improve the design of the 1997 Metropolitan Life surveys that investigated gender equity and parental involvement in schools from the perspectives of students, teachers, and parents. And the organization might have provided NCES with resultant datasets as well as suggestions for improving future surveys and/or linkage. Organizations such as Louis Harris and Associates are sensitive to the idea that linkages of various kinds can advance the company's mission in the public interest. They also recognize that linkage of datasets may be useless and that linkage engenders both naive and subtle privacy issues. More important, such organizations can be encouraged to develop more creative and innocuous ap- proaches to policy on putting datasets together. This effort could be made for national samples of schools, local education agencies, sampling frames, and so forth. The information that comes about as a result ought to become a part of the knowledge base for NCES and other statistical agencies. SUMMARY Implication: Electronic Mapping NCES, and perhaps other statistical agencies, can invent a Web-based system for mapping the variables measured in each survey sponsored by the agency (and other studies), the questions that address the variables, and the question response categories, exploiting hypertext to facilitate the acquisition of deeper information and wider searches. This would make easier the task of understanding what is common and unique to diverse surveys in education and perhaps other areas. Such a system is a natural extension of NCES's work on data warehousing and electronic code books and can adopt software that meets open database connec- tivity standards. Implications: Nomenclature NCES can play a leadership role in clarifying and standardizing the semantics of linkage. This would help make plainer and more uniform words such as merging, pooling, connecting datasets and so forth and fostering sensitivity to definitions of these in statistical policy, activity, and publications. NCES has been vigorous in related respects in the past, to judge from the agency's work

OCR for page 195
224 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER with state education agencies on, for example, determining what dropout means and how a dropout is counted. Implications: Dimensionalizing Linkage NCES can explore ways to make plainer the functions of linking surveys, in effect dimensionalizing linkage activity. This might be done, as suggested earlier, by hinging dimensionalization on the ideas of augmenting a primary survey with two or more secondary ones, focusing on what is augmented: samples, popula- tions, variables, modes of measurement, replication, and so on. The rationale is that we need to learn how to better arrange our thinking about very complex linkage efforts. Implication: Linkage Policy NCES can explore at least two approaches to linkage policy. Ex-ante policy stresses the idea that all surveys can be planned so as to be more connectable in specific senses. Ex-post facto policy recognizes that not all linkage can be planned and that unplanned linkage must be planned for. Further, institutional vehicles for developing policy can be identified and explored, such as inter- agency councils and statistical agency task forces. In the continued absence of coherent policy, we are unlikely to make much progress in productively exploit- ing diverse surveys or in better understanding the benefits and costs of linked studies. Implication: Registries, Displays, and Evaluation Developing a registry of each study that depends on linkage and developing new ways of displaying sinkable or linked studies is possible. These are essential to understanding the linkage landscape and, moreover, to evaluating the value of linkages of various kinds. No such registries exist. Partly for this reason, per- haps, few formal and comprehensive evaluations of linkage efforts have been published. Implication: Broken Pieces, Missing Pieces NCES can consider approaching linkage issues productively by using a "broken pieces, missing pieces" theme. That is, one tries to understand how a study could be more informative had the possibility of linkage actualized through better planning. This perspective is kin to the idea underlying good postmortems in medicine and good crash investigations in the aviation and nuclear sciences, engineering, and other disciplines. It can be exploited by statistical agencies in the linkage context as it is, in effect, in individual survey efforts and formalized.

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH Implication: Cross-Agency and Cross-Institution Initiatives 225 NCES can play a leadership role in understanding whether, how, and how productive certain kinds of linkage studies that cross institutional and geopolitical jurisdiction lines have been and could be done. In pnnciple, for example, some surveys sponsored by the public might easily be linked in one or more dimensions with privately sponsored surveys. In principle a survey mounted by a federal statistical agency such as NCES can be designed so as to permit easy connection to a study designed by a federal agency with another mission, such as program evaluation. What is possible in principle is not always possible in practice, but unless we explore the former, we will not improve the latter. To return to the general topic of this essay, recall the quotation from Henry James at the start of this paper. It says, in other words, that everything is related to everything else. To make this manageable, NCES and the statistical and social sciences community have to draw circles around the more connectable things. In this respect the work reviewed in this paper and the implications educed here can help NCES and the research community do better in the future. This requires resources, of course, not the least among which is the political and scientific will to make data work harder to serve the public interest. ACKNOWLEGMENTS Research for this paper was sponsored by the National Center for Education Statistics, the National Science Foundation, and the U.S. Department of Educa- tion. We are grateful to colleagues at the Planning and Evaluation Service of the U.S. Department of Education, the U.S. General Accounting Office, and the Education Statistical Services Institute for conversations that helped clarify our thinking on the topic. REFERENCES Blasius, J., and M. Greenacre 1998 Visualization of Categorical Data. New York: Academic Press. Boruch, R.F., and G. Terhanian 1996 So what? The implications of new analytic methods for designing NCES surveys. Pp. 4.1-4.118 in From Data to Information: New Directions for the National Center for Education Statistics, G. Hoachlander, J. Griffith, and J.H. Ralph, eds. Washington, D.C.: U.S. Department of Education. 1998 Controlled experiments and survey-based studies on educational productivity: Cross- design synthesis. Pp. 59-85 in Advances in Educational Productivity, Volume 7, A. Reynolds and H. Walberg, eds. Greenwich, Conn.: JAI Press. Bohrnstedt, G.W. 1997 Connecting NAEP Outcomes to a Broader Context of Educational Information. Paper presented at the annual meeting of the American Educational Research Association, Chicago.

OCR for page 195
226 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER Braun, M., and W. Miller 1997 Measurement of education in comparative research. Comparative Social Research 16: 163-201. Brooks-Gunn, J., B. Brown, G.J. Duncan, and K.A. Moore 1995 Child development in the context of community resources: An agenda for national data collection. Pp. 27-97 In Integrating Federal Statistics on Children: Report of a Work- shop. Board on Children and Families and Committee on National Statistics, National Research Council. Washington, D.C.: National Academy Press. Bruce, R.V. 1973 Bell: Alexander Graham Bell and the Conquest of Solitude. New York: Little Brown. Bushery, J., D. Royce, and D. Kasprzyk 1992 The Schools and Staffing Survey: How re-interview measures data quality. In 1992 Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American Statistical Association. Citro, C.F. 1997 Editor's postscript. Chance 10(4):31. Chambers, J. 1995 Public School Teacher Cost Differences Across the United States. Washington, D.C.: National Center for Education Statistics. Cotton, R.G.H., V. McKusick, and C.R. Scriver 1998 The HUGO Mutation Database Initiative. Science 279:10-11. Cox, L.H., and R.F. Boruch 1988 Emerging policy issues in record linkage and privacy. Journal of Official Statistics 4(1):3-16. Evinger, S. 1997 Recognizing diversity: Recommendations to OMB on standards for data on race and ethnicity. Chance 10(4):26-31. Letter. Science 275: 1862-1863. Grace, J.B. 1997 Griffith, J. 1992 Presentation to the National Advisory Council on Education Statistics (March 12-13, 1992): Draft Paper on a Proposal for an Integrated Longitudinal Studies Program. Wash- ington, D.C.: National Center for Education Statistics. Harkness, J., and P. Mohler 1998 Towards a Manual of European Background Variable: Part I, Appendix II: Report on Background Variables in a Comparative Perspective. Mannheim, Germany: Zentrum fur Umfragen, Methoden und Analysen. Hedges, L.V., and A. Nowell 1995 Sex differences in mental test scores, variability, and numbers of high scoring individuals. Science 269:41-45. Hilton, T., ed. 1992 Using National Data-bases in Educational Research. Hillsdale, N.J.: Lawrence Erlbaum Associates. Hofferth, S.L. 1995 Children's transition to school. Pp. 98-123 in Integrating Federal Statistics on Children: Report of a Workshop. Board on Children and Families and Committee on National Statistics, National Research Council. Washington, D.C.: National Academy Press. Holland, P.W., and D.B. Rubin, eds. 1982 Test Equating. New York: Academic Press.

OCR for page 195
GEORGE TERHANIAN AND ROBERT BORUCH 227 Holt, A., S. Kaufman, F. Scheuren, and W. Smith 1994 Intersurvey consistency in school surveys. Pp. 105-l lO in Volume II: 1994 Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American Statistical Association. Jenkins, C.R., and A. Wetzel 1994 The 1991-92 teacher follow-up survey reinterviewed and extensive reconciliation. Pp. 821-826 in Volume II: 1994 Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American Statistical Association. Johnson, F. 1993 Comparisons of school locale settings: Self-reported vs. assigned. Pp. 689-691 in 1993 Proceedings of the Section of Survey Research Methods. Alexandria, Va.: American Statistical Association. Kasprzyk, D., K. Gruber, S. Salvucci, M. Saba, F. Zhang, and S. Fink 1994 Some data issues in school-based surveys. Pp. 815-820 in Volume II: 1994 Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American Statistical Association. Kilss, W., and W. Alvey, eds. 1985 Record Linkage Techniques: Proceedings of the Workshop on Exact Matching Method- ologies. Washington, D.C.: U.S. Department of the Treasury. King, K.E., and S. Kaufman 1994 Estimation issues related to the student component of SASS. Pp. 1111-1115 in 1994 Proceedings of the Section on Survey Research Methods. Alexandria, Va.: American Statistical Association. Kruskal, W.H., ed. 1982 The Social Sciences: Their Nature and Use. Chicago: University of Chicago Press. Ligon, G. 1998 Success Finder Mapper. Available at: www.evalusoft.com. McCabe, B., and J. Harkness 1998 Towards a Manual of European Background Variable: Part I, Appendix II: Report on Background Variables in a Comparative Perspective. Mannheim, Germany: Zentrum fur Umfragen, Methoden und Analysen. Nadis, S. 1996 Computation cracks semantic barriers between data-bases. Science 272:1419. National Research Council 1992 Teacher Supply, Demand, and Quality: Policy Issues, Models, and Data-bases, E.E. Boe and D.M. Gilford, eds. Committee on National Statistics. Washington, D.C.: National Academy Press. 1995 Integrating Federal Statistics on Children. Board on Children and Families and Commit tee on National Statistics. Washington, D.C.: National Academy Press. 1999 Grading the Nation's Report Card: Evaluating NAEP and Transforming the Assessment of Educational Progress, J.W. Pellegrino, L.R. Jones, and K.J. Mitchell, eds. Committee on the Evaluation of National and State Assessments of Educational Progress, Board on Testing and Assessment. Washington, D.C.: National Academy Press. Pallas, A. 1995 Federal data on educational attainment and the transition to work. Pp. 122-155 in Inte grating Federal Statistics on Children: Report of a Workshop. Board on Children and Families and Committee on National Statistics, National Research Council. Washington, D.C.: National Academy Press.

OCR for page 195
228 PUTTING SURVEYS, STUDIES, AND DATASETS TOGETHER Rosen, S., ed. 1974 Final Report of the Panel on Manpower Training Evaluation: The Use of Social Security Earnings Data for Assessing the Impact of Manpower Training Programs. Washington, D.C.: National Academy of Sciences. Scheuren, F. 1995 Administrative Record Opportunities in Educational Survey Research. Report prepared for the National Center on Educational Statistics. Washington, D.C.: George Washington University. Spencer, B.D. 1980 Conducting benefit cost analysis. Pp. 38-59 in R.W. Pearson and R.F. Boruch, eds. Lecture Notes in Statistics: Survey Research Designs. New York: Springer-Verlag. Taylor, C. 1997 The Effect of School Expenditures on the Achievement of High School Students: Evi- dence from NELS and the CCD. Paper presented at the American Educational Research Association annual meeting, Chicago. Terhanian, G. 1997 School Policies and Practices, Student Proficiency, and Racial Differences in Proficiency: Evidence from a Multilevel Analysis of the Reading Proficiency of 4th Graders from Pennsylvania and New York. Paper presented at the Summer Data Conference of the National Center for Education Statistics, Washington, D.C. Homepage. Available at: http://dolphin.upenn.edu/~terhania. Tufte, E.R. 1990 Envisioning Information. Cheshire, Conn.: Graphics Press. U.S. General Accounting Office 1986a Computer Matching: Assessing Its Costs and Benefits. Washington, D.C.: U.S. General Accounting Office. 1986b Computer Matching: Factors Influencing the Agency Decision Making Process. Wash- ington, D.C.: U.S. General Accounting Office. Vogel, G. 1997 Publishing sensitive data: Who calls the shots? Science 276:523-526. Wenglinsky, H.A. 1997 When Money Matters: How Educational Expenditures Improve Student Performance and When They Don't. Princeton, N.J.: Policy Information Center, Educational Testing Service. Williams, N. 1997 How to get databases talking to one another. Science 275:301-330.