1
Introduction

In August 1974, President Gerald R. Ford signed the Family Educational Rights and Privacy Act (FERPA) into law. One of two privacy laws Congress approved that year in response to the breach of public trust created by the Watergate scandal, FERPA was designed to protect the privacy of individual student test scores, grades, and other education records (U.S. Code, Title 20, Chapter 31, Section 1232g).1

Much has changed since that time. Education policies now emphasize education standards and testing to measure progress toward those standards, as well as rigorous education research. At the same time, private firms and public agencies, including schools, have replaced most paper records with electronic data systems. Reflecting the movement toward electronic data, many social science researchers have changed their methods; today, they may conduct fewer original surveys to gather research data and turn more often to administrative data maintained in government databases.

These trends have converged to greatly increase the supply of data on student performance in public schools. With funding from the U.S. Department of Education, the states are compiling student records from local schools and districts into statewide databases, with unique student identifiers that can be used to track students’ performance as they move through grade levels and schools. Although these databases represent a



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 1
1 Introduction In August 1974, President Gerald R. Ford signed the Family Educa- tional Rights and Privacy Act (FERPA) into law. One of two privacy laws Congress approved that year in response to the breach of public trust cre- ated by the Watergate scandal, FERPA was designed to protect the privacy of individual student test scores, grades, and other education records (U.S. Code, Title 20, Chapter 31, Section 1232g).1 Much has changed since that time. Education policies now emphasize education standards and testing to measure progress toward those stan- dards, as well as rigorous education research. At the same time, private firms and public agencies, including schools, have replaced most paper records with electronic data systems. Reflecting the movement toward electronic data, many social science researchers have changed their meth- ods; today, they may conduct fewer original surveys to gather research data and turn more often to administrative data maintained in govern- ment databases. These trends have converged to greatly increase the supply of data on student performance in public schools. With funding from the U.S. Department of Education, the states are compiling student records from local schools and districts into statewide databases, with unique student identifiers that can be used to track students’ performance as they move through grade levels and schools. Although these databases represent a 1 Text of the act is available at http://www4.law.cornell.edu/uscode/html/uscode20/ usc_sec_20_00001232---g000-.html. 

OCR for page 1
 PROTECTING STUDENT RECORDS rich source of longitudinal data, researchers’ access to the individually identifiable data they contain, as well as to student record data main- tained at the local level by individual schools and school districts, is limited by the privacy protections of FERPA. Researchers’ limited access to individual student data slows research not only in education but also in related fields, such as child welfare and health. To explore possibilities for data access and confidentiality in compli- ance with FERPA and with the Common Rule for the Protection of Human Subjects, the National Academies and the American Educational Research Association convened the Workshop on Protecting Student Records and Facilitating Education Research in April 2008 (see Appendix A for the workshop agenda). The workshop was supported by the Ewing Marion Kauffman Foundation, the Spencer Foundation, and the William T. Grant Foundation. To carry out the workshop, the National Academies’ Committee on National Statistics and Center for Education appointed an expert plan- ning committee chaired by Felice J. Levine, researcher and executive director of the American Educational Research Association. The planning committee was charged to Plan for a workshop at the National Academies on providing research access to administrative records (including test scores) pertaining to elementary, secondary, and higher education students and their schools while protecting the privacy and confidentiality of the information. The planning committee will be charged with commissioning papers for pre- sentation, and convening and serving as moderators for the workshop. WORKSHOP gOALS AND FRAMEWORK Felice Levine opened the workshop by welcoming all participants and providing an overview of the key issues to be discussed. Over the past five years, researchers have become increasingly interested in accessing the state education databases that compile student records, particularly because the No Child Left Behind Act of 2001 requires that “scientifically based [education] research” drive state and local use of federal education funds. These concerns informed the central question of the workshop— how to reconcile FERPA protections with current educational needs and goals. Levine explained that the workshop would address this central question in a broader context, examining approaches to reconciling pri- vacy protections with research access not only in education, but also in other fields, such as health care. Levine observed that the workshop was timely, because the Depart- ment of Education was seeking comments on proposed changes to its FERPA regulations. The proposed new rules address not only when

OCR for page 1
 INTRODUCTION schools and colleges can release student information for the purpose of protecting health and safety (following the April 2007 massacre at Vir- ginia Polytechnic Institute and State University), but also when student information may be released for research purposes (U.S. Department of Education, 2008a). Levine invited thoughtful discussions that would inform useful comments on the proposed new rules. She mentioned that comments on the rules by the American Educational Research Associa- tion2 would reflect an online survey of its members about FERPA, which drew a large response from over 250 education researchers. In closing, she predicted that the workshop would be valuable, observing that the National Academies’ Committee on National Statistics has a long history of successfully addressing issues of research access and privacy protection (e.g., National Research Council, 1993). ACCESS AND PRIVACY IN CONTEXT Miron Straf (National Research Council) described the larger context of data privacy and research access issues surrounding the workshop. His remarks reflected a series of reports issued by the Committee on National Statistics over the past three decades. Straf began by defining the follow- ing key terms (Bradburn and Straf, 2003): Information: knowledge, facts, or representations of them. Personal information: information that is or can be linked directly or indirectly to some person. Identifiable information is personal information. Data: information that is collected, compiled, captured, created, or received for one or more purposes. Confidential data: data with personal information. Statistical data: data without any personal information. Disclosure: the release of personal information. Discovery: to become aware of personal information from statistical data and other knowledge. Privacy: an individual’s control over who has access to information about him or her. The concept of privacy is relevant to what per- sonal information becomes data. 2 Joint comments on the proposed rule, submitted by the American Educational Research Association, the American Statistical Association, and the Consortium of Social Science Associations, were published following the workshop (American Educational Research Association, 2008).

OCR for page 1
 PROTECTING STUDENT RECORDS Confidentiality: protection against the release of personal information. An important distinction is that privacy pertains to individuals; confidentiality to their information. Straf then distinguished between (1) confidential data, with personal information, and (2) statistical data, without any personal information. He said that privacy pertains to the boundary between personal information and data and to the release of confidential data to others. For example, an individual may be willing to provide personal information to a health provider but opposed to having that same personal information provided to his or her employer. On the basis of this analysis, Straf argued that, although people have the right to control their personal information, they do not have the right to control statistical data derived from that information. For example, the fact that a parent has a child enrolled in the eighth grade of a par- ticular school district is personal information, but the number of eighth graders reported by the school district is an example of statistical data. An individual parent has no right to exclude his or her child from that count. Expanding on this analysis, Straf argued that it is not a violation of confidentiality to produce statistical data from one’s personal informa- tion and, more broadly, it is not a violation of privacy or confidentiality to use statistical data for a purpose different from the one for collecting the information from which the statistical data were derived. However, two key problems remain, according to Straf. The first is disclosure of personal information in confidential data, and the second is discovery of personal information from statistical data when those data are combined with other knowledge. He outlined two approaches to protect confidential data against both problems: 1. Altering the data in one of several ways, such as removing per- sonal identifiers, collapsing individual data categories, adding random errors (statistical noise) to the data, or creating replicated (synthetic) data. 2. Restricting access—one approach is to license researchers who are then subject to penalties for disclosure or discovery. Straf noted that the National Center for Education Statistics has been a leader in using this approach (see Chapter 4). Another approach is to pro- vide access at highly protected sites (data enclaves, research data centers), where analyses and other outputs are screened before they are released to any researcher. Straf explained that new variants of restricted access have emerged. In one, at the request of a researcher, agency staff members analyze confiden-

OCR for page 1
 INTRODUCTION tial data and then screen the results before releasing them to the researcher online. In another, Cornell University economist John Abowd has created a virtual research data center, which provides access to synthetic data over the Internet (Cornell University, 2008). The National Opinion Research Center at the University of Chicago has created a virtual data enclave that licenses researchers for restricted access online. Although these protective approaches are important, Straf said, they also impose new costs and risks. Protected research sites, such as the Census Bureau’s research data centers, are not easily available to many researchers, and, even when they are, they may not provide access to all relevant data. In addition, researchers are unclear about the extent to which replicated data corresponds to real data. Straf then gave a brief overview of federal privacy laws and regula- tions. The Confidential Information Protection and Statistical Efficiency Act of 2002 (CIPSEA) is designed to protect confidentiality of data col- lected for statistical purposes by government agencies. In addition, 17 federal agencies have adopted the Federal Policy for the Protection of Human Subjects, widely referred to as the Common Rule (U.S. Code of Federal Regulations, Chapter 45, Section 46). The Common Rule requires universities, federal agencies, and other research organizations to estab- lish institutional review boards (IRBs). These boards review proposals to conduct research involving human participants, and they may reject proposals or require alterations in order to ensure adequate protection of individual privacy and data confidentiality. Straf said that, although these boards sometimes impose unnecessary confidentiality requirements that delay valuable research activities, they are important, providing an extra level of protection against disclosure and discovery of personal informa- tion. Most boards provide expedited reviews for research proposals not seen as involving significant risk. Straf said that IRBs often require researchers to ensure that written, informed consent will be obtained from individuals who provide personal information for use in research studies. Informed consent documents are designed to clarify who will have access to the personal information and how it will be used. Straf argued that informed consent should apply only to personal information and should not be required when a researcher wants to use statistical data derived from that information (because, in his view, privacy and confidentiality do not apply to statistical data). Nevertheless, informed consent is valuable to build trust with the public, he said. Straf suggested that informed consent documents clearly describe all potential uses of the data sought, including research uses (National Research Council, 1993), and refer to the larger goals of the research, such as to improve the quality of education. Arguing that statistical agencies’ goal of zero tolerance for disclosure

OCR for page 1
 PROTECTING STUDENT RECORDS of confidential data is unrealistic, Straf suggested that agencies might instead adopt a standard of reasonable care, which would balance a small risk of disclosure against the great benefit of social science research. An additional protection would result if the agency placed the onus on the researcher using the data to avoid discovery or disclosure of personally identifiable information, as the National Center for Education Statistics does in its licensing arrangements (see Chapter 4). Straf said that current tensions result from education policy makers’ “voracious” demand for data on student performance and the Depart- ment of Education’s efforts to promote rigorous research. Advances in cognitive, behavioral, and neurosciences are opening up new research avenues with potential to reduce disparities in educational achievement, he said, but researchers need access to education records to pursue these avenues. Describing many state education agencies as “wary” of provid- ing education records for research because of uncertainty about how FERPA applies, he observed that many different methods are available to provide research access while protecting personal information, which is the goal of FERPA. DISCuSSION OF KEY ISSuES In response to Straf’s presentation, Levine said that the key issue is the migration of individual data to statistical data. As discussed in Putting People on the Map: Protecting Confidentiality with Linked Social-Spatial Data (National Research Council, 2007), when researchers link statistical data sets to other data sets (such as geospatial data), they sometimes create personally identifiable data without having the consent of the individual whose data are now identifiable. When conducting a small-scale survey, a researcher routinely obtains each survey participant’s informed consent for the uses of the data (including a warning about possible disclosure or discovery), but when there is no consent process, as is the case with administrative data, it is unclear how to allow research access while pro- tecting privacy. Straf’s proposal for a reasonable standard of care led to a discussion of breaches of confidential data. Robert Boruch (University of Pennsyl- vania) said that the United States Privacy Protection Study Commission created under the Privacy Act of 1974 had searched for disclosures or risks of disclosure and found them only in marketing surveys and other private-sector information-gathering activities. Gerald Gates (Census Bureau–retired) said that, although the Census Bureau does not docu- ment disclosures, it has a staff dedicated to studying data files in order to determine whether links to external data would reveal individual identi- ties, and this staff has identified some dangers. Myron Gutmann (Inter-

OCR for page 1
 INTRODUCTION University Consortium for Political and Social Research) said that a recent search for lawsuits related to data disclosure had uncovered very few (National Research Council, 2007). As the director of a data-archiving organization, Gutmann said, he had asked other data-archiving orga- nizations for examples of disclosures but found none. When Gutmann directed his staff to study the consortium’s data files, the staff found few dangers. Gutmann explained that, because survey data has “noise,” the odds of disclosure are small and that some papers on this topic would soon be published. Barbara Schneider (Michigan State University) agreed that it is hard to personally identify data in large national data sets, but said state-level data are “easier to crack,” raising critical confidentiality issues. This led to discussion of whether state databases of deidentified education records might actually include personally identifiable information, as well as the risks of disclosure from these databases. REPORT OVERVIEW This report continues in Chapter 2 with discussion of the Department of Education’s current interpretation of FERPA and proposed new regula- tions to carry out the law, along with a description of the department’s initiative to assemble and report state educational performance data. Chapter 3 discusses the value of education research using student and school records using examples of three studies that promise to inform needed improvements in public schooling. Chapter 4 presents models that allow researchers to access education records in ways that protect confidentiality and discusses the limitation of these models. Building on that discussion, Chapter 5 describes similar models of research access and confidentiality protection in other sectors, discussing as well the limita- tions of these models. The final chapter includes reflections about key issues and next steps by members of the workshop planning committee and other workshop participants.

OCR for page 1