National Academy of Sciences | 150 Year Anniversary

Questions? Call 800-624-6242

| Items in cart [0]

The National Academies Press

PAPERBACK
price:$59.00
add to cart

Rights & Permissions

topleft topright

Studies of Welfare Populations: Data Collection and Research Issues (2001)
Commission on Behavioral and Social Sciences and Education (CBASSE)

Citation Manager

. "7 Matching and Cleaning Administrative Data." Studies of Welfare Populations: Data Collection and Research Issues. Washington, DC: The National Academies Press, 2001.

Please select a format:

BibTeX EndNote RefMan


Page
217
bottomleft bottomright

The following HTML text is provided to enhance online readability. Many aspects of typography translate only awkwardly to HTML. Please use the page image as the authoritative form to ensure accuracy.


Studies of Welfare Populations: Data Collection and Research Issues

names might be a good alternative while providing a better means of protecting individual identities.4

CONCLUSION

Recommendations

We recommend a number of activities in the cleaning of administrative data for research use. These include:

  • Examining the internal consistency of the data;

  • Examining how the data were collected, processed, and maintained before delivery to the researcher;

  • Taking every opportunity to compare with other data sets, either survey or administrative, through record linkage; and,

  • Most important, getting to know the operations of the program, not just the collection of administrative data, but also how services are provided so that inconsistencies in the data might be understood better.

We also recommend using probabilistic record linkage and not relying on any one identifier for linking records. We believe our analysis above makes this case. The golden rule of record linkage is that there is no such thing as a unique identifier, because individuals can match on many identifiers. In many cases the same SSN has been provided to two or more individuals.

Developments in Information Technology That May Improve Administrative Data

Much of what is discussed previously is required because public policy organizations are still, for the most part, in their first generation of information systems. These “legacy” systems are typically a decade or older mainframe installations that do not take advantage of much of today’s technology. Data entry in the legacy systems, for example, is often quite cumbersome and requires a specialized data entry function. Frontline workers are typically not trained to do this or do not have the time or resources to take on the data entry task. An exception is in entitlement programs in some jurisdictions, where the primary activity for eligibility workers is collecting information from individuals and entering it into a computerized eligibility determination tool. The development of new graphical user interfaces that are more worker friendly—in that the screens

4  

Popular software programs such as SAS provide a simple method of converting names to Soundex codes.

Page
217