APPENDIX
J

Content and Quality of Federal and State Administrative Records

The Bureau of the Census has established an Administrative Record Information System (ARIS) (for more information on ARIS, see Gates and Palacios, 1993) that provides current information about the content, nature, and availability of over 60 federal administrative record systems and more than 400 state systems.

Table J.1 summarizes the subject content (population) available from a selected set of federal administrative records. These files are comprehensive in their coverage for specific universes and, in total, are estimated to or are likely to include most if not all of the population usually enumerated in the census (excluding the homeless and the institutional populations that can be obtained from other administrative records). These files would play important roles in any census activities involving administrative records.

Not included in this summary are a number of special files with restricted universes, e.g., Veterans Administration files (disability and education files), Office of Personnel Management, and the Indian Health Service, which may have some utility but lack the broad scope and appeal for present purposes.

FEDERAL FILES

The federal files summarized in Table J.1 include:

  1. Internal Revenue Service (IRS) Individual Master File. This is essentially the information available on each individual income tax return (1040). Furthermore, matching to other information returns—1099s, W-2s, and other



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 357
Modernizing the U.S. Census APPENDIX J Content and Quality of Federal and State Administrative Records The Bureau of the Census has established an Administrative Record Information System (ARIS) (for more information on ARIS, see Gates and Palacios, 1993) that provides current information about the content, nature, and availability of over 60 federal administrative record systems and more than 400 state systems. Table J.1 summarizes the subject content (population) available from a selected set of federal administrative records. These files are comprehensive in their coverage for specific universes and, in total, are estimated to or are likely to include most if not all of the population usually enumerated in the census (excluding the homeless and the institutional populations that can be obtained from other administrative records). These files would play important roles in any census activities involving administrative records. Not included in this summary are a number of special files with restricted universes, e.g., Veterans Administration files (disability and education files), Office of Personnel Management, and the Indian Health Service, which may have some utility but lack the broad scope and appeal for present purposes. FEDERAL FILES The federal files summarized in Table J.1 include: Internal Revenue Service (IRS) Individual Master File. This is essentially the information available on each individual income tax return (1040). Furthermore, matching to other information returns—1099s, W-2s, and other

OCR for page 357
Modernizing the U.S. Census IRS documents—would provide links to employers and place of work and considerably increase population coverage. Master Beneficiary Record and the Supplemental Security Record of the Social Security Administration (SSA). Provides some income items not always reported in IRS returns as well as information for many nonfilers. Monthly benefit amounts for Social Security recipients and Supplemental Security Income (for very low income recipients only) are included. Numident File (SSA). This is the basic file of Social Security numbers (SSNs) assigned. Information obtained from application for SSN (the SS-5) form. This is the key file for finding an individual's SSN and for other matching and linking purposes. Summary Earnings File (SSA). This is where individual earnings received under covered employment are posted and maintained as a permanent record for computing Social Security benefit entitlements. Health Insurance Master Entitlement File (HCFA). This is the "medicare" file and provides comprehensive coverage for those 65 years and older. Some content issues (even for the limited amount available): Address: Mailing address with major overlap with home address. It's been estimated that 10-20 percent of IRS addresses on individual tax returns may not be the address of residence. Work with information documents may reduce this percentage considerably. Similarly, addresses of beneficiaries include many financial institutions, but it may be possible to obtain home addresses from the Health Care Financing Administration (HCFA) files. There is the further issue of reference date of address relative to census day. For census purposes the ability to geocode addresses to the smallest lands of geography is paramount, and this ability will be affected by the nature of addresses. Research carried out by the staff of the population division of the Census Bureau using a 1-in-1,000 sample of the 1988 individual income tax file (Form 1040) informs us on this aspect of the problem. Specifically, the results showing the types of addresses in the IRS files were as follows: city style, 81.3%; rural routes, 9.0%; and P.O. boxes, 7.7%. The percentages varied significantly by state, with 9 states having 90 percent or greater city-style addresses, and at the other extreme, 6 states with less than 50 percent of city-style type. Furthermore, less than half the counties had such addresses in excess of 50 percent. Continuing research at the Census Bureau suggested that TIGER/Address Control File (i.e., TIGER update with the 1990 Census Address Control File) should be able to code 64.4 percent of all IRS addresses (Form 1040) in the United States. About 9 percent are not codable because of rural routes, 8 percent P.O. boxes, 2 percent for other reasons; and 17 percent were potentially codable but not coded mainly because of street misspellings, bad abbreviations, or miskeying. Improved address standardization

OCR for page 357
Modernizing the U.S. Census should reduce these problems to a minimum. Present work on enhancing TIGER and proposals for developing and maintaining a continuous updated master address file (MAF) should overcome present shortcomings due to the nature of addresses and in the coding systems (see Schneider, 1992; Sater, 1992, 1993). Race: Uncertain quality and poor coverage especially for new birth cohorts. Before 1980, SSA obtained only three categories: white, black, and other. After 1980, the SS-5 calls for 5 categories: white (non-Hispanic), black (non-Hispanic), Hispanic, Asian and Pacific Islanders, American Indian, or Alaskan Native. Furthermore, in recent years, race is not provided for those applying for SSN for their children at birth using the birth record. Overall, race is not reported in the SSA files as follows: Current beneficiaries (approximately 42 million) 1.5% Supplemental Security Income (SSI) recipients (6 million) 3.3% Wage earners (130 million) 3.0% Social Security numbers issued 1980-1991 (90 million) 15.2% Self-reporting, third-party reporting (birth and death records for example), and consistency of reporting over time also affect comparability of data between and within various records systems. Relationship and household composition: Ability to reconstruct households and family composition from the records requires considerable research. Occupation: Not clear on comparability with census classifications. Industry and class of work: Presumably available from employer's name and employer identification number (EIN), but reference time not clear. Note again that the summary is based on information extracted from the Census Bureau ARIS file. Discussions with program administrators may refine or modify some of the entries. In general, not too much is known about the quality, consistency, and comparability of the various subject items in administrative records. A priori expectations vary between systems and particular characteristics. For example, income data from IRS records, age (date of birth) from birth records, or earning information from the Summary Earnings Record would be expected to be most accurate—in fact they represent standards against which we evaluate accuracy of reporting of these items in other systems. Addresses of persons receiving benefits also falls into this category, except, as noted earlier, mailing addresses do not necessarily reflect addresses of residence. Much research and evaluation would be required to fully understand the quality, including consistency and comparability, of other information in this federal record system.

OCR for page 357
Modernizing the U.S. Census STATE FILES A summary of the content status of various state administrative record systems is provided in Table J.2. Twelve systems are summarized ranging from the broad (perhaps comprehensive) coverage of state income tax files and driver's license records to the more limited universe of birth records and probation and parolee files. Not all states have record systems covering the types of programs indicated, and a large number of program agencies failed to respond to this survey. In general, state programs and their files are much more limited than the federal system in the percentage of the population covered. In terms of content, census-type information (long form) is available on only a very limited basis and in many cases only partially reported. Name, age, sex, and address (mail or residence) are almost universally available. Most of the other items are infrequently included, although income is reported most of the time on five record systems (income tax, AFDC, food stamps, unemployment insurance, and worker's compensation), but little is known on available detail or comparability with census data. As stated, although little is known specifically about the quality (loosely defined) of the individual record items, the Census Bureau survey did attempt to elicit information from file managers on what is known about the quality of its data. Table J.3 provides an accuracy assessment of the state record systems and summarizes survey responses to a series of questions designed to inform on quality aspects of the files. The table shows how many program managers answered "yes" to such questions as to whether studies were carried out relative to record and file accuracy, comparability, and other type studies. The survey did not ask for the results of the studies, but a "yes" response presumes that such information should be forthcoming from the originating agencies (see Figure J.1).

OCR for page 357
Modernizing the U.S. Census REFERENCES Gates, G.W., and H.L. Palacios 1993 ARIS: an administrative records information resource for statisticians. Pp. 189-193 in 1993 Proceedings of the Government Statistics Section. Alexandria, Va.: American Statistical Association. Stater, D. 1992 Geographic Coding Research—Types of Addresses on Income Tax Returns. Memorandum to J. Knott dated February 6. Population Division, Bureau of the Census, U.S. Department of Commerce, Washington, D.C. 1993 Geographic Coding of Administrative Records—Past Experience and Current Research. Technical Working Paper No. 2. Population Division, Bureau of the Census. Washington, D.C.: U.S. Department of Commerce. Schneider, P.J. 1992 Year 2000 Census Research Administrative Records Geographic Coding Research. Memoranda to S. Miskura dated June 8, June 15, and June 29. Population Division, Bureau of the Census, U.S. Department of Commerce, Washington, D.C.

OCR for page 357
Modernizing the U.S. Census TABLE J.1  Content of Selected Federal Administrative Records (all files except the decennial census contain Social Security numbers)     Social Security Files Census Subjects (population only) IRS (including information documents) Master Beneficiary Record Numident File Summary Earning Record HCFA Health Insurance Master Record Name x x x x x Address x x — — x Relationship and/or household composition (partial) — — — — Sex — x x x x Race (15 categories) — w, b, other (2) (2) w, b, other Age (1) (primary taxpayer) x x — x Marital status x x — — — Spanish (4 categories) — Surname in 5 states (2) Spanish — State or country of birth — — x — — Citizenship — — — — — Year of immigration — — — — — School enrollment — — — — — Level of education — — — — — Ancestry/ethnic origin — — — — — Place of residence (5 years ago) — — — — — Language spoken at home — — — — — Ability to speak English — — — — — Military service/veteran status — — — — — Disability (1) x (if receiving disability benefits) x (if receiving disability benefits) x (if disabled) x (if disabled before age 65)

OCR for page 357
Modernizing the U.S. Census Children ever born — — — — — Labor force/employment/ work experience — — — — — Place of work (3) — — — — Commuting items — — — — — Occupation x — — — — Industry (3) — — — — Class of worker (3) — — — — Income (gross) x — — — — Income—wage/salary self-employment, farm x — — covered earnings — Income—Social Security, pensions, other retirement x monthly benefits — — — Income—public assistance, etc. — SSI — — — Income—dividend/interest, etc. x — — — — Income—all other x — — — — Timing/availability Individual returns:   9-12 months Updated daily Updated daily Updated weekly Updated daily   Information returns: 12-15 months         Notes: x indicates content item available in the record system; — indicates content item is not available in the record system. (1) Indication of +65; Indication of blindness. (2) Numbers issued before 1980 -white, black, other; after 1980 -white (non-Hispanic), black (non-Hispanic), Asian, American Indian or Alaskan Native, and Hispanic; race and Spanish not available for those obtaining SSN at birth, and option available starting in 1990. (3) Employer's name and EIN; last or current employer.  Source:  Administrative Records Information System, Program and Policy Development Office, Bureau of the Census.

OCR for page 357
Modernizing the U.S. Census TABLE J.2  Content of Twelve Major Administrative Record Systems Maintained by 52 Jurisdictions (50 states, the District of Columbia, and Puerto Rico) Information Birth Records Death Records Income Tax Records Auto Registration Driver's License Records AFDC Records Food Stamp Records Unemp. Insurance Records Worker's Comp. Records Parollee Records Probation Records Prisoner Records Total Reporting cases 45 45 31 34 40 29 31 41 32 32 30 42 432 Eligible cases 52 52 43 52 52 51 51 52 52 51 45 52 605 Name 45 44 30 33 40 29 30 41 32 32 30 42 428 SSN 14 41 26 9 32 29 29 34 28 25 23 36 326 Date of birth 43 44 4 9 40 29 30 33 30 32 29 42 365 Sex 44 44 1 4 39 29 29 34 31 31 30 40 356 Marital status 25 43 21 0 0 19 16 9 18 24 19 38 232 Race 43 44 0 0 13 26 25 32 3 30 29 41 286 Hispanic origin 38 39 0 0 5 22 21 26 0 23 24 34 232 Mail address 32 12 28 25 32 22 23 34 26 14 15 16 279 Residence address 32 28 12 23 37 20 18 18 22 18 15 29 272 Other geograph. 45 44 29 31 39 22 24 34 25 25 26 37 381 Place of birth 38 41 0 1 2 6 3 2 0 19 17 35 164 Phone 1 0 8 0 2 20 20 31 15 9 7 12 125 Income 0 0 29 0 0 28 29 30 25 8 6 4 159 Occupation 10 32 2 0 0 5 3 16 23 13 10 20 134 Education 25 37 0 0 0 17 13 17 2 23 19 33 186 Health 33 7 1 1 10 13 12 8 21 14 15 22 157 Note:  Each column indicates the number of jurisdictions, out of 52, that maintain content items in the particular record system. The total column displays the total number of major state systems containing the content item (the maximum number in the column is 624, if all jurisdictions maintain all the content items in every major record system). Source:  Administrative Records Information System, Program and Policy Development Office, Bureau of the Census.

OCR for page 357
Modernizing the U.S. Census

OCR for page 357
Modernizing the U.S. Census TABLE J.3  Accuracy Assessment of 12 Major State Record Systems, 1992 System Type Record Accuracy File Accuracy Data Comparability Multiple Sources Data Comparability Similar Collection Statistical Studies Samples Eligible States Control Number of States Responding Birth 35 20 12 31 41 31 52 45 Death 26 14 9 27 44 27 52 45 Income tax 6 11 2 1 20 19 43 31 Auto registration 4 0 2 2 11 10 52 34 Driver's licenses 6 2 4 4 15 11 52 40 AFDC 16 7 9 9 15 24 51 29 Food stamps 16 7 12 8 15 25 51 31 Unemployment 22 10 10 13 20 20 52 41 Workers' compensation 9 4 1 6 16 8 52 32 Parole 12 6 4 6 18 12 51 32 Probation 9 7 7 3 18 12 45 30 Prisoner 12 10 5 12 29 17 52 42 Note: Each column indicates the number of jurisdictions with particular accuracy assessments out of 52 jurisdictions. Source: Administrative Records Information System, Program and Policy Development Office, Bureau of the Census.

OCR for page 357
Modernizing the U.S. Census FIGURE J.1 Administrative File Information Request, Extract of Survey Questions from Section D, Quality of Data. Source: Administrative File Information Request, Form ARIS-1, Program and Policy Development Office, Bureau of the Census.

OCR for page 357
Modernizing the U.S. Census

OCR for page 357
Modernizing the U.S. Census

OCR for page 357
Modernizing the U.S. Census

OCR for page 357
Modernizing the U.S. Census