IRS documents—would provide links to employers and place of work and considerably increase population coverage.
Master Beneficiary Record and the Supplemental Security Record of the Social Security Administration (SSA). Provides some income items not always reported in IRS returns as well as information for many nonfilers. Monthly benefit amounts for Social Security recipients and Supplemental Security Income (for very low income recipients only) are included.
Numident File (SSA). This is the basic file of Social Security numbers (SSNs) assigned. Information obtained from application for SSN (the SS-5) form. This is the key file for finding an individual's SSN and for other matching and linking purposes.
Summary Earnings File (SSA). This is where individual earnings received under covered employment are posted and maintained as a permanent record for computing Social Security benefit entitlements.
Health Insurance Master Entitlement File (HCFA). This is the "medicare" file and provides comprehensive coverage for those 65 years and older.
Some content issues (even for the limited amount available):
Address: Mailing address with major overlap with home address. It's been estimated that 10-20 percent of IRS addresses on individual tax returns may not be the address of residence. Work with information documents may reduce this percentage considerably. Similarly, addresses of beneficiaries include many financial institutions, but it may be possible to obtain home addresses from the Health Care Financing Administration (HCFA) files. There is the further issue of reference date of address relative to census day.
For census purposes the ability to geocode addresses to the smallest lands of geography is paramount, and this ability will be affected by the nature of addresses. Research carried out by the staff of the population division of the Census Bureau using a 1-in-1,000 sample of the 1988 individual income tax file (Form 1040) informs us on this aspect of the problem. Specifically, the results showing the types of addresses in the IRS files were as follows: city style, 81.3%; rural routes, 9.0%; and P.O. boxes, 7.7%.
The percentages varied significantly by state, with 9 states having 90 percent or greater city-style addresses, and at the other extreme, 6 states with less than 50 percent of city-style type. Furthermore, less than half the counties had such addresses in excess of 50 percent. Continuing research at the Census Bureau suggested that TIGER/Address Control File (i.e., TIGER update with the 1990 Census Address Control File) should be able to code 64.4 percent of all IRS addresses (Form 1040) in the United States. About 9 percent are not codable because of rural routes, 8 percent P.O. boxes, 2 percent for other reasons; and 17 percent were potentially codable but not coded mainly because of street misspellings, bad abbreviations, or miskeying. Improved address standardization