Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Preliminary Census Design Issues In this chapter, we give attention to certain key activities that begin far in advance of Census Day (e.g., address list development) and that support many aspects of census operations (e.g., record linkage). We also discuss the legal and operational dimensions of some innovative methods, discussed in detail in later chapters, that are being considered for use in the 2000 census. ADDRESS LIST DEVELOPMENT AND RELATED ACTIVITIES Virtually all fundamental design changes contemplated for the 2000 census depend on the existence of an accurate list of residential addresses. Historically, the address list for a decennial census has served only limited purposes after census operations are completed. For example, the address list is used for se- lected areas in the development of sampling frames for current sample surveys. Such uses occur during a period following the census, but the census address file has not been linked in any way to address list development in subsequent cen- suses. The address list for the next census has been built from scratch using combinations of commercially available lists, listings created by Census Bureau staff, and various address list coverage checks. The duplication of effort, cost, and complexity involved in address list compilation has led outside interests and internal Census Bureau experts to suggest the creation and maintenance of a master list of addresses over the decade (Leggier), 1994~. 30
PRELIMINARY CENSUS DESIGN ISSUES Development of a Master Address File 31 The master address file (MAF) must contain the information for each residential living quarters that is necessary to support Census Bureau contact with house- holds by mail, by telephone, or by personal visit. The content elements for the basic MAF are (Bureau of the Census, 1992a; Leggieri, 19949: · mailing address (including nine-digit zip code and unit designators for multiunit addresses), · location information (e.g., house number/street name address or physical description), · census tract/block numbering area and block number, · number of units in structure, . telephone number of household, · year structure built (for 1990 census units), · type of unit (e.g., open to elements, converted to nonresidential unit structure, tenure), and . . . ·, In mu tl unit status (e.g., unconfirmed delete, unverified add, duplicate address). For group living quarters, such as college dormitories, additional information must be maintained, including name and type of group quarters and a link to a special place (e.g., a college or university). The MAF will be linked to the Census Bureau's automated geographic data- base, the TIGER (topologically integrated geographic encoding and referencing) system. This linkage should provide higher-quality address list coverage and geographic accuracy, as well as improved geographic products for census data collection and local review. In addition, a linked address/geographic file can provide support for many aspects of census research and related demographic programs. (In later sections, we discuss the possible use and support of such a file by other federal agencies, states, and local governments.) The proposed development and ongoing maintenance of a geographically linked MAF requires completion of several major steps, as outlined by Leggieri (1994): 1. Expand the TIGER geocoding capabilities and reformat the 1990 census address control file to prepare for linkage and matching. 2. Match the reformatted 1990 address file to a primary source of new ad- dress information, such as the delivery sequence file from the Postal Service. 3. Match the updated MAF to the TIGER database for assignment of geo- graphic codes. 4. Resolve nonmatches through clerical procedures and field work. 5. Use administrative records to supplement the primary address source. At the time of this report's preparation, address ranges in the TIGER database
32 COUNTING PEOPLE IN THE INFORMATION AGE have been expanded, and the other steps listed above are being implemented for the three urban sites in the 1995 census test Oakland, California, Paterson, New Jersey, and New Haven, Connecticut. This activity is not being undertaken for the rural test site six parishes in northwestern Louisiana because rural-style addresses cannot be geocoded into the current TIGER database. In the 1995 census test, questionnaire mailing will not be performed at the rural site. Instead, an update/leave procedure will be used; that is, census forms will be delivered to housing units by Census Bureau personnel, and the address list for this site will be updated using information collected at the time of delivery. MAF/TIGER Benefits for the Decennial Census and Other Programs Lists containing address information but without names of residents or other personal data-support several stages of decennial census operations, in- cluding distribution of mail questionnaires, follow-up of nonresponding house- holds, and measurement of population coverage. The MAF/TIGER system would support several fundamental census design changes being considered in the 1995 census test to reduce census costs and eliminate differentials in census coverage. In particular, MAF updating with Postal Service information would facilitate the use of letter carriers to identify vacant housing units during census mailout. The MAF would also provide a frame for sampling (and subsequent estimation) dur- ing nonresponse follow-up and integrated coverage measurement. Finally, the MAF would provide a control framework for matching administrative records- compiled by other federal, state, or local agencies as a substitute for or supple- ment to traditional census enumeration. There are numerous potential uses of a continuously updated MAF/TIGER system beyond its application during a decennial census. Such a system would be required to support an administrative record census (see Chapter 5) or a continu- ous measurement survey of the type under current examination (see Chapter 6~. In addition, the MAF could also be used for intercensal population estimates and projections, redesign of current surveys, special censuses, and quality evaluation of other lists and activities (Leggiest, 1994~. A linked address/geographic file can thus provide support for many aspects of census research and related demo- graphic programs. Indeed, its utility would be likely to extend to other federal agencies, states, and local governments. The panel believes that a geographic database that is fully integrated with a master address file is a basic requirement for the 2000 census, regardless of the final census design. Recommendation 2.1: The Census Bureau should continue aggressive development of the TIGER (topologically integrated geographic encod ing and referencing) system, the Master Address File (MAF), and inte
PRELIMINARY CENSUS DESIGN ISSUES 33 "ration of these two systems. MAF/TIGER updating activities for the 1995 census test sites should be completed in time to permit the use and evaluation of the MAF/TIGER system as part of the 1995 census test. Successful completion of MAF/TIGER updating for the test sites will enable the Census Bureau to gain valuable experience during the 1995 census test. Frequency of MAF/TIGER Updating In order to maintain a MAF/TIGER system that can be used throughout the decade, continuous updating would be required. Current Census Bureau plans are to conduct the full maintenance cycle several times during the decade for high-growth counties and less frequently for areas of minimal change (Leggier), 1994~. This updating is needed primarily to maintain the quality of the system- i.e., the accuracy of the listed addresses and coverage of the housing unit stock. However, the field efforts required to update the MAF/TIGER system may have other benefits for census operations by facilitating local outreach and cooperation with state and local governments. Maintenance of the system could rely heavily on local support (see the section below on cooperation with state and local gov- ernments). High growth in new housing unit construction may not be the only criterion in determining the frequency of updating. Certain types of new housing e.g., trailer parks, migrant worker camps are more difficult to locate than others. Also, it is unlikely that all hard-to-find housing units could be identified by any procedure that is part of the counting phase of census operations. Integrated coverage measurement procedures will be needed to complete census coverage of . . lousing units. Alternatives to decennial collection of small-area information may impose more rigorous requirements on the MAF/TIGER system. If the MAF is used only once a decade, then updating might be accelerated in the years immediately preceding a decennial census. But more steady monthly updating would be needed to support the intercensal long-form survey that would be part of a con- tinuous measurement program. A recent version of the continuous measurement prototype included a component for updating approximately 1,000 problem (i.e., nonmatched) addresses per month. Frequent updating would presumably also be needed if the MAF/TIGER system is to provide an accurate sampling frame for other current surveys. Cooperation With the Postal Service The logical source of national information on mailing addresses is the United States Postal Service (USPS). In 1991 the Census Bureau's Geography Division began negotiations with the USPS to develop a partnership that would build and
34 COUNTING PEOPLE IN THE INFORMATION AGE maintain an integrated MAF~IGER system. There was initial interest in the potential value of TIGER to further automate USPS mail delivery planning and management activities. After examining special prototype TIGER files that con- tained customized postal information, however, the USPS concluded that the potential value of the TIGER system was not sufficient to pursue the joint ven- ture. The Census Bureau is exploring the possibility of collaborating with state and local governments on TIGER updating, as discussed in the next section. The Census Bureau and the USPS are still working to develop an arrange- ment that would allow the Census Bureau to maintain the MAF using USPS address information. Regular updating would be accomplished by matching the MAF to the USPS delivery sequence file, which contains information about delivery addresses. The two agencies have reached an agreement in principle, but the details of a cooperative arrangement have not been settled (Leggier), 1994~. Developmental work is proceeding under a memorandum of understanding signed by the Census Bureau and the USPS in August 1993. The memorandum of understanding authorized a pilot study to share address information for five 3- digit zip code areas. That phase of limited testing has been completed, and the project is being expanded to include exchange of address information for the four sites of the 1995 census test. A national agreement for address sharing will require amendments to federal legislation governing both agencies (see the sec- tion below on legal issues). Developmental testing will be necessary to resolve differences between the USPS and Census Bureau conceptual definitions of an address. For example, the USPS files contain both commercial and residential addresses, but the MAF is restricted to residential addresses. Home-based businesses present a special chal- lenge, because their addresses may be classified as commercial in the USPS file, even though one or more persons may reside there. For USPS purposes, an address is a delivery point. This definition may cause difficulties for census coverage of some multiunit dwellings, such as high- rise apartment buildings for which the local letter carrier is responsible for only a single delivery mail drop and the building management staff sorts mail for indi- vidual residents. Multiunit dwellings also pose problems for matching, because different address sources may identify the unit designation differently or lack unit identifiers altogether. Rural address designations (e.g., rural delivery numbers) and post office boxes are problematic for census-taking, because field enumerators cannot al- ways locate the mailing address. Also, these addresses cannot be geocoded into the current TIGER data base. Leggieri (1994) provides further discussion of these challenges and possible steps toward solving them. Potential USPS involvement in census-taking extends beyond address list development. One of the more promising options is to use information supplied by local letter carriers during census mailout to identify vacant and nonexistent
PRELIMINARY CENSUS DESIGN ISSUES 35 housing units. Such a procedure could save money by eliminating one of two Census Bureau checks and would also enable address cleaning to occur earlier in census operations. Both agencies have examined the suggestion that letter carriers be used to conduct interviews during the nonresponse follow-up stage of census operations. The USPS and Census Bureau have concluded that this is not a viable option, because of concerns about interference with mail delivery, significantly higher census costs, and erosion of public confidence in the privacy of information entrusted to the postal system. The panel accepts this conclusion. Other areas for expanded USPS-Census Bureau cooperation are described more fully in a No- vember 1993 letter from both agencies to the House Appropriations Committee (Green and Scarr, 1993~. Cooperation With State and Local Governments Local knowledge about living quarters can serve as an important source of information to supplement the USPS delivery sequence file. Plans for involving state and local governments in the MAF/TIGER updating process are not yet well defined. Several possible cooperative roles are under consideration for testing in the 1995 census test (see Leggieri, 1994, for further discussion). At a minimum, regional Census Bureau staff would use reference materials (e.g., maps, address lists) supplied by state and local jurisdictions to clerically resolve addresses that have not been geocoded or matched by computer. State and local governments could be involved in an earlier and larger role if the Census Bureau accepted automated address or geographic files from local jurisdictions. Files such as tax assessment records and digital geographic files for 911 emergency response systems would be matched to the MAF/TIGER data bases, with subsequent resolution of nonmatched cases, and might prove to be valuable sources of information for updating. Maximum involvement of state and local governments would be achieved by turning responsibility for MAF/TIGER updating over to these jurisdictions. Un- der this scenario, local governments would be supplied with a copy of the MAF for their jurisdiction, along with corresponding TIGER maps or files. Matching and updating would then be carried out by the local government. Implementation of this approach would require changes to Title 13 of the U.S. Code to permit sharing address information with state and local governments (see the section below on legal issues). A potential difficulty with this approach is the presum- ably wide variation in technological capabilities across local jurisdictions. A procedure for MAF/TIGER updating in a large, urban area may not be feasible for a small, rural area. The implications of a differential approach to MAP/ TIGER updating (e.g., for coverage measurement methods) will need to be weighed in considering this option. State and local governments can also play a role in other components of
36 COUNTING PEOPLE IN THE INFORMATION AGE census operations (see Collins, 1994~. Involvement in outreach and promotion programs and in planning and assisting the census enumeration is discussed in Chapter 3. State and local administrative records may also help to improve census coverage. The use of selected state and local administrative records for coverage improvement and other purposes have been examined in tests associ- ated with special censuses for Godfrey, Illinois, and South Tucson, Arizona (Bureau of the Census, 1993e, 1993f) and will be explored more fully in the 1995 census test. Collins (1994) notes that experience from past censuses-e.g., the parolee-probationer program in 1990 suggests that cost-effective use of state and local records for coverage improvement will require them to be standardized, automated, and accessible, with appropriate provisions for confidential handling. Conditions for the statistical use of administrative records are discussed further in Chapter 5. RECORD LINKAGE Record linkage is the identification of records belonging to the same unit (i.e., a person, household, or housing unit) either within a single data set or across two different data sets. In decennial census applications, records are matched either to eliminate duplication or to pool information from multiple sources. Many census operations involve matching one list of records to another. Needs for record linkage arise when address lists and other administrative records are used, when people are given multiple opportunities to respond to the census, and when dual-system estimation is used as part of a coverage measurement program. Historically, an initial match has been performed by a computer algo- rithm, followed by clerical verification and resolution. Many of the innovative methods being examined in the 1995 census test would place greater demands on matching technology. Thus, improvements in the accuracy or efficiency of auto- mated record linkage will support the 2000 census design by increasing the capability to produce reliable results within time and budget constraints. As discussed above, the development and updating of an integrated MAP/ TIGER system will require automated address matching and geocoding at vari- ous stages. For example, the 1990 address control file will be matched to more current sources, such as the USPS delivery sequence file. Effective methods for record linkage could minimize the problems posed by duplicate address listings that occur when the same residence is listed in two different ways in different address records. Multiple listings are more likely to be found in rural areas, where they are a potential source of erroneous enumeration (Leggier), 1994~. The distribution of unaddressed questionnaires, the opportunity to respond by mail or telephone, and the application of other special methods are likely to increase the potential for duplication in the census enumeration. These activities will result in questionnaires without housing unit control identification numbers that must be matched to census files to determine whether persons and housing
PRELIMINARY CENSUS DESIGN ISSUES 37 units were enumerated more than once. The Census Bureau's recent research on fostering may lead to new approaches to ascertaining residency, which would present new complications in assigning people correctly to geographic areas. Matching may also be needed to obtain telephone numbers during follow-up of nonresponding households. Record linkage technology will also support the development of an adminis- trative records database for the 1995 census test sites (see also Chapter 5~. Cur- rent plans call for the database to be used in stratification of the nonresponse follow-up and integrated coverage measurement sample designs. By incorporat- ing information about Spanish surnames, the database may also be able to iden- tify geographic areas in which response might be improved by distributing Span- ish-language questionnaires. The integrated coverage measurement method being tested in 1995 consists of two stages: (1) an independent housing unit listing to assess the coverage of housing units within the sampled areas and (2j a within-household reinterview to assess the coverage of persons within the housing units (see Chapter 4~. Detailed procedures are still under development, but they could involve the automated comparison of records from the census enumeration and the independent opera- tions of integrated coverage measurement. The Census Bureau will need solid capability for computer matching and elimination of duplicate records in order to perform all the above tasks in an accurate, timely, and cost-effective manner. The Census Bureau is conducting ongoing research on methods for auto- mated record linkage. A recent study (Bureau of the Census, 1994f) matched the 1990 census file for South Tucson, Arizona, to the file of persons enumerated in the special census of the community that was carried out in November 1992. These two independent files were used to compare the accuracy of three different computer matchers. Two of the matchers used a probabilistic algorithm to clas- sify individual records according to the likelihood of a match. The third, and newest, matcher first attempts to match corresponding households. Persons in nonmatched households are then matched according to individual characteristics. In this study, the "true" match status was determined through clerical opera- tions; because of budget constraints, no field work was done to verify people's actual status. Against this measure of performance, the three matchers yielded similar results, and none emerged as uniformly superior to the others. The two individual matchers obtained results that agreed with the "true" match status 89.6 percent and 87.8 percent of the time. The agreement rate for the household matcher was somewhat lower (83.4 percent), but a slight revision to the code would have increased the agreement rate to 87.2 percent. Cross-tabulations suggest that, although the three matchers agree on many cases, there are signifi- cant numbers of cases in which the matchers made different classifications. Some combination of the household strategy and the two individual strategies might improve capabilities for matching lists in the presence of underlying household
38 COUNTING PEOPLE IN THE INFORMATION AGE structures. The 1995 census test provides an opportunity for further comparative evaluation of automated record linkage technologies. Recommendation 2.2: The Census Bureau should continue its research program on record linkage in support of the 1995 census test and the 2000 census. Efforts should include studies of the effectiveness of differ- ent matching keys (e.g., name, address, date of birth, and Social Security number) and the establishment of requirements for such components as address standardization, parsing, and string comparators. Existing record linkage technology should be tested and evaluated in the 1995 census test. Limits on the ability to eliminate duplicate records may prove to be the control- ling factor with regard to the feasibility of many of the innovations under consid- eration for the 2000 census design. LEGAL ISSUES There are many legal issues associated with the decennial census, perhaps the most obvious being the content requirements mandated by the Constitution and other law. The Panel on Census Requirements is conducting a thorough investigation of these content requirements. However, legal issues with possible implications for census methods arise in at least five contexts: (1) census starting and reporting dates, (2) the use of sampling and statistical estimation, (3) sharing Census Bureau address lists with other government agencies, (4) accessing Postal Service address information, and (5) accessing administrative records for statisti- cal purposes. Census Reference Date April 1 is mandated as the reference census date by Title 13 of the U.S. Code. Title 13 also mandates that the state population counts required for reapportion- ment be provided 9 months after the census date and that local-area data needed for redistricting be provided no later than 12 months after the census date. Thus, the respective deadlines for reapportionment and redistricting data are December 31 of the census year and March 31 of the subsequent year. The 1995 census test will use March 4 as the census reference date. At the time of this report's writing, legislation has been proposed in the House of Rep- resentatives to establish the first Saturday in March as the new census reference date. An earlier census date should alleviate problems encountered in the enumeration of households that move during census operations. Research indicates that the peak moving season in the United States begins in mid- May (Scarr, 1994~. An earlier census date may also reduce difficulties in
PRELIMINARY CENSUS DESIGN ISSUES 39 enumerating college students, who are more likely to be in residence on or near campus? and homeless people, who are more likely to use shelters and services in colder weather. Because moves from one housing unit to another tend to occur at the end and the beginning of a month, conducting a census using one of the first days of the month as the reference date may lead to more frequent errors of misclassification. The shift from April 1 to March 4 will probably not significantly reduce end-of- month moving problems; delivery of questionnaires (and prenotice letters, if used) takes place a few days before the reference date. A greater shift toward the middle of the month e.g., to the second Saturday in March would probably be needed to minimize the enumeration difficulties posed by moving households. Other countries have acted on the basis of similar considerations; Canada expects gains in accuracy in future censuses by changing its census date from June 4 to May 14 (Choudhry, 1992~. (See Chapter 3 for further discussion of residential mobility.) However, adopting the first Saturday, instead of the second Saturday, as the census reference date does possess some operational advantages. Using a census reference date early in the month will enable all phases of the mail operation- from prenotice letter to second reminder after receipt of a replacement question- naire to be completed by the end of the month, thus avoiding potentially signifi- cant problems when mail follow-up occurs in a different month than the original mailing. But if the latter stages of the mail operation (e.g., a second reminder card) do not prove cost-effective, their deletion would permit reconsideration of a midmonth census reference date. In weighing alternative methods, concern has been expressed about the abil- ity to provide data by the legislatively mandated deadlines. The panel believes that the need for these 9-month and 12-month deadlines should be reevaluated if otherwise promising census methods would be unlikely to meet one or both dates especially the former. This consideration could apply, for example, to any proposed methodology for integrated coverage measurement (see Chapter 4~. Maintaining the December 31 and March 31 deadlines with an earlier census reference date (the current legislative proposal), such as March 4 or 11, would allow more time to implement follow-up activities and integrated coverage mea- surement to produce the official census estimates. Promising new methods that can reduce the differential undercount or substantially reduce the costs of the census should not be discarded on grounds of time constraints without further consideration of those legally imposed constraints. Recommendation 2.3: In view of the operational advantages that are likely to result, the panel endorses the proposed change in census refer ence date from April 1 to the first Saturday in March. Furthermore, we recommend that changing the census reference date from early in the
40 COUNTING PEOPLE IN THE INFORMATION AGE month to midmonth (e.g., the second Saturday in March) be reconsid- ered if subsequent modifications to the mailout operation would permit all census mailings to be executed within the same calendar month using a midmonth reference date. Use of Sampling and Statistical Estimation The legal acceptability of using sampling and statistical estimation in the decennial census is supported by rulings in every U.S. District Court case and a similarly favorable position in a recent Congressional Research Service (CRS) report (Lee, 1993~. In its interim report (Committee on National Statistics, 1993a), the Panel on Census Requirements, relying on reviews by legal scholars, en- dorsed the CRS position that sampling and statistical estimation are acceptable provided that there has first been a bona fide attempt to count everyone (e.g., by distributing a mail questionnaire). As in our interim report, our recommendations in this report are based on this premise. The Census Bureau is considering whether Title 13 of the U.S. Code should be amended regarding the use of sampling for appropriate purposes (Scarr, 1994~. The panel has no objections in principle to enacting clarifying legislation, but we do not view passage of such legislation as necessary for implementing non- response follow-up sampling and integrated coverage measurement in the 2000 census. Access to Address Information Part of the Census Bureau's research and development program for the 2000 census has involved exploration of further possibilities for cooperative working relationships with other federal agencies and state and local governments. The ability to forge stronger cooperative relationships has sometimes been hindered by the perception that there is a one-way flow of information to the Census Bureau without reciprocal benefits to the cooperative party. This perception has been reinforced by a Supreme Court ruling that address lists without any indi- vidually identifiable data that are collected and recorded by the Census Bureau become confidential under Section 9 of Title 13 and therefore may only be seen by sworn employees of the Census Bureau. The Census Bureau has stated in recent congressional testimony (Scarr, 1994) that it seeks a legislative change that would allow the agency to share its address lists with federal, state, and local officials to meet three objectives: (1) to im- prove the accuracy and completeness of the address lists; (2) to provide meaning- ful participation by governmental units in the census; and (3) to minimize the costs to the taxpayer for construction of duplicative address lists by various governmental agencies in order to implement public programs.
PRELIMINARY CENSUS DESIGN ISSUES 41 Title 39 of the U.S. Code restricts the USPS from disclosing lists of names or addresses, and similar restrictions on the Census Bureau appear in Title 13. Legal considerations thus impose constraints on USPS-Census Bureau coopera- tion. Special temporary legislation was obtained to permit the USPS to share detailed address information with the Census Bureau during the 1984 Address List Compilation Test (Bureau of the Census, 1992e). Similar permanent legisla- tion might provide an opportunity for both agencies to realize significant gains in operational efficiency and consequent cost savings, but any joint activity will need to attend to confidentiality issues regarding the sharing of address lists. The potential utility of a geographically linked master address file suggests the possibility that development and maintenance of such a system could be undertaken by a consortium of federal, state, and local agencies. Under such a scenario, of course, the Census Bureau would be a major customer. But this arrangement might allow the realization of efficiency gains more broadly and quickly across levels of government. Development of a national address registry that is maintained outside the Census Bureau raises complex issues with regard to current statutes (Title 13, U.S. Code). If address information flows only into the Census Bureau, then changes to Title 13 are probably unnecessary. But, if new information about addresses is obtained by the Census Bureau in using the registry, then revision of Title 13 may be needed to permit such information to be forwarded to the custo- dian of the registry. Confidentiality issues will need to be resolved, particularly for data such as occupied or residential units that do not appear on local prop- erty rolls or in building code records that could be used for enforcement pur- poses. We believe that the development of an address registry for use by multiple government agencies requires the involvement of the Statistical Policy Office in the Office of Management and Budget. Recommendation 2.4: The Statistical Policy Office of the Office of Man agement and Budget should develop a structure to permit the sharing of address lists among federal agencies and state and local governments including the Census Bureau and the Postal Service for approved uses under appropriate conditions. Access to Administrative Records for Statistical Purposes Chapter 5 discusses potential uses of administrative records in the decennial census and related demographic programs. The panel is concerned that research and development to expand the use of administrative records for census pur- poses with potential benefits for improved accuracy and lower cost-might be impeded by unnecessary restrictions on access to some of the administrative record systems that offer the greatest promise for such use.
42 COUNTING PEOPLE IN THE INFORMATION AGE In its earlier letter report (Committee on National Statistics, 1992), the panel recommended: · The Census Bureau should seek the cooperation of federal agencies that maintain key administrative record systems, particularly the Internal Revenue Service and the Social Security Administration, in undertaking a series of experi- mental administrative records minicensuses and related projects, starting as soon as possible and including one concurrent with the 2000 census. We continue to support this type of cooperative research because of its potential benefits for improved census methodology. Provisions regarding access to administrative records are the subject of cur- rent debate, mostly in connection with proposals for health care reform. A recent letter to the chairman of the House Subcommittee on Census, Statistics, and Postal Personnel from the Committee on National Statistics (Bradburn, 1994) distinguished research and statistical uses of administrative records, which are not concerned about specific individuals, from regulatory, administrative, and enforcement uses, which do affect specific individuals. A recent report by a panel of the Committee on National Statistics and the Social Science Research Council (Duncan et al., 1993) describes effective administrative and technical practices that federal statistical agencies can adopt to protect confidentiality of information on individuals while allowing access to the information for impor- tant research and statistical purposes. Chapter 5 provides further discussion of ways to improve access to administrative records and protect confidentiality. OPERATIONAL ISSUES Uniform Treatment The legitimacy of the census depends in part on public perception that it fairly treats all geographical areas and demographic groups in the country. "Fair treatment" can be defined in either of two ways: by applying the same methods and effort to every area or by attaining the same degree of population coverage in every area so that estimates of relative populations of different areas are accurate. These alternatives are in some ways analogous to the competing principles of equality of opportunity and equality of outcome in the provision, for example, of education services. The proper balance of these principles is a subject of policy debate about the provision of services. In the case of the census, however, the priorities are clear: the objective of the census is to measure population accu- rately and above all to calculate accurate population shares in order to apportion representation properly. Therefore, obtaining equal coverage clearly takes prior- ity over using the same methods in every area. In fact, since experience shows that treating every geographical area and demographic group in the same way leads to differential coverage, the Census Bureau has a positive duty to use
PRELIMINARY CENSUS DESIGN ISSUES 43 methods designed to close the coverage gap, a duty recognized as a mandatory criterion for any 2000 census candidate design (see Chapter 1~. The approach of developing a tool kit of special methods and a planning database (described further in Chapter 3) is one of the Census Bureau's responses to this duty. This approach involves constructing a planning database, containing information on demographic and housing characteristics, to be used to identify areas at particular risk for low mail return rates or other enumeration problems. These are areas in which an accurate enumeration is likely to benefit most from the deployment of special techniques drawn from a tool kit of candidate meth- ods such as using specially trained enumerators or address locators, opening census assistance centers, distributing forms other than by mail, and distributing some forms in languages other than English. These tool-kit methods would be applied as needed in small areas of various sizes. The decision to use any particular tool-kit method would be controlled by some combination of adminis- trative judgment, information in the planning database, and predictions from a formal targeting model. Past censuses have also used different treatments with the goal of achieving equal outcomes. For example, in the 1970 and 1980 censuses, the Census Bureau used enumeration by personal visit in sparsely populated areas but used a mail questionnaire in other areas. In 1990, special enumeration methods were used, often at local discretion, but their cost-effectiveness has not been well docu- mented. The targeting model would establish a more formal structure for such applications. Beyond the tool-kit methods discussed in Chapter 3, targeting efforts might be useful for other census operations, particularly the development of address lists and administrative record databases. In the past, the Census Bureau selected commercial address list vendors using a criterion of gross coverage. A better criterion might be to equalize address list coverage in easy-to-count and hard-to- count areas. The use of administrative records, described in Chapter 5, is not part of the tool kit available to local census offices, but such use might also involve some targeting of efforts to particularly hard-to-enumerate areas or population groups. For example, a list of food stamp recipients could add more names to low-income areas. Other lists, such as state (driver's license) or local govern- ment (school registration) lists, would of necessity contribute to the count only in their areas of coverage. Some critics worry that the use of special methods in certain areas (e.g., tool- kit methods, local administrative records) might make statistical assessments of coverage more difficult or might invalidate assumptions used to combine sample- based estimates and enumeration totals. This criticism must be taken seriously. Plans for 2000 call for correcting differences in coverage across areas or groups by coverage measurement and estimation. These plans are described more completely in Chapter 4. The correction methodology is likely to involve multiplying counts for each poststratum (estimation cell) by a factor that is con
44 COUNTING PEOPLE IN THE INFORMATION AGE slant across the cell. For example, a poststratum might consist of all black males ages 18-29 who live in rented homes in large urbanized areas (population over 250,000) in the West, as in the calculation of estimates from the 1990 coverage measurement program. Counts for people in this group in any smaller area would then be multiplied by the same factor. Suppose that a method designed especially to increase response among rent- ers is applied in all large urbanized areas in the West. Furthermore, suppose that the collection of all households whose coverage is likely to be improved by this method coincides with some combination of poststrata. In other words, suppose that every poststratum containing renters in large urbanized areas in the West consists only of such people, but does not include homeowners, people outside large urbanized areas, or people outside the West. (Again, this was true of the cell definitions used for 1990 coverage estimation.) If the special method im- proves coverage for these poststrata, then the statistical estimation procedure used in integrated coverage measurement will find correspondingly higher levels of coverage than it otherwise would have found, and it can properly account for the effects in producing population estimates. It is possible that the special method would be more effective in some urban- ized areas than others, just as the census without special methods has better coverage in some areas than others, but there would not be a predictable bias. We therefore believe that the use of special methods, including tool-kit methods or local administrative records, would not create any new statistical problems when applied for a geographic area or population that is recognized by the estimation procedure. Any improvements that the special method causes in initial coverage through enumeration and assignment are very desirable, because with high levels of initial coverage, final estimates are less dependent on estimation and mean squared error is reduced. More serious concerns arise if special methods are applied differentially within geographic areas or subpopulations that correspond to poststrata. For example, if a special method is applied to improve coverage for renters in only one western city, but people in several cities fall into the same poststrata, then the coverage measurement procedures would not recognize the differential ef- fect on coverage of the method, and, even after estimation, the city in which the special method was used might predictably benefit at the expense of other west- ern cities. Several points should be considered in defense of census procedures that treat different areas differently. First, with any practical posts/ratification scheme (cell definition), there will be some heterogeneity within the cells, both in the underlying conditions affecting census coverage and in the conduct of the census. This has always been the case; for example, mail return rates and district office closeout dates varied substantially in the 1990 census. Second, if special enu- meration methods can be targeted to areas that have relatively low coverage for their poststrata, these methods may make coverage more homogeneous within
PRELIMINARY CENSUS DESIGN ISSUES 45 the poststratum, so population counts after estimation will be made more accu- rate. Thus, differences in treatment can be justified by local differences in condi- tions, especially if the decision to use a special method is determined by an objective decision procedure. If decisions are based on knowledge about the distribution of hard-to-count populations, such differences in treatment will tend to reduce differentials in outcome. Third, differences in treatment based on objective measures of the difficulty of enumeration or the usefulness of particular techniques in different areas are more justifiable than those that result from haphazard implementation of coverage improvement programs or the assert- iveness and technical capabilities of local authorities. However, certain practices that may arouse justified suspicion should be avoided. If special enumeration methods are targeted to certain areas but not to others with similar characteristics, their application will appear to be arbitrary. The same will be true if they are targeted toward only some ethnic or socioeco- nomic groups but not others with similar undercoverage problems. Systematic and complete planning for the use of these methods, based on objective criteria, can defend against the appearance of arbitrariness. Inevitably, there will be different levels of success in operations of various district offices due to varying local conditions. By considering in advance rules for closeout of district office operations and for distribution of additional re- sources to district offices rules that are designed to optimize uniformity of coverage within the constraints of varying conditions the Census Bureau will do its best to produce uniform coverage. Of course, as in past censuses, the actual degree of uniformity attained will be limited by practical constraints. Paradoxically, the Census Bureau's improved capabilities and success in tracking census operations, together with growing knowledge and awareness about factors that may affect differential undercount, create a climate in which even more than usual care must be given to avoid any appearance of arbitrariness ~ . . or tavont~sm. Residence Rules The residence rules that define where each person should be counted in the census are crucial to census coverage for several reasons. First, it is important to have consistent rules so that each person is counted in only one place (especially when matching records or eliminating duplication from multiple information sources is done). Second, people should be assigned to the correct location, as defined by the residence rules in effect. Third, people should not be excluded solely because the residence rules do not easily apply to them. The Census Bureau has conducted the census on the basis of de jure rather than de facto residence that is, people are essentially asked "VVhat was your usual residence on Census Day?" rather than "Where did you actually stay on Census Day?" De jure enumeration asks people to report themselves where the
46 COUNTING PEOPLE IN THE INFORMATION AGE "rules" say they should be counted; a de facto approach would collect informa- tion on where people are found (see further discussion in Chapter 3). The de jure approach has the advantage of defining residency in a way that does not depend on what happened on a particular day, but it requires that the respondent under- stand and apply the Census Bureau's definition of "usual residence." This task can be difficult for people whose de jure residency is hard to determine or who have none at all, such as homeless people and young people who move about from place to place. Attention must be given to defining residency consistently throughout all stages of census operations: questionnaire mailing and questionnaire return and subsequent nonresponse follow-up and coverage measurement activities. The definition of residency is particularly critical for coverage measurement pro- grams (see Chapter 4), because they must determine Census Day residency weeks or months after the fact. The concept of residency reappears at various points in subsequent chapters. Chapter 3 contains further discussion of residence rules and reviews related research that is aimed at improving within-household coverage and handling complicated living situations. That chapter also considers ideas for collecting a "census night" roster followed by questions to assign de jure resi- dence. Administrative records vary in how they define residency, both because of the different purposes and laws under which they are collected and because some are continuously updated while others follow set time schedules. Uses of specific administrative record systems will have to take into account the defini- tions used and the frequency of updating of their residence information (see Chapter 5~. Continuous Infrastructure Common sense, complemented by anecdotal evidence, suggests several ben- efits associated with the Census Bureau's maintaining a continuous presence in local areas throughout the decade. Ongoing activities could contribute to more effective outreach and promotion, thus improving public response and decreasing costs associated with nonresponse follow-up. (This theme is explored more fully in Chapter 3.) Organizational efficiencies might be realized by reducing the number of temporary staff needed in the 10-year census cycle. The potential benefits could be especially significant if a continuous measurement program is adopted. Chap- ter 6 assesses the pros and cons of a continuous measurement census design. The Census Bureau is planning to continue work to develop this option in parallel with the 1995 census test. As in the evaluation of tool-kit enumeration methods, the value of maintain- ing a continuous presence will have to be weighed against the costs of putting in place the necessary structure, staff, training, and tools. Some benefits can be readily quantified; others are more qualitative and may require subjective assessment.