Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 379
The 2000 Census: Counting Under Adversity Appendix C Census Operations This appendix provides additional detail on the operations of the 2000 census, noting differences from 1990 census procedures. It covers five topics: the Master Address File (MAF) (including local review and internal checks for duplicate addresses); questionnaire delivery and mail return (including redesign of mailings and materials and multiple response modes); field follow-up (including nonresponse follow-up, NRFU, and coverage improvement follow-up, CIFU); outreach efforts; and data processing (including data capture, coverage edit and telephone follow-up, unduplication of households and people, and other processing). Two important parts of census data processing—editing and imputation—are described in greater detail in separate appendices for the basic (complete-count) data (Appendix G) and the long-form-sample data (Appendix H). General theory and approaches to item imputation are discussed in Appendix F.
OCR for page 380
The 2000 Census: Counting Under Adversity C.1 MASTER ADDRESS FILE The 2000 census was conducted primarily by mailing or delivering questionnaires to addresses on a computerized mailing list—the MAF—and asking residents to fill out the questionnaires and mail them back.1 The Census Bureau first used mailout/mailback techniques with an address list in the 1970 census,2 but the procedures to develop the 2000 MAF differed in several important respects from those used in past censuses (see Working Group on LUCA, 2001; Owens, 2000). The major difference from 1990 was that the 2000 MAF was constructed using more sources. C.1.a Initial Development The Census Bureau used somewhat different procedures to develop the MAF for areas believed to have predominantly city-style mailing addresses (house number and street) than for areas believed to have predominantly rural route and post office box mailing addresses (see Box C.1). City-style areas were those inside the “blue line,” and non-city-style areas were those outside the “blue line.”3 For areas inside the blue line, the Bureau expected to have U.S. Postal Service carriers deliver questionnaires to most addresses on the list; for areas outside the blue line, the Bureau expected to use its own field workers to deliver questionnaires. For remote rural areas, which have less than 1 percent of the population, Census Bureau enumerators developed the address list concurrently with enumerating households in person. For special places in which people live in nonresidential settings, such as college dormitories, prisons, nursing homes and other group quarters, the Bureau used a variety of sources to develop an address list. 1 The Census Bureau refers to the version of the MAF that was used in the census as the Decennial Master Address File or DMAF. It is an extract of the full MAF, which includes business as well as residential addresses. Use of the term MAF in this report refers to the DMAF. 2 Unaddressed short-form questionnaires were delivered by the U.S. Postal Service in the 1960 census to 80 percent of households, but residents were to hold the questionnaires for enumerators to pick up. At every fourth household, enumerators left a long-form questionnaire, which respondents were to fill out and mail back (National Research Council, 1995b:189). 3 The blue line was a late-1997 Census Bureau demarcation.
OCR for page 381
The 2000 Census: Counting Under Adversity Inside the “Blue Line” As the starting point for the MAF for city-style areas inside the blue line, the Census Bureau took the 1990 census address list for these areas and updated it from the Delivery Sequence File (DSF) of the Postal Service. The DSF contains a listing of addresses to which mail is delivered, ordered by carrier routes. It is updated regularly. The Census Address List Improvement Act of 1994 (P.L. 103-430) allowed the Postal Service to share the DSF with the Bureau. Although not part of its original plan, the Bureau determined that a complete field check of the city-style list should be conducted, which was done in a block canvass operation for all mailout/ mailback areas conducted in January–May 1999. The reason for the complete block canvass was the determination that the DSF was not as accurate or as up to date in all areas as needed for the MAF. The Bureau also provided an opportunity for local review in 1998 and 1999 (see Section C.1.b). Approximately 101 million addresses were included in the MAF for areas inside the blue line at the time when questionnaires were labeled and prepared for mailing in July 1999. The Postal Service conducted an intensive check of the DSF in early 2000, and updates were made to the MAF based on that check prior to questionnaire delivery. Outside the “Blue Line” To develop the MAF for non-city-style areas, the Bureau first conducted a complete address listing operation in July 1998–February 1999. The 1990 list was not used. There was also a local review program for areas outside the blue line in 1999. Approximately 21 million addresses were included on the MAF for areas outside the blue line at the time when questionnaires were labeled and prepared for delivery. Census enumerators further updated the MAF in these areas when they delivered questionnaires in February–March 2000. C.1.b Local Review The Census Address List Improvement Act of 1994, which allowed the Postal Service to share the DSF with the Census Bureau, also permitted the Census Bureau to invite local governments to review the MAF for their areas and provide additions, deletions, and
OCR for page 382
The 2000 Census: Counting Under Adversity Box C.1 Basic Steps to Develop the Master Address File Prior to Census Day, 2000 and 1990 2000 CENSUS MASTER ADDRESS FILE City-Style Areas (mailout/mailback areas inside the “blue line”) Start with the 1990 Census Address Control File. Refresh the 1990 list with periodic updates of the U.S. Postal Service Delivery Sequence File. Conduct complete block canvass in the field in January–May 1999 (not in original plans). Provide opportunity for counties, minor civil divisions, places, and tribal governments to review the MAF for their areas in the Local Update of Census Addresses Program, called LUCA 98. LUCA 98 spanned February 1998–March 2000; it included: local review of initial MAF and census maps; Census Bureau verification of address changes provided by localities (reconciliation); local review of feedback/final determination materials from the Bureau; and review of local appeals by Census Address List Appeals Office in the U.S. Office of Management and Budget. Provide opportunity for localities to review the address list for dormitories, nursing homes, and other special places in December 1999–April 2000 (Special Places LUCA). Incorporate updates from Postal Service’s final intensive check of Delivery Sequence File prior to questionnaire delivery. Provide opportunity for localities to supply addresses for newly constructed housing units in January–March 2000 to be enumerated in summer 2000 (New Construction LUCA). Non-City-Style Areas (update/leave areas outside the “blue line”) Conduct an address list creation operation in July 1998–February 1999. Provide opportunity for localities to review the list in the LUCA 99 Program. LUCA 99 spanned July 1998–March 2000. It was similar to LUCA 98 except that localities were asked to challenge housing unit counts for blocks in their initial review. They had to challenge and provide evidence for specific addresses in the appeals phase. Provide opportunity for localities to review the address list for special places. Instruct Census Bureau enumerators to update the MAF when they drop off questionnaires in February–March 2000.
OCR for page 383
The 2000 Census: Counting Under Adversity 1990 ADDRESS CONTROL FILE City-Style Areas Purchase lists from two vendors; supplement with field listing operation. Recheck the vendor-supplied lists in 1989 in a complete block canvass. Provide the opportunity for localities to review housing unit counts by block in summer 1989. Have the Postal Service conduct several checks in 1988–1990. Non-City-Style Areas Conduct an address list creation operation (called prelisting) in fall 1989. Instruct Census Bureau enumerators to update the list when they drop off questionnaires. corrections to the Bureau (Working Group on LUCA, 2001).4 The Local Update of Census Addresses (LUCA) Program—covering counties, places, and minor civil divisions, over 39,000 jurisdictions in all—was conducted separately in areas inside the blue line (LUCA 98) and areas outside the blue line (LUCA 99). There was also a Special Places LUCA Program. LUCA required participating local governments to sign a pledge to treat the address list as confidential. The program involved several steps of local review, field verification by the Bureau, and appeal to the U.S. Office of Management and Budget when localities disagreed with the Bureau’s decision to reject local changes to the MAF. Due to time constraints, some planned LUCA operations were combined and rescheduled (see Chapter 4, Table 4.2). In response to local concerns, a New Construction LUCA Program was added to give localities inside the blue line an opportunity during January–March 2000 to identify newly constructed housing units. Addresses identified in the program were not mailed questionnaires; instead, they were visited by enumerators during the coverage improvement follow-up operation in summer 2000. Of the 39,051 jurisdictions that were eligible for either or both LUCA 98 or LUCA 99, it is estimated that 25 percent participated fully in one or both programs by informing the Census Bureau of 4 Local review procedures were used in the 1980 and 1990 censuses, but localities were not permitted to examine the list of individual addresses for their areas.
OCR for page 384
The 2000 Census: Counting Under Adversity needed changes to the address list for their area (Working Group on LUCA, 2001:Ch.2). Participation varied by such characteristics as geographic region of the country, population size of jurisdiction, type of government, and city-style or non-city-style area (see Chapter 4, Table 4.4). C.1.c Further Development of MAF MAF was a dynamic file during the operation of the census. Not only were addresses added from each stage of census field operations, they were deleted in an effort to minimize duplicate and erroneous entries. In total, the Census Bureau estimates that about 4 million addresses were added to the MAF during census field operations—2.3 million addresses during questionnaire delivery in update/leave, update/enumerate, and list/enumerate areas (see Section C.2) and 1.7 million addresses during follow-up. About 10.4 million addresses were removed as duplicative of other addresses or nonexistent. About 5 million of these addresses were removed on the basis of two internal consistency checks, one of which was planned and the other of which was designed and implemented while the census data were being processed; the remaining addresses were deleted on the basis of field operations (see Section C.3). Whether the combination of internal checks and field checks reduced duplicate and erroneous addresses to a minimum or went too far or not far enough is a matter for evaluation (see Section 4-E.2). The final number of occupied and vacant housing units counted in 2000 was 115.9 million (Farber, 2001a:Tables 1, 2). C.1.d Internal Checks for Duplicates Reducing the NRFU Workload In April 2000 the Census Bureau conducted an internal consistency check of the MAF prior to the beginning of nonresponse follow-up in order to remove from the NRFU workload as many addresses as possible that could clearly be identified as duplicative or nonexistent (Miskura, 2000a). At the conclusion of this operation, 3.6 million addresses were dropped or merged with another MAF address.
OCR for page 385
The 2000 Census: Counting Under Adversity One source of potential duplicates and errors came about because LUCA—essentially, a new, untested program—did not run as smoothly as intended (Working Group on LUCA, 2001). Because of delays in providing materials to local governments to review, the Census Bureau agreed to include every address provided by a LUCA participant on the MAF that was used to label questionnaires in July 1999, even when there had not been time to verify the address in the field. LUCA-supplied addresses that the Bureau believed likely did not exist, based on field checks after July, were flagged. Processing specifications were developed to delete many of these addresses and other addresses of doubtful existence when no questionnaire was returned for them. In all, 2.5 million addresses that the Bureau had reason to believe did not exist were deleted from the MAF prior to nonresponse follow-up. Also as part of this review, the Bureau attempted to identify duplicate addresses originating from LUCA or other sources. About 1.1 million addresses were merged with another address on the MAF when the addresses appeared to be exact duplicates. Follow-up was conducted either only for the one (merged) address or not at all if a questionnaire had been received for that address.5 Unduplication and Late Additions Another important set of MAF internal checks, not previously planned, was put into place in summer 2000. From evaluations of MAF housing unit counts during January–June 2000 against estimates prepared from other sources, such as building permits, the Census Bureau determined that there were likely still a sizable number of duplicate addresses on the MAF (West and Robinson, 2001). Field verification carried out in June 2000 in a small number of localities substantiated this conclusion (Nash, 2000). Consequently, the Bureau mounted a special operation to identify duplicate addresses and associated duplicate census returns to remove them from the MAF and the census. Software was written for this operation to match addresses and person records to identify potential duplicates. The flagged records were deleted from the 5 If questionnaires were received for two addresses that were deemed to be exact duplicates, the Primary Selection Algorithm checked for duplicate enumeration and determined the census household (see Section C.5.c).
OCR for page 386
The 2000 Census: Counting Under Adversity census file of valid, completed returns and further examined. After examination, it was decided that a portion of the potential duplicates were likely valid returns for addresses not already in the census, and they were restored to the census file (late additions). At the conclusion of the operation, 1.4 million housing units and all 3.6 million people in those units were permanently deleted from the census file, from a total of 2.4 million housing units and all 6.0 million people in those units that had been initially flagged as potential duplicates (Miskura, 2000b). C.1.e Comparison: Address List Development in 1990 The procedures used to develop the 1990 Address Control File (ACF) differed in important respects from those used to develop the 2000 MAF (see Box C.1). Overall, the Census Bureau used fewer sources in developing the 1990 ACF than it used for the 2000 MAF; also, the 1990 local review operation was considerably less extensive than the 2000 LUCA Program (see National Research Council, 1995b:App.B). For 1990 in areas with city-style addresses, the Census Bureau made no use of the 1980 census address list or the Postal Service DSF. Instead, the starting point for the ACF was two files of lists purchased from vendors, supplemented by a field listing operation carried out by census field staff in summer 1988 (precanvass). The Postal Service performed several reviews of the list in 1988–1990; Bureau staff also checked the part of the ACF that derived from commercial lists in a block canvass in 1989. Governmental jurisdictions in the city-style areas were given an opportunity for review in summer 1989; however, they could not review specific addresses but only counts of addresses at the block level. About 16 percent of eligible local governments responded, adding about 400,000 housing units to the ACF (Bureau of the Census, 1993:6-44). By comparison, twice as many eligible governments—36 percent—participated in the LUCA 98 Program in city-style areas. In areas with non-city-style addresses, the development of the 1990 address list was similar to that in 2000; census field staff conducted an address listing operation in fall 1989. Census enumerators also checked the list in March 1990 when they delivered questionnaires in the areas in which the update/leave technique (new for
OCR for page 387
The 2000 Census: Counting Under Adversity the 1990 census) was used. However, there was no precensus local review program for the ACF in these areas. C.2 QUESTIONNAIRE DELIVERY AND MAIL RETURN The 2000 census, like the 1980 and 1990 censuses, was conducted primarily by delivering questionnaires to households and asking them to mail back a completed form. Procedures differed somewhat depending on such factors as type of addresses in an area and accessibility; in all, there were nine types of enumeration areas. Box C.2 provides brief descriptions of the nine types in 2000. The two largest types of enumeration areas were: (1) mailout/ mailback, covering almost 82 percent of the population, in which Postal Service carriers delivered questionnaires, and (2) update/ leave/mailback (usually termed update/leave), covering almost 17 percent of the population, in which Census Bureau field staff delivered questionnaires and updated the MAF at the same time. These two types, together with small numbers of addresses in areas (6), (7), and (9), comprised the mailback universe, covering about 99 percent of the household population (calculated from Baumgardner et al., 2001). The remaining 1 percent of the household population was counted by census enumerators (see areas (3), (4), (5), and (8) in Box C.2). Separate enumeration procedures were used for such special populations as homeless people, residents of group quarters, and transients (see Citro, 2000c). Approaches to boost mail response were to redesign the questionnaire and mailing package, adapt enumeration procedures to special situations (the reason for having nine types of enumeration areas), and allow multiple modes for response. Advertising and outreach efforts were also expanded from 1990 (see Section C.4). The final mail response rate in 2000 (67 percent) was slightly higher than the rate in 1990 (65 percent); it was also considerably higher than the rate that was budgeted (61 percent), which reduced the burden of field follow-up. The mail return rate in 2000 (78 percent) was higher than the rate in 1990 (75 percent). This rate is a more refined measure of public cooperation than the mail response rate, which includes vacant and nonresidential addresses in the denominator in addition to occupied housing units (see Chapter 4, Box 4.1).
OCR for page 388
The 2000 Census: Counting Under Adversity Box C.2 Types of Enumeration Areas (TEAs) Mailout/mailback In areas with predominantly city-style addresses (inside the blue line), U.S. Postal Service carriers delivered an address-labeled advance letter to every housing unit on the MAF the week of March 6. In mid-March the carriers delivered address-labeled questionnaires, followed 2 weeks later by a reminder postcard. Households were to fill out the questionnaire and mail it back. Update/leave In areas outside the blue line in which there were many rural route and post office box addresses that could not be tied to a specific location, census enumerators dropped off address-labeled questionnaires to housing units in their assignment areas. At the same time, they checked the address list and updated it to include new units not on the list, noting for each its location on a map (map spot), so that follow-up enumerators could find units that did not mail back a questionnaire. The update/leave effort began in late February in some areas and continued through March. List/enumerate In remote, sparsely populated, and hard-to-visit areas, census enumerators combined address listing and enumeration. There was no MAF for these areas created in advance. The enumerators searched for housing units, listed each unit in an address register (also its map spot), and enumerated the household at the same time. This operation was conducted in March–May 2000. Remote Alaska The enumeration procedure in remote areas of Alaska was similar to list/enumerate. It was conducted earlier (in February) before ice breakup and snow melt. Rural update/enumerate It was determined in some instances that blocks originally planned to be enumerated by update/leave would be better handled by a procedure in which address list updating and enumeration were conducted concurrently. “Rural” refers to the source of the address list—the address listing and LUCA 99 operations conducted outside the blue line.
OCR for page 389
The 2000 Census: Counting Under Adversity Military Mailout/mailback procedures were used for all residential blocks on military bases (excluding group quarters). Such blocks in type 2 enumeration areas (but not those in type 1 enumeration areas) were assigned an enumeration area code of 6 because there was no need to update the address list or provide map spots. “Urban” update/leave It was determined that some blocks originally planned to have questionnaire delivery by the Postal Service would be better handled by having census enumerators follow an update/leave procedure. Such blocks contained older apartment buildings that lacked clear apartment unit designators, or they had many residents, despite having city-style addresses, who elected to receive mail at post office boxes. “Urban” refers to the source of the address list—the 1990 list updated by the DSF, the LUCA 98 Program, and the Postal Service check in early 2000. No map-spotting was needed for these addresses. “Urban” update/enumerate Some American Indian reservations contained blocks in more than one TEA. In these instances, all blocks in the reservation were enumerated using update/enumerate methods (see TEA 5). However, those blocks for which the mailing list was developed using “urban” procedures and for which no map-spotting was required were assigned a TEA code of 8 and not 5. Mailout/mailback conversion to update/leave Some blocks originally in TEA 1 areas contained a significant number of non-city-style addresses. They were identified and converted to “rural” address listing procedures before the urban block canvassing operation was carried out in 1999; they were reviewed as a special component of the LUCA 99 Program. NOTE: For details, see U.S. Census Bureau (1999b).
OCR for page 398
The 2000 Census: Counting Under Adversity forecast the extent of the decline in the mail response rate from 1980 to 1990—the Bureau projected a 70 percent response rate (down from 75 percent in 1980), but the actual rate at the time NRFU began was 63 percent (the rate subsequently rose to 65 percent). The Bureau had to obtain additional appropriations and scramble to hire sufficient workers for NRFU and other follow-up activities; it raised pay rates in 140 of the 449 district offices (equivalent to LCOs) and took other steps to increase productivity. The NRFU operation was planned to take 6 weeks from when it began in late April; however, only 72 percent of the workload was completed by that time (by June 6). Another 18 percent of the workload was completed in 2 more weeks, but it took another 6 weeks—until early August—to complete the remaining 10 percent of the workload (U.S. General Accounting Office, 1992:46). A subsequent stage of follow-up in 1990 included several coverage improvement procedures (Bureau of the Census, 1993:6-37 to 6-38;6-53 to 6-56). An operation called field follow-up, carried out in June–August, rechecked most units classified as vacant or delete in NRFU. Units that were not rechecked included those in areas with high proportions of seasonal housing or boarded-up buildings, plus units classified as delete by two precensus address update operations and a NRFU enumerator (a more stringent criterion than that used in 2000). By August 1, 5.3 percent of deleted units and 7.1 percent of vacant units that were rechecked in field follow-up were converted to occupied. (The corresponding percentages in 1980 were 7.5 percent deleted units and 10 percent of vacant units converted to occupied.) These figures are considerably below the rate of conversion from vacant or delete to occupied in the 2000 CIFU (24 percent). In addition to the recheck of vacant and delete units, the 1990 field follow-up operation revisited failed-edit mail returns. These cases were mail returns that failed computer or clerical review with regard to completeness of coverage and content (Bureau of the Census, 1995b:8-10) and for which telephone follow-up was not successful (see Section C.5). Because of backlogs in the telephone follow-up operation for questionnaires handled by processing offices (those from central city areas), after mid-June most questionnaires in these offices that failed the content review and were not resolved by tele-
OCR for page 399
The 2000 Census: Counting Under Adversity phone were not sent to field follow-up. The 1990 field follow-up also revisited a number of mailback cases for which there was no record of data capture. Another 1990 coverage improvement operation was the “Were You Counted” campaign, in which people who thought they had been missed were encouraged by media announcements in June–July 1990 to send in a special form. Those forms with addresses that could be assigned to census geography and with complete content were put through a search and matching operation to determine if they duplicated other forms. There was no field verification of the address, except in the Detroit district office, from which an unusually large number of forms were received. Another special operation was the recanvass, carried out in July–November 1990, in which selected blocks, including those in high growth areas and those identified by postcensus local review, were relisted. The households were then reenumerated, provided the enumerator determined that the unit existed as of April 1. In all, the Bureau recanvassed more than 650,000 blocks containing about 20 million housing units (20 percent of all units). Blocks identified for recanvassing by localities came about because in 1990 (though not 2000), local jurisdictions nationwide were invited to review preliminary census counts of housing units by block for their areas (Bureau of the Census, 1993:6-45 to 6-46). The counts were provided in August 1990, and localities had 15 days to challenge them. Responses were received from about 25 percent of all jurisdictions, including all of the 51 largest cities. All challenged blocks in which the discrepancy between the census count and that provided by the locality exceeded a specified amount were added to the recanvass operation, for which additional funding had to be obtained. As part of the coverage improvement effort in 1990, in 24 local offices, all households for which the questionnaires reported only one household member were reenumerated. This procedure was implemented in response to allegations in late summer 1990 that enumerators in some offices during the closeout phase of NRFU had recorded households as one-person households without actually obtaining an interview (i.e., they were curbstoning). In addition, seven local offices in New Jersey were identified in which it appeared
OCR for page 400
The 2000 Census: Counting Under Adversity that fabrication may have occurred; households in these offices were reinterviewed when the questionnaires indicated household size but recorded no members (Bureau of the Census, 1993:6-55). Finally, a special program was implemented to improve the coverage of people who were on parole or probation (Bureau of the Census, 1993:6-55). The first step was to contact each state to ask its parole or probation officers to distribute census forms to their assignees to be filled out and mailed back. This operation had a very low response rate, so census enumerators were sent to correction departments in designated counties to obtain information for parolees and probationers from administrative records. No attempt was made to contact parolees or probationers unless their addresses could not be verified. The operation was not completed until late November-early December 1990. The forms obtained were processed through an unduplication operation (see Section C.5); however, subsequent analysis determined that many of the parolee/probationer forms that were accepted in the census count represented erroneous enumerations (Ericksen et al., 1991:43–46). C.3.d Summary: 1990 and 2000 The description of 2000 and 1990 follow-up procedures makes it clear that they were large-scale, complex operations, similar in broad outline but sufficiently different in detail to make it difficult to compare results across years. It is difficult, for example, to compare results from the 2000 CIFU recheck of vacant and delete units with the 1990 field follow-up vacancy check because of differences in how the workload was defined. Also, it is not clear exactly how such terms as “proxy” (2000), “last resort” (1990), “closeout,” and “non-data-defined” were similar or dissimilar, again complicating the task of comparative evaluation. One can, however, conclude that the Census Bureau was more successful in 2000 than in 1990 in controlling field follow-up operations and keeping them on schedule. Coverage improvement operations were more focused, and programs that appeared problematic in 1990 (e.g., the parolee and probationers check) were not repeated in 2000.
OCR for page 401
The 2000 Census: Counting Under Adversity C.4 OUTREACH EFFORTS To supplement field operations and special programs to improve population coverage and cooperation with the census, the Census Bureau engaged in large-scale advertising and outreach efforts for 2000. For the first time, the Census Bureau budget included funds ($167 million) for a paid advertising campaign (recommended by National Research Council, 1978). In previous censuses, the Advertising Council arranged for advertising firms to develop ads and air them on a pro bono, public service basis (Anderson, 2000). The 2000 advertising campaign was extensive, involving a major contractor, Young and Rubicam, which contracted with four other agencies to prepare ads targeted to particular population groups and communities. The advertising ran from November 1, 1999, to June 5, 2000, and included a phase to alert people to the importance of the upcoming census, a phase to encourage filling out the form, and a phase to encourage people who had not returned a form to cooperate with the follow-up enumerator. Ads were placed on television (including one during the 2000 Super Bowl), radio, newspapers, and other media, using multiple languages. Based on market research, the ads stressed the benefits to people and their communities from the census, such as better targeting of government funds to needy areas for schools, day care, and other services. In addition to the ad campaign, the Census Bureau hired partnership and outreach specialists in local census offices, who worked with community and public interest groups to develop special initiatives to encourage participation in the census. The Bureau signed partnership agreements with over 100,000 organizations, including federal agencies, state and local governments, business firms, nonprofit groups, and others. The Bureau did not fund these groups, but it provided materials and staff time to help them encourage a complete count. A special program was developed to put materials on the census in local schools to inform school children about the benefits of the census and motivate them to encourage their adult relatives to participate. The Census Bureau director and other staff made numerous public appearances throughout the census period to stress the importance of a complete count and respond to questions and concerns. The director also put into place a program to use the Internet to
OCR for page 402
The 2000 Census: Counting Under Adversity challenge communities to raise their mail response rates. The 1990 response rates were posted for local areas on the Bureau’s Web site beginning in mid-March, and 2000 response rates were regularly updated on the site through mid-April. Communities were challenged to exceed their 1990 rates by 5 percent. Although few communities achieved this goal, the overall response rate did not continue its decline from previous censuses. The 1990 census had also included advertising and outreach efforts; however, their extent was less than in 2000. The advertising was prepared by a firm selected by the Advertising Council, which conducted its work on a pro bono basis. Ads were placed as public service announcements, which meant that many ads ran in undesirable times. The partnership program was not as extensive as in 2000. In both censuses, perhaps more so in 2000, advertising and outreach efforts varied in intensity across the country. Some localities were more active than others in coordinating and supplementing outreach and media contacts. Whether this variability narrowed or widened the difference in net undercount rates among major population groups depends on the extent to which outreach efforts were more (or less) effective in hard-to-count areas in comparison with other areas. C.5 DATA PROCESSING Data processing for the 2000 census was a continuing, high-volume series of operations that began with the capture of raw responses and ended with the production of voluminous data products for the user community, which were made available in 2001–2003.11 Important innovations were adopted for 2000. For the first time, the Census Bureau contracted with outside vendors for major components of data processing. Also for the first time, data capture operations were carried out using optical character recognition technology in addition to optical mark recognition. A telecommunications network linked Census Bureau headquarters in Suitland, Maryland; 12 permanent regional offices; the Bureau’s permanent 11 Data processing also included a series of computer systems for management of operations, including payroll, personnel, and management information systems.
OCR for page 403
The 2000 Census: Counting Under Adversity computer center in Bowie, Maryland; 12 regional census centers and the Puerto Rico Area Office; the Bureau’s permanent National Processing Center in Jeffersonville, Indiana; 3 contracted data capture centers in Phoenix, Arizona, Pomona, California, and Baltimore County, Maryland; 520 local census offices; and contracted telephone centers for questionnaire assistance (U.S. Census Bureau, 1999b:XI-1). Five operations in 2000 are described in this section: data capture, coverage edit and telephone follow-up, unduplication, editing and imputation, and other data processing. Data processing operations for 1990 are also summarized. C.5.a Data Capture The first step in data processing was to check in the questionnaires and capture the data on them in computerized form. The return address on mailback questionnaires directed them to one of four data capture centers—the Bureau’s National Processing Center and three run by contractors. Each questionnaire had a bar code that was scanned to record its receipt. The questionnaires were then imaged electronically, check-box data items were read by optical mark recognition (OMR), and write-in character-based data items were read by optical character recognition (OCR). Clerks keyed data from images in cases when the automated technology could not make sense of the data. Keying of the additional long-form-sample items was deferred until fall 2000 to permit the fastest possible processing of the basic (complete-count) data from short and long forms. C.5.b Coverage Edit and Telephone Follow-Up The data on mailed-back questionnaires were reviewed by computer to identify those returns that failed coverage edit specifications. These failed-edit cases were reinterviewed by telephone, using contractor-provided clerical telephone staff. The workload for the coverage edit and telephone follow-up operation totaled about 2.3 million cases. It included returns that reported more, or fewer, household members in question one (“How many people were living or staying in this house, apartment, or mobile home on April 1, 2000?”) than the number of members for which individual informa-
OCR for page 404
The 2000 Census: Counting Under Adversity tion (e.g., age, race, sex) was provided; returns in which question one was left blank and individual information was provided for exactly six people (the limit of the space provided on the mail questionnaires); and returns that reported household counts of seven people or more. The purpose of the edit and telephone follow-up was to reduce coverage errors in the households selected for follow-up and to obtain basic characteristics for household members for whom the household had no room to report their characteristics on the form. No characteristics were obtained for missing responses for household members for whom only some characteristics were reported. There was no field follow-up for failed-edit households for which telephone follow-up was unsuccessful. Because of computer problems, the start of the coverage edit and telephone follow-up operation was delayed. Originally planned to be conducted in April–June 2000, it was carried out in May through mid-August. C.5.c Unduplication of Households and People Two major, computer-based unduplication operations were carried out subsequent to field follow-up. One of those operations, the use of the primary selection algorithm (PSA) to unduplicate multiple returns for the same address, was planned from the outset and is described below. The other operation, the use of special software and procedures to reduce duplication of addresses in the MAF, was designed and implemented in summer 2000 to respond to evidence of duplicate addresses not eliminated by previous processing (described in Section C.1.d). The special unduplication operation used the results of the PSA; final determination of which returns to delete from the census because they duplicated a return from another MAF address was not made until after the PSA had processed multiple returns for the same address. The purpose of the PSA was to identify unique households and people to include in the census when more than one questionnaire was returned with the same census address identification number. Such duplication could occur in a number of ways: when a respondent mailed back a census form after the cutoff date for determining the NRFU workload and the enumerator then obtained a second form from the household (or perhaps identified the household as
OCR for page 405
The 2000 Census: Counting Under Adversity vacant); when someone was enumerated in a group quarters but provided another “usual” address to which his or her information was assigned; or when a respondent filled out a Be Counted form, thinking that he or she had been missed, but another member of the household also mailed back a questionnaire for the household (which might or might not contain information for the individual). For each housing unit, returns with one or more persons in common were combined to form a single PSA household, retaining only one response for each household member who was reported on more than one return, as well as the responses for household members who were reported on only one return. All vacant returns for a housing unit were also combined to form a PSA household. In some cases more than one PSA household might exist for a unit. For each PSA household, the algorithm selected which return best represented the Census Day household (“basic” return) and which people from the other returns were part of that household.12 In all, 9 percent of census housing units had two returns that were eligible for the PSA operation, and 0.4 percent had three or more eligible returns. (Extra returns for an address that had no useful information were not included in the operation.) In most instances, the operation of the PSA discarded duplicate household returns or extra vacant returns. Less often, the PSA found additional people to assign to a basic return or identified more than one household at an address (see Baumgardner et al., 2001:22–27). C.5.d Editing and Imputation Editing and imputation were carried out for all data-captured questionnaires that were retained in the census after the PSA operation. The editing and imputation process included whole-household imputation, called substitution, when there was minimal or no information for the housing unit; editing content items for consistency and to fill in (assign) values for missing items on the basis of related items (e.g., to calculate age when only date of birth was provided); and imputation of content items using values reported for another person or household, called allocation, when values were missing for one or more items. See Box 4.2 in Chapter 4 for types of whole-household imputation; Table 4.1 for whole-household imputation 12 The Census Bureau does not make public the criteria for the PSA.
OCR for page 406
The 2000 Census: Counting Under Adversity rates by type in the 1980–2000 censuses; and Chapter 7 for basic (complete-count) and long-form-sample item imputation rates. All editing and imputation were computer based; there was no clerical review or editing of any items as in past censuses. When it was not possible to perform an edit that used other information for the same person or housing unit, imputation was performed using hot-deck methods that made use of information for other people and households, generally in the immediate neighborhood. First used in processing the 1960 census, the Census Bureau’s computerized hot-deck procedures have been refined and elaborated. The donor pool is geographically restricted to take advantage of common characteristics among small-area populations (see Appendix G; see also Citro, 2000b). C.5.e Other Data Processing A number of other data processing steps were carried out to generate data files and publications from the 2000 census records. Such steps for the complete-count records included tabulating the data on various dimensions and modifying the data appropriately on files that were to be released to the public in order to protect the confidentiality of individual responses. For the long-form-sample records, there were the added steps of coding such variables as occupation and industry and weighting the records to complete-count control totals on several dimensions. C.5.f Comparison: 1990 Data Processing The 1990 census data processing system was more decentralized than in 2000 and made more use of clerical review (see Bureau of the Census, 1995b:Ch.8; National Research Council, 1995b:App.B). There were seven processing offices and 559 district offices. Mailback questionnaires in district offices in hard-to-enumerate areas in central cities went directly to a processing office for check-in by scanning bar codes, data capture by using the Census Bureau’s Film Optical Sensing Device for Input to Computers (FOSDIC), and computerized review to identify cases that failed to meet the edit specifications for completeness of coverage and content. Failed-edit cases went to telephone follow-up, and those cases that could not be
OCR for page 407
The 2000 Census: Counting Under Adversity contacted were sent to a district office for field follow-up. However, backlogs in the telephone follow-up operation necessitated curtailment of field follow-up for cases that could not be contacted by telephone, and, for cost reasons, only a 10 percent sample of mailed-back short forms that failed the content review (and not also the coverage review) were sent for telephone follow-up. Enumerator returns for central city offices were checked-in at the district office and then sent to the processing office for data capture, computerized review of coverage and content, and telephone follow-up as needed. Enumerator returns were not eligible for field follow-up. Once any further data had been received from follow-up, computerized editing, whole-household imputation, and item imputation routines were used to fill in remaining missing or inconsistent data. Mailback and enumerator returns in other district offices went first to the district office for check-in, clerical review of coverage and content, telephone follow-up as needed, and field follow-up of failed-edit mail returns for which telephone follow-up was unsuccessful. After completion of follow-up, the questionnaires were sent to the processing offices for data capture and computerized editing and imputation. Another step in data processing included the search/match operation, in which forms received from various activities were checked against microfilm images of questionnaires for the same address to determine which people should be added to the household roster and which were duplicates. This operation was carried out for “Were You Counted” forms, for parolee/probationer forms, and for people who sent in a questionnaire from one location with an indication that their usual home was elsewhere. Such people might have two homes, such as people who spend the winter in a southern state and the summer in a northern state. There was no way on the 2000 form to indicate usual home elsewhere. The FOSDIC technology used for data capture was originally developed by Census Bureau staff for the 1960 census and reengineered and enhanced in 1970, 1980, and 1990 (see Salvo, 2000). It involved two main stages: microfilming the questionnaires and using the FOSDIC equipment to scan the microfilm and read the filled-in answer circles for each item and output responses to a computer file (the answer dots showed up as light images on a dark background). In 1990, FOSDIC processed over 130 million questionnaires; about
OCR for page 408
The 2000 Census: Counting Under Adversity 900,000 forms had to be “repaired” by clerks and remicrofilmed before they could be read (e.g., because the forms were torn or folded improperly and so were out of alignment for scanning). The FOSDIC equipment could read answer dots and sense the presence of write-in entries but not capture such entries directly. Write-in responses were keyed by clerks using the paper questionnaires for long-form-sample items and a microfilm access device for keying of write-in responses to the race question. After keying, the write-in responses were coded by a combination of computer and clerical review.
Representative terms from entire chapter: