Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 66
VALIDATION OF ATLANTA, GEORGIA, REGIONAL COMMISSION POPULATION SYNTHESIZER 55 controls distinguish "single" from "2+" persons per (a) improving the quality of the rounding procedure used household. For families with householders over 65 years after iterative proportional fitting (IPF) before drawing of age, the distinction by presence of children is ignored. the households from PUMS, (b) enhancing user friendli- For the forecast year, fewer controls are defined for ness, and (c) adjusting the PopSyn to accommodate the each transportation analysis zone (TAZ). These capture recently expanded 20-county geographic scope. ARC TAZ-level forecasts of household income and Enhancements would also be advisable to take advan- household size. However, ARC also forecasts some ele- tage of enhanced inputs that may become available from ments at a regional level that can be used for regional the economic and land use models. Through use of the controls, such as the average number of workers within current synthesizer, base-year and back-cast synthesis a household and the size of age cohorts. have been tested, and preliminary validation results have The PopSyn creates a synthetic population for a base been produced. year and for each forecast year. There are two key differ- The ARC PopSyn allows the user to implement a vari- ences between the base year and the forecast year. The ety of versions without reprogramming. For initial test- initial distribution for the base year comes from PUMS, ing and validation, three versions were created, the whereas for the forecast year it comes from the base-year simplest with 52 household demographic categories and distribution. The controls for the base year come from the others with 128 and 316 categories, respectively. As census tables, but for the forecast years they come from more categories are used, more detail can be used from the land use forecasts. In both cases, the PopSyn pro- the census tables (base year) or ARC demographic and duces a synthetic population, and it also produces a val- land development forecasts (forecast year) to control the idation report that compares synthetic population synthesis procedure so that more household attributes characteristics with known characteristics. should be synthesized precisely. However, the computa- To validate the synthesizer's ability to generate a fore- tion takes longer; an increase in the number of sparsely cast population, ARC uses Year 2000 as the base year and populated categories causes more rounding error; and validates a back-cast to 1990. The initial distribution the use of regional values and averages for the additional comes from the base-year PopSyn. The controls then emu- controls might increase the noise and introduce bias. So late a 1990 forecast data set and synthesize a 1990 popu- one of the primary purposes of the validation is to choose lation, which is then compared with 1990 census, testing the best version of household categories; preliminary the ability to generate a synthetic population with limited conclusions are reported below. The three versions are forecast information. In this process, it is assumed that the shown in Table 1, with their number of categories within forecast input, though limited in amount and detail, is cor- each of six dimensions. They will be identified subse- rect. In other words, the procedure validates the synthe- quently by their overall number of categories (e.g., Ver- sizer but does not validate the land use model forecasts. sion 52). The procedure validates by calculating both aggregate Validation allows better understanding of the level of characteristics of the synthetic population and the same geographic detail at which the aggregate population characteristics directly from the detailed census tables. It attributes can be trusted and which household variables then compares them to see how well they match. There are synthesized well enough to be used in the travel fore- are four levels of geographic aggregation: tract, Public casting models. Results of this analysis are reported later. Use Microdata Area (PUMA), county, and supercounty. Also reported is the testing of other setup parameters, Reports are then repeated for multiple synthetic popula- including the convergence criterion for IPF and the tions to identify the variability caused by the Monte aggregation level used in the seed distribution for the Carlo draws used in the synthesizer. forecast year. As for software, it is object-oriented Java, Version 1.5, Validation can also be used to evaluate the level of and consists entirely of subprograms called classes. Each variation in results that is caused by the stochastic nature class consists of member objects (that is, the information it of the simulation procedure used to generate the syn- holds) and methods (functions it can accomplish). Each thetic population. Several base-year runs have been made class can be individually coded and tested. The PopSyn has four major groups of classes tied together by PopSyn class. TABLE 1 Three Basic PopSyn Versions Number of Categories Dimension Simple Middle Complex DEVELOPMENT STATUS AND Overall 52 128 316 Household income 4 4 4 VALIDATION OBJECTIVES Household size 4 5 5 Number of workers in household 4 4 4 The initial programming of the ARC PopSyn is complete. Family or nonfamily 1 2 2 Age of householder 1 1 2 Some improvements are known to be needed, including