2
Ensuring the Availability, Accuracy, and Relevance of Urban and Housing Data

Because of HUD’s relationships with local communities and community groups, HUD can introduce local priorities into national dialogs, and can support and encourage providers of local data to meet national standards for inclusion in the National Spatial Data Infrastructure (NSDI).1 The development of a parcel-level layer of information for metropolitan areas is particularly important to HUD, to the communities HUD serves, and to the NSDI and other federal data initiatives. This chapter discusses the challenge that HUD faces in its mission to provide urban and housing data for the nation, and suggests methods to ensure the accuracy and relevance of these data. The final section describes how fully responding to this spatial data challenge is tantamount to creating urban framework data layers for the NSDI.

THE SPATIAL DATA CHALLENGE

Access to reliable spatial data is essential for research, analysis, and policy development on numerous urban and housing issues. Such data are fundamental building blocks for all agencies and organizations that have

1  

The NSDI is defined as the technologies, policies, and people necessary to promote sharing of geospatial data throughout all levels of government, the private and non-profit sectors, and the academic community (<http://fgdc.er.usgs.gov/nsdi/nsdi.html>).



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 29
2 Ensuring the Availability, Accuracy, and Relevance of Urban and Housing Data Because of HUD’s relationships with local communities and community groups, HUD can introduce local priorities into national dialogs, and can support and encourage providers of local data to meet national standards for inclusion in the National Spatial Data Infrastructure (NSDI).1 The development of a parcel-level layer of information for metropolitan areas is particularly important to HUD, to the communities HUD serves, and to the NSDI and other federal data initiatives. This chapter discusses the challenge that HUD faces in its mission to provide urban and housing data for the nation, and suggests methods to ensure the accuracy and relevance of these data. The final section describes how fully responding to this spatial data challenge is tantamount to creating urban framework data layers for the NSDI. THE SPATIAL DATA CHALLENGE Access to reliable spatial data is essential for research, analysis, and policy development on numerous urban and housing issues. Such data are fundamental building blocks for all agencies and organizations that have 1   The NSDI is defined as the technologies, policies, and people necessary to promote sharing of geospatial data throughout all levels of government, the private and non-profit sectors, and the academic community (<http://fgdc.er.usgs.gov/nsdi/nsdi.html>).

OCR for page 29
resource management and allocation mandates.2 Reliable spatial data and technologies are needed to monitor and manage urban growth; maximize social, environmental, and economic well-being; and achieve important long-term goals related to quality of life. Without relevant and accurate spatial data at the base, GIS and related technologies (e.g., the global positioning system [GPS], remote sensing, computer mapping, and spatial analysis) are useless. Furthermore, tools for analysis and decision support are required for the application of geographic data to real-world issues. Public and private institutions are making resources available for long-term decisions about the collection, management, and use of spatial data (NRC, 1997). The Federal Geographic Data Committee (FGDC), representing 17 federal agencies, coordinates the development of the NSDI (Box 2.1). The NSDI encompasses policies, standards, and procedures for organizations to cooperatively produce and share geographic data and information. It is being developed in cooperation with organizations from state, local, and tribal governments; the academic community; and the private sector. In the United States, geographic data collection is a multibillion-dollar business. In 1993, the U.S. Office of Management and Budget conducted a survey and found that total annual expenditure in federal agencies alone was close to $4 billion.3 Another estimate places total annual revenues from GIS hardware, software, and data sales at $7 billion in 1999, with the GIS data industry being the most significant sub-sector (Longley et al., 2001). Developing, maintaining, and disseminating reliable spatial data has been a major challenge to many organizations. Without a coordinated effort, duplicate data for the same locality or region could be collected by multiple organizations using various definitions of time, at different spatial scales, and with varying degrees of accuracy. In the United States, President Clinton’s 1994 Executive Order 12906 established the NSDI and set a significant milestone in coordinating spatial data development (see Box 2.1). The NSDI is evolving, and its strategic goals have been redefined to reflect various stakeholders’ input and current trends. Of NSDI’s many activities, three have significant implications and direct applicability to the development of spatial data and GIS technology at HUD. They are the development of data standards, the development of framework data and the geospatial data clearinghouse, and the establishment of partnerships with state, local, private sectors, and local communities. 2   See NRC (2002c) for discussion of federal data collection and dissemination. 3   Source: FGDC web page, <http://www.fgdc.gov>.

OCR for page 29
BOX 2.1 The National Spatial Data Infrastructure Created partly because of the recommendation of a National Research Council report (NRC, 1993), the NSDI refers to the technologies, policies, and people necessary to promote sharing of geospatial data throughout all levels of government, non-profit organizations, the private sector, and the academic community. The goals of the NSDI are to reduce duplication of effort among agencies; improve quality and reduce costs related to geographic information; make geographic data more accessible to the public; increase the benefits of using available data; and establish key partnerships with states, counties, cities, tribal nations, academia, and the private sector to increase data availability. TABLE 2.1 Various Responsibilities for Data Layers of the NSDI Subcommittees Federal Agency Chair Base Cartographic Data U.S. Geological Survey Cadastral Bureau of Land Management Cultural and Demographic Bureau of Census Federal Geodetic Control NOAA’s National Geodetic Survey Geologic U.S. Geological Survey Ground Transportation Bureau of Transportation Statistics International Boundaries & Sovereignty Department of State, Office of Geographer Marine and Costal Spatial Data NOAA’s Coastal Services Center Soils USDA’s National Research Conservation Service Spatial Climate USDA’s National Water and Climate Center Spatial Water Data U.S. Geological Survey Vegetation USDA’s U.S. Forest Service Wetlands U.S. Fish and Wildlife Service   SOURCE: Adapted from FGDC Chart of Partner Responsibilities.

OCR for page 29
To enable data sharing among different data-producing units, some basic information about the data, that is, the metadata4 must be provided. FGDC data standards give terminology and definitions for the documentation of digital geospatial data. Included in the data standard are information on data availability for a geographic location, their fitness for intended use, how to access the data, and how to process and use the data. All federal agencies are required to work with the FGDC. Each agency is responsible for: Cooperating as requested in the development of appropriate coordinating mechanisms; Supplying necessary information to the interagency coordinating committee concerning its surveying, mapping, and related spatial data requirements, programs, activities, and products; and Conducting its surveying, mapping, related spatial data gathering and product distribution activities in a manner that provides effective government-wide coordination and efficient, economical service to the general public.5 The concept of a framework and a clearinghouse ensure that data now being developed are made available to many users. The National Geospatial Data Framework is designed to be a collaborative effort. It currently contains seven data themes, including transportation, hydrology (rivers and lakes), geodetic control, digital imagery, government boundaries, elevation and bathymetry, and land ownership (Figure 2.1). The framework data represent the best available data that are certified, standardized, and described according to a common standard. They provide a foundation on which organizations can build by adding their own data. Moreover, through the National Geospatial Data Clearinghouse (a distributed, electronically connected network of geospatial data producers, managers, and users), additional data can be accessed that is layered on top of the framework data.6 The appropriateness of the various spatial data can then be determined from their metadata descriptions, which are required for all datasets included in the Clearinghouse. 4   Spatially enabled data include horizontal and vertical coordinates and metadata, or “data about data”, describing the content, quality, condition, and other characteristics of the data. 5   1990 Office of Management and Budget (OMB) A-16. 6   See <http://nsdi.usgs.gov/> for a description of the National Spatial Data Clearinghouse.

OCR for page 29
FIGURE 2.1 Once framework and thematic data foundation layers are built, the data can be used for many applications. SOURCE: NSGIC and FGDC, nd. The success of data development activities depends on partnerships with state and local governmental and non-governmental organizations to continue building data and enriching the NSDI Clearinghouse. The NSDI is characterized by strong partnerships and collaborations, and strategic goals of increasing participation through education and outreach to derive common

OCR for page 29
solutions for discovery, access, and use of geospatial data in response to the needs of diverse communities. Attention is given to community-based approaches to developing and maintaining common collections of geospatial data for sound decision making.7 In the last decade, HUD has launched other initiatives that are closely related to the NSDI and directly relevant to the development of spatial data and the use of GIS. HUD launched Community 20/20 in 1994 (Box 2.2), and also partnered with EPA in a web-based initiative called E-MAPS (Box 2.3). Subsequently, HUD introduced another set of geospatial data CD-ROMs called R-MAPS (Box 2.4). The agency’s Enterprise GIS is currently under development (Box 2.5). HUD has initiated several research-based uses of GIS at PD&R. Among the first were those associated with Community 20/20 (See Box 2.2), and examples are reflected in Mapping Your Community (HUD, 1998). Several other targeted applications emerged, including an investigation of the value of computer mapping to questions about mortgage lending (Wyly and Holloway, 2002); an evaluation of evidence for segregation and discrimination (NRC, 2002d); and an assessment of patterns of crime around public housing (HUD, 1999). These projects were part of a broader consideration of the use of spatial data analysis and GIS at HUD and related institutional needs. Ongoing GIS research at HUD includes the U.S.–Mexico Border initiatives, the Global Urban Indicators project, and community-based GIS efforts. These efforts are discussed in Chapter 5 (see Boxes 5.1 and 5.2). Large-scale and long-term research projects such as these require interdisciplinary expertise and good datasets to address complex environmental and societal problems. GIS as an enabling technology can play an important role in supporting large-scale, data-intensive research. Research initiatives such as National Science Foundation’s Information Technology Research, and Biocomplexity in the Environment initiatives, NASA’s Earth Science Enterprise and its program to detect land use and land cover change, and EPA’s regional assessment initiative are examples of such recent research trends. The Digital Earth concept,8 embraced by the research community in the late 1990s, refers to a multi-resolution, three-dimensional representation of the planet where vast quantities of geospatial data are embedded. Promoted by Vice President Gore in 1998, the Digital Earth concept provides a vision whereby geospatial data, methods, and analyses are combined so that important societal issues—such as crime, biodiversity, global change, and food security—can be tackled in a more timely and efficient manner. 7   See <http://www.fgdc.gov> for details. 8   See <http://www.digitalearth.gov/> for details.

OCR for page 29
BOX 2.2 Community 20/20 In 1994, HUD developed a geographic and statistical data resource called Community 20/20. The product aimed to provide multi-faceted planning, mapping, and communication capabilities to HUD data users. The software, which received the Ford Foundation and Harvard University’s Kennedy School Innovation in American Government Award in 1996, was sold on CDs but provided free of charge to fair housing centers nationwide. HUD sold 3,500 copies of the CD and allocated 2,000 free copies (Dick Burke, U.S. HUD, personal communication, 2002). The purpose of this GIS product was to allow citizens to see the investments of 60 active HUD grant programs. Demographic data, encompassing 600 data items from the 1990 Census, and federal Empowerment Zone and Enterprise Communities activities were linked through this package (Thompson and Sherwood, 1999). Socioeconomic data from the U.S. Census and other sources included data on births, deaths, crimes, school performance, housing code violations, property values, and toxic emissions. These data could be coupled with information about types and locations of HUD-funded projects. The software was specifically designed to support the consolidated planning activities that HUD requires of local governments. Community 20/20 was intended to provide diverse groups (community-based and non-government organizations, state and local governments, and housing authorities) with the capability to plan and design housing and urban development projects. To promote the use of these tools for community development, HUD published Mapping Your Community: Using Geographic Information to Strengthen Community Initiatives (1998). These products are used primarily in connection with specific HUD proposals and projects. The Digital Earth concept extends the spatial data challenge further into an international realm. HUD, as a national agency, could play an important role in global housing and habitat studies. The Global Spatial Data Infrastructure (GSDI), established in 1996 by a number of nations and organizations, recognizes the importance of geographic data and supports ready global access to geographic information.9 Among its goals, the GSDI aims to promote awareness and implementation of complementary policies, common standards, and effective mechanisms for the development and availability of interoperable 9   See <http://www.gsdi.org> for details.

OCR for page 29
digital geographic data and technologies to support decision making at all scales for multiple purposes. Finally, the popularization of the Internet in the last few years has revolutionized the concept and practice of NSDI and the related initiatives discussed above. The Internet allows cost-effective, user-oriented dissemination of data and information. It also enables user input and encourages online collaboration. On the other hand, Internet dissemination exacerbates some existing data concerns, such as confidentiality, privacy, and control. As administrative data (e.g., local data on income tax, employment, and public assistance) are computerized and geographically referenced, and as spatially disaggregated data become standardized and available, privacy concerns will mount. Concerns include the possibility that the data and the developed indicators might be used for red-lining or otherwise stigmatizing troubled neighborhoods. These questions are growing in complexity and importance and deserve increased attention by federal agencies such as HUD. New technologies provide a means to address the need for both detailed, local data and for privacy and confidentiality concerns by allowing users to analyze disaggregated data without giving them case-by-case access. These are major challenges for agencies; nevertheless, full and effective participation in mandatory federal data initiatives demands attention to such questions. Efforts that HUD undertakes to meet FGDC standards will also benefit HUD’s internal efforts to collect, use, and disseminate information on urban and housing issues. Conclusion: To participate fully with the FGDC and other federal data initiatives, HUD should develop an in-house, integrated data infrastructure. To provide reliable data and be consistent with the NSDI, data should be accurately described and assigned spatial definition (geo-referenced) according to the standards of the FGDC. Recommendation: As a first step, HUD should meet federal data standards in all operations by: Participating fully in the FGDC and other federal initiatives to ensure that agency efforts are consistent with the development of the NSDI; and Supporting its program participants’ efforts to provide operational data in FGDC standard format and make these data available on the Internet along with other HUD data, subject to the limits of confidentiality.

OCR for page 29
BOX 2.3 E-MAPS In September 2000, HUD launched a partnered effort with the U.S. Environmental Protection Agency (EPA), linking data on HUD-funded activity in every neighborhood across the country with EPA environmental information. The purpose of Environmental-MAPS or E-MAPS is to provide people with detailed, site-specific information about what the government is doing to protect the environment and to promote community and economic development. The goal is to ensure easy access to data so that communities can engage in informed discussions and make informed decisions about growth and development. Data available through E-MAPS include: the location, type and performance of HUD-funded activities; site-specific information about all Superfund sites and related laws; brownfields data and Brownfields Tax Incentive Zones; and other environmental data including air pollution reports, toxic chem.icals data, hazardous waste business and permitting information, trend analyses of hazardous waste generation, and company waste water discharge information. Communities interested in redeveloping abandoned or underused industrial sites can use the data to check for contamination and determine what financial resources exist for redevelopment in the area. (HUD, 2000). HUD E-MAPS are intended to enable communities to make informed decisions about new sites for facilities, such as public and assisted housing, and to help communities prioritize the demolition of existing complexes. E-MAPS can be accessed on-line at: <http://198.102.62.140/emaps/SearchFrame.asp>. The trend is clear. With the Internet, public demand for information services will increase, and so will participation in community-based activities that use these data. Solutions to many real problems that exist today require teamwork and collaboration. The development of spatial data coupled with GIS technology is necessary for a federal agency such as HUD to continue to function efficiently in this information age and to be responsive to societal needs.

OCR for page 29
ENSURING DATA ACCURANCY AND RELEVANCE Current Data Sets at HUD PD&R maintains a number of housing-related databases. To increase awareness of the availability of these data, PD&R has compiled The Guide to PD&R Data Sets, which can be downloaded from the HUD USER web site.10 The guide describes 13 available housing data resources and provides web links to related documents and datasets. Each dataset is provided with basic information, such as the source, geographic coverage, period covered, web address, background, intended users, and intended use. These important components of metadata will enable researchers to find the data quickly and easily. The following is a brief synopsis of these data sources. The Low-Income Housing Tax Credit (LIHTC) database contains information on over 16,600 projects and nearly 710,000 housing units placed in service nationwide between 1987 and 1998. Geographic data for each housing project include its address, census tract, city, county, metropolitan area, and state. The Qualified Census Tracts dataset contains information on tracts that are qualified for the low-income housing tax credit based on the 1990 census data. The dataset covers all of the United States. The Difficult Development Areas dataset, also a national dataset, includes information on areas where incomes are substantially lower than housing costs. The data are broken down by state, the metropolitan statistical areas (MSAs), and non-MSAs. The American Housing Survey (AHS) includes two datasets. The national dataset is a nationally drawn sample of 60,700 housing units covering 878 counties and independent cities throughout the United States. The data provide detailed information on housing conditions as well as characteristics of householders, such as apartments, mobile homes, family composition, income, and neighborhood quality. Geographic indicators for each housing unit include the census region and whether it is in a central city, suburb, or non-metropolitan area. AHS’s metropolitan sample includes some 5,000 housing units from 47 metropolitan areas. The smallest geographic area identified for each unit is the zone. Zones are groups of census tracts where at least 100,000 persons live. The Property Owners and Managers Survey (POMS) was designed to provide information about the cost and availability of rental housing and what motivates owners to rent out their property to tenants. Although a nationwide survey was conducted, the final POMS dataset includes locational 10   Available at <http://www.huduser.org>.

OCR for page 29
information of only 438 sampling areas. Geographic indicators of each housing unit include the census region and whether it is in a central city, suburb, or non-metropolitan area. The State of the Cities Data Systems (SOCDS) consists of five databases for many metropolitan areas, central cities, and suburban places. These five databases provide information on each locality, its historical census, poverty rate estimates, labor statistics, FBI crime statistics, and building permits. The Fair Market Rents dataset shows the fair market rents, which HUD estimates annually for each county in the United States. The HUD Median Family Income Limits dataset contains estimates of income limits for different family sizes at the county level. The income limits are used to determine the income eligibility of applicants for public housing, Section 8, and other HUD programs. The Annual Adjustment Factors data for each metropolitan area are determined by a formula utilizing information such as consumer price index and residential rent and utilities cost changes. This information is used to adjust contract rents for units participating in HUD’s housing assistance programs. The Assisted Housing dataset sketches a picture of nearly 5 million subsidized households across the United States. Included are housing variables such as the total number of subsidized households, as well as demographic variables such as household income and number of children. The data present U.S. totals, state totals, and census tract summaries. Data are also summarized by local public housing agencies and by individual housing project. The Government-Sponsored Enterprises (Fannie Mae and Freddie Mac) data contain information on mortgage purchases of Fannie Mae and Freddie Mac. The data are tabulated at the national, state, MSA, and census tract levels. The data will be useful to studies of the flow of mortgage credit and capital in American communities where Fannie Mae and Freddie Mac are focusing their affordable homeownership efforts. Finally, the Research Maps (R-MAPS), volumes 2 and 3, datasets contain a portion of data selected from the above list and made into a GIS-readable format (such as ArcView and LandView). These two volumes represent an effort by the Office of PD&R to make the above data available in spatially enabled data format (see Box 2.4). Data Needs and Issues PD&R has taken significant steps in collecting data and making data available to the public. HUD’s datasets are widely used, by researchers from universities and policy agencies to town governments and community advocacy

OCR for page 29
BOX 2.4 R-MAPS PD&R designed Research Maps (R-MAPS) to make HUD data more accessible and useful to researchers, policy makers, and practitioners. The data were initially provided free on CD-ROM but are now provided on-line as part of HUD’s EGIS. The datasets are presented as shape files that link tabular data to boundary files. Spatial query and analysis tools are provided through LandView. Data available through this software include American Housing survey data, government-sponsored enterprise and Home Mortgage Disclosure Act information, Low-Income Housing Tax Credit locational data, housing data, data on public housing and project-based program areas, and data reported in the State of Cities publications. groups; however, a number of data issues should be addressed to make the disseminated data more useful. The issues of data quality and completeness are paramount. Data from different sources often need to be integrated and “cleaned” spatially and thematically11 to maximize their utility. The data development process is the most costly component of a GIS, accounting for about 60 to 85 percent of the cost for most organizations (Longley et al., 2001). The success of an enterprise GIS (see Box 2.5), and therefore its organization, is determined by its ability to provide high-quality and useful data. Data quality also determines the accuracy of subsequent GIS analyses. Error in data tends to propagate through the analysis steps, making subsequent analysis results unreliable. For example, to produce aggregate measures of the incidence of poverty in small areas within cities, HUD uses census data on household income and household composition. Data on rental housing cost may then be used to produce measures of the proportion of household income spent on rent. Error in any of these datasets will be propagated along the chain. Although GIS is useful for storing, retrieving, and analyzing data, it cannot assure data quality, which is essential for sound policy and planning. In evaluating the usefulness and quality of a spatial dataset, a number of factors should be considered. These include: the geographic scale or extent (national or local coverage), the completeness of the coverage (gaps or holes), 11   Spatially, when different datasets are combined, boundaries and roads may be topologically inconsistent and require matching. Thematically, two datasets may have different attributes or coding and necessitate matching or filling in of missing attributes.

OCR for page 29
the spatial resolution (level of spatial details), the attribute accuracy (e.g., sampling density), the frequency of updates, the degree of confidentiality, and finally the compatibility and comparability with other data sets. Another consideration is the perpetual tradeoff between data quality and its development cost (time and personnel needed). Cost is generally higher with higher data quality. Unfortunately, data error is impossible to entirely eliminate, and it will always exist. Therefore, it is important for an organization to maintain a balance between cost and quality. The objective in this case is to “manage” rather than try to eliminate the error by acknowledging some degree of uncertainty through various statistical means (Aronoff, 1989). The above synopsis of the 13 datasets shows that there are rich sources of data that can be utilized for housing research; however, it reveals three issues. First, although most of the existing datasets have some geographical identifiers, not all have been fully spatially enabled. These datasets will have to be address-matched so that the user can retrieve and map information by its spatial location (e.g., a city, a tract). Second, the smallest geographical areas reported for most datasets are at the city or county levels; a re-tabulation of the data into smaller geographical areas, such as at the census-tract scale, should be considered to enable micro-level analysis. Smaller geographical areas such as census blocks or block groups will be more desirable, but they are also more vulnerable because of confidentiality breeches. However, census tracts are an appropriate geographic scale as a first step for a nationwide coverage and some of the existing data are already reported at this geographical scale. Third, some data are incomplete, and these gaps must be identified and filled. Conclusion: HUD datasets derive from a multiplicity of sources. Local datasets can be a valuable source of accurate and detailed data that is relevant to HUD’s local constituents. Recommendation: As a first step, HUD should improve existing housing and related data. Existing data should be cleaned and checked for accuracy, consistency, and completeness. Data gaps should be identified and filled. HUD should adopt accuracy and documentation standards that build on FGDC data standards. Building on existing data is only the first step. As GIS capability improves, new national datasets as well as local, finer resolution data will be wanted, so that new possibilities for research can be realized, and a response to societal needs can be accommodated in a timely and accurate manner. New data development can be concomitant with research issues that are high on HUD’s research agenda, such as home ownership issues, the housing

OCR for page 29
conditions in the Colonias on the southwest border, and growth management problems for communities (see Box 3.1). HUD’s current GIS initiative, entitled EGIS, is described in Box 2.5. An internal spatial data infrastructure or an agency-wide GIS can provide a common platform that facilitates data use and dissemination. An internal GIS infrastructure is an enterprise GIS implemented as an organization-wide platform to structure the collection, storage, analysis, and presentation of geographic information. To accomplish this, a governance structure must be created to adopt standards, implement consistent business practices, and develop an organization-wide strategy for this purpose. BOX 2.5 Enterprise GIS In 2000, HUD entered into a contract with Environmental Systems Research Institute (ESRI) to build an Enterprise GIS or EGIS. The EGIS allows agencies of all kinds and at all levels, and the public in general, to view HUD housing and community development data together with data from three other federal agencies (U.S. Census Bureau, Environmental Protection Agency, and the Federal Emergency Management Agency). The Census data include both 1990 and 2000 data, providing a temporal element to HUD’s GIS platforms. HUD’s addition of metropolitan-level data from other agencies represents an expansion of capability beyond that of earlier systems. By accessing the EGIS web site, individuals and community groups can combine data sets in different ways to compile a rich base of information that is specific to the user’s needs. The information available through the EGIS includes spatially-referenced data on multifamily housing, brownfields tax incentive zones, public housing, hazardous waste and air pollution. Using the EGIS, users can: Create a personalized map. Users may enter an address or click on a map and have the application take them to a map of the location. Add data to maps. Users may display any combination of HUD Housing and Community Development data, along with data from any of HUD’s federal data partners (EPA, FEMA, Census), as well as data layers that include lakes, rivers, landmarks, city streets, highways, and other features Save maps. Users may save the maps they create, name the map, then retrieve it whenever they use EGIS again Create a thematic map. Users may create a thematic map based on data and criteria that the user specifies, thus allowing the user to

OCR for page 29
classify data into groups or classes that have similar characteristics and values; tables associated with the maps will also be available Print maps. s can print the maps they create, and use them for their own purposes. The EGIS is in its early stages, but several challenges have been identified (Mark Mitchell, ESRI, personal communication, 2002). First, there is a need for data integration. Data compatibility is an important issue because of the many data sets that comprise the EGIS, and their varied sources. Second, there is a need to improve data quality generally. Third, early experience with EGIS points to a compelling need to track funds allocated by HUD and create a more rigorous tracking and assessment program to determine the impacts of various interventions. Fourth, the public still has difficulties gaining access to good information. Knowledge of GIS and spatial analysis among local governments and community groups—especially in smaller cities and towns—is often lacking. See <http://www.hud.esri.com/egis> This infrastructure must have authorization and support from the Secretary. It should be created and managed within PD&R, but the changes in business practices required will affect all data gathering, storage, analysis, and presentation within the department and local housing agencies. It requires a long-term commitment of financial and human resources within the department and local housing agencies, but will permit, inter alia geographic analysis of the following: Strength of prior HUD investments; Effect of HUD investment on the stability of neighborhoods, municipalities, schools, and school districts; Educational and economic opportunity present in areas of potential HUD investments; and Future investment decisions that will foster health, education, and economic opportunity, and residential and commercial stability of neighborhoods and regions. Conclusion: A spatial data infrastructure can provide a uniform and high quality of service delivery across HUD’s programs and missions. GIS can foster agency-wide data coordination, integration, sharing, and analysis; and facilitate internal assessment of HUD programs, and analysis and

OCR for page 29
reporting of federal urban investments. An integrated spatial data infrastructure can aid in the delivery of services to HUD clients such as metropolitan or regional organizations and local governments. It can also enable local and regional information to be integrated in ways that allow for more accurate program assessment and for evaluation of federal investment in urban development. Recommendation: HUD should create an internal spatial data infrastructure for an agency-wide GIS to support an appropriate urban research agenda and to integrate locally derived data. Integrating Local Datasets HUD’s 81 field offices nationwide represent a rich source of local data and a wealth of relationships at the local level. Local governments submit their data to the federal government and receive data back in the form of TIGER/line files12 and other products including a digital database of geographic features. The local data that go into the creation of these products can surpass the resulting TIGER/line files in terms of local accuracy and relevance. When local data users find discrepancies in the returned TIGER files (for example, in the spatial boundaries of their local areas), they may spend significant time updating and modifying the TIGER files to accurately represent known local conditions. No mechanism is in place to integrate these updated and accurate datasets from local users and to re-distribute those datasets to other users or to integrate them into a national database. For most urban areas, TIGER data are derived from dual independent map encoding (DIME) files, originally developed for the 1970 census. DIME is an encoding scheme for street addresses. Integration of local (neighborhood and parcel-level) data with federal data from HUD and other agencies will be facilitated by new systems and technologies. At present, however, such data integration represents a formidable challenge. Similarly, efforts to improve data quality constitute a major investment whose full range of costs and benefits are not known. The incorporation of comparable local data would make data available at multiple scales on a broad range of urban topics including real estate market conditions, neighborhood educational and economic opportunity, crime, the quality of local housing stock, and environmental risk. These data could be disseminated via the Internet, saving HUD data users the 12   Topologically Integrated Geographic Encoding and Referencing System (TIGER). See <http://www.census.gov/geo/www/tiger/index.html>.

OCR for page 29
time-intensive work of data integration. The long-term commitment needed for such an effort would produce data for national comparative analysis at a resolution useful to local agencies. As the prime unit at HUD responsible for providing reliable and relevant data for research and analysis, PD&R has an important role to play in promoting data integration and data sharing within the agency, and between HUD and its partners. Consistent data from all internal HUD units and HUD partners are needed for coordination and optimal use of data. Conclusion: As a result of the agency’s local relationships, HUD has significant access to local data and a singular ability to mandate national standards for local data. The HUD grantee program is another valuable source of local data. Recommendation: HUD should develop mechanisms to accept and integrate relevant locally derived data and georeference the data for integration in the agency-wide GIS. Specifically, HUD should spatially-enable local data by performing address matching of individual records at the finest scale using geographic coordinates. HUD should select, tabulate, analyze, and map relevant housing variables through a GIS at multiple relevant geographic scales (census block, block group, and tract; place, county, and metropolitan area). PD& R should take the lead within HUD in efforts to integrate grantee and other data at different levels: parcel, neighborhood, municipality, school and school district, metropolitan area, state, and national. THE NEED FOR AN URBAN SPATIAL DATA INFRASTRUCTURE HUD’s challenge—using spatial data to promote adequate and affordable housing, economic opportunity, and a suitable living environment free from discrimination—is considerable in scope. In its mission, HUD identifies six related strategic goals (HUD, 2002b): Increase homeownership opportunities, Promote decent affordable housing, Strengthen communities,

OCR for page 29
Ensure equal opportunity in housing, Embrace high standards of ethics, management, and accountability, Promote participation of faith-based and community organizations. These goals emphasize the need for HUD to act responsibly and to encourage community participation in the urban and housing arenas. The recommendations outlined in this chapter include full participation in the FGDC and other federal initiatives; ensuring the accuracy, consistency, and completeness of HUD data; creating an internal spatial data infrastructure; and developing ways to integrate and disseminate local data. These goals are encapsulated below in the concept of an Urban Spatial Data Infrastructure (USDI) as a component to the NSDI for urban areas. Because of its relationship with groups at the local level across the nation, HUD has unique access to detailed, updated data on local conditions and local needs. Local data are needed by HUD to address its agency mission and, when stored in a national database like the NSDI, are useful for multiple purposes including comparative urban analysis, resource and services allocation, and homeland security. At this time, the United States does not have a source of standardized local, parcel-level data. Figure 2.2 shows the correspondence between HUD’s agency mandates and national data needs. The Bureau of land Management (BLM),13 through its cadastral survey, is responsible for the identifying the location and boundaries of federal lands in the United States. The agency maintains cadastral survey and historical data, along with information on the mineral estate, resource conditions, and permits or leases on federal lands. BLM’s Geographic Coordinate Data Base (GCDB) is using GIS to modernize data management for a parcel-based land information system that meets FGDC standards. Initially data were collected in the western states where most federal lands are located; collection is now proceeding in the eastern states. BLM also has a successful online data distribution system of the Public Land Survey System, but these data are also limited to the western states. The creation of a nationwide parcel-level dataset will require the participation of local government, finance agencies including Fannie Mae and Freddy Mac, realtors, and market researchers. States and metropolitan/ regional-level governments (for example, the Twin Cities in Minnesota) have created programs to create or modernize parcel-level data. Because there is no nationwide source of parcel-level data, costly duplication and gaps can occur. In its effort to provide accurate and relevant data on urban and housing conditions, help the homeless, spur economic growth in distressed neighbor- 13   See <http://www.blm.gov/cadastral/> for details.

OCR for page 29
hoods, and help local communities meet their development needs, HUD is undertaking an effort equivalent in value and significance to the creation of an urban spatial data infrastructure (USDI) for the nation. For HUD, as for other federal agencies with responsibility for providing spatial data for national initiatives, carrying out this effort demands significant time and resources. FIGURE 2.2 This Venn diagram demonstrates the overlap between the building of a USDI and HUD’s agency mandates. The data, tools, and partners needed to carry out HUD’s mission and support agency research are the same as those needed to build a USDI. HUD shares responsibility for the provision of these important urban data with other federal agencies, notably the Department of Health and Human Services, the Department of Transportation, and the Environmental Protection Agency. The task of cross-referencing data at this fine-grained scale is complex and spatial features used by various agencies do not correspond one-to-one. For example, parcels and boundaries do not mesh with buildings,

OCR for page 29
blocks, streets and street centerlines are not the same as bus-stop-to-bus-stop routes. Floodplains and atmospheric plumes find their own paths. Data sharing in such circumstances involves standardizing data formats, semantics, and syntax. Standardizing 911 emergency addresses and developing master address files is an example of an ongoing multi-decade effort to facilitate address-based geo-referencing. The importance of partnerships and communication among the federal agencies is discussed in Chapter 5. For many purposes, census tracts are too aggregated or too sensitive to problems associated with arbitrary geographic units, known as modifiable areal unit problems. On the other hand, parcel and housing unit data are often too disaggregated, cumbersome, or invasive of privacy for many analyses. HUD has a potential role to play in the development of relevant intermediate layers. In particular, HUD can influence the development of standardized procedures for computing land value surfaces, housing price indices, job accessibility measures, and other derived data layers that are finer grained than census tracts but appropriately aggregated and smoothed compared with parcel maps. HUD could promote data cleaning, interpretation, and statistical analyses needed to develop some of these specialized intermediate layers and play a lead role in making these data a meaningful and reliable component of urban models. The use of software tools to aggregate and adjust the individual data into a customizable form could facilitate the use of confidential data that may be more useful than data at the census tract level. Eventually, such estimated intermediate layers could be derived “on-the-fly” by HUD-provided online tools. Conclusion: HUD is well-suited to be one of the lead federal agencies in providing and managing urban framework data layers for the NSDI. HUD functions within a network of agencies at multiple levels that share responsibility for providing data on urban and community issues. Recommendation: HUD should promote the development of a parcel-level data layer and other urban framework layers to create a USDI as a component of the NSDI for housing and urban development. The federal government should make available resources commensurate with this task. Core elements of the USDI can include: Public and federally assisted housing data, Tenant and housing characteristics, Parcel-level data, Locally updated TIGER files,

OCR for page 29
Environmental data, and Socioeconomic data. Examples of socioeconomic and environmental data include hazardous waste and “brownfield” site location data, crime statistics, transportation data, health data, educational data, and business activity data. Health data include mortality, morbidity, and immunization statistics. Education data include school performance, enrollment, and percent receiving subsidized lunch. Business activity data include the number of business establishments, and employment data. The development of GIS at HUD is not so much a question of purchasing GIS software systems and data as it is the development of appropriate, maintainable databases, information infrastructure, and interagency-agency relationships. HUD must upgrade its internal spatial data infrastructure and play a more active role in shaping an urban spatial data infrastructure (USDI). Rather than a system that HUD owns or manages, the USDI would include many agencies and public-private partnerships at multiple levels of government. The USDI would comprise locally managed metropolitan information infrastructures that can feed appropriate data to HUD and use HUD-processed data, analyses, indicators, and models to improve the understanding of local conditions and the design, delivery, and evaluation of local plans and services. Conclusion: The collection and dissemination of relevant and accurate urban data require close coordination with state, regional, and local groups. Communication with local groups ensures that the data collected are meaningful to the community. Partnerships with local groups build capacity for research and applications and promote the collection of accurate data that meet federal standards for data sharing. Local data centers can support the development of a USDI. Recommendation: HUD should encourage and support the development of local, metropolitan, and regional data centers to facilitate local data coordination, use, and training towards the creation of a USDI. SUMMARY Developing, maintaining, and disseminating reliable spatial data are major challenges for HUD. HUD’s efforts will be more efficient and productive when carried out in coordination with other federal data initiatives, notably the NSDI. To provide accurate and consistent data on housing and urban issues and use these data for internal evaluations of HUD programs and investments, HUD needs an in-house spatial data

OCR for page 29
infrastructure. The challenge that HUD confronts in use of spatial data to address its broad and important mission goals demands an effort that is tantamount to the creation of an urban spatial data infrastructure (USDI) as a component of the NSDI. The integration of local datasets into HUD’s databases can provide relevant, high spatial and temporal resolution data for HUD’s internal program analysis and evaluation, and for national data and information needs including the creation of a USDI, resource management and allocation, and other federal data initiatives. These are formidable challenges that resist simple solutions. This chapter offers a vision of the future of GIS at HUD that demands administrative support and adequate resources for spatial data initiatives.