5
Improving Data and Statistics on Business Dynamics—Bridging the Gap Between the Current and a Comprehensive System

In this report we argue that the constellation of business data currently produced by the statistical agencies, while impressive in many respects, could be substantially improved. An increased focus by statistical programs on business dynamics would facilitate a more complete understanding of business creation processes, of the mechanisms whereby firms adapt and change with the economy, of the role of new and young firms in economic growth, of how new sectors emerge and new markets are created, and of shifts in employment opportunities across sectoral and geographic dimensions. Additionally, while the United States is widely considered the exemplar of an “entrepreneurial nation”—largely because of its role in the creation of entirely new markets and economic sectors—there is very little precise information about how this mechanism actually works and what policy actions might enhance (or harm) this environment.

In this chapter, we provide a framework and recommendations for (1) improving the representativeness and quality of a broad range of business surveys, (2) generating more timely descriptions of changes in the U.S. economy that allow policy makers to respond more quickly to the changing business environment, and (3) expanding the scope and details of information on individuals businesses. We offer specific recommendations—about how to improve the business lists residing at the statistical agencies as well as other data sources relevant to the measurement of business formation, dynamics, and performance—while recognizing the need to minimize additional costs and respondent burden. In constructing our recommendations,



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future 5 Improving Data and Statistics on Business Dynamics—Bridging the Gap Between the Current and a Comprehensive System In this report we argue that the constellation of business data currently produced by the statistical agencies, while impressive in many respects, could be substantially improved. An increased focus by statistical programs on business dynamics would facilitate a more complete understanding of business creation processes, of the mechanisms whereby firms adapt and change with the economy, of the role of new and young firms in economic growth, of how new sectors emerge and new markets are created, and of shifts in employment opportunities across sectoral and geographic dimensions. Additionally, while the United States is widely considered the exemplar of an “entrepreneurial nation”—largely because of its role in the creation of entirely new markets and economic sectors—there is very little precise information about how this mechanism actually works and what policy actions might enhance (or harm) this environment. In this chapter, we provide a framework and recommendations for (1) improving the representativeness and quality of a broad range of business surveys, (2) generating more timely descriptions of changes in the U.S. economy that allow policy makers to respond more quickly to the changing business environment, and (3) expanding the scope and details of information on individuals businesses. We offer specific recommendations—about how to improve the business lists residing at the statistical agencies as well as other data sources relevant to the measurement of business formation, dynamics, and performance—while recognizing the need to minimize additional costs and respondent burden. In constructing our recommendations,

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future we attempt to differentiate between strategies that could be implemented relatively quickly at modest cost and those that would require longer term commitment. Early on, the panel identified a set of principles to guide its work and, in turn, the development of recommendations presented in this chapter. These principles are detailed in Chapter 1 and summarized here: Confidentiality: Statistical agencies have the responsibility to data providers and data subjects to protect the confidentiality of information that is provided. Data collected by the government must be maintained in such a way that identifiable information is not disclosed for administrative, regulatory, or enforcement purposes. Public Purpose: Subject to confidentiality requirements, data sharing among government statistical agencies and data access by others should be facilitated when it serves a substantial public purpose. Data uses that serve a substantial public purpose include those that (1) lead to improvements in the quality, breadth, and usefulness of government statistical data and systems; (2) provide evidence crucial to informing government policies on social and economic issues; and (3) encourage research that advances scientific knowledge. The rationale for the public purpose principle is straightforward: government administrative record systems and survey databases generate enormous public value in terms of informing decision makers (including those in the private sector) and are maintained at considerable cost to the public. As such, the public is entitled to the full and effective use of these assets, provided that such uses do not compromise the confidentiality assurances afforded to respondents. Targeting Deficiencies: Improvements to data collection should focus first on areas for which policy and research relevance is high but statistics needed to inform those policies and research are weakest. For business data, this means building up the statistical infrastructure for measuring dynamics and collecting information on rapidly growing economic sectors in which the activities of smaller and younger firms are disproportionately important, but for which data coverage is relatively weak. Cost Efficiency: The statistical agencies should give the highest priority to actions that can be done expeditiously and at low cost. In this report, we identify a number of cases for which more creative use of existing data could lead to the production of useful statistics. The idea is to get as much information out of the system as possible for a given level of resource and data protection commitment. Reflecting the charge to the panel and the concentration of its efforts, recommendations are organized around three systemic needs:

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future increasing the capacity to measure activities of nascent and young businesses that rapidly enter and exit fast-growing and innovative sectors of the economy and that are central to understanding business dynamics; improving the coverage and depth of business data through more effective coordination and integration of existing sources; and shifting the legal and organizational environment to accommodate data sharing and confidentiality protections in a way that enables the kinds of efficiencies envisioned by the panel to occur. 5.1 EXPANDING DATA SOURCES FOR MEASURING BUSINESS DYNAMICS Although accounting for only a small portion of economy-wide revenues, nonemployers (most are sole proprietors) and other small firms represent the vast majority of businesses in the United States. More importantly, these microbusinesses appear to disproportionately contribute to changes in the composition of the economy’s product and labor markets. A small percentage but large absolute number of these businesses evolve into firms with employees. Data on nonemployers, sole proprietors, and those involved in entrepreneurial activities are therefore essential for studying business dynamics. Measuring key business and worker transitions typically requires longitudinal data covering the early and late life phases of businesses. Ideally, entities are tracked at the establishment level in such a way that changes (for example, in the kinds of goods or services produced) can be detected, even when location and name remain the same. In addition, data are needed that can be disaggregated to local levels. Recent examples of high-profile, localized economic transitions include the workforce mobilization out of New Orleans following Hurricane Katrina and those associated with military base closings and realignments. High-quality samples that are representative of a broad range of businesses—old and new, large and small— would allow for more timely analysis of these kinds of events and, in turn, provide opportunities for policy makers to respond more quickly to changing market conditions. An ideal business data system would facilitate measurement of the attributes that are linked to (and possibly predictive of) business performance, outcomes for individuals and communities, and local economic trends. 5.1.1 Sampling Young and Small Firms In designing a data collection system, nothing is more fundamental than the question of whom to survey. The optimal mix of business entities

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future to be covered in the statistical system’s surveys, censuses, and administrative sources must be determined. The challenge is that this is a multipurpose problem—there is a trade-off between measuring levels and measuring dynamics. Because it is important to estimate the volume of output, employment, and other variables at low cost, statistical programs have historically focused on the largest entities. However, to accurately measure changes occurring among businesses and in markets, which for many purposes is more important, data must also be collected on emerging and fading entities—typically the smallest. For measuring business dynamics, it would be beneficial to reduce the undersampling of those parts of the business population that are most likely to be in transition and that provide early indicators of the future directions of the economy. Recommendation 1: To measure business dynamics more effectively, the Census Bureau and the Bureau of Labor Statistics (BLS) should increase the sampling of younger units in their surveys. This will require that business age is included as one of the stratifying variables and that business lists, on which the surveys are based, cover recent business entrants. When optimizing a sampling structure, there are trade-offs in terms of picking up changes in variables versus achieving precision in population total estimates. Currently, most survey programs stratify by industry and size, and samples are chosen to minimize the sampling error of level estimates. This approach leads to lower (in most cases, much lower) sampling probability for small businesses relative to large ones. If the statistical agencies move to instead minimize a criteria function that includes sampling errors for both levels and growth rates—for example, some weighted average of the mean squared error—then this would increase the sampling probability of small businesses. Stratifying by age, as we suggest, will have similar effects. Although the case for these recommended changes is strong, implementing them would require a rethinking of the fundamental approach by the agencies to business sampling. Acting on this recommendation would entail new costs and take time to implement but, with very little new expenditure, the agencies could immediately begin undertaking research to quantify the statistical trade-offs associated with adjusting sampling rates of businesses along age and size dimensions. As things now stand, nonemployer firms—many of whom are initiated as sole proprietorships or partnerships—are almost completely unrepresented in federal data programs, yet these businesses are frequently associated with the most dynamic elements of the economy. That sources of microdata have historically been scarce for nonemployers, and for smaller and newer businesses more generally, has hampered research progress on

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future business dynamics. Recently, however, data products have emerged that promise to greatly enhance available information relevant to the topic. Because of their proximity to the business lists, the Census Bureau and BLS are the key architects in this emerging data area; their efforts should be applauded and further development of these sources encouraged. Recommendation 2: BLS and the Census Bureau should support and expand their development of statistical programs such as the Business Employment Dynamics (BED) and Statistics of U.S. Businesses (SUSB) that provide basic measures of business dynamics, including statistics on business formation and dissolution and job creation and destruction. Some extensions of these programs would admittedly necessitate longer term commitment, while others may be initiated by more intensive use of existing data. For example, the Census Bureau could add significant value to the SUSB by incorporating information on the dynamics of nonemployer firms (taking advantage of the development of the Integrated Longitudinal Business Database or ILBD). This would clearly be an expansion, albeit a useful one, of the program. Still, progress could be made on some fronts with relatively little additional cost. BLS could accelerate the development of more disaggregate tabulations at the geographic (substate) and industry (6-digit NAICS) levels with little or no new data collection, just more aggressive processing of administrative records data. If one accepts the premise that data on small and young businesses are inadequate and that important research and policies rely on such information, then it follows that key data programs must keep careful tabs on how long business entities have existed. Recommendation 3: The Census Bureau and BLS should exploit their administrative record systems to produce public-release statistics on economic activity disaggregated by indicators of business age. Readily available business age indicators in these administrative records systems include the application date for an Employer Identification Number (EIN), the point at which positive revenues are generated, and the first period with positive payroll. Acting on this recommendation will require only modest adjustments to existing data collection instruments; indeed, age tables were produced for the Census of Retail in 1939 and 1948. The Census Bureau and BLS currently publish numerous public-release statistics disaggregated along other dimensions—for example, data on productivity by industry and firm size. It would be similarly useful, in terms of monitoring comparative trends of new entrants in the economy, if statistics were maintained for firms and

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future industries by age. The appropriate milestone for defining a business birth will vary by purpose. A focus on publishing statistics by business age would also dovetail well with the recent innovations in measuring producer dynamics in the microdata-based BED and SUSB/LBD programs. Since these programs rely on the accurate measurement of entry and exit of producers and the accurate tracking of existing producers over time, their statistical frames could be readily adapted to include statistics disaggregated by business age. The development of longitudinal versions of the business registers at both the Census Bureau and BLS would permit using business age as a stratifying variable to annual, monthly, and quarterly surveys. Because many key statistics (e.g., productivity by industry) integrate survey information from multiple sources, adding downstream data products delineated by business age would require increased coordination by the agencies to make definitions consistent. Initially, business age should be added to surveys for which the new information would be most valuable. Good candidates for this might be the Annual Capital Expenditures Survey and the National Science Foundation’s Research and Development Survey. The Census Bureau’s Survey of Business Owners (SBO) offers something of a model, given that it already asks respondents for information on business age. 5.1.2 Nascent Business Activity—The Essential Role of Household-Based Data Tracing the entire life cycle of businesses and measuring and analyzing the processes through which they are born and grow require going beyond traditional data collection from employer businesses. Nascent businesses encompass the entrepreneurial activities of individuals or households before they come in contact with the legal system as business entities—thus, business registers take one only so far in measuring business dynamics. Only after acquiring an EIN as a federal business taxpayer, or as a state Unemployment Insurance taxpayer, is a business tracked in the frames used by BLS and the Census Bureau to measure economic activity. Unlike the BLS register, the Census Bureau register does includes nonemployer businesses if they have taxable revenues. However, businesses are not typically included in the major surveys of these agencies until they become employer businesses with positive payrolls or taxable revenues. There is little in the current system that provides a way of tracking individuals as they enter into the business creation process and spend time and resources in an effort to organize and implement a new firm. The most direct way to capture many of the activities (and characteristics of those

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future carrying out the activities) associated with the early stages of business formation is to collect data from household or individual units. While there are limitations to household-based data, such as the typical absence of information on business performance, they can add unique analytic capacity for understanding business dynamics. Ideally, information collected from households would be linkable to business data sets through unique identification numbers. Integrating data from households and from employers is critical—perhaps increasingly so—for tracking the growth of new firms and emerging sectors, and for developing a more complete picture of employment flows in the economy. Existing household surveys could be used as the screening vehicle for identifying nascent and young businesses. Recommendation 4: The Census Bureau should periodically add a module to the American Community Survey (or possibly the Current Population Survey) to identify nascent entrepreneurs. A method should be developed for linking this survey information with subsequent business identifiers in a longitudinal household-business data infrastructure so that transitions from nascent to active status (and vice versa) and from nonemployer to employer status (and vice versa) can be measured and studied. Adding a module or a screener question to household surveys can, in principle, capture activity associated with nascent businesses. Such a plan must acknowledge that the majority of households will not be populated with individuals involved in business start-ups. Because the sampling frame is not particularly efficient for locating such activity, only very large surveys would generate sufficient numbers of eligible cases. That said, only minor modifications to the structure of items asked in the Current Population Survey (CPS) or the American Community Survey (ACS) would be required. These modifications could be implemented immediately with relatively little additional monetary cost, though we recognize that there is an opportunity cost (given the widespread interest in adding content to household surveys, there are a number of topical modules worthy of consideration). The Census Bureau should also consider implementing a program of periodic follow-on surveys of the screened nascent entrepreneurs and young businesses using specialized topic modules. Combining responses to a well-designed set of questions with some longitudinal follow-ups and administrative record linkage would provide a pathway for studying the dynamics of these businesses over their entire life cycles.1 If these individuals could be 1 Approximately one-third of new initiatives ultimately become incorporated into the various business registries.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future followed over a longer period of time—up to three or four years—activity could be tracked until their businesses entered the system through a Schedule C tax return filing or, if it had employees, via the Unemployment Insurance system records. Such modules could be rotated over time to cover a range of both firm and business owner variables. Data from these modules would provide estimates of the prevalence of independent start-ups and business sponsored start-ups among the adult population and should be stored in a database that facilitates longitudinal analyses. 5.1.3 Surveying Business Owners In addition to tracking changes that business entities undergo, it would also be beneficial to be able to monitor transitions that accompany the earliest phases in the lives of the owners who start them. Given the focus of many surveys on large producers, timely information on start-up financing, human resources, and investments in research and developoment and physical capital is often inadequate for young and small firms. This is particularly true for the nonemployer segment of the business population. One survey vehicle that does provide coverage of both the employer and nonemployer universes is the SBO. A key feature of this survey is that it identifies business age. The SBO generates statistics on the composition of U.S. businesses and on owner characteristics. Economic policy makers in federal, state, and local governments use SBO data as a source of information on business success and failure rates. The survey is particularly useful for comparing the performance of minority and nonminority and women-and men-owned businesses (see Appendix A). The primary shortcoming of the SBO, in terms of its value for producing statistics on business dynamics, is that it is carried out infrequently— once every five years. Because many new businesses emerge then fail quickly, this kind of information needs to be collected on a more frequent basis. Recommendation 5: The Census Bureau’s SBO should be conducted on an annual basis. The survey should include both a longitudinal component and a flexible, modular design that allows survey content to change over time. In addition, the Census Bureau should explore the possibility of creating a public-use (anonymized) SBO or a restricted access version of the data file. The survey could be modified to include panel elements as well, perhaps in a manner similar to what is done in the Annual Survey of Manufacturers. This would facilitate measurement of the transitions that young and small firms make over their lifetimes. Finally, it would allow for flexibility in the type of questions asked over time by incorporating survey modules that differ with respect to content. For example, to minimize burden, one could

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future create modules on business finance, investment, and workforce training, among others, and cycle through them so that each is conducted periodically. The net result of such a program would be more detailed statistics about young small firms, provided on a more consistent basis, with overall better survey coverage than is currently available. Implementing Recommendation 5 entails no conceptual hurdles; however, a more frequent survey would create new demands on resources and raise concerns about burden. It is possible that respondent burden associated with a more frequently conducted survey could be offset by rotating the samples and supplementing the survey data with additional administrative data. Finally, the value of the SBO would be greatly enhanced if researchers could obtain greater access to the microdata in secure settings or through creation of a public-use file. 5.2 MORE EFFECTIVE USE OF EXISTING INFORMATION Statistical agencies do not and should not conduct their activities in isolation. An effective statistical agency actively explores ways to work with other agencies to meet current information needs, for example, by seeking ways to integrate the designs of existing data systems to provide new or more useful data than a single system can provide…. Efforts to standardize concepts and definitions, such as those for industries and occupations, further contribute to effective coordination of statistical agency endeavors, as does the development of broad macro models such as the system of national accounts. —Practice 11: Coordination and Cooperation with Other Statistical Agencies, Principles and Practices for a Federal Statistical Agency (National Research Council, 2005a, p. 41) In working toward an improved and more versatile data system, the question of how much and what kinds of data are needed to fulfill important purposes must be balanced against the cost and burden associated with the enterprise. Given finite, often tightening resources, a realistic strategy to improve business data must rely heavily on effective use of current data collection efforts. 5.2.1 Linking Survey and Administrative Data Sources A comprehensive business data system must integrate information from an array of sources—private and public, business and household based, cross-sectional and longitudinal, survey and administrative, national and subnational—in a way that permits business dynamics to be measured in ways that are just now being conceptualized. Given legal, bureaucratic, and political realities, movement toward an ideal system can be expected to be

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future incremental, but the basic idea to creatively combine data sources should guide the process from the beginning. A key aspect of the strategy to better coordinate data collection and production involves solving technical and legal hurdles so that administrative data that are routinely collected by (and from) businesses can be optimally exploited. Use of administrative data can (1) improve accuracy of information, particularly when survey questions require respondent recall, (2) improve breadth of information, and (3) reduce respondent burden by minimizing the amount of information that must be gathered using surveys (National Research Council, 2005a, p. 8). In conducting their business data programs, the statistical agencies could, if permitted, make more effective use of administrative data that are collected as a matter of course. For studying topics related to life-cycle business dynamics—such as the link between the age of businesses and their economic contributions (e.g., to employment growth or innovation)—it is particularly important to develop a linking strategy that allows for the construction of comprehensive longitudinal data structures that capture events as they take place. Many of today’s surveys and censuses have longitudinally incompatible questionnaires. When survey instruments are created and revised, more weight should be given to the potential longitudinal uses of the data. It would also be highly desirable to be able to link new collections to existing data sets to maximize their research and policy value. Linkage opportunities include tapping “nonbusiness” data, such as the CPS, the ACS, and the American Time Use Survey. Looking forward, the statistical agencies should develop their administrative data and surveys with the intent to integrate them into a longitudinal household-business data infrastructure. Recommendation 6: The Census Bureau should develop a fully integrated longitudinal household-business data infrastructure from administrative data to serve as a platform for tracking business formation, for integrating household and business survey data for measuring economic activity associated with the business formation process, and for developing samples for new surveys of business dynamics. The integration should include the master household address files, the job frame from linked employer-employee administrative records, and data for firms (including those with no paid employees, but with receipts) from the Census Bureau business register. The Federal Economics Statistics Advisory Committee recently advised BLS and the Census Bureau to further integrate household and employer data. One motivation is to investigate the discrepancies between the various employment statistics produced by the agencies. Given the potential differ-

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future ences in the treatment of young businesses and nonemployers in data originating from households versus businesses, measures of self-employment and, in turn, business formation and dynamics can be systematically affected. Again, implementing these recommendations will require little or no new data collection, just more intensive record processing.2 Moving toward the vision specified in this recommendation involves a long-term strategy. Elements of this strategy to innovatively integrate data and improve coverage of small and young firms are already in motion. The Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) program combines federal and state administrative data on employers and employees with federal censuses and surveys. The LEHD has created opportunities to conduct research on topics for which empirical analysis of confidential longitudinal linked employer-household microdata are required. Similarly, the Integrated Longitudinal Business Database (ILBD) combines survey and administrative data on employer and nonemployer businesses. The ILBD provides a tool for studying business start-ups and early life-cycle dynamics by tracking business entities as they transition from nonemployer to employer status (Appendix A contains detailed descriptions of these data sources). Many other opportunities exist as well. For example, the SBO, discussed above, could be linked to the ILBD, which would allow for more thorough, though still imperfect, longitudinal tracking of businesses by owner(s)’ race and gender.3 An efficient data collection infrastructure also requires that survey programs be well coordinated across statistical agencies. It is not sensible to expand the Census Bureau’s surveys to include more information or to collect it at more frequent intervals when similar data are already collected in BLS surveys. Similarly, it is not efficient to add output and nonlabor input measurement (like capital investments) to BLS high-frequency surveys. However, periodic measurement of all these concepts on the same questionnaire (and from the same entities) is the only way to identify and correct errors in the estimation of dynamic relationships that occur when the microdata from multiple sources are aggregated for use in statistical products. Recognizing cost limitations, the panel recommends that the two agencies use topical modules in each other’s surveys to address this defi- 2 This is not to say that it will be easy. The quality and properties of administrative versus survey data, including the quality of the longitudinal links, should be a high-priority research topic in the development of this data infrastructure. There are also complications involving timing as it relates to linking survey and administrative data that are captured with different time lags. 3 Something similar to this was done by Robb (2000) using 1992 business owner survey data and longitudinal establishment data.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future Statistical Efficiency Act of 2002 (CIPSEA), which allows sharing of confidential business data among BLS, the Bureau of Economic Analysis (BEA), and the Census Bureau for statistical purposes—provides a foundation facilitating the kind of data coordination needed to work toward this goal. Recommendation 11: BLS and the Census Bureau should cooperate under the auspices of CIPSEA to initiate, and CIPSEA should be enhanced to allow, the creation and use by source agencies of a reconciled, consolidated, integrated business establishment list. The potential advantages of an integrated business sampling frame are many: Census data would help BLS improve the consistency of industry codes and (if federal tax data can be shared) BLS could obtain information on the nonemployer universe, thereby improving sampling efficiency. BLS data could benefit the Census Bureau by providing employment data for single units, and industry codes and physical location information for all records. Matching BLS’s business establishment list and the Census Bureau’s business register would allow editing processes to be developed to identify records with large discrepancies. Gains from sharing the two existing business registers include more accurate measurements of births, deaths, and ownership changes; enhanced ability to track mergers and acquisitions; improved industry output and productivity measurement; and possibly reduced costs and burden. The potential benefits to downstream users from these upgrades (many of which are documented in National Research Council, 2006) are substantial. Implementation of this recommendation would require harmonizing key elements of the current business lists. These elements include frame maintenance, activity/industry coding, birth and death record processing, ownership change identification, and handling of missing data. The creation of a reconciled, consolidated establishment register requires four broad steps involving information from the Census Bureau registers, BLS registers, and (indirectly) the state employment security offices, which maintain the input list used by BLS: matching and unduplicating the combined employer establishment lists of BLS and the Census Bureau (consolidated employer establishment list); integrating and unduplicating businesses from the Census Bureau’s nonemployer registry with respect to establishments in the consolidated employer establishment list; validating the births, deaths, and entity demography using information from all sources; and

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future reconciling the activity codes, physical locations, and volume variables when the consolidated list displays disagreements between the sources. We envision that a reconciliation can be fully automated, as a cost-saving measure; however, human value added should be used to improve the process over time. The new version of the establishment registry would include unique business identifiers, common activity codes (NAICS, presumably), common physical location identifiers (latitude, longitude, presumably), common volume variables (employment and payroll, presumably), common indicators of employer or nonemployer status, and common indicators of type of ownership and enterprise structure (e.g., multiunit). Reconciliation of the two business lists in itself will not produce the kind of gains the panel envisions unless the most desirable characteristics of each can be brought to bear in the new product—reconciling the lists does not mean drawing from a single administrative and survey source. Even now, BLS and the Census Bureau should be sharing as much of the multiestablishment data (which does not involve IRS data) as they possibly can. A goal of the reconciliation project might be to identify the most productive data items and records to be shared under CIPSEA. To fully integrate all useful information, reconciliation should take place at or near the end of the production processes that yield the current BLS and Census Bureau business registers. An exception might be that new establishment lists be reconciled as soon as data become available, so that sample frames can be rapidly and continuously updated. Finally, it is worth reemphasizing that business list comparison work is already well under way at the agencies. This progress not withstanding, the idea of a BLS-Census Bureau business list reconciliation is still very much in the discussion stages. The legalities and procedures necessary to get this started (most specifically, the restrictions resulting from tax return data that the Census Bureau receives from the IRS) are not trivial. A high proportion of Census Bureau business records contain data that originate or that are derived from tax records, the use of which is restricted by tax law and IRS regulations.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future 5.2.3 Expanding the Use of Data Accurate and timely information about the economy is critical for effective policy making in both the private and public domains. The statistical agencies rightly view themselves primarily as data collectors, and their mission is to do so, maximizing quality subject to budget constraints. While the agencies have skilled in-house staffs of policy analysts and researchers, the vast share of expertise resides elsewhere, and public policy research must be done at universities or other nongovernmental institutions. It is this intensive use of statistical agency products that gives them their high public value. Recommendation 12: The quality of research based on business data produced by the statistical agencies would improve with greater interaction between outside researchers and businesses and the statistical agencies. As recommended in previous Committee on National Statistics reports, statistical agencies, in particular the Census Bureau, should incorporate into their missions a broader interpretation of the criteria for access to data. Specifically, research that informs social and economic policy should be considered a valid reason for accessing confidential data. The panel is encouraged by recent steps at the Census Bureau and IRS to emphasize the importance to public and private decision making of research that takes place at the agency’s data centers, and to work out procedures to facilitate streamlined processes for reviewing proposed research projects.6 It is also important that the statistical agencies facilitate front-end collaboration with the academic community (and, in some cases, the business community as well) with respect to survey design. Ideally, the basic elements of the employer section of the business register should be made accessible to qualified researchers to serve as a sampling frame for their surveys. Practically, we realize that this will not happen overnight, and we recommend extending access first for key government policy research purposes. 6 Our optimism is based, in part, on a January 4, 2007 memo from the Director of the Census Bureau which states: “United States Code Title 13 Chapter 5 directs the Census Bureau to carry out censuses and surveys of the U.S. population and economy. Ensuring that resulting data meet the highest standards of quality and utility requires significant supporting analytical research by Special Sworn Status researchers participating in the Census Research Data Center program. The importance of this research will only increase, as the data needs of public and private decision makers grow broader and more complex…. Accordingly, and to continue fulfilling its mandate at the highest level of technical excellence, it is the policy of the Census Bureau to undertake analytical research for authorized purposes, to the fullest possible extent.”

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future Recommendation 13: It would be highly desirable if the business register(s) were available to federal agencies for the purpose of constructing sampling frames. For example, the Board of Governors of the Federal Reserve System should be able to access, in a secure setting that ensures current levels of confidentiality, the Census Bureau business register for the purpose of drawing samples for the surveys that they conduct or commission. Changes in legislation protecting confidential information should permit sharing of data—including reconciled industry, location, entity identifier, volume, and employer status codes—for statistical purposes with the condition that the original source of the reconciled value and entity is not identified. For many purposes—such as state and local planning—data must be collected and accessible in a way that allows for small-area analyses. Data with precise location identifiers are also needed to document the impact of federal government policies and actions. As a specific example, one could imagine research and policy interest in data on federal government contract awards to private businesses, linked to the Census Bureau business register using EINs or other common business identifiers. The Census Bureau could use data linked in this way to produce public-release statistics on the volume and type of contract awards by county, industry, business size, and business age. The impact on local economies of base closures, military reserve deployment, emergency relief, and extending eligibility for unemployment insurance are other examples of situations in which the capacity to fully analyze events has been compromised by data limitations. The same attributes that make data useful for research and policy can also increase their value to businesses. For example, most business planning takes place at the subnational level, which generates a need for small-area statistics. However, additional burden can also be created since firms, especially multiunit firms, may have trouble disaggregating information into small geographic areas. Often, firms are able to report only at a more aggregate level than that which would truly be of interest. More generally, there are increasing conceptual difficulties associated with assigning a physical location to economic activity performed by businesses. It is becoming commonplace for economic activity to be conducted by “virtual businesses”—groups of people collaborating without a formal employment relationship. Globalization has also created more complex and far-flung supply chains. The statistical agencies have done a good job of integrating small-area details in many of their data programs. Survey programs should continue to collect, and administrative record systems should maintain, data that enable (1) identification, for authorized purposes, of detailed geographic

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future and sectoral location of business activity, generally at the establishment level, and (2) flexible aggregation of statistics by product, industry, region, county, etc. Geographic specificity raises confidentiality issues, since finer geocodes compromise anonymity. Typically, data at the finest level of geographic detail can be made available only in restricted access settings or, for public-use data sets, in an altered form. Agencies can add value to their products to the extent that they can utilize statistical “fuzzing” techniques to maintain confidentiality without completely losing geographic details. Recommendation 14: Using synthetic data approaches or other statistical disclosure limitation techniques, BLS and the Census Bureau should work to develop anonymized, public-use versions of their recently developed longitudinal data sets on businesses. This should include the Longitudinal Database on Businesses from BLS and the Longitudinal Business Database and the ILDB at the Census Bureau.7 Work to begin implementing the recommendations in this subsection can begin without major cost outlays. These ideas (which are far from novel) involve no new data collection, only a fuller recognition of the public value of statistical data use and a shift in the policies regulating the scope of that use—the topic to which we now turn. 5.3 CHANGING THE DATA-SHARING ENVIRONMENT TO REALIZE SYSTEMIC EFFICIENCY Initiatives for sharing data among statistical agencies (including individual data and address lists when permitted by law and when sharing does not violate confidentiality promises) can be helpful for such purposes as achieving greater efficiency in drawing samples, evaluating completeness of population coverage, and reducing duplication among statistical programs, as well as reducing respondent burden. —Practice 11: Coordination and Cooperation with Other Statistical Agencies, Principles and Practices for a Federal Statistical Agency (National Research Council, 2005a, pp. 44-45) We have emphasized a strategy for improving business data and statistics that involves more effective use of administrative and survey data that are already collected. Data sharing among the agencies is a key aspect of this idea. In order to produce the highest quality data sets and statistics at 7 It should be noted that Reiter and Kinney (2006) are engaged in work to produce a partially synthetic version of the Longitudinal Business Database.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future the lowest possible cost, the statistical agencies must be able to access the best information available, system-wide. Data sharing has the potential to reduce respondent burden as well which, along with assurances of confidentiality, may increase the likelihood that businesses respond to survey requests. To their credit, the statistical agencies have recognized the potential gains from data sharing, and survey and administrative data on U.S. businesses are shared to some extent among BEA, BLS, and the Census Bureau for statistical purposes.8 Recommendation 11, above, argues for extending CIPSEA to increase the flexibility with which information can be shared among statistical agencies for purposes of constructing a comprehensive business register and for designing special surveys. Sharing of business registers is essential in order to continue to improve the accuracy of measures of industry output, compensation, and productivity trends. It would permit the statistical agencies to keep abreast of the dynamic economy by producing statistical samples that are consistently and quickly adjusted to reflect entry and exit of new businesses. This is especially important for fast-growing and innovative industries, such as information technology. Such improvements would enhance our ability to perceive emerging trends in the economy and more accurately forecast economic activity.9 The panel endorses most aspects of the past efforts (reviewed in 4.4 Appendix) to expand data sharing. Effective coordination of statistical agency data programs is essential for improving the accuracy, coverage, and timeliness of business data, as well as the efficiency with which it is produced. As discussed above, a key part of the strategy to develop the most useful business data system possible (and a valuable and low-cost first step) would be to coordinate, and improve in other ways, the business lists residing at the Census Bureau and at BLS. Expanded interagency data sharing is a prerequisite for making progress on such a project. Before work can progress much further to reconcile the business lists, and before data sharing between the three CIPSEA-designated agencies (BLS, BEA, and the Census Bureau) can be fully exploited, the IRS regulations and tax code legislation must be changed. Recommendation 15: Measures should be taken immediately to facilitate the expansion of CIPSEA to increase the kinds of infor- 8 The “statistical purposes” qualifier excludes using information for administrative, regulatory, law enforcement, judicial, or other purposes that may affect the rights, privileges, or benefits of a respondent. 9 This argument was well articulated in comments made by Federal Reserve Board governor Randall S. Kroszner about “developing innovative statistics for a dynamic economy” (Kroszner, 2006).

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future mation that may be shared among statistical agencies for the purpose of reconciling the business lists and for the design of special surveys. This expansion of data sharing can be accomplished by (1) Congress acting to enact legislation that revises Internal Revenue Code Section 6103(j) to extend authorized access of IRS tax information to BEA and BLS, (2) the Treasury Department initiating an update of the IRS regulations, which clarify purpose and detail specific items that can be shared with authorized agencies, or (3) a combination of both actions. While recognizing that the “ideal” data system for measuring business dynamics would ultimately integrate data from an array of sources—private and public, business- and household-based, cross-sectional and longitudinal, survey and administrative, national and subnational—it is worth noting that CIPSEA will, in reality, probably expand only incrementally. A first step should be a push to amend Section 6103 of the IRS Code and Treasury regulations to allow BLS and BEA access to part or all of the tax data to which the Census Bureau has access for the specific purposes of creating a unified business list. This might entail limiting data sharing, for federal tax information, to a small number of variables needed for business list coordination (e.g., name, address, legal form of organization, ownership structure, identity of parent firm if applicable, industrial classification, geographic coding information, employment, and payroll). The goal of the agencies charged with creating and maintaining the source lists should always be to include accurate data on all business units, large and small, new and old. There are, according to the Census Bureau, roughly 18 million nonemployer firms10—about three quarters of all firms—and they constitute a reasonably large fraction (12 percent in 2000) of aggregate U.S. business revenues (http://www.census.gov/epcd/www/smallbus.html). In addition, for the subset of businesses in which this panel is particularly interested—the young and small ones, many of which operate in service and information sectors—there is much fluidity between those that have employees and those that do not. Indeed, a substantial portion of established firms in the United States started out as nonemployers (many of them sole proprietorships). Thus, sharing of business data would be quite limited if it did not permit an integration of the list information from the Census Bureau on nonemployer businesses. 10 For nonemployers, a firm is the same as an establishment. Because the Census Bureau counts each distinct business income tax return filed as an establishment, it is possible for an individual to account for more than one nonemployer establishment.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future Recommendation 16: In order to create a comprehensive business list and to generate data that would be useful for studying the dynamics of small and young firms, interagency sharing agreements should extend to data on nonemployers. Data on all sole proprietors and partnerships must also be included, whether they have employees or not. We believe that a compelling data-driven case has been made, in this report and elsewhere, that reconciliation of the business lists would better serve downstream users—such as BEA in the production of national accounts and the Federal Reserve in carrying out research to inform monetary policy—to an extent that more than warrants the actions recommended here. In order to effect such changes, active support will be needed from within the administration (e.g., the Office of Management and Budget and the Council of Economic Advisers, where it appears to already exist) and from congressional staff. A key element to generating this support involves ensuring that further sharing of business data for statistical purposes does not unduly compromise confidentiality. The political and legal feasibility of such an extension has certainly been enhanced by recent events. The provisions of CIPSEA provide sufficient coverage to continue to ensure that the privacy and confidentiality of records will be maintained, even with expanded sharing of information. Indeed, given the uniform set of confidentiality requirements enacted through CIPSEA, the agencies are now in a better position than ever before to protect data collected for statistical purposes under a pledge of confidentiality. Finally, one danger associated with improving data quality through cooperative arrangements is that it may increase the risks (or perceived risks) of allowing researcher access. Interagency data sharing efforts are clearly desirable, but precautions must be taken to ensure that the improved richness of the business data source resulting from linking, sharing, or better coordination of administrative and tax records cannot rightly be used as an argument for further restricting access. 5.4 RECOMMENDATION PRIORITIES AND COSTS In this report, we have presented our views expressing (a) why the United States needs to improve its measures and understanding of business dynamics; (b) why this requires obtaining better data, especially longitudinal information, on new and small businesses than are currently produced; (c) what an ideal data collection system for monitoring business dynamics might look like; and (d) some of the steps that would need to be taken to facilitate an ongoing and feasible data collection effort that is sensitive to confidentiality, legal, and cost considerations.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future In this chapter, we have presented our recommendations; however, we realize that this is a very broad outline, and creative work will have to continue within the statistical agencies, by outside researchers, and through collaborations between these groups. The statistical agencies should continue to tap into their advisory committees (such as the Federal Economic Statistics Advisory Committee) to provide further guidance on prioritizing the recommendations in this report as well as new ideas that will continue to emerge. While the panel cannot provide detailed cost estimates of what the recommendations in this report would cost to implement, we have attempted to provide guidance on the topic, and we summarize some of our views on costs and priorities here.11 Work could begin almost immediately to implement several of the panel’s recommendations, and with modest resource commitments. Actions associated with two of the report’s core themes could accurately be categorized as low-hanging fruit—they involve minimal long-term monetary outlay (there may be political costs) but are likely to yield high value: (1) The statistical agencies should maintain the business register in a more coordinated manner—Recommendations 11, 15, and 16—and (2) they should utilize it to produce new tabulations of economic activity by measures of business (establishment) age—Recommendations 2 and 3. The specifics of point 1 involve expanding information sharing by government agencies and the use of such things as common identifiers to make it easier to link data across administrative and survey sources. The report provides guidance on how these objectives could be attained without unduly compromising confidentiality and without creating excessive new reporting burdens to businesses. Part of the strategy—for example, to develop consistent and comprehensive data on nonemployer businesses—involves more intensive and better coordinated use of existing administrative records. Implementing point 2 also seems to us quite feasible. Business age variables can be constructed by linking the business registry over time; the Census Bureau’s Center for Economic Studies has already done this through its construction of the Longitudinal Business Database. Publishing information about how economic activity varies with establishment age is essentially a matter of producing new tabulations (some of this is already taking place with the SUSB program, which publishes total employment in establishments that are one year old or less). One potential cost embedded in this recommendation is that, when publishing tabulations, trade-offs must sometimes be confronted in order to maintain respondent confidentiality. So, 11 Our organizational thinking for this section was shaped by the comments of two reviewers of the report.

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future while age is an important variable, the panel is not suggesting giving up equally and, for many applications, more important industry and geographic detail to acquire it. The report also includes recommendations—specifically numbers 1, 4, and 5—that would be more expensive to implement and that would require a longer term concerted effort to carry out. Recommendation 1 states that young business should be sampled more in existing business surveys, which implies that either the number of large firms that are sampled must be reduced or that funding for the surveys must be increased marginally to cover costs associated with a larger survey. The rationale underlying this guidance is that young and small establishments should be given more weight than their receipts or total employment might suggest, because their characteristics change quickly and because they may contribute disproportionately to growth. Recommendation 4 proposes incrementally adding questions to household surveys to identify and gather information about nascent entrepreneurs. The panel recognizes the inherent inefficiencies in using household surveys for this purpose. While the cost of creative integration of data sources and of adding a module to an existing survey is not exorbitant (the former could even lead to cost savings down the road), following firms from their genesis forward is not a trivial task. Since this is a potentially expensive undertaking, it is a lower priority, or at least a more long-term goal. Recommendation 5 proposes that the SBO be conducted on an annual basis rather than a quinquennial basis, which would require new funding. It should be noted that none of these recommendations suggest the creation of major new data collection efforts; most involved adjustments or reorganization of existing activities. Even the proposals for new data collection involve additions, enhancements, or modifications of existing administrative or survey efforts. In fact, almost all of the proposed additional procedures have already been developed and extensively field tested in one context or another. Most of the report’s remaining recommendations have to do with program processes and data access issues that are already within the realm of ongoing agency responsibilities. Other recommendations, such as those suggesting more rapid integration of new technologies or more effective use of existing data sources, are offered with the hope of encouraging long-term efficiency of business data collection. In summary, a major justification for the panel’s recommendations is to avoid the costs of “benefits foregone” from the absence of timely, precise data on the mechanisms by which the U.S. economy adapts and grows. Even the most liberal estimates of the financial costs associated with all the recommendations would be in the tens of millions of dollars, spread across several federal agencies. The return on such an investment would be an

OCR for page 92
Understanding Business Dynamics: An Integrated Data System for America’s Future improvement in the data required to understand the complexity of business dynamics in the United States and, possibly, a reduction in the likelihood of mistakes in national policy making. The cost of a minor misstep that reduces the growth of a 13 trillion dollar economy by even a small fraction is in the billions of dollars. By any calculus, the amount of resources required to provide a more timely, more accurate, and more complete description of U.S. business dynamics seems like a very good investment of public resources, yielding substantial benefits for future generations.