Appendix A
Overview of Current Data Collections
This appendix provides an overview of current data sources used to construct business and employment statistics and to inform research and policy related to business formation, dynamics, and performance. Our focus is on data produced by the U.S. federal statistical system, but we also cite other examples.
The material in this appendix is organized into subsections loosely defined in terms of data source characteristics and purpose:
-
data to count firms and catalogue essential characteristics—the business lists;
-
longitudinal data for tracking businesses over time;
-
data sources designed to improve coverage of small businesses;
-
aggregate employment statistics;
-
data on the self-employed, entrepreneurs, and business gestation;
-
coverage of special sectors, such as agriculture, nonprofit organizations, and e-commerce; and
-
financial data.
Beyond describing the basic design elements, we indicate the extent to which data are available to users outside the agencies (or other organizations) that collect them. Statistical agencies generally provide documenta-
tion for their accessible data sources, and we try to avoid reproducing detailed descriptions that can be accessed elsewhere.1
Table A-1, located at the end of this appendix, allows quick cross-comparisons of various data sets. In this appendix, we omit several important kinds of business data that are more closely linked to production of aggregate statistics and are less central to the panel’s charge. For example, we do not directly discuss price data—notably the producer price index (PPI), which measures changes over time in the selling prices received by producers of goods and services—or the array of industry and input/output data (much of it deflated by PPI) crucial to productivity measurement and to the construction of the national accounts and statistics on gross domestic product (GDP).
A.1
COUNTING FIRMS AND CATALOGING ESSENTIAL CHARACTERISTICS—THE BUSINESS LISTS
The two primary business lists administered by federal statistical agencies in the United States are the Census Bureau’s business register (BR), and the Bureau of Labor Statistics’ Quarterly Unemployment Insurance (UI) address file, more commonly referred to as the Quarterly Census of Employment and Wages (QCEW). Administrative data from the Internal Revenue Service (IRS), which maintains the Business Master File, and the Social Security Administration (SSA) underpin the BR, while the QCEW relies on data from the state UI programs. The most noteworthy business list maintained outside government is the Dun & Bradstreet Dun’s Market Identifiers (DMI).
A.1.1
The Census Business Register
In 1968, the Office of Management and Budget directed the Census Bureau to develop and maintain a comprehensive business list. Known until recently as the Census Bureau’s Standard Statistical Establishment List (SSEL), the BR covers the universe of businesses—over 7 million employer businesses and some 16.5 million nonemployer businesses. The BR serves as the master enumeration list for sampling frames drawn for the Census Bureau’s firm and establishment surveys, most notably the quinquennial
1 |
The Kauffman Foundation web page (http://research.kauffman.org) has a well-organized list, with links of government and private sources of data on U.S. and international businesses; the list focuses on entrepreneurship, small business, and self-employment information. RAND, with funding from the Kauffman Foundation, has also assembled an overview of data resources on small businesses—see http://www.rand.org/pubs/working_papers/WR293/index.html. |
economic census. The economic census, conducted during years ending in 2 and 7, covers over 5 million companies; nonemployer and small businesses are covered by sample only, not a full census (http://www.census.gov/econ/overview/mu0000.html). Domestic, nonfarm business data are collected at the metropolitan statistical area (MSA) geographic level. Because it occurs only every five years and a firm can materialize and close (or vice versa) over shorter periods, the economic census does not comprehensively capture business birth and death information.
The BR also serves the important function of providing central storage for an array of administrative data—most notably, payroll tax records, corporate and individual tax returns, and Employer Identification Number (EIN) application information. Maintenance of the BR is heavily dependent on these administrative data. Data on nonemployer firms are drawn exclusively from administrative sources, mainly business income tax records.2
Within the BR, data are organized at the establishment level—that is, a single location where goods are produced or services provided. Reflecting the composition of the economy, most are single-unit businesses, but there are establishments that are part of businesses operating in multiple locations as well. Because taxes—and in turn tax information—are collected from firms, Census researchers must break up IRS administrative data to the establishment level for multiunit enterprises.3 In interim business census years, this is done using information from the Company Organization Survey (COS)—an annual survey of all large employers (250 employees or more) and a sample of smaller mid-size companies, reaching approximately 50,000 of the largest multiunit enterprises. The accuracy of the single/multiunit identification is reported to improve around economic census years, and then to decline thereafter (Jarmin and Miranda, 2002). The COS is used more generally in an attempt to maintain up-to-date company affiliation, location, closings, spin-offs, and operating information for multiestablishment companies. This allows for fuller coverage of such companies, which account for the vast majority of the nation’s business activity. Title 13 of U.S. Code authorizes this and the other economic census-related surveys and stipulates mandatory responses.
A key element of the BR program is the identification and tracking of individual establishments owned by multiestablishment firms. The BR has
excellent coverage of multiestablishment firms every five years, but in the years between the economic census, the Census Bureau relies on the COS to update the multiunit segment of the business registers along with information it learns about multiestablishment firms from its other surveys (e.g., Annual Survey of Manufactures4). The limited scope of the sampled firms and the rotation of these sampled firms over time affect the timeliness and coverage of smaller multiestablishment companies in the business register. Several studies have noted that both births and deaths of smaller multiunit establishments are more concentrated in the year prior to the economic census when the register is being prepared for the upcoming economic census.5 One sees this pattern of the clumping of changes in other dimensions of the data as well. McGuckin and Peck (1992) show that industry coding changes for establishments were especially concentrated in the economic census years.
A number of improvements were made to the newest version of the BR, which became fully operational in January 2002: additional data elements were added; seven years of data are now maintained, instead of three, allowing tracking of businesses from one quinquennial census to the next; processing of nonemployer statistics, which previously were not maintained, has been expedited; and industry detail has been brought into concordance with the North American Industry Classification System (NAICS). In addition, in 2005, the IRS began providing quarterly employment data from tax form 941 instead of only for the first quarter. Form 941 includes the EIN, employer-reported wages and other compensation, employment for the pay period, income and social security tax withholdings, and related information. When a new business payroll record is received from the IRS, the Census Bureau adds a business employer record to the BR. Nonemployers cannot be identified as quickly, since personal income tax returns are filed annually rather than quarterly. Form 941 now also includes an identifier for businesses filing final tax returns—useful for capturing business deaths. In July 2004, Census began receiving SS-4 form data directly from the IRS (rather than by way of SSA, as before) on a weekly basis, which allows industry codes to be assigned to new businesses more quickly.6
4 |
For noneconomic census years, the Annual Survey of Manufactures provides sample estimates of employment, plant hours, payroll, number of establishments, cost of materials, value of shipments, inventories, and detailed capital expenditures statistics for commercial manufacturing establishments with paid employees (http://factfinder.census.gov). |
5 |
Jarmin and Miranda (2002) discuss the importance of retiming both small multiunit births and deaths in these data in order to improve the accuracy of the annual birth and death statistics. |
6 |
Salyers (2004) provides a “progress report” for the BR, including a full description of the expanded use of administrative records, as a more general listing of recent changes and improvements (http://www.stats.gov.cn/english/18roundtable/papers/t20041230_402219768.htm). |
Data from the BR, as well as the more than 100 surveys that rely on its sampling frame, are used in the production of a wide range of publicly available aggregate statistics (many available on the Census Bureau’s American Fact Finder web page at http://factfinder.census.gov). A widely used product of the BR is the Census Bureau’s County Business Patterns. First-quarter employment and payroll numbers, cross-tabbed by county and kind of business, are published, cooperatively with the SSA, in the County Business Patterns and in the ZIP Code Business Patterns statistical series.
In addition, the Census Bureau’s Non-Employer Statistics (NES) “provides U.S. and sub-national data by industry for businesses without paid employees.” Originating primarily from administrative records, the NES “summarizes the number of establishments and receipts of sole proprietorships, partnerships, and corporations without paid employees.” The Census Bureau began publishing NES data annually in 1997, and annual releases beginning with the year 2002 can be found on its American FactFinder web page.
These publications provide geographic aggregates of the BR microdata. BR data are also essential to economic research conducted at the Center for Economic Studies (for a description of these uses, see http://www.ces.census.gov/index.php/ces/1.00/researchprogram). Although a number of BR-based aggregate statistics enjoy high visibility, the BR is also structured with confidentiality very much at the fore. The BR itself is not a publicly available document, although parts of the register can be used by researchers under highly restrictive arrangements at the Census Bureau’s research data centers (RDCs). Beyond this, data from administrative records are maintained in separate tables, and IRS Title 26 data are segregated from Census collected data. Microdata on race and gender, required for the Survey of Business Owners (SBO), is likewise stored in a separate table for use by SBO analysts only.
A.1.2
The BLS Business List
The other primary business list maintained in the federal statistical system is BLS’s QCEW—formerly the Business Establishment List, initiated in 1988. The QCEW converts data submitted by the universe of employer businesses covered by state UI systems (ES-202), as well as federal agencies subject to the Unemployment Compensation for Federal Employees program, to an establishment basis. The master file includes a number of key fields: establishment name, address, telephone number, monthly employment and quarterly wages, federal EIN—all available by NAICS code, county, and ownership sector for the entire United States.7 UI wage records
7 |
Full details are documented at the BLS QCEW home page (http://www.bls.gov/cew/). |
for individuals working in UI-covered employment are used at times by BLS and the states to validate individual cases of large wage fluctuations and include name, Social Security number, employer name and address, employer ID, and total earnings paid.
The QCEW serves as the sampling frame for most BLS surveys, and it is used to benchmark the Current Employment Statistics (CES) establishment survey. The establishment count also sets the population base in establishment birth and death estimators. The QCEW program provides a comprehensive source of employment and wage statistics, as well as a virtual census (98 percent) of employees on nonfarm payrolls (Spletzer et al., 2004). A crucial limitation of the QCEW—particularly in the context of understanding new and young business dynamics—is that it excludes nonemployer businesses and data on owner characteristics. The QCEW, which currently is geocoded to the rooftop level for 90 percent of private-sector employment, has plans for developing data at the census tract level. QCEW provides industry, employment, county, and physical location addresses on over 3 million firms, mostly new and small businesses, to the Census Bureau. However, the QCEW and BR have different structures which makes cross-survey comparisons difficult. In addition, requirements under the UI program’s Multiple Worksite Report (MWR) vary from state to state and have size thresholds that may exclude certain businesses.8 Finally, the ability to make longitudinal and cross-state linkages is complicated because no firm ID fields other than the tax ID number exist in the database (this is discussed again in the next section).
In the MWR, “multi-location employers with a total of 10 or more employees in their secondary locations are required or requested” to break out their employment and payroll by individual establishment. The MWR is mandatory in 21 states and provides good coverage for all but the smallest multiestablishment employers on a timely basis.9 The timing of small multiestablishment births may not be accurate because reporting will depend on the secondary establishment passing a threshold size. Thus, when a single-location firm expands to a multilocation firm, one will not observe the “new” establishment until the establishment has at least 10 employees. In addition, if the expansion occurs across state lines, it may not be captured as a multiestablishment birth but as a new firm in the other state if it did not already have a presence in that state and if the firm has different EINs across states. There are also issues about firms that have multiple UI
8 |
For example, businesses are not required to report a location in another state if there is only one, other sites within the state if total employment from these sites is less than 10; or any site that is under a different UI account number (http://bls.gov/cew/). |
9 |
and EIN accounts within a state that may affect multiestablishment measurement. QCEW has tried to identify across-state expansions in two ways. First, the state staff may notice a significant change in employment and wages reported by a firm. Upon follow-up, the staff may determine that a firm should file the MWR. If the change in employment and wages is small enough that the state staff does not observe the differences, the need for the MWR filing is captured after the employer completes the Annual Refiling Survey (ARS) and reports a new location.10 About 2 million businesses are contacted annually to update such information as business name, address, and industry codes through the ARS.
As with the BR, numerous data products and statistics are derived from the QCEW, most prominently the quarterly wage and employment statistics, aggregated at various industry and geographic levels. Microdata underlying the QCEW are not publicly accessible; however, BLS does offer limited opportunities for researchers to access confidential data for the purpose of conducting statistical analyses. Data access is restricted to onsite use at the BLS national office in Washington (a list of the restricted access data sets available to researchers can be found on the BLS web site, http://www.stats.bls.gov/bls/blsresda.htm).
A.1.3
Dun’s Market Identifiers
Business data are also collected by private-sector firms. These efforts are typically geared more toward marketing or informing business decisions and less toward research and public policy. The most prominent private-sector collection (and one that has been used for both purposes) is the Dun & Bradstreet (D&B) DMI. Because the BLS and Census business lists are not typically available as sampling frames outside those agencies, D&B data—and its Data Universal Numbering System (DUNS)—have been widely used in a variety of applications elsewhere in government. For example, it serves as the sampling frame for the Federal Reserve’s Survey of Small Business Finances. The DUNS numbers are also used by the federal government to identify entities receiving federal contracts. Data have been broadly used by private-sector firms to estimate numbers of businesses, establishments, and employees, as well as sales and to perform cost-benefit analyses and risk assessment exercises. D&B data products can be purchased and used subject to the company’s terms and conditions, which differ for end users (individuals, businesses, and information professionals).11
10 |
Based on correspondence from Jim Spletzer, BLS. |
11 |
A full description of these terms and conditions can be found at http://library.dialog.com/bluesheets/htmlaa/bl0518.html. |
DMI includes basic data, updated monthly, on over 2.9 million private and public companies and 17 million U.S. business establishment locations (about 18.4 million records as of January 2006) operating in private, public, and government spheres (there are also European and other international versions). The data set is broadly representative of all businesses but limited to private and public companies with five or more employees or sales of $1 million, and consequently, it does not include many of the newest start-up firms or self-employed individuals.12 In contrast, the IRS reports that for 2003 about “19.7 million individual income tax returns reported nonfarm sole proprietorships” (Pierce, 2005), of which about 3 million filed a Schedule C-EZ, on which annual receipts totaled less than $25,000 (www.irs.gov). The file contains up to three years of basic data (the length of coverage varies by company), such as type of business, legal and trade names, physical and mailing addresses, geographical descriptions, product and industry descriptors, sales and number of employees (and the number at each corporate location), growth rates, annual sales, net worth and profit, names and titles of key executives, corporate linkages, DUNS numbers, and other marketing information.
D&B data are collected from various sources, such as in-person and telephone interviews, government publications, and business trade programs and mailings, a fact that limits the quality of information in some important ways. For example, there is no standard guideline for detecting new businesses and incorporating them into the file—information is brought in ad hoc from applications for credit, classified advertising, and other private sources. Similarly, there is no clear process for purging records. Unlike several of the government data sources, DMI does not have a mechanism for determining establishment versus firm records. Furthermore, the data are not longitudinal; in fact, DMI is not cross-sectional for a specific point in time, since there is no regular schedule for updates—the process is ongoing (Haviland and Savych, 2005).
A.2
TRACKING BUSINESSES OVER TIME: BUSINESS LIST-BASED SOURCES OF LONGITUDINAL MICRODATA
Sources of longitudinal business microdata have historically been scarce, particularly for smaller and newer businesses. However, new data programs are emerging that greatly enhance available information relevant to the topics covered in this report. Among the most promising data sets now or soon to be coming online are the Integrated Longitudinal Business Data-
base (ILBD) and Longitudinal Employer-Household Dynamics programs at the Census Bureau and BLS’s Business Employment Dynamics.
A.2.1
ILBD and Precursors
The ILBD has evolved as a natural extension of the Longitudinal Business Database (LBD), which the Census Bureau’s Center for Economic Studies began constructing in 1999. The LBD covers employer establishments, currently for the period 1975-2003. These programs, which can be traced to the early 1990s (under various names), have expanded research capabilities to new frontiers that would not have been possible with aggregate and cross-sectional data alone.
The LBD was constructed using EINs to link year-to-year snapshots of all employer establishments, along with name and location information contained in the Census Bureau’s SSEL. Work is ongoing to add such fields as payroll employment, location, industrial activity, and firm affiliation. The LBD is useful for researching elements of business dynamics, such as firm entry and exit and job flows. Establishment identifiers also facilitate linking the LBD to other data sets. The value of the data set is enhanced by its algorithm for flagging establishment records as births, deaths, or continuers. Generally speaking, a birth is identified when a record appears for the current year that does not match any record from the previous year; a death is detected when a record for a previous year does not match any record for the current year; and continuing establishments show a match from one year to the next (see Jarmin and Miranda, 2002, for a detailed explanation of this algorithm). The practice of using EINs in conjunction with name and address information is intended to increase the accuracy with which establishment births and deaths can be identified; missing source data for some years make this a challenge.
The LBD itself is an extension of another CES predecessor, the Longitudinal Research Database (LRD), which contains longitudinally linked plant-level data from censuses and annual surveys of manufactures. With the relatively rapid growth and subsequent interest in nonmanufacturing sectors, the narrow focus of the LRD has become an increasing concern. Furthermore, LRD coverage of firms with fewer than 250 employees is limited, and the plant-level data are not linked to enterprises, so the overall size and industry of enterprises owning large plants are not always known. Despite these limitations, the LRD has been intensely analyzed and has spawned a robust literature (see Bartelsman and Doms, 2000, for a review of these efforts). The LBD allowed academic research on employment dynamics issues at the establishment level (forged by Dunne, Roberts, and Samuelson, 1989; Davis and Haltiwanger, 1990, 1992; and Davis,
Haltiwanger, and Schuh, 1996) to begin expanding beyond the manufacturing sectors.
The ILBD marks another discrete advance for business research aimed at understanding the processes of small and young firms over time, as its coverage is much broader than its predecessors. Extending work by researchers such as Boden and Nucci (2004),13 the ILBD integrates federal government administrative records and survey sources for nearly all private, nonagricultural employer and nonemployer businesses in the United States, currently covering the years 1992 and 1994-2000 (see Jarmin and Miranda, 2003, Miranda et al., 2005, for a detailed description of the ILDB). One clear advantage of the ILBD over earlier data sets in the lineage is that it allows analysts to track a business’s characteristics as it transitions from nonemployer to employer status (or vice versa), a key but difficult-to-study aspect of business evolution. ILBD data have shown, for example, that over three-year horizons, about 5 percent of nonemployer businesses become employer businesses or are acquired by, or absorbed into, employer businesses. This translates to approximately 750,000 businesses—a large number in absolute terms—and is an important component of job creation.
Employer businesses and some nonemployers are linked from period to period by EIN; most nonemployers are linked using business owner ID (Social Security number) fields. This technique is not seamless. For example, over time, ID numbers can change for legal or other reasons.14 In addition, problems of inconsistent data formats, the volatility of young and small firms, and the sheer number of records (over 15 million nonemployers and over 5 million employer businesses) all pose challenges for the Census Bureau staff carrying out the project.
The ILBD has continually been under development and is not currently available to users outside the Census Bureau. Initial versions of some statistics are scheduled to be made available in the near future. Access to microdata will become available at RDCs, after further documentation of data quality assurance is completed, by perhaps as early as 2007. Access to ILBD data is governed by U.S. Code Title 13 (i.e., for statistical purposes only and with “predominant purpose” consistent with Census).15
13 |
Richard Boden and Alfred Nucci linked nonemployer entities to the business register both cross-sectionally and longitudinally for the years 1992 through 1999. Their paper enumerates the myriad of issues that arise when attempting to track nonemployer businesses over time, including those involving sole proprietorships (not the least of which is a change in legal form of organization). |
14 |
The technical challenges inherent in ILBD linking procedures are documented in Davis et al. (2006). |
15 |
Documentation of the Census Bureau’s RDC guidelines define “predominant purpose” and describes Title 13 requirements generally (http://www.ces.census.gov/index.php/ces/1.00/researchguidelines). |
A.2.2
BLS’s Business Employment Dynamics (BED) Program
The BLS’s BED program produces a quarterly series of gross job gains and gross job losses statistics based on the universe of establishments covered in the QCEW (those subject to state unemployment insurance laws). Sectoral designations now conform to the NAICS classification system. Again, the major exclusions are the self-employed, along with certain nonprofit organizations. Data from the program were first published in September 2003 and are now complete for the period 1992 to the first quarter of 2006. Quarterly data will be released every three months, making them more timely than the alternative employment data sources previously available.16
The BED data allow disaggregation of employment changes into the underlying components—the number (and percentage) of gross jobs gained by opening and expanding establishments and the number (and percentage) of gross jobs lost by closing and contracting establishments.17 These data, constructed using a multistep procedure to link QCEW microdata across periods, provide a picture of the dynamics underlying aggregate employment growth statistics.18 Research based on the quarterly time series contributes to knowledge of the processes underlying the business cycle; for example, Clayton, Sadeghi, and Talan (2005) identify seasonally adjusted job changes resulting from establishment openings and closings, as opposed to expansions and contractions. In general, BED data have revealed that firm and establishment growth rates vary by size and how these results differ from those produced by analyses limited to annual data.
The primary obstacle to further development of the BED is that EINs are imperfect for creating record linkages (see Okolie, 2004); however both the QCEW and the BED incorporate a complex multi-stage process to link records across quarters.19 As with QCEW microdata, researchers must submit proposals to access BED data; if the proposal is accepted, the data must be used at the BLS research center in Washington.
16 |
These and other details can be found at http://www.bls.gov/bdm/bdmover.htm. |
17 |
Getz et al. (2005) provide a detailed description of the methodologies used to capture business births and deaths in the various Census Bureau and BLS data sources. |
18 |
Pivetz et al. (2001) describe the technique used to longitudinally link the data. |
19 |
Clayton, Sadeghi, and Talan (2005) provides some detail on the linkage procedures for the QCEW. |
A.2.3
The Longitudinal Employer-Household Dynamics (LEHD) Program
The LEHD is a relatively new microdata source being developed by the Census Bureau, which describes the LEHD as a “set of infrastructure files using administrative data provided by state agencies, enhanced with information culled from demographic and economic (business) surveys and censuses…. The LEHD Infrastructure Files provide a detailed and comprehensive picture of workers, employers, and their interaction in the U.S. economy” (http://lehd.dsd.census.gov).
The program is breaking new ground by, as its name suggests, integrating data on households and individuals with data on employers. The idea behind the LEHD originated with a 1999 National Science Foundation initiative to investigate the potential to combine large administrative data sets with data collected through censuses and surveys. The objective was to “reduce respondent burden, increase data quality, and enhance the information available to the federal, state and local agencies which rely on Census Bureau data for decision making.” The principal investigators on the project proposed linking information to permit data sets to be longitudinal in both the household/individual and firm/establishment dimensions (http://lehd.dsd.census.gov/led/about-us/FAQ.html#slehd).
The LEHD relies on BLS’s QCEW in order to crosswalk between unemployment insurance accounts and the federal EINs and for its framework of detail on industry codes, employment levels, and physical location addresses. The LEHD is composed of interrelated infrastructure data sets: (1) the Employer Characteristic File, with information about the employer, including employment, payroll, industry, size, and location (http://lehd.dsd.census.gov/led/library/tech_user_guides.html); (2) the Employment History File, with information about the employment history of the employee, including employer identity, payroll, and employment; (3) the Personal Characteristics File, with time invariant information about the employee, including gender, race, foreign-born status, and date of birth; (4) the Employer Human Capital File, with statistics about the skill mix of businesses; and (5) the Employer Quarterly Workforce Indicators, with information at the employer level about accessions, separations, job creation, and destruction. These data sets allow for the integration of Census economic data with employee characteristics files (http://www.icpsr.umich.edu/access/census-unpub.html).20 Core employee data originate from the QCEW. These records are supplemented with additional information on
individual and firm characteristics. The resulting database contains about 80 million individual and 5 million business records from participating states. Using these data, it is possible to follow each employer-employee match on a quarterly basis.
The LEHD has created opportunities to conduct research on topics for which empirical analysis of confidential longitudinal linked employer-household microdata are required, such as research on low-wage workers and human capital and productivity. The LEHD is already facilitating the creation of new statistics describing the dynamic nature of local economies. For example, it has spawned the Local Employment Dynamics program, a voluntary partnership between state labor market information agencies and the Census Bureau to develop new information about local labor market conditions. By receiving and processing quarterly data from each of about 29 state partners, quarterly workforce indicators are being produced by industry, age, and sex for local areas. The Quarterly Workforce Indicators program generates timely statistics on job churning, such as rates of accession, separation, job creation, and job destruction by detailed industry and location. Among the interesting results: Accession and separation rates have been found to exceed 20 percent per quarter, while rates of job creation and destruction are typically 7-10 percent. These statistics, which are comparable to the job creation and destruction rates from the BED, translate into over 13 million jobs destroyed each year (http://www.bos.frb.org/economic/ppb/2004/ppb0401.htm).
The ILBD integration of longitudinal data (survey and administrative) for all employer and nonemployer businesses has created a tool for studying business start-ups and early life-cycle dynamics (Davis et al., 2006). By incorporating geographical information system applications, analysts have been able to describe how workers travel to and from work for transportation planning purposes. One finding from this work, which has focused on workers leaving businesses (these separations and accessions are highly visible but account for only about 1 percent of the total), is that clusters of workers affected by outsourcing often move to temporary help and personnel supply jobs, helping to explain the growth of that industry (see Benedetto et al., 2004). The LEHD illustrates the tip of the iceberg in terms of the information volume and detail that can be made available through data integration and the efficiency of the approach relative to developing new surveys.
Data in the LEHD are of course very sensitive and subject to strict Census confidentiality procedures. As documented in Abowd, Haltiwanger, and Lane (2004), “only authorized researchers working from Census-controlled facilities have worked with the LEHD microdata; however, major efforts to make the data available to external researchers are underway.” Since 2005, external researchers may access the LEHD data infra-
structure in the Census-National Science Foundation RDCs, and several research projects are already under way. Public-use summary data from the Quarterly Workforce Indicators are currently available, but synthetic data may facilitate the release of customized LEHD microdata products.
A.2.4
National Establishment Time Series
The National Establishment Time Series (NETS), another nongovernment data source, captures business relocations and employment change (job destruction and creation) for business establishments disaggregated at fine geographic detail. The data were developed from D&B data by Walls and Associates. As discussed earlier, the primary goal of D&B data is to sell information to businesses about businesses for decision-making purposes. Walls and Associates linked the data with the goal of constructing longitudinal files for studying business dynamics (Neumark et al., 2005a, p. 10).
Neumark et al. (2005b) used NETS to analyze employment growth in California; they also provide a detailed description of the data set. The authors were given access to annual data for the universe of business establishments located in California between 1989 and 2002 (there were about 3.5 million) for the purpose of examining the extent to which job creation and losses were attributable to interstate business relocation and to business births and expansions and contractions and deaths.21 Regarding data quality, the authors conclude that NETS is a “reliable data source although not without limitations” (p. 3). They note, in particular, the rounding of employment figures and that short-term changes are not picked up particularly well (p. 31). They suggest that analysts should use three years of data to minimize the effects of these shortcomings. Davis et al. (1996) provide a discussion of limitations of D&B data more broadly. One advantage of these private data sources is that they are not subject to the access restrictions that handcuff the statistical agencies.
A.3
DATA SOURCES DESIGNED TO IMPROVE COVERAGE OF SMALL AND YOUNG BUSINESSES
A.3.1
Survey of Small Business Finances
In addition to the Census Bureau’s BR, which has included data on nonemployers annually since 1994, there are a number of specialty surveys that focus on small business. A particularly important one is the Survey of
Small Business Finances (SSBF), which has been conducted by the Board of Governors of the Federal Reserve, with assistance from the National Opinion Research Center, in 1987, 1993, 1998, and 2003. (The Federal Reserve Board intends to discontinue the SSBF.) The SSBF is a nationally representative sample of small and minority-owned businesses with fewer than 500 employees screened for eligibility using D&B data files. Interviews were ultimately conducted with 3,500 firms from this class, including over-samples of African American-, Asian American-, and Hispanic American-owned firms. The Federal Reserve’s objective with the SSBF was to collect information to better understand overall finances and credit conditions that small firms face, including:
-
factors that affect prices and availability of credit to small businesses;
-
effects that bank consolidation may have on the availability of credit and other financial services;
-
characteristics of small businesses and how these characteristics influence their credit needs;
-
experiences that small businesses have with credit applications;
-
the impact that government regulations may have on small business credit access; and
-
the financial and nonfinancial sources used by small businesses for their financing needs (http://www.federalreserve.gov/ssbf/).
The survey includes information on income and expenses, assets and liabilities, and financing sources, much of which is not available from any other sources. SSBF data are used to evaluate the impact of public policies, bank mergers and consolidations, and the rise in interstate banking on small businesses of different sizes, locations, and ownership characteristics (http://www.norc.uchicago.edu/studies/economic.htm). Summary analyses of the data are published in the “Report to Congress on the Availability of Credit to Small Business” that is produced by the Federal Reserve every five years. In addition, the SSBF has provided the foundation for a wide variety of academic research—for example, analyses of shifts in lending from the banking to the nonbanking sectors, bank mergers and consolidations, and the rise in interstate banking, as each relates to small businesses.
A.3.2
Small Business Administration-Funded Data Sources
The U.S. Small Business Administration (SBA) is the government agency most directly concerned with small business interests.22 Its mission is to
“maintain and strengthen the nation’s economy by aiding, counseling, assisting and protecting the interests of small businesses and by helping families and businesses recover from national disasters” (www.sba.gov). Since Congress instructed the SBA in 1979 to begin developing data sets covering small businesses, the agency has worked to push forward data collection for use in studying firm dynamics—particularly job creation, attributable to smaller businesses. Generally the SBA is interested in the enterprise unit of analysis (as opposed to small establishments that are owned or controlled by large firms).
The SBA is involved in a number of programs; one of particular note is the Business Information Tracking Series (BITS), another effort to edit and longitudinally link archived data. BITS, which at times has also been known as the Longitudinal Establishment and Enterprise Microdata (LEEM), is partially funded by the SBA and carried out by the Census Bureau using SSEL-based data. Essentially, the SBA creates an enterprise version of County Business Patterns, called the Statistics of U.S. Businesses (SUSB), that consists of annual observations on each business and includes data on number of firms, number of establishments, employment, and annual payroll of firms categorized by location and industry (http://www.sba.gov/advo/research/data.html). SUSB provides annual snapshots on businesses from 1988 to 2002; the BITS program linked these records annually from 1989 to 2001. BITS provides data on private-sector establishments (single physical locations) with positive payroll; it includes, for each year, employment, annual payroll, 4-digit Standard Industrial Classifications, location, start year, legal entity, and total employment. BITS is structured to identify firm births and deaths, expansions and contractions, and mergers and acquisitions and to examine job flows. Firms are tracked using identifiers designed to remain unchanged even if there is a change in legal or ownership status. Among its limitations, BITS does not include the self-employed, is characterized by a long lag in production, and tracks only establishments (not firms).
The SBA publishes aggregate statistics on numbers of business formations and the distribution and growth of large versus small businesses over time from these data; however, given its business list foundations, microdata are available only to researchers who successfully apply to use them at Census RDCs.
A.3.3
Kauffman Firm Survey
The Kauffman Firm Survey is a new initiative designed to produce “a data set of publicly accessible research on new businesses and their development in the United States.” The Kauffman Foundation began working with
Mathematica Policy Research, Inc., to develop and administer this survey in 2005. The Kauffman Firm Survey is a longitudinal survey of the principals of 5,000 firms, sampled from D&B, that started operations in 2004. The survey is oriented primarily to generate data on the financial development of new businesses in their first four years of existence. Surveys will be conducted by either telephone or on the Internet, and owners of these businesses will be asked about the characteristics of the business, about the financing of business operations, and about characteristics of the owner(s). Three follow-up interviews are planned with these businesses in 2006-2008 (http://www.mathematica-mpr.com/surveys/kauffmanfirm.asp). Data from this survey are not currently available, but they will ultimately provide publicly available longitudinal data on new firms (http://www.kauffman.org/research.cfm).
A.4
EMPLOYMENT STATISTICS
Two monthly surveys underpin measurement of employment levels and trends, over time—the Current Population Survey (CPS), a household survey, and the CES survey, a payroll or establishment survey. Employment estimates from each are published monthly. In addition to these two surveys, the BLS houses the Job Openings and Labor Turnover (JOLTS) program, which produces monthly data on job openings, new hires, and voluntary and involuntary separations.
The CPS, conducted by the Census Bureau and BLS, has been in existence for more than 50 years. It is based on a survey of approximately 60,000 households designed to estimate total employment of persons age 16 and over in the civilian noninstitutional population. The CPS captures employment broadly, including unincorporated self-employed, unpaid family workers, and agricultural workers; it also collects demographic and supplemental information.
The CES is a monthly survey of 160,000 businesses and government agencies covering approximately 400,000 establishments. It is a simple random sample stratified by state, industry, and size, which produces estimates of the number of nonfarm payroll jobs, hours, and earnings estimates based on payroll records of business establishments. CES counts jobs, meaning that multiple jobholders are overrepresented (from an employment perspective) and the self-employed are excluded—which points to the importance of household data for the production of comprehensive employment statistics. QCEW micro files serve as the sampling frame for the CES; the quarterly LBD is used to identify new business births and deaths.
A.4.1
CPS
The CPS is BLS’s most widely used household survey. It is used in the production of monthly estimates of unemployed persons in the United States, providing information on employment along occupational, industry, and other dimensions. The CPS is a unique source of “business data” in that the households, rather than businesses, serve as the sampling unit. Microdata going back to 1962 are publicly available.
Specifications and uses of the CPS are documented in great detail at the BLS web site (http://www.bls.census.gov/cps/cpsmain.htm) and elsewhere. The survey, conducted by Census for BLS, involves a monthly sample of 60,000 in the civilian population, age 15 and older. The survey asks respondents for basic demographic information—age, gender, race, marital status, education, immigrant status, family structure, and labor market status (for those age 16 and over)—in addition to the employment status questions. Respondents are asked if they work for private firms or government or are self-employed, and they are asked to provide hours worked, occupation, industry, and earnings information.
The CPS is valuable for identifying new businesses, detected when an individual goes from employed to self-employed status from one period to another. Supplements provide detailed information on veterans, computer/ Internet use at home and at work, adult education, health care, and pension coverage. The CPS is structured as a panel, in which each housing unit is sampled for four consecutive months, out for eight, and back in for four. This creates some opportunities for studying self-employed and entrepreneurship.
A.4.2
CES
The QCEW, including the payroll records it receives from the state employment security agencies (SESAs), is a benchmark data source for the CES. The CES provides employment, hours, and earnings estimates based on the SESA payroll records. Most of the CES employment series begin in 1990 for the reconstructed NAICS system, although employment by industry supersector is available since 1939. The CES is based on a sample of about 400,000 business establishments (160,000 firms) in over 1,150 industries (hours and earnings data are available for about 850 industries). The LBD, stratified by state, industry, and employment size, serves as the sampling frame. Geographic location of establishments is designated using Census-defined MSA guidelines. Beginning with the May 2003 data, the CES began publishing data organized using the NAICS classification system (http://www.bls.gov/ces/home.htm).
Series for employment hours and earnings at detailed industry and
geographic levels are collected and published monthly. In addition to the online database, the CES program produces a monthly news release and the monthly periodical Employment & Earnings, which are released on a three-and five-week lag, respectively. The database includes information on total employment, number of women employed, average weekly and monthly hours and earnings, number of production or nonsupervisory workers, and average weekly overtime hours in manufacturing industries. Microdata are not publicly available.
Like the BEL and QCEW sampling frames, the CES does not include nonemployers or detailed characteristics of business owners (http://www.bls.gov/ces/cesprog.htm).23 The hourly employment data are used in the BLS industry productivity measures. However, the CES program collects hours and earning data only for production workers (primarily from the natural resources and mining and manufacturing sectors) and nonsupervisory workers (only from the service-providing industries). Employment data are collected for all workers, including production and nonsupervisory workers. BLS is currently collecting hours and earning data for all employees and plans to publish these data in early 2007 (http://www.bls.gov/ces/cesww.htm). Triplett and Bosworth (2004) argue that “with the huge change in workplace organization and management in recent years, the boundary between ‘production’ and ‘non-production’ workers has lost its meaning,” and that the same applies to supervisory and nonsupervisory workers outside manufacturing. They conclude that there is a clear need for BLS to collect information on hours of work for all workers, as well as information on changes in labor quality at the industry level.
A.4.3
Job Openings and Labor Turnover
The BLS JOLTS program provides monthly data, based on a sample of about 16,000 establishments nationwide, on job openings, new hires, and both voluntary and involuntary separations. The definition of a job opening or vacancy requires that a specific position exists, that work could start
within 30 days, and that the employer is actively recruiting from outside the establishment with the opening. Data are available in a national series and for four regions. Because the JOLTS data series has only recently been developed—data are available back to December 2000—it is still rather limited for business cycle analyses. However, once a sufficiently long time series has been established, the JOLTS series on hiring activity and job turnover will become useful for a range of labor market analyses not currently possible with other data sources.
JOLTS has already been used for assessing the reliability of the Conference Board’s help-wanted index as a measure of job vacancies, which for years has been the primary monthly indicator of labor demand (http://www.bls.gov/jlt/). Since it has been available over a long time period and at the level of disaggregated regions, the help-wanted advertising index has been the leading data source for assessing important aspects of the labor market operations, “such as the effectiveness of the job-matching process. Depending on the intended application, the help-wanted series may be most useful when adjusted for underlying trends that are unrelated to labor market conditions” (http://www.frbsf.org/publications/economics/letter/2005/el2005-02.html).
A.5
DATA ON THE SELF-EMPLOYED, ENTREPRENEURS, AND BUSINESS GESTATION
The vast majority of information on businesses is collected from employer firms; as a result, less is known about the self-employed. One exception among the data sources described so far is the CPS. In this section, we describe data sources that are useful for research on emerging businesses, the self-employed, and entrepreneurs; in the process, we also discuss some of the topics that can be illuminated with data collection efforts centered on households.
A.5.1
Household Surveys: CPS and the Kauffman Index of Entrepreneurial Activity
Research by Robert Fairlie, who reported to the panel at its September 2005 meeting, articulated the value of household data for research related to small business dynamics. Fairlie has used CPS data to analyze the self-employed working in their own incorporated or unincorporated businesses (BLS published estimates do not include incorporated business owners), capturing both employer and nonemployer businesses. The CPS is designed primarily to collect employment information, but, since it also captures demographic and other supplemental information, a richer picture can be created.
The four months in, eight months out, four months in structure of the CPS makes it possible to match individuals from the first survey period to the next, as well as month to month to create panel data. Fairlie has been able to achieve annual match rates of about 70 percent. By linking CPS files over time, longitudinal data can be created, a situation that allows for the examination of business creations. Entrepreneurship rates for the 1980-2001 period, calculated using this method, are presented in Fairlie (2004). If individuals are tracked over an extended period of time, up to 3 to 4 years, the results of their efforts in operating a business could be tracked when the business files a Schedule C or, if it hires employees, in the state Unemployment Insurance and federal Social Security establishment registries. Entrepreneurs are defined as individuals who report owning a business who did not own a business during the previous survey year. A complication with this method arises if the time spent starting a business is not considered new business activity without any information on economic activity of the business entity itself.
Other data sources have been used to estimate self-employment among population subgroups. For example, BLS’s National Longitudinal Survey of Youth (NLSY), a nationally representative sample of 12,686 young men and women who were between the ages of 14 to 22 in 1979, has been used to generate rates for economically disadvantaged groups. Individuals in the NLSY were interviewed annually through 1994 and are currently interviewed biennially. The NLSY collects information through an event history format and focuses primarily on labor force behavior, although it collects information on a wide range of variables, including child care costs, welfare receipts, aptitude measures, and school achievement. A second cohort of 9,000 youths were surveyed in the NLSY beginning in 1997 and are interviewed on an annual basis.
Similar work has been done to create the Kauffman Index of Entrepreneurial Activity, which uses matched data from the 1996-2004 CPS to develop a new measure of entrepreneurship. All individuals ages 20-64 who do not own a business as their main job are identified in the first survey month. By matching CPS files, it is then determined whether these individuals own a business as their main job with 15 or more usual hours worked per week in the following survey month. The Kauffman index is thus defined as the percentage of the population of nonbusiness owners who start a business each month. An average of 0.36 percent of the working-age population created a business each month, representing approximately 550,000 new businesses per month.24
The advantages of household data include comparatively large sample sizes and long time series and more timely estimates of business ownership and entrepreneurship. Some of the disadvantages involve limited information on business outcomes, employment, and customers. These relative benefits reveal the attraction of integrating household and business data, which creates the potential to pull the best information from both.
A.5.2
Survey of Business Owners (SBO)
The Census Bureau’s SBO generates information on the characteristics of business owners (e.g., gender, race) and their sources of financing. The richness of the data is enhanced because they can be linked with other longitudinal survey and administrative data collected by the Census Bureau. The SBO collects information on the minority status of owners and allows for complex responses in which each owner can report a self-identified race, as well as multiple racial groups. The sampling frame includes “all firms operating during 2002 with receipts of $1,000 or more that filed tax forms as individual proprietorships, partnerships, and any type of corporation, except those classified as: agricultural production; domestically scheduled airlines; railroads; U.S. Postal Service; mutual funds (except real estate investment trusts); religious grant operations; private households and religious organizations; public administration; and government” (www. census.gov/Press-Release/www/2005/sbo2002_presentation.ppt). The IRS compiles for the Census Bureau a list of all companies filing any of the following forms: IRS Form 1040, Schedule C (for an individual proprietorship or a self-employed person); 1065 (for a partnership); any one of the 1120 corporation tax forms; and 941 (Employer’s Quarterly Federal Tax Return) (http://www.census.gov/econ/census02/text/sbo/sbomethodology.htm).
Separate SBO reports for the Census Bureau’s Surveys of Minority and Women Owned Business Enterprises include more detailed data by geographic area, kind of business, and size of business (full details on data characteristics and reliability can be found at http://www.census.gov/csd/sbo/intro2002SBO.htm). Conducted in conjunction with the quinquennial economic census program, these surveys are used in constructing and publishing the Surveys of Minority-Owned and Women-Owned Business Enterprises Company Statistics Series. Number of firms, sales and receipts, paid employees, and annual payroll are included in the published data, which are presented by geographic area, industry, firm size, and legal form of organization. Microdata are protected under Title 13 confidentiality guidelines and are available for statistical purposes to researchers with special sworn status with the Census Bureau (http://www.census.gov/csd/mwb/).
A.5.3
Panel Study on Entrepreneurial Dynamics (PSED)
Organizations outside government have developed keen interest in measuring entrepreneurial activities. The PSED, sponsored by the Kauffman Foundation, is one such manifestation of this interest. The PSED is a nationally representative database for the United States designed to enhance understanding of the business start-up phenomenon. The PSED is a longitudinal sample of over 64,000 U.S. households that were contacted to find individuals who were actively engaged in starting new businesses. Resulting data are intended to promote research into the business gestation process— the period before the business actually produces output. The PSED includes information on the proportion and characteristics of the adult population involved in attempts to start new businesses, the kinds of activities nascent entrepreneurs undertake during the business start-up process, and the proportion and characteristics of the start-up efforts that become infant firms. Prevalence rates for nascent entrepreneurs are reported by gender and ethnicity (white, black, and Hispanic) on such demographic variables as age, education, household income, and urban context (Gartner et al., 2004; Reynolds, 2006).
The PSED focuses on four fundamental questions:
-
Who is involved in starting businesses in the United States?
-
How do they go about the process of starting companies?
-
Which of these business start-up efforts are likely to result in new firms?
-
Why are some of these business start-up efforts successful in creating high-growth businesses? (http://research.kauffman.org/cwp/appmanager/ research/)
Four panels of data are currently available covering the time period 1998 to 2003. Additional work has begun that will follow a cohort of “nascent entrepreneurs” for three years. Data from the first-year interviews are scheduled to be available in 2006. The data are being used to address a number of research questions—for example, to estimate the number of individuals or teams of individuals in the United States attempting to start a business at any given time, the variation in start-up rates among minority groups, and the impact of education on entrepreneurship. Data from the PSED are maintained and made available for download by the University of Michigan’s Institute for Social Research (http://www.psed.isr.umich.edu).
A.5.4
Global Entrepreneurship Monitor (GEM)
A recent Organisation for Economic Co-operation and Development (OECD) report (Vale, 2006) provides an inventory of the different sources of data on start-ups in the OECD member countries. One data source described in the study is the GEM; in this project, a new sample is drawn each year from adult populations to generate a harmonized set of cross-national comparisons. The GEM collects data on various aspects of entrepreneurship through a series of coordinated household surveys in a gradually increasing number of countries worldwide. The GEM has focused on the study of early-stage entrepreneurial activity and is moving into “analyses of the existence and characteristics of established business owners; the degree of innovativeness, competitiveness, and growth expectations of early-stage and established business owners; and the existence and characteristics of social environments conducive to entrepreneurship” (Global Entrepreneurship Monitor, 2004; http://www.gemconsortium.org/).
Using the conceptual framework and methodology developed for the initial PSED, the GEM program was developed in 1998 to provide harmonized cross-national comparisons of the prevalence of adults participating in new firm creation. This was accomplished by commissioning surveys of the adult population in participating countries using a common interview schedule and consolidating and standardizing the responses; the result is the capacity to compare the proportion of the adult population (ages 18-64) in each country that is actively engaged in new firm creation. The major comparisons utilize the Total Entrepreneurial Activity Index, reflecting both those in the start-up process as well as owner-managers of new firms up to 42 months old. These interviews were supplemented by personal interviews with 30-80 national experts in entrepreneurship in each country, standardized expert questionnaire responses, and an assembly of a considerable amount of harmonized national data from standardized sources (Reynolds et al., 2005).
The GEM also produces a report on women and entrepreneurship based on survey data from more than 107,000 respondents in 35 countries. The report attempts to quantitatively characterize and describe patterns of behavior among women entrepreneurs relative to those of men and to measure the gender gap for entrepreneurial activity internationally. The first annual report was produced on 10 countries in 1999. By 2005, over 44 countries, as well as specialized regions in some countries, had been involved for one or more years. An annual report summarizes the major cross-national findings supplemented by special topic reports and annual country reports. Over 100 of these reports and the major data sets, with a three-year lag, are provided on the project web site (www. gemconsortium.org).
A.5.5
American Time Use Survey (ATUS)
Other federal data collections exist that, although not specifically designed to measure entrepreneurial activity, may eventually be useful. One example is BLS’s recently launched ATUS. The ATUS includes data on time spent working, sleeping, shopping, volunteering, participating in leisure activities, etc. It also includes self-employment identifiers and data on hours worked. In addition to its time diary, the ATUS collects demographic and labor force information, as well as summary questions on child care activities, paid work activities, and absences from home (Frazis and Stewart, 2004).
The ATUS diary can be used to assess the validity of claims that household surveys overstate hours worked (Robinson and Bostrom, 1994). A comparison of data on hours worked from the CPS and the ATUS (2003 data) provides some support for this claim. The average number of weekly hours worked estimated from the ATUS fell in the 37.3 to 37.9 range—depending on which work-related activities are included—relative to about 39 hours reported in the CPS (Frazis and Stewart, 2004). One possible explanation is that hours worked during CPS reference weeks (the week including the 12th of the month) are higher than nonreference weeks. In fact, in their comparison work, Frazis and Stewart found that “estimates of actual hours worked from the CPS are very close to (ATUS) time-diary estimates for the CPS reference week.” In other research, Song (2005) and Eldridge and Pabilonia (2005) used ATUS data to investigate the incidence of people working at home and its relation to pay status and length of hours worked. ATUS data can also be used to compare how self-employed workers spend their time relative to wage and salary workers. However, “work” itself is a black box category, and not much can currently be done to measure specific entrepreneurial activities.
A.5.6
The American Community Survey (ACS)
The Census Bureau’s ACS will generate household-based data that will also be useful for studying self-employment trends. Question 35 of the survey asks respondents about their current or most recent job activity—specifically, whether the person was “an employee of a private for profit company; an employee of a private not for profit, tax-exempt, or charitable organization; a local government employee; a state government employee; a federal government employee; self-employed in own not incorporated business, professional practice, or farm; self-employed in own incorporated business, professional practice or farm; working without pay in a family business or farm.” Question 41 asks for “self-employment income from own non-farm business or farm business, including
proprietorships and partnerships (report net income after business expenses)” http://www.census.gov/acs/www/SBasics/SQuest/SQuest1.htm.
The ACS data undergo extensive computer editing to correct reporting deficiencies and improve consistency of the income reports. For example, “if people reported they were self employed on their own farm, not incorporated, but had reported only wage and salary earnings, the latter amount was shifted to self-employment income.”25 The ACS data are limited in that income is often underreported. In addition, the earnings data generated are not directly comparable with those from the SSA records, since SSA data are based on employer reports and income tax returns for the self-employed. Furthermore, SSA excludes some civilian government and nonprofit organization employees, workers covered by the Railroad Retirement Act, and people whose earnings are insufficient for the Social Security program.
A.6
DATA COVERAGE OF SPECIAL SECTORS: AGRICULTURE, NONPROFITS, AND E-COMMERCE
While this report does not deal with or make recommendations specifically about issues that are unique to the agricultural sector, we point out here that there are numerous data sources covering the sector. In fact, relative to its quantitative role in the economy, more is spent in the production of statistics on agricultural output (and other variables) than on other areas of the economy, including those that are growing rapidly.
A.6.1
Agricultural Resource Management Survey (ARMS)26
The Economic Research Service (ERS) and the National Agricultural Statistical Service of the U.S. Department of Agriculture (USDA) are joint sponsors of the annual ARMS, the primary source of information on “the financial condition (including debt levels), production practices, resource use, and economic well-being of America’s farm households.” The ARMS provides data on the business side of farms (i.e., field-level farm practices and economics of the business) as well as household characteristics, such as age, education, farm and off-farm work and income, and family living expenses, making ARMS the most broad-based source of national agricultural business data.
25 |
American Community Survey: 2004 Subject Definitions (www.hawaii.gov/dbedt/info/census/acs/acs_subject_definitions_8_05.pdf). |
26 |
Information on this survey can be found at the USDA web site: http://www.ers.usda.gov/data/arms/GlobalAbout.htm#Use. |
The ARMS data provide the basis for various USDA estimates, including the annual cost-of-production estimates the department is congressionally mandated to produce for over 15 commodities and the annual estimates of average and net farm income, which in turn are used by the Bureau of Economic Analysis (BEA) to develop GDP and personal income estimates. In addition, the Food and Agriculture Act of 1977 requires USDA to produce the Annual Report on Family Farms. In preparing this report, ERS draws on the ARMS data for information on a host of relationships, including:
-
farm participation in agricultural programs and the distribution of farm program payments;
-
the structure and organization of farms, including family and nonfamily ownership;
-
the use of new production technologies and other management practices;
-
farm use of credit;
-
farmers’ participation in off-farm employment; and
-
identifying the characteristics of producers purchasing crop insurance.
A.6.2
Nonprofit Organizations27
While there are gaps in data for studying the performance and economic contribution of nonprofit organizations, many of the data that are collected are publicly available because confidentiality constraints on information (including financial data) are very different. The IRS maintains a continuously updated registry of tax-exempt nonprofit organizations, which, in turn, is incorporated into the IRS Business Master File. Unlike other components of the master file, the IRS is allowed to provide public access to information on Forms 900 filed by nonprofits, a condition for tax-exempt status. Information becomes available to the public roughly six months after the IRS rules that an organization qualifies for tax-exempt status. The IRS registry of exempt organizations can be used to identify births and deaths in the sector. Registry data have been used to measure elapsed time from birth to filing Form 990 (David, Pollak, and Arnsberger, 2005) and, with a longer lag, also transitions to inactive status.
Census Bureau lists include Business Master File and Form 941 information on nonprofit organizations, as does the QCEW at BLS. The QCEW can be used to measure employment growth for new organizations; however, since nonprofit status is not identified, estimates (such as Knaup, 2005) cannot easily be broken out by for-profit and nonprofit status. However, researchers from the Johns Hopkins Center for Civic Society Studies have used QCEW data via the BLS outside researcher program to develop a way to identify nonprofit organizations in the QCEW and produce employment statistics on the sector. The joint initiative method involves identifying tax-exempt firms in the data sets by “matching employer identification numbers on the QCEW files with those on the exempt organization master file, maintained by the Internal Revenue Service.” The Form 990 information from the IRS is limited in that the data are organization-based, rather than establishment-level based (Salamon and Sokolowski, 2005). Unlike the Census Bureau’s economic census, the QCEW has broader coverage of the nonprofit sector and is more timely. The decennial population census and the CPS cover the nonprofit sector; however, research shows that the self-identification of the profit or nonprofit status of a workplace is questionable.
There is high demand for small-area statistics on the numbers of nonprofit organizations and employees. In principle, such estimates could be produced, as often as quarterly, from states and metropolitan areas from Form 941 or QCEW data. Employment information from QCEW data are more precise since organizations are disaggregated to the establishment level. Also, employment information is not collected from organizations with less than $100,000 annual revenue that file Form 990-EZ. Organizations with less than $25,000 annual revenues over three years do not need to file at all, and there are not much data on slightly larger organizations without paid employees.
These sources say little about the use of volunteer labor, an important input for nonprofit operation. Estimates of the supply of volunteer labor can be produced by various household data collections, particularly the CPS supplement and the newly developed ATUS. Use of this labor, particularly at the establishment level, is still largely missing.
Martin David suggested to the panel that microdata that include the exempt status of the organization should be created and made available to researchers at the Census and BLS data centers with restricted access. He noted further that the NAICS industrial classification is insufficient for nonprofit organizations. The National Taxonomy of Exempt Entities (NTEE), which is used by the IRS to classify annual information returns, is more detailed (for example, it distinguishes between elementary and high schools, which NAICS does not). Carrying over NTEE from the source
documents at the organization level to statistical coverage at the establishment level would be useful. His other suggestions were to require that Form 990 be filed more promptly after the end of the fiscal year and to require it to identify revenue from government contracts to improve information for measuring balance of private financing versus government financing of nonprofits.
The Urban Institute’s National Center for Charitable Statistics has created a digital census of charitable nonprofit organizations (classified as IRS 501(c)(3) entities), built from Form 990 data, that contains most of the fields from these forms. Industry coding is also incorporated, and the national taxonomy of exempt entities used in creating the digital census improves classification of organizations, which creates value added to the IRS registry. Microdata are linked into a panel structure using EINs. New data are released annually and are accessible to researchers.
A.6.3
E-Commerce28
The impact of electronic commerce (e-commerce) on the U.S. economy is growing and is now widely discussed; however, there is a significant data gap in the official, national economic statistics. The Census Bureau and BEA are involved in efforts to fill this data gap by measuring aspects of the “new economy.” Much of the initial efforts were oriented toward simply defining e-commerce and digital and electronic economic activities (Tehan, 2003). A number of basic questions emerged:
-
How is business-to-business and business-to-consumer e-commerce impacting the accuracy of labor surveys?
-
What are the goods and services choices, characteristics, and prices offered?
-
How difficult is it to track international transactions as well as business costs and productivity (Tehan, 2003)?
Gaps in data on e-commerce (e.g., the extent to which computers and the Internet have reduced entry barriers for small businesses, making them better equipped to compete in the economy) have kept researchers from fully exploring questions involving the extent to which new technologies are altering business dynamics.
The taxonomy of e-commerce continues to evolve to reflect that clear distinctions are not always possible—online retailers expand their inven-
tory, and brick-and-mortar businesses develop online components. The Census Bureau collected e-commerce data in four surveys, to measure various aspects of economic activity, including shipments for manufacturing, wholesale and retail trade sales, and service industry revenues. To capture whether firms were conducting sales online and the volume of these sales, the Census Bureau added two questions to its monthly retail trade survey in fall 1999. In addition, the Census Bureau developed the U.S. Department of Commerce E-Stats web site devoted to “measuring the electronic economy,” which covers NAICS industries, or 70 percent of economic activity from the 1997 economic census (Tehan, 2003, pp. 3-5). BEA has considered whether e-commerce can be accurately measured and has proposed “a comprehensive measure of e-business and high-tech that would measure the new economy in a comprehensive and consistent fashion” (Tehan, 2003, p. 6).
At the Understanding the Digital Economy conference in May 1999, John Haltiwanger noted that existing databases were unlikely to have the capacity in the near term to measure what is happening in the digital economy, particularly by firm size. The LEEM file, discussed above, can be used to determine growth rates, the geography of fast-growing firms, birth and death information, and mergers and acquisitions. The data, which existed in the LEEM file, were not used to advance understanding of the digital economy, because “the problems were that the digital economy would have to have been defined by the 1987 SIC code designations which are outdated and would have covered only the time period 1988-1995” (Berney, 1999, p. 3). In addition, Berney speculated that the NAICS codes will also have difficulty keeping up with this continuously evolving segment of the economy.
The Statistics Canada model provides a way to both reduce the bureaucratic lags from the cooperation of multiple data collection agencies and protect the individual agency budgets. “Until the federal statistical agencies can make more current data available, researchers will need to rely on survey information by private firms and trade associations to analyze what is going on in e-commerce and the digital economy” (Berney, 1999, p. 3). Microsoft, Nathan Associates, the International Data Corporation, and the Yankee Group have been involved in developing databases on the digital economy and e-commerce. The 2003 Report to Congress lists web addresses for e-commerce statistics as of June 2003, including government, academic, and private research firms.
A.7
FINANCIAL DATA
When measuring and evaluating entrepreneurial activity, it is important to identify financial sources and to differentiate between venture-backed companies and others. Current data on financing and investments
are insufficient in terms of analyzing how financing interacts with the evolution of new firms. In his presentation to the panel, Steven Kaplan (University of Chicago Graduate School of Business) noted the need to integrate financial data with performance data. Given existing databases, it is, for example, very difficult to study the extent to which economic activity and job growth are driven by venture capital.
Existing databases include Venture Economics, Venture One, the Statistics of Income (SOI), Compustat, the Annual Capital Expenditures Survey (ACES), and the Kauffman Financial and Business Database (KFBD). Currently, the databases are not linked to one another. Venture Economics, owned by Thompson, is useful for valuation history. The Dow Jones Venture One includes higher quality, clean data and more valuation data and data on people than Venture Economics. However, the data from Venture One are difficult to access, compared with Venture Economics data, which are accessible for a fee.
SOI corporate data are the only publicly available source of corporate financial information; data products for S-corporations—those with 75 or fewer shareholders—are also available. The data, based on a stratified random sample of 130,000 preaudited returns, contain income statements, balance sheets and tax information, industry classification, identification of accounting periods and sizes of assets, receipts, and income taxes after credits. The SOI provides data annually to BEA on partnerships, as well as producing annual information on nonfarm sole proprietorships from Schedule C data.
Standard and Poor’s produces Computstat, a database for all publicly traded firms in the U.S. stock market. It is geared mainly toward investors and attempts to standardize financial statements and accounting statement information on companies around the world. Compustat has been longitudinal since 1980 and includes such information as quarterly and annual income statements, balance sheets, and cash flow statements. Reporting units are identified by reporting company and a 4-digit SIC code and are business or industry segments, defined by the Financial Accounting Standards Board (FASB) Statement of Financial Accounting Standards No. 14 as: a component of an enterprise engaged in providing a product or service or a group of related products or services primarily to unaffiliated customers (i.e., customers outside of the enterprise) for profit (FASB Statement 14, p. 7).
In its words, the Census Bureau began the ACES as part of a comprehensive program designed to provide detailed and timely information on capital investment in new and used structures and equipment by nonfarm businesses. The survey sample includes approximately 46,000 employer companies and approximately 15,000 nonemployer companies. The survey prior to 1999 published capital expenditures data only for “97 industries
comprised of two-digit and selected three-digit industries from the Standard Industrial Classification (SIC) system…. Beginning with the 1999 ACES, for companies with employees, capital expenditures data are published for industries comprised primarily of three-digit and selected four-digit industries from the North American Industry Classification System (NAICS)” (http://www.census.gov/csd/ace/).
The Kauffman Foundation purchases data semiannually from D&B and uses them to develop the KFBD, a longitudinal file of annual data since 1996. More than 1 million records make up the KFBD, housing financial information on more than 300,000 firms and current demographic data for over 900,000 companies, each record including an annual balance sheet, an annual income statement, fourteen standard financial ratios, and various firm-level demographic items. The KFBD contains complete, consecutive financial statements for approximately 50,000 companies for a period of 3 years in length. These data may be sorted by industry (either NAICS or SIC codes), year started, number of employees, annual sales, minority ownership, and detailed location information, among other variables (http://research.kauffman.org/cwp/appmanager/research/researchDesktop).
REFERENCES
Abowd, J.M., J. Haltiwanger, and J. Lane 2004 Integrated longitudinal employee-employer data for the United States. American Economic Review Papers and Proceedings 94(May).
Armington, C. 2004 Development of Business Data: Tracking Firm Counts, Growth, and Turnover by Size of Firms. (Prepared for the SBA Office of Advocacy). Washington, DC: U.S. Small Business Administration.
Bartelsman, E.J., and M. Doms 2000 Understanding productivity: Lessons from longitudinal microdata. Journal of Economic Literature 38(3):569-594.
Berney, R. 1999 Comments. Prepared for the Small Business Session, Understanding the Digital Economy Conference, May 25-26, Department of Commerce, Washington, DC. Available: http://www.technology.gov/digeconomy/powerpoint/berney/index.html.
Boden, R., and A. Nucci 2004 Longitudinal Linking of Nonemployer Data to Nonemployer Data and the Business Register Over Time: Some Preliminary Findings. Paper presented at the Eastern Economic Association Annual Meeting, February, Washington, DC.
Clayton, R.L., A. Sadeghi, and D.M. Talan 2005 New Data on Business Employment Dynamics. Presented at the Federal Committee on Statistical Methodology Research Conference, November 14-16, Washington, DC.
David, M., T. Pollak, and P. Arnsberger 2005 Compliance with Information Reporting: Exempt Organizations. Available: http://www.irs.gov/pub/irs-soi/05david.pdf.
Davis, S.J., and J. Haltiwanger 1990 Gross job creation and destruction: Microeconomic evidence and macroeconomic implications. NBER Macroeconomics Annual 123-168.
1992 Gross job creation, gross job destruction, and employment reallocation. Quarterly Journal of Economics (August):819-864.
Davis, S.J., J. Haltiwanger, R.S. Jarmin, C.J. Krizan, J. Miranda, and A. Nucci 2006 Measuring the Dynamics of Young and Small Businesses: Integrating the Employer and Nonemployer Universes. (Center for Economic Studies Working Paper #06-04). Washington, DC: U.S. Census Bureau.
Davis, S.J., J. Haltiwanger, and S. Schuh 1996 Job Creation and Destruction. Cambridge, MA: MIT Press.
Dunne, T., M. Roberts, and L. Samuelson 1989 Plant turnover and gross employment flows in the U.S. manufacturing sector. Journal of Labor Economics 7(1):48-71.
Eldridge, L.P., and S.W. Pabilonia 2005 Are Those Who Bring Work Home Really Working Longer Hours? Implications for BLS Productivity Measures. Washington, DC: U.S. Bureau of Labor Statistics. Available: http://www.oecd.org/dataoecd/36/55/37490637.pdf.
Fairlie, R.W. 2004 Recent trends in ethnic and racial self-employment. Small Business Economics 23(3):203-218.
Financial Accounting Standards Board 1976 Financial Reporting for Segments of a Business Enterprise. In Statement of Financial Accounting Standards No. 14. Norwalk, CT: Financial Accounting Foundation.
Frazis, H., and J. Stewart 2004 Where Does the Time Go? Concepts and Measurement in the American Time Use Survey. Washington, DC: U.S. Bureau of Labor Statistics. Available: http://www.nber.org/books/CRIW03-BH/frazis-stewart3-24-05.pdf.
Gartner, W.B., K.G. Shaver, N.M. Carter, and P.D. Reynolds, eds. 2004 Handbook of Entrepreneurial Dynamics: The Process of Business Creation. Thousand Oaks, CA: Sage Publications.
Getz, P., J. Kropf, R. Clayton, R.E. Detlefsen, P. Hanczaryk, R. Jarmin, and E.D. Walker 2005 Births and Deaths in Business Surveys. Presentation to the Federal Economic Statistics Advisory Committee, June 10, Washington, DC. Available: http://www.bea.gov/about/pdf/BirthsDeathsBusinessSurveys.pdf.
Global Entrepreneurship Monitor 2004 Global Entrepreneurship Monitor 2004 Global Report. Available: http://www.gemconsortium.org/document.asp?id=364.
Haviland, A., and B. Savych 2005 A Description and Analysis of Evolving Data Resources on Small Business. (RAND Institute for Social Justice Working Paper). Available: http://www.rand. org/pubs/working_papers/2005/RAND_WR293.pdf.
Jarmin, R., and J. Miranda 2002 The Longitudinal Business Database. (Center for Economic Studies, July). Washington, DC: U.S. Census Bureau. Available: http://webserver01.ces.census.gov/in-dex. php/ces/1.00/cespapers/index.php/ces/1.00/cespapers?down_key=101647.
Knaup, A.E. 2005 Survival and longevity in the business employment dynamics data. Monthly Labor Review 128(5):50-56.
McGuckin, R.H., and S. Peck 1992 Manufacturing Establishments Reclassified into New Industries: The Effect of the Survey Design Rules. (Center for Economic Studies Discussion Paper #92-14). Washington, DC: U.S. Bureau of the Census.
Neumark, D., J. Zhang, and B. Wall 2005a Employment Dynamics and Business Relocation: New Evidence from the National Establishment Time Series. (NBER Working Paper #11647). Cambridge, MA: National Bureau of Economic Research.
2005b Are businesses fleeing the state? Interstate business relocation and employment change in California. California Economic Policy 1(4).
Okolie, C. 2004 Why size class methodology matters in analyses of net and gross job flows. Monthly Labor Review (July):3-12. Also available: http://www.bls.gov/opub/mlr/2004/07/art1full.pdf.
Pierce, K. 2005 Sole proprietorship returns, 2003. Statistics of Income Bulletin June 22. Washington, DC: Internal Revenue Service.
Pivetz, T.R., M.A. Searson, and J.R. Spletzer 2001 Measuring job flows and establishment flows with BLS longitudinal establishment microdata. Monthly Labor Review 124(40):13-20.
Reynolds, P.D. 2006 New Firm Creation in the U.S.: A PSED I Overview. Cheltenham, England: Edward Elgar.
Reynolds, P.D., N.S. Bosma, E. Autio, S. Hunt, N. De Bono, I. Servais, P. Lopez-Garcia, and N. Chin 2005 Global entrepreneurship monitor: Data collection design and implementation: 1998-2003. Small Business Economics (24):205-231.
Robinson, J.P., and A. Bostrom 1994 The overestimated workweek? What time diary measures suggest. Monthly Labor Review 117(8):11-23.
Salamon, L., and S.W. Sokolowski 2005 Nonprofit Organizations: New insights from QCEW data. Monthly Labor Review (September):19-26.
Salyers, E. 2004 Progress Report of the U.S. Census Bureau. Presented at the 18th Roundtable on Business Survey Frames. October 17-22, Beijing, China. Available: http://www.stats.gov.cn/english/18roundtable/papers/t20041230_402219768.htm.
Song,Y. 2005 Working at Home: Pay Status and Timing. Presented at the Annual Conference of the International Association for Time-Use Research, November 2-4, Halifax, Canada. Available: http://www1.union.edu/songy/WorkingatHome.pdf.
Spletzer, J.R., J. Faberman, A. Sadeghi, D.M. Talan, and R.L. Clayton 2004 Business employment dynamics: New data on gross job gains and losses. Monthly Labor Review (April):29-42.
Tehan, R. 2003 E-Commerce statistics: Explanation and Sources. (Congressional Research Service report). Damascus, MD: Penny Hill Press.
Triplett, J.E., and B.P. Bosworth 2004 Productivity in the U.S. Services Sector: New Sources of Economic Growth. Washington, DC: Brookings Institution Press.
Vale, S. 2006 The International Comparability of Business Start-up Rates, Final Report. Organisation for Economic Co-operation and Development, November, Paris, France. Available: http://www.olis.oecd.org.
TABLE A-1 BUSINESS DATA SETS1
The following acronyms are used in the table: |
|
ATUS |
American Time Use Survey |
BEA |
Bureau of Economic Analysis |
BED |
Business Employment Dynamics |
BEL |
Business Establishment List |
BITS |
Business Information Tracking Series |
BLS |
Bureau of Labor Statistics |
BR |
Business Register |
CBP |
County Business Patterns |
CES |
Current Employment Statistics |
COS |
Company Organization Survey |
CPS |
Current Population Survey |
D&B |
Dun and Bradstreet |
DMI |
Duns Market Identifiers |
ECIS |
Employment Cost Index Survey |
EIN |
Employer Identification Number |
ES-202 |
Covered Employment and Wages (under the SESA UI program) |
ETA |
Employment and Training Administration |
FRB |
Federal Reserve Board |
FTP |
File Transfer Protocol |
GDP |
Gross Domestic Product |
GEM |
Global Entrepreneurship Monitor |
ILBD |
Integrated Longitudinal Business Database |
IRS |
Internal Revenue Service |
KFBD |
Kauffman Financial & Business Database |
KFS |
Kauffman Firm Survey |
KIEA |
Kauffman Index of Entrepreneurial Activity |
LBD |
Longitudinal Business Database |
LDB |
Longitudinal (establishment) Database |
LED |
Local Employment Dynamics |
LEEM |
Longitudinal Establishment and Enterprise Microdata |
LEHD |
Longitudinal Employer-Household Dynamics |
LFO |
Legal Form of Organization |
LRD |
Longitudinal Research Database |
MSA |
Metropolitan Statistical Area |
NAICS |
North American Industry Classification System |
NORC |
National Opinion Research Center |
NSF |
National Science Foundation |
OES |
Occupational Employment Statistics |
PPI |
Producer Price Index |
PSED |
Panel Study of Entrepreneurial Dynamics |
QCEW |
Quarterly Census of Employment and Wages |
QUI |
Quarterly Unemployment Insurance |
QWI |
Quarterly Workforce Indicators |
R&D |
Research and Development |
RDC |
Research Data Center |
SBO |
Survey of Business Owners |
SESA |
State Employment Security Agency |
SIC |
Standard Industrial Classification |
SMOBE |
Survey of Minority-Owned Business Enterprises |
SOI |
Statistics of Income |
S&P |
Standard & Poor's |
SSA |
Social Security Administration |
SSBF |
Survey of Small Business Finances |
SSEL |
Standard Statistical Establishment List |
SUSB |
Statistics of U.S. Businesses |
SWOBE |
Surveys of Women-Owned Business Enterprises |
UCFE |
Unemployment Compensation for Federal Employees |
UI |
Unemployment Insurance |
U.S. DEPARTMENT OF LABOR, BUREAU OF LABOR STATISTICS (BLS)
Quarterly Census of Employment and Wages (QCEW) [formerly the Business Establishment List (BEL)] |
|
Purpose/uses |
Serves as the sample frame for most BLS establishment surveys. Used to publish a quarterly count of employment and wages reported by employers (covers about 98% of U.S. jobs) available at the county, MSA, state and national levels, by industry. Used as benchmarking source for CES and OES. Used by BEA in calculating personal income components of GDP. |
Design basics |
Constructed mainly from UI system administrative records maintained by SESAs. |
Frequency |
Microdata continuously updated. County wage and employment data are published quarterly, 6 to 7 months after end of quarter. |
Unit level |
Establishment (EIN provides ability to aggregate to firm and industry levels and from county to national levels). |
Coverage |
Includes all establishments and workers covered by UI and UCFE programs. This amounts to about 8.4 million establishments—a near census—and 98% of all nonfarm wage and salary employment. |
Content |
Master file includes establishment names; full mailing and physical address; federal EIN; monthly employment; and quarterly wages by NAICS industry, county, ownership sector. |
Limitations or lag time |
Excludes nonemployer firms; no ownership characteristics; no government and only partial farm coverage; ownership links for multistate firms not comprehensive. |
Accessibility of data |
Microdata available only for statistical purposes and must be accessed onsite at the BLS research center in Washington, DC. Publicly available files include data on the number of establishments, monthly employment levels and quarterly wages, aggregated by NAICS industry, county, or ownership sector. |
Business Employment Dynamics (BED) |
|
Purpose/uses |
Program produces a quarterly series of gross job gains and gross job losses statistics based on the universe of establishments covered in the QCEW. Used in CES to track establishment-level expansions, contractions, openings, and closings. |
Design basics |
Data are constructed using a multistep procedure to link QCEW microdata across periods; about 6.7 million private-sector employers are covered. Data become available about 8 months after the end of a given quarter. |
Frequency |
Quarterly, panel (1992 to present). |
Unit level |
Establishment (can be aggregated to firm level with EIN). |
Coverage |
All establishments subject to state UI laws and federal agencies subject to the UCFE program (about 98% of all employment). |
Content |
Monthly employment and wages; full address; and number and percent of gross jobs gained by opening and expanding establishments, gross jobs lost by closing and contracting establishments, establishments that are classified as openings, closings, expansions, and contractions. |
Limitations or lag time |
Excludes government employees, certain nonprofit employees, self-employed, private households, and establishments with zero employment. UI coverage differs by state and may change over time. EINs are imperfect for creating record linkages. |
Accessibility of data |
As with QCEW, public-use microdata are not available. Researchers must submit proposals and data can only be used at the BLS research center in Washington, DC. |
Current Employment Statistics (CES) [also known as Payroll Establishment Survey] |
|
Purpose/uses |
Provides employment, hours, and earnings estimates based on payroll records. Provides first economic indicator of current economic trends each month (with unemployment rate). |
Design basics |
Based on a sample of about 400,000 business establishments (160,000 firms). The LDB, stratified by state, industry, and employment size, serves as the sampling frame. |
Frequency |
Monthly |
Unit level |
Establishment |
Coverage |
Payroll employment for establishments in nonagricultural industries (over 1,150 industries). Hours and earnings data are collected from SESAs for about 850 industries. |
Content |
Total employment, full address, number of women employed, number of production or nonsupervisory workers, average hourly earnings, average weekly hours, average weekly earnings, and average weekly overtime hours in manufacturing industries. |
Limitations or lag time |
As with QCEW, there is no nonemployer, self-employed, or farm coverage, and no detailed owner or small firm characteristics. Geographic coding is available only by MSA. Establishments are not tracked over time and multiple jobholders are overrepresented. |
Accessibility of data |
Electronic access to selected indicator data is available. Microdata are not publicly available. Researcher can apply for access to the confidential microdata. |
American Time Use Survey (ATUS) |
|
Purpose/uses |
Collects information on how people in the United States spend their time, including kinds of activities and time spent doing them. Used in preparation of BLS press releases and to produce categorical time use tables on ATUS web site. |
Design basics |
Sample frame is drawn from households that have completed their final month of interviews for the CPS, utilizing a stratified, 3-stage sample. |
Frequency |
Data have been collected since 2003, and they are published annually. |
Unit level |
Individual (household) |
Coverage |
Civilian noninstitutional population and workers ages 16 and over. For 2004 and 2005, approximately 27,000 cases yielded about 13,500 completed interviews; the survey was roughly 50% larger in 2003. Diaries are used to capture data spent on various activities. |
Content |
Data are collected on major activity categories (work, sleep, eating, etc.) and on selected variables such as earnings, school enrollment, selected demographics, household, labor force characteristics, and hours worked. There is also a self-employment identifier. |
Limitations or lag time |
Little information is collected on secondary activities (those done in combination with other activities) not collected. Estimates subject to nonsampling errors, particularly if nonresponse is correlated with time use. |
Accessibility of data |
Published tables and microdata files available on the ATUS web site. |
Job Openings and Labor Turnover Survey (JOLTS) |
|
Purpose/uses |
Data serve as demand-side indicators of labor shortages at the national level. Availability of unfilled jobs—the job openings rate—is an important measure of job market dynamics that complements measures of unemployment. |
Design basics |
Data from a sample of approximately 16,000 U.S. business establishments are collected on a voluntary basis. The sample frame consists of approximately UI million establishments on the BLS' ES-202 QCEW file. Reference periods for total employment is the pay period that includes the 12th of the month; for job openings, it is the last business day of the month; for hires and separations, it is the entire calendar month. |
Frequency |
Data tables are released monthly. |
Unit level |
Establishment |
Coverage |
The survey covers all nonagricultural industries in the public and private sectors for the 50 states and the District of Columbia. |
Content |
Total employment, job openings, hires, quits, layoffs and discharges, and other separations. |
Limitations or lag time |
Data available only on a national level. No turnover rates by occupation. |
Accessibility of data |
Data are disseminated in a news release and through updated tables on the BLS website. |
BLS AND U.S. CENSUS BUREAU
Current Population Survey (CPS) |
|
Purpose/uses |
Provides information on the labor force characteristics of the U.S. population. Data are used to calculate total employment (by occupation) and unemployment statistics. Used as sample frame for ATUS. Used to produce supplements on displaced workers, job tenure and occupational mobility. CPS data have also been used to construct the KIEA (1996 to 2004), a measure of business creation defined as the percentage of nonbusiness owners who started a business each month. |
Design basics |
Uses a household-based (from the Census Bureau) sampling frame and rotating sample design: respondents are in the survey for UI months, out for UI months, and back in for an additional UI months. |
Frequency |
Monthly, longitudinal panel capability upon matching, 1962 to present (matching is imperfect—annual match rates around 70%). |
Unit level |
Individual, family, and household |
Coverage |
Civilian noninstitutionalized population ages 16 and over. Survey size is approximately 60,000 households. |
Content |
Employment (by occupation and industry), indicators for self-employed, unemployment, business ownership, some characteristics of small business employees, earnings, hours of work, age, sex, race, marital status, and educational attainment. Supplemental questions on school enrollment, income, previous work experience, health, employment benefits, and work schedules. |
Limitations or lag time |
Record matching over time is imperfect. |
Accessibility of data |
Microdata are publicly available. Data can be accessed electronically. |
U.S. CENSUS BUREAU
Census Bureau's Business Register (BR) |
|
Purpose/uses |
Provides a comprehensive database of U.S. business establishments and companies for statistical program use. Serves as the master enumeration list for sampling frames drawn for the Census Bureau's firm and establishment surveys, most notably the quinquennial economic census. Source for annual reports providing summary statistics (e.g., number of establishments, payroll, employment) by county and 6-digit NAICS industry. |
Design basics |
Sample frame draws from the IRS Business Master File, tax return data from Schedule BR 1040, SSA data, the economic census, COS, and other Census Bureau business surveys. |
Frequency |
Establishment listings are initiated and updated continuously with information from Census Bureau and other federal statistical and administrative records programs. Individual data items are updated anywhere from every quarter to every UI years (with the economic census). |
Unit level |
Establishment (organized by EIN, enterprise, and alternate reporting units) |
Coverage |
Employer and nonemployer businesses: 180,000 multiunit companies representing 1.5 million affiliated establishments, UI million single- establishment companies, and approximately 16.5 million nonemployer businesses. |
Content |
Business location (mailing and physical address), organization type, EIN, NAICS, LFO code, tax status, employment and payrolls, IRS reported sales and receipts or revenue, assets, interest income, gross rents, parent EIN, activity status, and filing requirement codes. |
Limitations or lag time |
Geo and industry codes are updated only every UI years. Lack of detail on small business owners. Accuracy of single versus multiunit identification—and of small multiunit establishment births and deaths—declines between economic censuses; no data on government or farms; ownership links for multistate firms not comprehensive. |
Accessibility of data |
Information is confidential under Title 13 and Title 26. No public-use data set. Researchers can apply for access to the confidential microdata. |
Company Organization Survey (COS) |
|
Purpose/uses |
Used to obtain current organization and operating information on multiestablishment firms to maintain and update the BR. Source for CBP Reports. |
Design basics |
Some multiestablishment companies receive annual mail-out/mail-back surveys. Smaller companies are selected when administrative data indicate a probable organizational change using a probability sampling procedure. About 40,000 multiunit companies with more than 250 employees and about 10,000 smaller multiunit companies are selected on a rotating basis. Content and coverage vary during the 5-year economic census program cycle. |
Frequency |
Conducted annually since 1974. |
Unit level |
Multiestablishment firms |
Coverage |
Cross-sectional survey of multiestablishment companies with payroll (and their establishments), excluding agricultural production companies. |
Content |
Companies identify establishments (including mailing and physical address) that have been sold, closed, continued, started, and acquired. Businesses are asked about first quarter and annual payroll, employment during the pay period including March 12 for each establishment, large foreign equity positions, and controlling interests held by other domestic or foreign-owned organizations. |
Limitations or lag time |
Limited scope and rotation of sampled firms over time affect the timeliness and coverage of smaller multiestablishment companies in the BR. |
Accessibility of data |
No public-use version available. Researcher can apply for access to the confidential microdata, which are typically available for a given year with about an 8-month lag. |
The Economic Census |
|
Purpose/uses |
Provides a detailed portrait of the economy once every UI years at both national and local levels. Used to update the Census BR and to produce industry and geographic area series and supplemental surveys of minority- and women-owned businesses. |
Design basics |
More than UI million companies are mailed a census form. Large- and medium-size firms, plus all firms known to operate more than one establishment are sent forms. For most very small firms, data from existing administrative records of other federal agencies are used. Data can be linked longitudinally. Geographic detail available varies by sector and can range from state to zip code levels. |
Frequency |
Data are collected every UI years (years ending in UI and 7). |
Unit level |
Establishment, firm |
Coverage |
All domestic nonfarm business establishments (not operated by government). |
Content |
Statistics tabulated for all industries covered include number of establishments, number of employees, payroll, and measure of output (sales, receipts, revenue, value of shipments, or value of construction work done). Additional items are available for certain sectors. |
Limitations or lag time |
Collected every UI years, making birth and death coverage incomplete. Nonemployer coverage by sample only (though this can be supplemented using annual nonemployer statistics); no detailed owner characteristics; and no government or farm coverage. |
Accessibility of data |
Though many publications based on the economic census are readily available, no public-use version of underlying microdata is available. Researcher can apply for access to the confidential microdata. |
Longitudinal Research Database (LRD) |
|
Purpose/uses |
Provides company-level data that have supported research on employment dynamics as well as on the issues related to productivity, profitability, and the uses of research and development. |
Design basics |
Data collected primarily using a mail-out/mail-back process. Periodically, visits to key companies are conducted to record the changing nature of R&D activities and any reporting difficulties companies may have, and to determine collectibility of proposed new items. A probability-proportionate-to- size approach is used for selecting individual companies for participation. |
Frequency |
Underlying survey data collected annually since 1957; however, only data from 1972 forward are included in the database. The R&D database is generally updated annually within UI years after the survey reference year. |
Unit level |
Plant |
Coverage |
Sample size has been approximately 25,000 companies since 1992. In any given year, the number of sampled companies that conduct or sponsor R&D activities is in the 3,500 to 4,000 range. Due to the concentration of R&D activities among the larger companies, most companies with significant R&D activities remain in the sample for a number of years. |
Content |
Mandatory items'. federal and company financed R&D, domestic net sales, domestic employment, and R&D by state. Voluntary items: information about scientists and engineers employed; basic and applied R&D using federal and company funds; contracted-out, foreign, and budgeted research; R&D by major type of expense and technology area; and energy R&D. |
Limitations or lag time |
Data historically limited to manufacturing sectors; coverage of firms with fewer than 250 employees is limited; plant-level data are not linked to enterprises. |
Accessibility of data |
Research is conducted by permanent and specially sworn Census Bureau employees. All current research is done at the Center for Economic Studies. |
Longitudinal Busness Database (LBD) |
|
Purpose/uses |
Used for researching establishment and firm dynamics (entry and exit) and job flows. Extends the LRD beyond manufacturing sectors |
Design basics |
Contains annual observations on employment and payroll for all businesses in the U.S. private sector. The database sources are periodic business surveys conducted by the Census Bureau and federal government administrative records. It uses SSEL data and EIN-based year-to-year linking. |
Frequency |
Annual |
Unit level |
Establishment and firm |
Coverage |
Longitudinal data set of all employer business establishments from 1975-2003. |
Content |
Establishment identifiers, age and tenure, payroll, employment, firm affiliation, name, and location (mailing and physical address) information. Work is ongoing to add payroll employment, location, industry activity, and firm affiliation. |
Limitations or lag time |
Linkages can be difficult due to inconsistent data formats, changing business ID numbers, and the sheer number of records. |
Accessibility of data |
None outside Census Bureau. Plans are in place to provide access to microdata available at RDCs after further documentation and quality assurance. |
Integrated Longitudinal Business Database (ILBD) |
|
Purpose/uses |
Extends LDB coverage to include nonemployer businesses providing research data on firm and job dynamics. The database allows a business's characteristics to be tracked as it transitions from nonemployer to employer status (or vice versa). The ILBD is currently used by researchers working on a wide variety of projects at the Census Bureau's Center for Economic Studies and RDCs. |
Design basics |
Integrates federal administrative records and survey data in a longitudinal structure. Records are linked by EIN or social security number. |
Frequency |
Data compiled annually, 1992, 1994 to 2000. |
Unit level |
Firm |
Coverage |
All private, nonagricultural, employer, and nonemployer businesses in the United States, currently covering the years 1992 and 1994 to 2000. This amounts to approximately 21 million businesses (over 15 million nonemployers and over UI million employers). |
Content |
A range of business characteristics, including location (mailing and physical address) as they transition from nonemployer to employer and visa versa. |
Limitations or lag time |
Linkages can be difficult due to inconsistent data formats, changing business ID numbers, and the sheer number of records. |
Accessibility of data |
Data access governed by Title 13 of the U.S. Code. Microdata will be available at RDCs in the near future. |
Longitudinal Employer-Household Dynamics (LEHD) |
|
Purpose/uses |
A microdata source designed to provide a detailed and comprehensive picture of workers, employers, and their interaction in the U.S. economy. Employment-household linked microdata create opportunities to conduct longitudinal research using on business start-ups early life-cycle dynamics and on local labor market conditions. Used by QWI to measure job churning and in designing the LED program. |
Design basics |
A set of infrastructure files using administrative data provided by state agencies, enhanced with information from demographic and business surveys and censuses. Uses LBD (linked to household data); federal and state administrative data; core Census Bureau censuses and surveys. |
Frequency |
Yearly, panel (1992 to 2001). |
Unit level |
Establishment and household |
Coverage |
Establishments from about 20 states (about UI million) and 80 million individual records. |
Content |
Integrates information about employers (including employment and payroll levels, industry, location, and employment history of employees); employees (including gender, race, foreign-born status, and date of birth); the skill mix of businesses; and employer- level accessions, separations, job creation, and destruction. |
Limitations or lag time |
Data only reveal workers' quarterly earnings, not work hours. For most workers data are not available on education or family characteristics. |
Accessibility of data |
Available to authorized users at Census Bureau controlled facilities. No public-use version available. Researcher can apply for access to the confidential microdata. |
Survey of Business Owners (SBO) |
|
Purpose/uses |
Provides statistics describing the composition of U.S. businesses by gender, race, and ethnicity of owner and sources of financing. Economic policy makers in federal, state, and local governments use SBO data as a source of information on business success and failure rates. |
Design basics |
Sample frame is based on IRS administrative data. The sample size is typically around 2.3 million businesses. |
Frequency |
Tied to economic census (every UI years). |
Unit level |
Establishment and owner |
Coverage |
Firms operating during reference year with receipts of $1,000 or more that filed tax forms as individual proprietorships, partnerships, or any type of corporation. Excludes those classified as agricultural production, domestically scheduled airlines, railroads, U.S. Postal Service, mutual funds (except real estate investment trusts), religious grant operations, private households and religious organizations, public administration, and government. |
Content |
Legal form of organization, receipts, business owner's race (self-identified and allowing for identification of more than one racial group), gender, ethnicity, age, education level, veteran status, and primary function in the business. Also includes family- and home-based businesses, types of customers and workers, and sources of financing for expansion, capital improvements, or start-up. |
Limitations or lag time |
Infrequent; not longitudinal |
Accessitaility of data |
No public-use version available. Researchers can apply for access to the confidential microdata. |
DUN & BRADSTREET (D&B)
Duns Market Identifiers (DMI) |
|
Purpose/uses |
Provides basic company data for U.S. business establishments. Used as a sampling frame for a wide variety of government (e.g., the SSBF) and private-sector applications. |
Design basics |
Data are collected from a wide range of public and private sources including in-person and telephone interviews, government publications, business trade programs, mailings, and applications for credit |
Frequency |
Data are updated continuously, albeit in an ad hoc fashion. |
Unit level |
Establishment and firm |
Coverage |
U.S. business establishment locations of all sizes and types, including public and private companies, government agencies and contractors, and schools and universities. Includes over 17 million establishments; over 2.9 million private and public companies. Limited to companies with UI or more employees or sales of $1 million. |
Content |
Information on owners, sales, employment and legal status, full address, names of executives and titles, corporate linkages, Duns numbers, organization status, marketing information, primary SIC code, and sometimes a NAICS code. |
Limitations or lag time |
Relies on disparate sources for detecting appearance of new businesses—there are no standard guidelines. There is no distinction between firm and establishments. Information on ownership and small firm characteristics is limited. |
Accessibility of data |
Microdata available for a fee. |
FEDERAL RESERVE BOARD (FRB)
Survey of Small Business Finances (SSBF) [oonducted by NORC] |
|
Purpose/uses |
The most comprehensive source of information available on the characteristics of small businesses and their owners, focusing on financial data. Data have been used to prepare the Report to Congress on the Availability of Credit to Small Business every UI years. Facilitates research on factors affecting prices and availability of credit; characteristics of small businesses and their influence on credit needs; experiences with credit applications; impact of government regulations on credit access; financial and nonfinancial sources used for financing needs. The FRB intends to discontinue the SSBF. |
Design basics |
About 24,000 firms from D&B are screened for a final sample of 4,240 (for 2003). |
Frequency |
About every UI years (1987, 1993, 1998, and 2003); cross-sectional. |
Unit level |
Firm |
Coverage |
Nationally representative sample from D&B of firms with fewer than 500 employees; oversamples African American, Asian American, and Hispanic American owned firms. |
Content |
Income and expenses, assets and liabilities, and financing sources. |
Limitations or lag time |
Infrequent—last conducted in 2003 with a low response rate—around 33%. |
Accessibility of data |
Only a small number of authorized staff at NORC and the Federal Reserve System has access to the raw microdata. A public-use version, altered to maintain respondent confidentiality, is available. |
GLOBAL ENTREPRENEURSHIP MONITOR CONSORTIUM
Global Entrepreneurship Monitor (GEM) |
|
Purpose/uses |
Measure differences in the level of entrepreneurial activity between countries and the relationship between entrepreneurship and national economic growth; uncovers factors that lead to higher levels of entrepreneurship and suggest policies that may enhance levels of entrepreneurial activity. Data have been used to produce an Indicator of Total Entrepreneurial Activity and reports on women and entrepreneurship. |
Design basics |
Data are collected through a series of coordinated household surveys in an increasing number of countries using a common interview schedule and consolidating and standardizing responses. Adult population surveys range from 1,000 to nearly 27,000 individuals per country—the average sample size is about 2,000. |
Frequency |
Samples are drawn annually. |
Unit level |
Household |
Coverage |
Cross-national assessment of entrepreneurship in 35 countries covering three types of data: adult population surveys, national expert interviews, and standardized cross-national data. |
Content |
Level of entrepreneurial activity, variance between countries, and change over time; relationship between entrepreneurship and economic growth; how national experts assess entrepreneurial climate in their countries; who becomes an entrepreneur, why and what types of businesses they are creating; and the importance of venture capital and informal finance. |
Limitations or lag time |
Individual-level data available only after a several year lag (national summary reports are available with a lag of less than a year). |
Accessibility of data |
GEM consortium members have access to individual-level survey data, interview schedules, data collection procedures, and other material needed for systematic analysis. Public users can view all reports. |
INTERNAL REVENUE SERVICE (IRS)
Statistics of Income (SOI) |
|
Purpose/uses |
Provides the only publicly available financial information on all corporations. Data products for S-corporations—those with 75 or fewer shareholders—are also available. The SOI provides data annually to BEA on partnerships, as well as producing annual information on nonfarm sole proprietorships from Schedule BR data. |
Design basics |
The survey is based on a stratified probability sample of 130,000 preaudited income tax returns or other forms filed with the IRS. |
Frequency |
Yearly, cross-section (1990 to 2002). |
Unit level |
Firm |
Coverage |
Corporations, S-corporations, partnerships, and nonfarm sole proprietorships. |
Content |
Net income statements, balance sheets, and tax information by industry, accounting periods, sizes of assets, receipts, and income taxes after credits. |
Limitations or lag time |
Potential reporting errors and inconsistencies, processing errors, and the effects of any early cutoff of sampling. |
Accessibility of data |
Though statistics are publicly available, researchers must apply for the access to the confidential microdata. |
KAUFFMAN FOUNDATION
Kauffman Firm Survey (KFS) [with Mathematica Policy Research, Inc.] |
|
Purpose/uses |
Produces data on the financial development of new businesses and to track them in the first UI years of existence. The data set is intended to create a public-use data source that informs policy decisions and academic analysis. |
Design basics |
Sampled from D&B, a longitudinal survey of the principals of 5,000 firms that started operations in 2004. The survey is oriented primarily to generate data on the financial development of new businesses in their first four years of existence. High-technology businesses are oversampled. Surveys conducted by either telephone or on the Internet |
Frequency |
Begun in 2005, an annual survey with UI follow-up panels over the period 2006 to 2008. |
Unit level |
Owner and firm |
Coverage |
New businesses starting in year prior to reference year in the United States. |
Content |
Business characteristics, strategy and innovation, employment, business organization and benefits, business finances, and work behaviors and demographics of owner(s). |
Limitations or lag time |
D&B sampling frame is limited in ability to quickly incorporate new firms (see D&B). |
Accessibility of data |
Data not now available; ultimately, publicly available longitudinal data on new firms will be available. |
Panel Study of Entrepreneurial Dynamics (PSED) [with the University of Michigan] |
|
Purpose/uses |
A nationally representative database designed to enhance understanding of the business start-up phenomenon. Resulting data are intended to promote research into the business gestation process (i.e., the period before the business actually produces output). |
Design basics |
A longitudinal sample of U.S. households was contacted to find individuals who were actively engaged in starting new businesses. Those identified as nascent entrepreneurs were included in the group and asked to participate in two follow up interviews (each 12 months apart). Four waves of PSED exist for 1998 to 2003; a new cohort has been developed for interview in the UI years beginning in 2006. |
Frequency |
Annual |
Unit level |
Individual entrepreneurs located by household. |
Coverage |
Approximately 670 nascent entrepreneurs, identified through a survey of 64,000 U.S. households. |
Content |
Proportion and characteristics (gender, ethnicity, age, education, household income, and urban context) of the adult population attempting to start new businesses, kinds of activities nascent entrepreneurs undertake during start-up, and proportion and characteristics of start-up efforts that become infant firms. |
Limitations or lag time |
Does not interview respondents who do not qualify as nascent entrepreneurs when initially selected (doing so could eliminating the need for comparison groups). |
Accessibility of data |
Data from the PSED are maintained and made available for download by the University of Michigan's Institute for Social Research. Four panels of data are currently available covering the time period 1998 to 2003. |
Kauffman Financial and Business Database (KFBD) |
|
Purpose/uses |
To collect financial information on U.S. businesses. |
Design basics |
Data primarily used for credit scoring purposes are purchased from D&B. Data include recent, detailed financial information. On average the longitudinal database contains complete, consecutive financial statements for a period of UI years in length. |
Frequency |
Annual—data are purchased from D&B on a semi-annual basis. |
Unit level |
Firm |
Coverage |
The longitudinal file includes data for every year since 1996 and contains more than UI million records with financial information on more than 500,000 unique firms. |
Content |
Financial statements for 3-year periods for about 50,000 companies, including annual balance sheet, annual income statement, 14 standard financial ratios, and various firm-level demographic items. Data may be sorted by industry, year started, number of employees, annual sales, minority ownership, and detailed location information. |
Limitations or lag time |
D&B is limited in its coverage of the newest start-up firms and of self-employed individuals. |
Accessibility of data |
Data are available for legitimate research from the Kauffman Foundation. |
SMALL BUSINESS ADMINISTRATION (SBA)
Statistics of U.S. Businesses (SUSB) (1989 to present) [compiled by the Census Bureau] |
|
Purpose/uses |
An annual series that provides national and subnational data on the distribution of economic activity by size and industry. Provides data on firms, establishments, employment, annual payroll, and estimated receipts (or sales) from which various tables are produced. |
Design basics |
Data items extracted from SSEL. The annual COS provides individual establishment data for multiestablishment companies. Data for single-establishment companies are obtained from the Annual Survey of Manufactures and Current Business Surveys, as well as from administrative records from IRS, the SSA, and BLS. |
Frequency |
Annual. Historical comparability is affected by definitional changes in establishments, activity status, and industrial classifications over the period 1988 to 2002. |
Unit level |
Establishment and firm |
Coverage |
The 1999 Statistics of U.S. Businesses covers all NAICS industries except crop and animal production, rail transportation, U.S. Postal Service, pension, health, welfare, vacation funds, trusts, estates, agency accounts, private households, and public administration. |
Content |
Tabulations can be made to estimate employment, annual payroll, number of firms, number of establishments by location and industry categories. |
Limitations or lag time |
The series excludes data on self-employed individuals, employees of private households, railroad employees, agricultural production employees, and most government employees. There is a 2-year time lag in reporting. |
Accessibility of data |
Tabulations of data by enterprise size for the country, states, and/or metropolitan statistical area can be accessed for recent years. |
Business Information Tracking Series (BITS) (also known as Longitudinal Establishment and Enterprise Microdata (LEEM)) [constructed by the Census Bureau] |
|
Purpose/uses |
To identify firm births and deaths, expansions and contractions, and mergers and acquisitions and for examining job flows. |
Design basics |
BITS is constructed by longitudinally linking archived SUSB data. The data set currently includes about 13 million establishments. |
Frequency |
Yearly, panel (1989 to present) |
Unit level |
Establishment and firm |
Coverage |
Private-sector establishments (single physical locations) with positive payroll. Same industry coverage as CBP. |
Content |
Establishment- and firm-level data on annual payroll, 4-digit SIC, location, start year, legal entity, total employment, firm affiliation, census geography, starting year, census file number, and constant firm identifiers (meaning there is no change in the ID even if legal or ownership status changes). |
Limitations or lag time |
No self-employed; long lag in production (about UI years); only tracks establishments (not firms), and has no farm coverage. |
Accessibility of data |
Not publicly available. Must become a sworn Census researcher and use data at Census RDCs. |
STANDARD & POOR’S (S&P)
COMPUSTAT |
|
Purpose/uses |
Tracks firm level activity for publicly traded, listed firms since 1950. Standardizes financial and accounting statement information on companies around the world for investors. Data used by hedge funds, money managers, analysts, researchers, corporations, and government (the IRS) and regulatory agencies. |
Design basics |
Database produced by S&P. Reporting units are identified by firm and by 4-digit SIC code and are business or industry segments, defined as a component of an enterprise engaged in providing a product, service, or group of related products or services primarily to customers outside the enterprise for profit. |
Frequency |
Quarterly (longitudinal since 1980). |
Unit level |
Firm and industry segment |
Coverage |
All publicly traded firms in U.S. stock markets (about 65,000 firms). |
Content |
Data include quarterly and annual income statements, balance sheets, and cash flow statements. Source information includes annual and quarterly SEC filings, 8-K, 20-F and Proxy filings, EDGAR filings and media releases and original annual reports. |
Limitations or lag time |
By design, limited to publicly traded firms (generally means mature entities). |
Accessibility of data |
Data available for a fee. |