11
Cross-Cutting Data Issues

Four cross-cutting themes emerged from the day's discussions of how to improve understanding of innovation: (1) the need for better data integration, (2) the importance of broader industry coverage, (3) the need for data collection and reporting at lower levels of aggregation, (4) the need for greater geographic detail, and (5) the need to achieve and maintain the cooperation of private firms in the provision and use of the information.

Database Integration

The need to integrate technology, innovation, and human resource databases was a frequent theme of participants' comments at the workshop. Linkages are difficult or impossible because databases use different classification systems, contain different levels of aggregation, result from different collection methods, or lack identifying information. Such barriers are a major obstacle to research that would produce policy-relevant information and analysis. Simply within the Commerce Department's portfolio of technology-related information, for example, the R&D data collected by the Census Bureau on Form RD-1 are not integrated with the Bureau of Economic Analysis's (BEA's) data on R&D of firms with foreign direct investment nor with the Patent and Trademark Office's patent records nor with the Census R&D laboratory information reported in the Auxiliary Establishment Survey.

Beyond the difficulties of linking technology-related data lie the challenges of relating innovation data to relevant economic data. For example, the difficulty of matching the establishment-level data in the Longitudinal Research Data file with the enterprise-level RD-1 data has obstructed analysis of the relationship



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 39
--> 11 Cross-Cutting Data Issues Four cross-cutting themes emerged from the day's discussions of how to improve understanding of innovation: (1) the need for better data integration, (2) the importance of broader industry coverage, (3) the need for data collection and reporting at lower levels of aggregation, (4) the need for greater geographic detail, and (5) the need to achieve and maintain the cooperation of private firms in the provision and use of the information. Database Integration The need to integrate technology, innovation, and human resource databases was a frequent theme of participants' comments at the workshop. Linkages are difficult or impossible because databases use different classification systems, contain different levels of aggregation, result from different collection methods, or lack identifying information. Such barriers are a major obstacle to research that would produce policy-relevant information and analysis. Simply within the Commerce Department's portfolio of technology-related information, for example, the R&D data collected by the Census Bureau on Form RD-1 are not integrated with the Bureau of Economic Analysis's (BEA's) data on R&D of firms with foreign direct investment nor with the Patent and Trademark Office's patent records nor with the Census R&D laboratory information reported in the Auxiliary Establishment Survey. Beyond the difficulties of linking technology-related data lie the challenges of relating innovation data to relevant economic data. For example, the difficulty of matching the establishment-level data in the Longitudinal Research Data file with the enterprise-level RD-1 data has obstructed analysis of the relationship

OCR for page 39
--> between R&D and productivity. Workshop participants suggested several steps to overcome these barriers, ranging from integration of databases across agencies to the inclusion of a single item of identifying information in a standard survey form. For example, Adam Jaffe suggested that companies be asked on the RD-1 form to list the names of the entities under which they hold patents, allowing researchers to match the R&D data to patent data by industry. The workshop was not designed to assign priorities to these suggestions nor to examine their cost and feasibility. Nevertheless, Steve Landefeld made the encouraging observation that although considerable economic data are acquired by BEA under the condition that they not be linked to other data, a 1990 law allows BEA's foreign direct investment and services data to be linked to Census Bureau establishment data.7 Furthermore, proposed legislation would allow broad data sharing among BEA, the Census Bureau, the Bureau of Labor Statistics, and most of the principal U.S. statistical agencies. Broad Coverage of Industries, Activities, Firms, and Technologies High-technology and service-sector industries have not been adequately represented in the national technology and innovation-related databases. Many workshop participants endorsed NSF's current efforts to expand the coverage to service and emerging high-technology industries but agreed that a good deal of fundamental research on the structure and nature of industrial innovation systems in these industries is needed to inform new data collection efforts. Unfortunately, the Standard Industrial Classification system, in spite of its recent revision, still does not represent important new areas of economic activity such as biotechnology. Appropriate Levels of Aggregation Many workshop participants endorsed the proposition that collection and reporting of R&D and innovation survey information at the level of the business unit rather than the firm level or establishment level would not only enhance its utility but also improve its quality. Survey instruments should be addressed to the most appropriate person or unit in the firm. In large multiproduct firms, R&D units may not have accurate information on firm-level sales, and specific units often report operational figures that do not match those used in other units or by headquarters management. Greater Geographic Detail The NSF collects information on R&D activities at the state level, but the state is not the most useful level of geographic aggregation for analyzing innova-

OCR for page 39
--> tion activity. If appropriate business units were required to give the addresses of their R&D operations on the RD-1 survey, the data could be linked to metropolitan statistical areas or counties, enabling the analysis of regional clustering of technology-based enterprises and employment. Nestor Terleckyj of NPA Data Service, Inc., noted that the Office of Patents and Trademarks has allocated recent patent grants to counties and is continuing to do so. He urged increased efforts to achieve geographic disaggregation. Private-Sector Cooperation In discussions of data needs the conclusion is often, ''more is better"; but "more" may be unaffordable or impractical to obtain from private-sector respondents. The burden is likely to be greater and the incentives for cooperation less for small firms than for large firms, but it is important not to exclude smaller businesses from consideration. It was not the intention of the sponsors of this one-day workshop to examine in depth the practical considerations that would determine the feasibility of the suggestions offered by participants. Nevertheless, the workshop did underscore the limitations on resources and the need for private-sector cooperation through, among other means, identification of essential information, exploration of common interests in acquiring information useful to both public policy makers and corporate managers and investors, and, when appropriate, formation of data collection and dissemination partnerships on a cost-shared basis.