The Panel to Review Research and Development Statistics at the National Science Foundation (NSF) was asked to look at the definition of research and development (R&D), the needs and potential uses of NSF’s R&D data by a variety of users, the goals of an integrated system of surveys and other data collection activities, and the quality of the data collected in the existing Science Resources Statistics (SRS) surveys. The panel has examined the portfolio of R&D expenditure surveys, identifying gaps and weaknesses, and areas of missing coverage.
The R&D expenditure surveys have owed their growth to a heightened interest in science and technology policy since World War II, as well as the growing federal involvement in R&D policy. Over the years, these data have become the accepted measures of the amounts of R&D spending, and of public and private investment in areas of science and engineering. These data have been called on to serve other purposes as well. They have become a proxy indicator of the direction of technological change. They are consulted to portray the locus of emphasis among the public, private, nonprofit, and college and university sectors. Most importantly, they are used by federal agencies, Congress, and the public to frame the national debate over the investment strategy for R&D.
The NSF research and development expenditure data are often illsuited for the purposes to which they have been employed. They attempt to quantify three traditional pieces of the R&D enterprise—basic research,
applied research, and development—when much of the engine of innovation stems from the intersection of these components, or in the details of each. Public policy attention to early-stage technology development, the Advanced Technology Program, and process innovation requires data beyond these basic components of R&D. Similarly, the data are sometimes used to measure the output of R&D, when, in reality, in measuring expenditures, they reflect only one of the inputs to innovation and economic growth. It would be desirable to devise, test and, if possible, implement survey tools that more directly measure the economic output of R&D in terms of short-term and long-term innovation. Finally, the structure of the data collection is tied to models of R&D performance that are increasingly unrepresentative of the whole of the R&D enterprise. The growth of the service sector, the growing recognition of the role of small firms in R&D, the shift in funding from manufacturing R&D to health-related R&D, changes in geographic location, and the globalization of R&D have all served to challenge the current system for depicting the amount and character of R&D in today’s economy. New forms of conducting R&D in collaborative environments, using joint ventures or outsourcing arrangements, working through alliances, and outsourcing R&D to foreign affiliates are just a few of the emerging ways of conducting research and development that are not well measured by the traditional R&D surveys.
At the same time that the foundation of R&D statistics is coming under increasing pressure, the league of uses and users continues to expand. The National Science Board continues to make sophisticated use of these data in producing the comprehensive volume, Science and Engineering Indicators, every 2 years, which places additional stress on the data in terms of quality and timeliness. The data are used by the administration, particularly the U.S. Office of Management and Budget (OMB) and the Office of Science and Technology Policy, to paint a complete picture of federal and nonfederal investment in R&D. Congress not only relies on the NSF data but also has directed collection of data necessary for evaluating the need for public investment in R&D. New uses of the data for purposes for which they were not originally intended are springing up. The inclusion of R&D investment in national income and product accounts, as well as in estimates of multifactor productivity, are two examples of the emerging uses that refocus attention on these data sources.
Finally, as the data have come under increasing use, they have come under increasing scrutiny. Some users are deeply troubled by the apparent discrepancy between reports of federal spending on R&D and the amounts that academia and industry report that they have received from the federal government. This large discrepancy casts doubt on the reliability of some of the data sources.
Against this backdrop, the panel undertook an in-depth review of five of the recurring statistical collections by the Science Resources Statistics Division of the National Science Foundation:
The Survey of Industrial Research and Development,
The Survey of Federal Funds for Research and Development,
The Survey of Federal Science and Engineering Support to Universities, Colleges, and Nonprofit Institutions,
The Survey of Research and Development Expenditures at Universities and Colleges, and
The Survey of Scientific and Engineering Research Facilities.
In addition, the panel considered the provisional Survey of Innovation, which has been conducted twice, in somewhat different forms.
Industry R&D Survey
The panel devoted much of its attention to the critically important survey of industrial R&D. This survey is conducted for NSF by the U.S. Census Bureau. It was last redesigned in 1991, to expand the sample into the service sector of the economy and make other changes. The panel recommends that NSF address the problems associated with this survey first, and lists its recommendations below in order of priority.
The improvements to the industry survey are extensive and expensive and call for a reconsideration of the basis for administering this survey. For several reasons, including the need to increase the professionalism of the staff of the Science Resources Statistics Division, the panel urges SRS to take the lead in the work on the industrial survey. While leaving the exact form of this more active role up to the designs of NSF and the Census Bureau, the panel suggests using the tools of the interagency agreement, the oversight of a high-quality methodological staff, and the input of highly qualified outside experts. This lead role should be undertaken while working collaboratively with the Census Bureau (Recommendation 8.1).
The panel strongly recommends that the National Science Foundation and the Census Bureau resume a program of field observation staff visits to a sampling of reporters to examine record-keeping practices and conduct research on how respondents fill out the forms (Recommendation 3.11). The first step in this process will be to make contact with respondents. The panel supports the initiative to identify individual respondents in companies as a first and necessary step toward developing an educational interac-
tion with respondents so as to improve response rates and the quality of responses (Recommendation 8.3).
Although the survey has been modified and adapted over the past decade, it has largely failed to keep up with the fast-changing environment for the conduct and organization of research in the private business sector, or with advances in data collection and analysis techniques. Results from field observations should inform this redesign, and NSF should also conduct research into record-keeping practices of reporting establishments by industry and size of company to determine if they can report by more specific categories that further elaborate applied research and development, such as the categories utilized by the Department of Defense (DoD) (Recommendation 3.1).
NSF and the Census Bureau should test the ability to collect some disaggregated data by the newer, more detailed North American Industrial Classification System (NAICS) codes used in the industry survey today. The record-keeping practice surveys should be used to assess the feasibility and burden of providing this additional detail on industrial reporters. With this information in hand, NSF and its advisory committee should decide whether the collection of reliable R&D line-of-business data is feasible, and, if so, whether for all or a subset of reporters, and at which frequency (Recommendation 3.5).
The panel notes that a special emphasis panel of R&D officials in large companies, which had provided advice and spending projections during the 1980s, had been disbanded in 1990 for reasons of funding shortfalls and concern over whether the body was sufficiently representative of industrial R&D. Today, NSF has no standing advisory body to which it can turn for advice on measurement issues in the industry survey. We recommend that NSF again develop a panel of R&D experts, broadly representative of the R&D performing and R&D data-using communities, to serve as a feedback mechanism to provide advice on trends and issues of importance to maintaining the relevance of the R&D data (Recommendation 3.8).
Among the issues facing the managers of the industrial R&D survey is the wastefulness of surveying large numbers of establishments to find a relatively rare activity: R&D was reported only for about 3,500 of the 25,000 firms in the sample. The panel recommends the use of supplemental lists of R&D performers in drawing the sample. There are a number of practical problems to be solved in using one or more supplemental lists. Lists may overlap, and duplicates must be handled in some way. The units on the lists may not all be the same—establishments may be mixed in with companies, for example—and some editing will be needed in advance of sampling. However, the payoff in efficiency could be substantial, and the panel thinks that this approach is worth investigating (Recommendation 3.2). Low response rates in the survey are a concern of every user, as they
may signal a problem with the quality of the estimates. The panel recommends increased reliance on mandatory reporting between economic censuses, and additional research on the topic of voluntary versus mandatory reporting (Recommendation 8.5).
The panel concludes that appropriate assignment of industrial classification to industrial R&D activity requires additional breakdowns of data at the business unit level. We urge NSF and the Census Bureau to evaluate the results of the initial collection of R&D data in the Company Organization Survey to determine the long-term feasibility of collecting these data.
The panel is concerned about the possibility that the editing process, replete with analyst judgment, could introduce unmeasured and undocumented errors into the publicly released data. The panel recommends that the industrial R&D editing system be redesigned so that the current problems of undocumented analyst judgment and other sources of potential error can be better understood and addressed (Recommendation 3.12). As NSF turns to modernizing the industrial R&D survey, the panel urges it to sponsor research into the effect of imprinting prior-period data on the industrial R&D survey in conjunction with testing the introduction of web-based data collection (Recommendation 8.2).
The panel took note of a recent pioneering effort to improve understanding of the impact of foreign investment in R&D in the United States by linking Census Bureau R&D data to the foreign direct investment data of the Bureau of Economic Analysis. The panel commends the three agencies for this initiative and encourages this and other opportunities to extend the usefulness of the R&D data collected by enhancing them through matching with like datasets. We urge that the data files that result from these ongoing matching operations be made available, under the protections to assure the confidentiality of individual responses that are guaranteed by the Census Bureau’s Center for Economic Studies, for the conduct of individual research and analytical studies (Recommendation 3.9).
The panel considered the several attempts to collect data on innovation here and abroad, as well as the need for such data to illuminate the amount and outcomes of innovation activity in the economy. The panel concludes that innovation, linked activities, and outcomes can be measured and the results used to inform public debate or to support public policy development.
Furthermore, the panel recommends that resources be provided to SRS to build an internal capacity to resolve the methodological issues related to collecting innovation-related data. The panel recommends that this collection be integrated with or supplemental to the Survey of Industrial Research and Development. We also encourage SRS to work with experts in univer-
sities and public institutions who have expertise in a broad spectrum of related issues. In some cases, it may be judicious to commission case studies. In all instances, SRS is strongly encouraged to support the analysis and publication of the findings (Recommendation 4.1).
An additional recommendation is that SRS, within a reasonable amount of time after receiving the resources, should initiate a regular and comprehensive program of measurement and research related to innovation (Recommendation 4.2).
Surveys of Federal R&D Spending
In reviewing the accounting framework basis for the federal funds survey, the panel considered the growing, important uses of the federal science and technology (FS&T) budget. The panel recommends that NSF continue to collect those additional data items that are readily available in the defense agencies and expand collection of expenditures for those activities in the civilian agencies that would permit users to construct data series on FS&T expenditures in the same manner as the FS&T presentation in the president’s budget documentation (Recommendation 5.1).
The panel reviewed the basis for collection of the data from federal agencies and compared the NSF procedures with the collection methodology employed in the RAND Research and Development in the United States (RaDiUS) database, which uses data from primary contract, grant, and cooperative agreement files as the data sources. Currently, the RaDiUS database is not adequate for obtaining estimates of federal government spending by science field. The panel urges NSF, under the auspices of the E-Government Act of 2002, to begin to work with OMB to develop guidance for standardizing the development and dissemination of R&D project data as part of an upgraded administrative records-based data system (Recommendation 5.2).
Similarly, the panel recommends that NSF devote attention to further researching the issues involved with converting the federal support survey into a system that aggregates microdata records taken from standardized, automated reporting systems in the key federal agencies that provide federal support to academic and nonprofit institutions (Recommendation 5.3).
Academic R&D Surveys
Noting that it has been some three decades since the field-of-science classification system has been updated, and that the current classification structure no longer adequately reflects the state of science and engineering fields, the panel recommends that it is now time for OMB to initiate a review of the Classification of Fields of Science and Engineering, last pub-
lished as Directive 16 in 1978. The panel suggests that OMB appoint the Science Resources Statistics Division of NSF to serve as the lead agency for an effort that must be conducted on a government-wide basis, since the field classifications impinge on the programs of many government agencies. The fields of science should be revised after this review in a process that is mindful of the need to maintain continuity of key data series to the extent possible (Recommendation 6.1).
The panel recommends that NSF engage in a program of outreach to the disciplines to begin to develop a standard concept of interdisciplinary and multidisciplinary research and, on an experimental basis, initiate a program to collect this information from a subset of academic and research institutions (Recommendation 6.2).
We are concerned that the apparently growing collaborative environment for the conduct of R&D is not adequately reflected in the academic spending survey. The panel recommends that NSF consider the addition of periodic collection of information on industry-government-university collaborations as a supplemental inquiry to the survey of college and university R&D spending. A decision on the viability of this collection should be preceded by a program of research and testing of the collection of these data (Recommendation 6.3).
With regard to the academic expenditure survey, the panel observes that the exact procedure used by NSF for imputation is not well documented, but it appears that imputation is used for unit nonresponse—a practice that is highly unusual in surveys. In most surveys, unit nonresponse is handled by weighting, as it was in this survey in 1999. At a minimum, NSF is urged to compare the results of imputation and weighting procedures (Recommendation 6.9).
We balance our concern over the burdensome nature of the survey of academic scientific and engineering research facilities with evidence that the data have important uses, including to those who provide that data. Sensitive to these concerns, the NSF staff has recently introduced several innovations in the questionnaire and in process automation. The panel recommends that the experience in the fielding of the revised questionnaire in 2003 be carefully evaluated by outside cognitive survey design experts, and that the results of those cognitive evaluations serve as the foundation for subsequent improvements to this mandated survey (Recommendation 6.7). This recommendation supplements our recommendation that NSF continue to conduct a response analysis survey to determine the base quality of these new and difficult items on computer technology and cyber infrastructure, study nonresponse patterns, and make a commitment to a sustained program of research and development on these conceptual matters (Recommendation 6.8).
Nonprofit Sector Survey
In reviewing the attempts by NSF to collect data on the nonprofit sector, the panel noted that there were evident problems that were well documented in the methodology report on this survey. Nonetheless, the panel recommends that another attempt should be made to make a survey-based, independent estimate of the amount of R&D performed in the nonprofit sector (Recommendation 3.10). The panel also recommends that NSF evaluate the possibility of collecting for nonprofit institutions the same science and engineering variables that pertain to academia (Recommendation 5.3).
DISCREPANCY BETWEEN SURVEYS
In evaluating the potential sources of the apparent discrepancy between the federal reports of spending on R&D and the reports of performers of R&D, the panel concludes that much of the discrepancy is caused by the use of improper metrics. The panel recommends that future comparisons of federal funding and performer expenditures be based on outlays versus expenditures, not obligations versus expenditures (Recommendation 7.1). However, the discrepancy can be an early and important sign of problems in one or more of the surveys. The panel’s recommendation is that a reconciliation of the estimates of federal outlays for R&D and performer expenditures be conducted by NSF on an annual basis (Recommendation 7.2).
SURVEY ADMINISTRATION AND MANAGEMENT
The panel makes several recommendations concerning the administrative and management functions of NSF with regard to the surveys. Noting that the SRS division is considered a full-fledged federal statistical agency but that it is somewhat buried in NSF, the panel nonetheless could find no compelling reason to suggest that SRS be relocated organizationally within NSF. However, we have the sense that an elevation of the visibility of the resource base for SRS would be a positive step and would serve to direct attention to the needs of the programs for sustainment and improvement.
There are several tools that NSF has in its toolbox that will help the agency gain more control over aspects of survey operations. As a start, the panel recommends that NSF, in consultation with its contractors, revise the Statistical Guidelines for Surveys and Publications to set standards for treatment of unit nonresponse and to require the computation of response rates for each item, prior to sample weighting (Recommendation 8.4).
The panel would like to note that significant progress has been made
by the Science Resources Statistics Division in fostering an environment for the improvement of data quality. We continue to be hopeful that these recent initiatives, buttressed by additional resources and supplemented by further initiatives such as those outlined in this report, will lay a basis for further improvements in the future.