Read "Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy" at NAP.edu

Page 143 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

Appendix F

Science, Technology, and Innovation Databases and Heat Map Analysis

Leland Wilkinson and Esha Sinha¹

The panel assembled and analyzed underlying data on research and development (R&D), science and technology (S&T), human capital, and innovation to determine the following:

What are the primary indicators that are necessary for the National Center for Science and Engineering Statistics (NCSES) to disseminate, and are they produced by traditional or frontier methods? To address this question, cluster analysis, primarily a heat map tool, was used together with knowledge gleaned from the literature on the performance of science, technology, and innovation (STI) indicators. Reference is made to the National Science Board’s Science and Engineering Indicators (SEI) biennial publication when appropriate, but this analysis is not a full review of the SEI publication.
What are the redundant indicators that NCSES does not need to produce going forward? These indicators might be low performers; highly correlated with other, more useful indicators; or measures that are gathered by other organizations. NCSES could target these indicators for efficiency gains while curating the statistics that are in demand but reliably produced elsewhere.

This appendix describes the main data on R&D, S&T, human capital, and innovation that the panel assembled and analyzed. It is divided into four sections. The first two sections contain descriptions of data sources from NCSES and other international statistical organizations. The third section presents the heat map analysis, citing the literature on methodological underpinnings of this technique. The final section gives observations based on this analysis. Not all of the data sources described were analyzed, because it was not feasible to investigate such a wide variety of data culled from various sources. Only databases of the five main STI data providers were analyzed: NCSES; OECD; Eurostat; the United Nations Educational, Scientific and Cultural Organization (UNESCO); Institute of Statistics (UIS); and Statistics Canada. Indicators published in the SEI 2012 Digest were also analyzed.

ASSEMBLED DATA

National Center for Science and Engineering Statistics

NCSES communicates its S&T data through various publications, ranging from InfoBriefs to Detailed Statistical Tables (DSTs) derived using table generation tools. The three table generation tools—the Integrated Science and Engineering Resource Data System (WebCASPAR), the Scientists and Engineers Statistical Data System (SESTAT), and the Survey of Earned Doctorates (SED) Tabulation Engine (National Center for Science and Engineering Statistics, 2013b)—are each supported by application-specific database systems. The Industrial Research and Development Information System (IRIS) is an additional searchable database of prepopulated tables.

WebCASPAR hosts statistical data for science and engineering (S&E) at U.S. academic institutions (National Science Foundation, 2012e). This database is compiled from several surveys, including:

____________________

¹Esha Sinha, CNSTAT staff, compiled the data used in the heat map analysis. Leland Wilkinson, panel member, initially ran the heat map program, based on an algorithm that he developed. Sinha then ran several versions of the program on different datasets and over several different time periods. She presented the results of the heat map analysis to the panel during its April 2012 panel meeting. She subsequently ran more sensitivity analyses to ensure the stability of the results. Panel member John Rolph reviewed the work, concluding that the statistical analysis was sound and potentially instructive as an indicators prioritization exercise that NCSES might perform in the future.

Page 144 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

— SED²/Doctorate Records File;

— Survey of Federal Funds for Research and Development;

— Survey of Federal Science and Engineering Support to Universities, Colleges, and Nonprofit Institutions;

— Survey of Research and Development Expenditures at Universities and Colleges/Higher Education Research and Development Survey;

— Survey of Science and Engineering Research Facilities;

— National Science Foundation (NSF)/National Institutes of Health (NIH) Survey of Graduate Students and Postdoctorates in Science and Engineering; and

— National Center for Education Statistics (NCES) data sources—Integrated Postsecondary Education Data System (IPEDS):

- IPEDS Completions Survey;

- IPEDS Enrollment Surveys;

- IPEDS Institutional Characteristics Survey (tuition data); and

- IPEDS Salaries, Tenure, and Fringe Benefits Survey.

SESTAT (National Science Foundation, 2013d) is a database of more than 100,000 scientists and engineers in the United States with at least a bachelor’s degree. This is a comprehensive data collection on education, employment, work activities, and demographic characteristics, covering 1993 to 2008.³ The SESTAT database includes data from:

— the National Survey of College Graduates (NSCG);

— the National Survey of Recent College Graduates (NSRCG);

— the Survey of Doctorate Recipients (SDR); and

— an integrated data file (SESTAT).

IRIS (National Center for Science and Engineering Statistics, 2013a) is a database containing industrial R&D data published by NSF from 1953 through 2007. It comprises more than 2,500 statistical tables, which are constructed from the Survey of Industrial Research and Development (SIRD). It is, therefore, a databank of statistical tables rather than a database of microdata of firm-specific information. The data are classified by Standard Industrial Classification and North American Industrial Classification codes (as appropriate), and by firm size, character of work (basic, applied, development), and state. Employment and sales data for companies performing R&D are also included in IRIS.

The data outlined above focus on academic and industrial R&D expenditures and funding and on human capital in S&T. NCSES conducts five surveys to capture R&D support and performance figures for various sectors of the economy. The National Patterns of Research and Development Resources series of publications presents a national perspective on the country’s R&D investment. R&D expenditure and performance data are available, as well as employment data on scientists and engineers. The National Patterns data are useful for international comparisons of R&D activities, and they also report total U.S. R&D expenditures by state. The data series span 1953 through 2011 and are a derived product of NCSES’s above-referenced family of five active R&D expenditure and funding surveys:

Business Research and Development and Innovation Survey (BRDIS; for 2007 and earlier years, the industrial R&D data were collected by the SIRD);
Higher Education Research and Development Survey (HERD; for 2009 and earlier years, academic R&D data were collected by the Survey of Research and Development Expenditures at Universities and Colleges);
Survey of Federal Funds for Research and Development;
Survey of Research and Development Expenditures at Federally Funded R&D Centers (FFRDCs); and
Survey of State Government Research and Development.⁴

The SEI biennial volume is another notable contribution from NCSES, published by the National Science Board. It not only contains tables derived from the table generation tools described above but also amalgamates information from NCSES surveys, administrative records such as patent data from government patent offices, bibliometric data on publications in S&E journals, and immigration data from immigration services. For example, tables on the U.S. S&E

____________________

²SED data on race, ethnicity, citizenship, and gender for 2006 and beyond are available in the SED Tabulation Engine. All other SED variables are available in WebCASPAR except for baccalaureate institution. For more details on the WebCASPAR database, see https://webcaspar.nsf.gov/Help/dataMapHelpDisplay.jsp?subHeader=DataSourceBySubject&type=DS&abbr=DRF&noHeader=1.

³Data for 2010 were released in 2013.

⁴For details on each of these surveys, see http://nsf.gov/statistics/question.cfm#ResearchandDevelopmentFundingandExpenditures [November 2012]. A sixth survey, the Survey of Research and Development Funding and Performance by Nonprofit Organizations, was conducted in 1973 and for the years 1996 and 1997 combined. The final response rate for the 1996-1997 survey was 41 percent (see http://www.nsf.gov/statistics/nsf02303/sectc.htm). This lower-than-expected response rate limited the analytical possibilities for the data, and NSF did not publish state-level estimates. The nonprofit data cited in National Patterns reports either are taken from the Survey of Federal Funds for Research and Development or are estimates derived from the data collected in the 1996-1997 survey. See National Science Foundation (2013c, p. 2), which states: “Figures for R&D performed by other nonprofit organizations with funding from within the nonprofit sector and business sources are estimated, based on parameters from the Survey of R&D Funding and Performance by Nonprofit Organizations, 1996-97.”

Page 145 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

labor force are generated using data from the American Community Survey, the Current Population Survey (U.S. Census Bureau), SESTAT, and Occupational Employment Statistics (Bureau of Labor Statistics) (National Science Board, 2012a, Table 3-1, p. 3-8).

Along with information on U.S. R&D capacity and outputs, the SEI Digest 2012 contains analysis of the data. The SEI indicators can be classified as follows (National Science Board, 2012b) (see Box F-1): (1) global R&D and innovation; (2) U.S. R&D funding and performance; (3) U.S. R&D federal portfolio; (4) science, technology, engineering, and mathematics (STEM) education; (5) U.S. S&E workforce trends and composition; (6) knowledge outputs; (7) geography of S&T; and (8) country characteristics.

BOX F-1
NCSES’s STI Indicators

1 Global research and development (R&D) and innovation

Worldwide R&D expenditure by regions and countries
Average annual growth of R&D expenditure for the U.S., European Union (EU), and Asia-10 economies
Annual R&D expenditure as share of economic output (R&D/gross domestic product [GDP])
R&D testing by affiliation, region/country
U.S. companies reporting innovation activities
Exports and imports of high-tech goods

2 U.S. R&D funding and performance (including multinationals and affiliates)

U.S. R&D expenditure by source of funds (including venture capital)
Types of U.S. R&D performed
Types of U.S. R&D performed by source of funds
U.S. academic R&D expenditure by source of funds

3 U.S. R&D federal portfolio

U.S. federal R&D expenditure by type of R&D
U.S. federal support for science and engineering (S&E) fields
U.S. federal R&D budget by national objectives
U.S. federal R&D spending on R&D by performer
Federal research and experimentation tax credit claims by North American Industrial Classification System (NAICS) industry
Federal technology transfer activity indicators
Small Business Innovation Research (SBIR) and Technology Innovation Program

4 Science, technology, engineering, and mathematics (STEM) education (most measures have demographic breakouts)

Average mathematics and science scores of U.S. students (National Assessment of Educational Progress [NAEP] and Programme for International Student Assessment [PISA])
Teacher participation, degrees, and professional development
High school students taking college classes
First university degrees in natural sciences and S&E fields by country/region
S&E degrees, enrollments, and related expenditures—associate’s, bachelor’s, master’s, doctoral
Doctoral degrees in natural sciences and S&E fields by country/region
Distance education classes

5 U.S. S&E workforce trends and composition

Individuals in S&E occupations and as a percentage of the U.S. workforce
S&E work-related training
Unemployment rate for those in U.S. S&E occupations
Change in employment from previous year for those in STEM and non-STEM U.S. jobs
Women and underrepresented minorities in U.S. S&E occupations
Foreign-born percentage of S&E degree holders in the United States by field and level of S&E degree

6 Knowledge outputs

S&E journal articles by region/country
Engineering journal articles as a share of total S&E journals by region/country
Citations in the Asian research literature to U.S., EU, and Asian research articles
Patents and citations of S&E articles in United States Patent and Trademark Office (USPTO) patents
U.S. patents granted to non-U.S. inventors by region/ country/economy
Share of U.S. utility patents awarded to non-U.S. owners that cite S&E literature
Value added of knowledge and technology

7 Geography of S&T

Location of estimated worldwide R&D expenditure
Average annual growth rates in number of researchers by country/economy
Value added of high-tech manufacturing by region/ country
Exports of high-tech manufactured goods by region/ country
Cross-border flow of R&D funds among affiliates
State S&T indicators

8 Country characteristics

Macroeconomic variables
Public attitudes toward and understanding of S&T

SOURCE: National Science Board (2012b).

Page 146 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

The primary conclusions of the SEI Digest are drawn from the indicators outlined above and are supported by more detailed STI data collected by NCSES. The list of variables is presented in Table F-1. Because of space limitations, it was not possible to highlight in this table the fact that most of the information in the SEI—such as assessment scores, S&E degrees, individuals in S&E occupations, R&D expenditures and their various components, federal R&D obligations, patents, and venture capital—is available at the state level.

OECD

R&D statistics generated by OECD are based on three databases: Analytical Business Enterprise Research and Development (ANBERD), Research and Development Statistics (RDS), and Main Science and Technological Indicators (MSTI).

The ANBERD database presents annual data on industrial R&D expenditures. These data are broken down by 60 manufacturing and service sectors for OECD countries and selected nonmember economies. The reported data are expressed in national currencies as well as in purchasing power parity (PPP) U.S. dollars, at both current and constant prices. Estimates are drawn from the RDS database and other national sources. ANBERD is part of the Structural Analysis Database (STAN) family of industrial indicators produced by the Science, Technology, and Industry directorate at OECD.

The RDS database covers expenditures by source of funds, type of costs, and R&D personnel by occupation (in both head counts and full-time equivalents [FTEs]). This database is the main source of R&D statistics collected according to the guidelines set forth in OECD’s Frascati Manual (OECD, 2002). It covers R&D expenditures by sector of performance, source of funds, type of costs, and estimates of R&D personnel and researchers by occupation (in both head counts and FTEs). It also includes data on government budget appropriations or outlays on R&D (GBAORD) (OECD, 2013b). Data are provided to OECD by member countries and observer economies through the joint OECD/ Eurostat International Survey on the Resources Devoted to R&D. Series are available from 1987 to 2010 for 34 OECD countries and a number of nonmember economies. Information on sources and methods used by countries for collecting and reporting R&D statistics is provided in the Sources and Methods database.

OECD’s MSTI publication provides indicators of S&T activities for OECD member countries and seven nonmember economies (Argentina, China, Romania, Russian Federation, Singapore, South Africa, and Chinese Taipei). Going back to 1981, MSTI includes indicators on financial and human resources in R&D, GBAORD, patents, technology balance of payments, and international trade in R&D-intensive industries (see http://www.oecd.org/sti/msti).

OECD Patent Database comprises information on patent applications from the European Patent Office (EPO) and the United States Patent and Trademark Office (USPTO), as well as patent applications filed under the Patent Cooperation Treaty (PCT) that designate the EPO and Triadic Patent Families.⁵ The EPO’s Worldwide Patent Statistical (PATSTAT) database is the primary source of these data. The following patent statistics are available on OECD’s statistical portal: patents by country and technology fields (EPO, PCT, USPTO, Triadic Patent Families); patents by regions and selected technology fields (EPO, PCT); and indicators of international cooperation in patents (EPO, PCT, USPTO). OECD has developed four different sets of “raw” patent data for research and analytical purposes, which may be downloaded from its server. OECD also provides tables on biotechnology indicators (see http://www.oecd.org/innovation/innovationinsciencetechnologyandindustry/keybiotechnologyindicators.htm).

At present, no standard OECD database covers innovation statistics based on the Oslo Manual. The reason for this is the difficulty of comparing results based on different survey methodologies, particularly those used by countries that follow the Eurostat Community Innovation Survey (CIS) model questionnaire and those used by non-European Union (EU) countries that implement the same concepts and definitions in different ways. Ad hoc data collection on selected innovation indicators has been carried out in recent years, and the results have been published in the STI Scoreboard and other related publications.

UNESCO Institute of Statistics

UIS collects its STI data from approximately 150 countries and territories. It has also partnered with three organizations to acquire additional data: on 25 Latin American countries, from the Network on Science and Technology Indicators—Ibero-American and Inter-American (RICYT); on 40 OECD member states and associated countries, from OECD; and on 7 European countries, from Eurostat. UIS conducts a biennial R&D survey, which is administered to the office responsible for national S&T policy or statistics of United Nations (UN) member nations. Even though the survey is administered every 2 years, the questionnaire items request annual information for the previous 5 years. Therefore, the data series is available for 1996 to 2010. A major accomplishment of UIS is that it adapted survey instruments and methodologies and developed other key indicators that are suited to the needs of developing countries. The aim was to enable those countries to apply concepts of the Frascati Manual that would in turn produce comparable S&T statistics across nations. The UIS S&T survey not only collects data on R&D expenditures but also elicits information on researchers involved in R&D. The survey uses a standardized occupational classification of researchers: “professionals

____________________

⁵See http://www.oecd.org/sti/inno/oecdpatentdatabases.htm for more details and links to sources of these data.

Page 147 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

engaged in the conception or creation of new knowledge, products, processes, methods, and systems and also in the management of the projects concerned” (OECD, 2002, p. 93). The classification includes Ph.D. students who are involved in R&D activities.

In 2011, UIS conducted a pilot survey on innovation in the manufacturing sector. Countries surveyed were Brazil, China, Colombia, Egypt, Ghana, Indonesia, Israel, Malaysia, the Philippines, the Russian Federation, South Africa, and Uruguay. The survey included both technological and nontechnological innovation. Survey items were (1) firms involved in innovation, (2) cooperative arrangements, and (3) factors hampering innovation.

Eurostat

Eurostat is the European Commission’s statistical office. Its main function is to provide statistical information on European nations to the European Commission. The main themes of Eurostat’s statistical portfolio are policy indicators; general and regional statistics; economy and finance; population and social conditions; industry, trade, and services; agriculture and fisheries; external trade, transport, environment, and energy; and STI. Within the STI theme, there are five domains:

Research and development—Data are collected from national R&D surveys using definitions from the Frascati Manual (OECD, 2002).
CIS—Data originate from the national CIS on innovation activity in enterprises that are based on the Oslo Manual (OECD-Eurostat, 2005).
High-tech industry and knowledge-intensive services—Various origins and methodologies are used; statistics are compiled at Eurostat.
Patents—Data originate from the patent database PATSTAT, hosted by EPO. PATSTAT gathers data on applications from the EPO and from about 70 national patent offices around the world (mainly USPTO and the Japan Patent Office); statistics are compiled at Eurostat.
Human resources in S&T—Data are derived at Eurostat from the EU Labour Force Survey (LFS) and the Data Collection on Education Systems (UOE) according to the guidelines in the Canberra Manual (OECD, 1995).

Statistics Canada

Statistics Canada is the Canadian federal statistical agency with a mandate under the Statistics Act:

(a) to collect, compile, analyze, abstract and publish statistical information relating to the commercial, industrial, financial, social, economic and general activities and condition of the people;

(b) to collaborate with departments of government in the collection, compilation and publication of statistical information, including statistics derived from the activities of those departments;

(c) to take the census of population of Canada and the census of agriculture of Canada as provided in this Act;

(d) to promote the avoidance of duplication in the information collected by departments of government; and

(e) generally, to promote and develop integrated social and economic statistics pertaining to the whole of Canada and to each of the provinces thereof and to coordinate plans for the integration of those statistics.⁶

The Canadian Socio-economic Information Management System (CANSIM) is a socioeconomic database of Statistics Canada and contains data tables from censuses and 350 active surveys. Data are provided under various subjects, such as the system of national accounts, labor, manufacturing, construction, trade, agriculture, and finance.

There are four areas within S&T:

1. R&D—Statistics on R&D expenditures and funding are collected by six surveys focused on various performing and funding sectors:

a. Research and Development in Canadian Industry

b. Research and Development of Canadian Private Non-Profit Organizations

c. Provincial Research Organizations

d. Provincial Government Activities in the Natural Sciences

e. Provincial Government Activities in the Social Sciences

f. Federal Science Expenditures and Personnel, Activities in the Social Sciences and Natural Sciences

2. Human resources in S&T—Data on personnel engaged in R&D are derived from the Federal Science Expenditures and Personnel, Activities in the Social Sciences, and Natural Sciences surveys.

3. Biotechnology—Currently inactive, the 2005 Biotechnology Use and Development Survey provided information on innovation by biotechnology companies.

4. Innovation—CANSIM includes tables from the 2003 and 2005 survey cycles of the Survey of Innovation. Jointly with Industry Canada and Foreign Affairs and International Trade Canada, Statistics Canada conducted the first Survey of Innovation and Business

____________________

⁶Available: http://www.statcan.gc.ca/edu/power-pouvoir/aboutapropos/5214850-eng.htm [July 2013].

Page 148 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-1 Subtopics of Science, Technology, and Innovation Data Produced by Agencies/Organizations, Showing Level of Detail and Unique Variables


Agency/Organization	Total R&D	Industrial R&D	Academic R&D	Federal R&D	GBAORD	Nonprofit R&D

NCSES (NB: Statistics on R&D expenditure and SEH degrees available by state)	Total R&D by performer and funder, character of work	Industrial R&D by funder, character of work, NAICS classification, company size	Academic R&D by funder, character of work; entities and subrecipients of academic R&D	Federal R&D by funder, character of work	R&D obligations and outlays by character of work and performing sector; reported in Science and Engineering Indicators only	Nonprofit R&D by funder, character of work, S&E field, extramural entity, type of nonprofit organization
NCSES and NSB	Total R&D by performer and funder, character of work, country/economy	Industrial R&D by funder, NAICS classification, company size; R&D performed by multinational companies, foreign affiliates	Academic R&D in S&E and non-S&E fields	Federal R&D by major socioeconomic objectives, country/region	Federal obligations for R&D and R&D plant by agency, performer, character of work	Domestic R&D performed by private nonprofit sector, domestic R&D funded by private nonprofit sector
Statistics Canada	GERD by performer and funder	BERD by funding sector	HERD	GOVERD by socioeconomic objectives, type of science, com onents		Private nonprofit R&D by funder
OECD	GERD by performer, funder, field of science, socioeconomic objectives	BERD by funding sector, type of cost, size class, field of science, performing industry	HERD, HERD financed by industry	GOVERD, GOVERD financed by industry	GBAORD by socioeconomic objectives	GERD performed by private nonprofit sector
Eurostat (NB: Almost all data available at regional level)	GERD by funding source, sector of performance, type of cost, socioeconomic objectives, field of science	BERD by funding source, type of cost, size class, economic activity	HE intramural expenditure by funding source, type of cost, field of science, socioeconomic objectives	Government intramural expenditure by funding source, sector of performance, type of cost, socioeconomic objectives, field of science	GBAORD by socioeconomic objectives	GERD performed by private nonprofit sector; GERD funded by private nonprofit sector
UNESCO	GERD by performing sector and funding sector, field of science, character of work	GERD performed by business enterprise sector; GERD funded by business enterprise sector	GERD performed by higher education sector; GERD funded by higher education sector			GERD performed by private nonprofit sector; GERD funded by private nonprofit sector

Page 149 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

Agency/Organization	Technology BOP and International Trade in R&D-Intensive Industries	Patents and Venture Capital	R&D Personnel, Scientists and Engineers	Science, Engineering, and Health Degrees; Assessment Scores	Innovation	Public Attitudes Toward Science and Technology

NCSES (NB: Statistics on R&D expenditure, SEH degrees available by state)			Scientists and engineers by gender, age, race/ethnicity, level of highest degree, occupation, labor force status, employment sector, primary/secondary work activity, median annual salaries	Graduate students, doctorate holders, postdoctorates, nonfaculty research staff by gender, race/ethnicity, citizenship, academic field, Carnegie classification; SEH doctorates by gender, age, race/ethnicity, occupation, labor force status, employment sector, primary/secondary work activity, postdoctoral appointments, median annual salaries	Product and process innovation by NAICS classification
NCSES and NSB	U.S. trade balance in research, development, and testing services by affiliation; exports of high-technology and manufactured products by technology level, product, region/country/economy; global value added by type of industry; ICT infrastructure index; U.S. high-technology microbusinesses	USPTO patents granted by selected technology area, region/country/economy; patenting activity in clean energy and pollution control technologies; patents granted by BRIC nations by share of resident and nonresident inventors; patent citations to S&E articles by patent technology area, article field; patenting activity of employed U.S.trained SEH doctorate holders; U.S. venture capital investment by financing stage and industry/technology; venture capital disbursed per $1,000 of GDP; venture capital deals as a percentage of high-technology business establishments; venture capital disbursed per venture capital deal by state	Workers in S&E and STEM occupations by MSA, occupation category, educational background, R&D work activities, age, ethnicity/race; scientists and engineers reporting international engagement by demographic characteristics, education, employment sector, occupation, salary, work-related training; foreign-born workers in S&E occupations by education level	SEH doctorate holders by gender, race/ethnicity, field of doctorate, sector of employment, academic appointment, salary, unemployment rate; S&E doctorate recipients and full-time S&E graduate students by source, primary mechanism of support, Carnegie classification; foreign recipients of U.S. S&E doctorates by field, country/economy of origin; field switching among postsecondary students; time taken to receive an S&E doctorate; community college attendance among recent recipients of S&E degrees by sex, race/ethnicity, degree level, degree year, citizenship status; NAEP assessment scores in mathematics and science; advanced placement exam taking by public school students	Small Business Innovation Research funding per $1 million of GDP by state	Media coverage, news stories by topic area; correct answers to S&T and S&T-related questions by gender and country/region; public perceptions of various occupational groups’ contribution to society and public policy-making process; public assessment/opinion of stem cell research and environmental problems; source of information for S&T issues

Page 150 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


Agency/Organization	Total R&D	Industrial R&D	Academic R&D	Federal R&D	GBAORD	Nonprofit R&D

Statistics Canada		Intellectual property commercialization by higher education sector; intellectual property management by federal department and agency	Researchers, support staff, technicians by R&D performing sector and type of science; federal personnel engaged in S&T by activity, type of science, S&T component	University degrees, diplomas, and certificates granted by program level and Classification of Instructional Programs, gender, immigration status	Innovation activities; product and process innovation; degree of novelty; hampering factors of and obstacles to innovation; important sources of information; cooperation arrangements; innovation impacts; methods of protection; geomatic activities
OECD	BOP—payments and receipts; trade—imports and exports by R&D-intensive industries	Patent applications, Triadic Patent Families, patents in selected technologies by region, international cooperation	R&D personnel by field of science, sector of performance, qualification; researchers by sector of performance and gender	Graduates by field and level of education; PISA scores in science and mathematics
Eurostat (NB: Almost all data available at subnational level)	Trade in high-tech industries and knowledge-intensive sectors within EU and ROW; employment in these sectors by gender, occupation, educational qualification, mean earnings	Patent applications at USPTO and EPO by priority year and sector, ownership of patents, patent citations; European and international copatenting, Triadic Patent Families	Human resources in S&T, R&D personnel, and researchers by gender, field of science, sector of performance, qualification; citizenship of researchers in government and higher education sector	Doctorate holders by gender, activity status; employed doctorate holders by gender, sector, occupation, field of science, job mobility	Innovation activities; product and process innovation; degree of novelty; hampering factors of and public funding for innovation; important sources of information; cooperation arrangements; environmental innovation; objectives of innovation; impacts of innovation; methods of protection; employees in innovation sector

Page 151 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

UNESCO	R&D personnel and researchers (FTE and head count) by gender, performing sector, educational qualification, field of science; technicians and other supporting staff by performing sector	Innovation in manufacturing sector—firms involved in innovation; cooperation arrangements; hampering factors of innovation; available for 12 nations

NOTES: BERD = business enterprise expenditure on research and development; BOP = balance of payments; BRIC = Brazil, Russia, India, and China; EPO = European Patent Office; EU = European Union; FTE = full-time equivalent; GBAORD = government budget appropriations or outlays for research and development; GDP = gross domestic product; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on research and development; HE = higher education; HERD = higher education expenditure on research and development; ICT = information and communication technology; MSA = metropolitan statistical area; NAEP = National Assessment of Educational Progress; NAICS = North American Industry Classification System; NSB = National Science Board; NCSES = National Center for Science and Engineering Statistics; PISA = Programme for International Student Assessment; R&D = research and development; ROW = rest of the world; S&E = science and engineering; SEH = science, engineering, and health; STEM = science, technology, engineering, and mathematics; UNESCO = United Nations Educational, Scientific and Cultural Organization; USPTO = United States Patent and Trademark Office.

SOURCES: Adapted from BRDIS, see http://www.nsf.gov/statistics/industry/ [November 2012]. Federal Funds, see http://www.nsf.gov/statistics/fedfunds/ [November 2012]. R&D Expenditure at FFRDCs, see http://www.nsf.gov/statistics/ffrdc/ [November 2012]. HERD, see http://www.nsf.gov/statistics/herd/ [November 2012]. Science and Engineering State Profiles, see http://www.nsf.gov/statistics/pubseri.cfm?seri_id=18 [November 2012]. S&E I 2012, see http://www.nsf.gov/statistics/seind12/tables.htm [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012]. Statistics Canada, CANSIM, see http://www5.statcan.gc.ca/cansim/a33?lang=eng&spMode=master&themeID=193&RT=TABLE [November 2012].

Page 152 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

Strategy (SIBS) in 2010. Information was collected from enterprises for the period 2007-2009, and survey estimates were published in 2011. SIBS data are not available in CANSIM.

World Intellectual Property Organization (WIPO)

WIPO is a specialized UN agency focused on intellectual property—patents, trademarks, copyrights, and design. It collects data by sending questionnaires to the intellectual property offices of 185 member states. It produces annual statistics on patents, utility models, trademarks, industrial designs, and plant varieties, thus creating a comprehensive national database of intellectual property. Some of the data series for particular nations go back to 1885.

Innovation Data from Other U.S. Agencies

Apart from NCSES, other U.S. agencies collect innovation statistics, ranging from patents and trademarks to grants and federal awards (see Table F-2):

USPTO—Three datasets are available from USPTO—the Patent Assignments Dataset, Trademark Casefile Dataset, and Trademark Assignments Dataset. As their names suggest, these datasets comprise ownership and changes in ownership for USPTO-granted patents and trademarks.
Economic Research Service (ERS), U.S. Department of Agriculture (USDA)—In 2013, ERS began fielding its first Rural Establishment Innovation Survey, which is aimed at business establishments funded through USDA’s Rural Development Mission Area. The purpose of the survey is threefold: to collect information on the adoption of innovative practices and their contribution to firm productivity; to discover how participation in federal, state, and local programs aids the growth of rural business units; and to determine usage of available local and regional assets, such as workforce education and local business associations, by rural business units.
NIH, NSF, and the White House Office of Science and Technology Policy—Science and Technology for America’s Reinvestment: Measuring the Effect of Research on Innovation, Competitiveness, and Science (STAR METRICS) is a multiagency venture that relies on the voluntary participation of science agencies and research institutions to document the outcomes of science investments for the public. Currently, more than 90 institutions are participating in the program. The STAR METRICS data infrastructure contains recipient-based data that include information on contract, grant, and loan awards made under the American Recovery and Reinvestment Act of 2009.
NSF—The U.S. government’s research spending and results webpage provides information on active NSF and National Aeronautics and Space Administration (NASA) awards, such as awardees, funds obligated, and principal investigator.
National Institute of Standards and Technology (NIST), Department of Commerce (Anderson, 2011)—NIST has been responsible for preparing the Department of Commerce’s report on technology transfer utilization. The Federal Laboratory (Interagency) Technology Transfer Summary Reports cover federal laboratories and FFRDCs. They contain data tables on patent applications, invention licenses, cooperative R&D agreements, and R&D obligations, both extramural and intramural.
Small Business Administration (SBA)—The Small Business Innovation Research (SBIR) program and the Small Business Technology Transfer (STTR) program fall under the administration of the SBA’s Technology Program Office. These programs award more than $2 billion each year to small high-tech businesses.⁷ The SBA-Tech.net website includes a searchable database on federal R&D funds/awards by agency, category, and state.
Department of Energy (DOE)—DOE’s visual patent search tool allows users to collect information on issued U.S. patents and published patent applications that result from DOE funding.

Data collected by federal statistical agencies, either through surveys or from administrative databases, contain rich information on various economic and social issues. Most of this information is used by private corporations and educational institutions (sometimes the agencies themselves) that either present the data in a comprehensive fashion or develop tools for dissemination and analysis. Some of those efforts are outlined below:

Google—On its website, Google hosts a bulk download tool that allows users to download data tables on patents and trademarks issued by USPTO.
NIH, NSF, and the White House Office of Science and Technology Policy—Applications of the STAR METRICS data platform include the Portfolio Explorer Project, a tool for examining public research award information by topic, region, institution, and researcher. STAR METRICS currently uses four tools to view scientific portfolios (Feldman, 2013b):

— The Portfolio Viewer provides information on proposals, awards, researchers, and institutions by program level and scientific topic.

____________________

⁷For details, see http://www.sba.gov/about-sba-services/7050 [July 2013].

Page 153 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-2 Innovation Data from U.S. Agencies Other Than NCSES


Agency	Database/Survey/Data Collection Mechanism	Indicator/Data Items	Time Period

United States Patent and Trademark Office (USPTO)	Patent Assignments Dataset	Patent assignments and change of ownership of patents that are granted by USPTO	2010 onward
	Trademark Casefile Dataset	Trademarks granted by USPTO	1884-2010
	Trademark Assignments Dataset	Change of ownership of trademarks granted by USPTO	2010 onward
Economic Research Service, U.S. Department of Agriculture	Rural Establishment Innovation Survey	Inventory of innovation activities; use of technology by labor force, establishment, and community characteristics; factors hampering innovation; funding source for innovation; applications for intellectual property and trademarks; sources of information on new opportunities	First survey cycle was conducted in 2013
National Institutes of Health, National Science Foundation, and White House Office of Science and Technology Policy	STAR METRICS	Recipient-based data containing information on contract, grant, and loan awards made under the American Recovery and Reinvestment Act of 2009	2009-2012
National Science Foundation	Research Spending and Results	Recipient-based data containing information on awards made by the National Science Foundation and the National Aeronautics and Space Administration	2007 onward
National Institute of Standards and Technology, U.S. Department of Commerce	Federal Laboratory (Interagency) Technology Transfer Summary Reports	Patent applications, invention licenses, cooperative R&D agreements, R&D obligations—extramural and intramural	1987-2009
Small Business Administration	Small Business Innovation Research (SBIR) and Small Business Technology Transfer (STTR) programs	Federal R&D funds/awards by agency, category, and state	1983-2012
U.S. Department of Energy NB: Information available at the state, county, and municipal levels, as well as from utilities and nonprofits	Energy Innovation Portal—Visual Patent Search Tool	Issued U.S. patents and published patent applications that are created using Department of Energy funding	1979 onward
	Advanced Manufacturing Office—State Incentives and Resource Database	Energy-saving incentives and resources available for commercial and industrial plant managers

SOURCES: USPTO databases, see http://www.gwu.edu/~gwipp/Stuart%20Graham%20020712.pdf. Rural Development, USDA, see http://www.gpo.gov/fdsys/pkg/FR-2011-06-22/html/2011-15474.htm. STAR METRICS, see https://www.starmetrics.nih.gov/Star/Participate#about. Research Spending and Results, see https://www.research.gov/research-portal/appmanager/base/desktop?_nfpb=true&_eventName=viewQuickSearchFormEvent_so_rsr. Federal Laboratory (Interagency) Technology Transfer Summary Reports, see http://www.nist.gov/tpo/publications/federal-laboratory-techtransfer-reports.cfm. SBIR and STTR, see http://www.sbir.gov/. Energy Innovation Portal, see http://techportal.eere.energy.gov/. Advanced Manufacturing Office, see http://www1.eere.energy.gov/manufacturing/.

Page 154 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

— The Expertise Locator provides information on proposals and coprincipal investigators related to different topic areas to make it possible to locate researchers working on that topic.

— The Patent Viewer provides data on patents from NSF grantees.

— The Map Viewer offers a geographic tool for viewing NSF investments by institution and topic.

DOE—DOE’s Green Energy Data Service contains bibliographic data for patents relating to various forms of green energy (e.g., solar, wind, tidal, bio-energy) resulting from research sponsored by DOE and predecessor agencies.
Indiana University—Innovation in American Regions is a project funded in part by the U.S. Commerce Department’s Economic Development Administration. The research is conducted by the Purdue Center for Regional Development, Indiana University, Kelley School of Business. The web tools available to users are the Innovation Index, Cluster Analysis, and Investment Analysis. The Innovation Index is a weighted index of indicators based on four components—human capital, economic dynamics, productivity and employment, and economic well-being. Cluster Analysis depicts occupation and industry clusters for any state, metro area, micro area, district, or county in the nation. Investment Analysis provides various kinds of information to aid regional investors.
Innovation Ecologies Inc.—The Regional Innovation Index is a single data platform consisting of a host of indicators from various sources. The indicators measure venture capital, labor inputs, personal income, education and training, globalization, Internet usage, R&D inputs, universities, quality of life, knowledge, employment outcomes in firms and establishments, social and government impacts, and innovation processes.

Innovation Data from Private Sources

A number of educational institutions and corporate organizations collect and disseminate innovation data (see Table F-3):

University of California, Los Angeles (UCLA)—Zucker and Darby (2011) developed the COMETS (Connecting Outcome Measures in Entrepreneurship Technology and Science) database. COMETS is an integrated database of principal investigators, dissertation writers and advisers, inventors, and employees at private-sector firms. COMETS data can be used to trace government expenditures on R&D from the initial grant through knowledge creation, translation, diffusion, and in some cases commercialization.
Association of University Technology Managers (AUTM)—AUTM has been conducting licensing surveys on U.S. and Canadian universities, hospitals, and research institutions since 1991. Twenty years of data from participating institutions are placed in Statistics Access for Tech Transfer (STATT), a searchable and exportable database. It contains information on income from, funding source of, staff size devoted to, and legal fees incurred for licensing; start-ups that institutions created; resultant patent applications filed; and royalties earned.
Association of Public and Land-grant Universities (APLU), Commission on Innovation, Competitiveness, and Economic Prosperity (CICEP)—APLU is involved in creating new metrics with which to measure the economic impact of universities at the regional and national levels. APLU’s CICEP has been working to identify and investigate the efficacy of potential metrics in the areas of human capital and knowledge capital. Indicators being investigated range from unfunded agreements between universities and industry (e.g., material transfer agreements, nondisclosure agreements), to student engagement in economic activities, to the impacts of technical assistance provided by universities to various actors in the region’s economy.
Harvard University—Patent Network Dataverse (Feldman, 2013a) is an online database created and hosted by the Institute for Quantitative Social Science at Harvard University. This is a “virtual web archive” that has, among other things, matched patents and publication data. Researchers use Dataverse to publish, share, reference, extract, and analyze data.
PricewaterhouseCoopers and the National Venture Capital Association—The MoneyTree Report is published quarterly and is based on data provided by Thomson Reuters. The report contains data on venture capital financing, including the companies that supply and receive the financing.
Venture capital database—CB Insights, Venture Deal, Grow Think Research, and Dow Jones VentureSource have venture capital databases that profile venture capital firms and venture capital-financed firms.

TYPES OF INFORMATION CAPTURED BY VARIOUS STI DATABASES

STI data can be broadly categorized into three distinct subtopics:

1. R&D expenditure—Total R&D activity in a nation can be further broken down into:

— Total R&D expenditure or gross domestic expenditure on R&D (GERD),

Page 155 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-3 Innovation Product Data from Private Sources


Agency	Database/Survey/Data Collection Mechanism	Indicator/Data Items	Time Period

University of California, Los Angeles	Connecting Outcome Measures in Entrepreneurship, Technology, and Science (COMETS) database	Integrates data on government grants, dissertations, patents, and publicly available firm data; currently contains information on patents granted by the U.S. Patent and Trademark Office (USPTO) and on National Science Foundation (NSF) and National Institutes of Health (NIH) grants	2007-2012
Association of University Technology Managers	Statistics Access for Tech Transfer (STATT)	Academic licensing data from participating academic institutions: licensing activity and income, start-ups, funding, staff size, legal fees, patent applications filed, royalties earned	1991-2010
Association of Public and Land-grant Universities—Commission on Innovation, Competitiveness, and Economic Prosperity	New Metrics to Measure Economic Impact of Universities	Relationship with industry: agreements, clinical trials, sponsored research, external clients Developing the regional and national workforce: student employment, student economic engagement, student entrepreneurship, alumni in workforce Knowledge incubation and acceleration programs: success in knowledge incubation and acceleration programs, ability to attract investments, relationships between clients/program participants and host university	Pilot conducted in spring 2012 with 35 participating institutions
Harvard University	Patent Network Dataverse: U.S. Patent Inventor Database	Patent coauthorship network	1975 onward
PricewaterhouseCoopers and National Venture Capital Association	Money Tree Report	Venture capital firms and firms receiving financing: quarterly and yearly investment amounts, number of deals by industry, stage of development, first-time financings, clean technology, and Internet-specific financings	Quarterly data, 1st quarter 1995 onward
CB Insights, Venture Deal, Grow Think Research, Dow Jones VentureSource	Venture Capital Database	Profile of venture capital firms and venture capital-financed firms

SOURCES: COMETS Database, see http://scienceofsciencepolicy.net/?q=node/3265. STATT database, see http://www.autm.net/source/STATT/index.cfm?section=STATT. APLU Economic Impact, see http://www.aplu.org/page.aspx?pid=2693. Patent Network Dataverse, see http://thedata.harvard.edu/dvn/dv/patent. Money Tree Report, see https://www.pwcmoneytree.com/MTPublic/ns/index.jsp. Venture Capital databases, see http://www.cbinsights.com/;http://www.venturedeal.com/; http://www.growthinkresearch.com/; https://www.venturesource.com/login/index.cfm?CFID=2959139&CFTOKEN=53e4cab1e600d5d-9089-411f-a010-949554ae0978.

— Business R&D expenditure or business enterprise expenditure on R&D (BERD),

— Academic R&D expenditure or higher education expenditure on R&D (HERD),

— Federal R&D expenditure,

— Government intramural expenditure on R&D (GOVERD),

— Government budget appropriations or outlays on R&D (GBAORD), and

— R&D performed and/or funded by nonprofit organizations.

2. Human capital/human resources in S&T—It comprises human capital in S&T, including individuals in S&T occupations and those with degrees in S&T fields. Most of the above-mentioned agencies/organizations produce statistics on both subgroups. The variables reported are:

— total R&D personnel;

— researchers;

— technicians;

— other supporting staff;

— scientists and engineers;

— number of degrees in science, engineering, and health (SEH) fields; and

— number of graduates in S&E fields.

3. Innovation—Statistics on business innovation are being collected by NCSES through BRDIS. NCSES has released two InfoBriefs (NSF 11-300 and NSF

Page 156 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

12-307) that provide information on technologically innovative firms and usage of methods for protecting intellectual property, both by North American Industrial Classification System (NAICS) classification. Data in both InfoBriefs were gathered from the 2008 BRDIS. BRDIS focuses on technological innovation (product and process) and was inspired by Eurostat’s CIS, which looks at both technological and nontechnological innovation. Similarly, Statistics Canada’s SIBS contains many elements borrowed from the CIS. To some extent, this ensures that questions across the three surveys align, and may be helpful for international data comparisons. As the survey results become available, it will be possible to answer the question of whether the data across all three surveys are truly comparable; for now, however, it is too early to say. The subtopics within innovation statistics stem from sections/questions in survey questionnaires and can divided into nine categories:⁸

type of innovation activity—product, process, organizational, marketing;
innovation activity and expenditure;
turnover due to innovative products;
objectives of innovation;
sources of information on innovation;
cooperation in innovation activity;
factors hampering innovation activity;
government support/public funding for innovation; and
innovation with environmental benefits.

Table F-1, presented earlier, outlines the level of detail available in the STI data produced by NCSES, Statistics Canada, OECD, Eurostat, and UNESCO. Unique variables produced by these agencies—those not available from other organizations—are highlighted in the table.

Even though agencies and other organizations try to produce STI statistics covering the subtopics, some of them clearly have an advantage over others in certain areas. Staff of the Committee on National Statistics looked at the concentration of agencies and other organizations in various subtopics (see Figures F-1 and F-2). The metric used in these figures is the percentage of tables produced on a particular subtopic relative to the total tables generated by the STI database. Using Eurostat’s statistics database as an example, it has 330 tables on various STI subtopics (see Table F-1). Of those, 9 tables show GERD values of member nations disaggregated by economic activity, costs, and so on. Similarly, there are 40 tables on R&D personnel and their various attributes, which brings the percentage of tables on the R&D personnel subtopic to 12 percent (40 divided by 330). A separate figure was created for NCSES to avoid confusion. As seen in Figures F-1 to F-3, STI data produced by NCSES are oriented toward scientists and engineers and SEH degrees; Statistics Canada and Eurostat focus more on innovation topics, and OECD and UNESCO on researchers.

images

FIGURE F-1 Subtopics of science, technology, and innovation data produced by agencies/organizations other than the National Center for Science and Engineering Statistics.
NOTES: The scale is in reverse order. As one moves closer to the epicenter, the value increases. BERD = business enterprise expenditure on R&D; BOP = balance of payments; GBAORD = government budget appropriations or outlays for research and development; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on R&D; HERD = higher education expenditure on research and development; R&D = research and development; SEH = science, engineering, and health; UNESCO = United Nations Educational, Scientific and Cultural Organization.
SOURCES: Adapted from UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012.] Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012]. Statistics Canada, CANSIM, see http://www5.statcan.gc.ca/cansim/a33?lang=eng&spMode=master&themeID=193&RT=TABLE [November 2012].

METHODOLOGY

The panel used cluster analysis, which includes multidimensional scaling (MDS), and a heat map tool to understand the redundancy in the main S&T indicators produced by the above-mentioned organizations/agencies. Although MDS and the heat map are not exclusive approaches to analyzing STI data, they are among many possible paths to understanding the issue of redundancy in the multitude of variables published by various agencies and organizations. Both methods

____________________

⁸Subtopics in the CIS.

Page 157 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-2 Subtopics of science, technology, and innovation data produced by the National Center for Science and Engineering Statistics.
NOTES: The scale is in reverse order. As one moves closer to the epicenter, the value increases. BERD = business enterprise expenditure on R&D; BOP = balance of payments; GBAORD = government budget appropriations or outlays for research and development; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on R&D; HERD = higher education expenditure on research and development; R&D = research and development; SEH = science, engineering, and health. SOURCES: Adapted from BRDIS, see http://www.nsf.gov/statistics/industry/ [November 2012]. Federal Funds, see http://www.nsf.gov/statistics/fedfunds/ [November 2012]. R&D Expenditure at FFRDCs, see http://www.nsf.gov/statistics/ffrdc/ [November 2012]. HERD, see http://www.nsf.gov/statistics/herd/ [November 2012]. Science and Engineering State Profiles, see http://www.nsf.gov/statistics/pubseri.cfm?seri_id=18 [November 2012].

offer wide-ranging applications in various fields and have helped researchers gain some amount of understanding of the dataset on which they are working.

Generally, cluster analysis⁹ is a collection of methods for finding distinct or overlapping clusters in data. It is an analytic procedure for grouping sets of objects into subsets that are relatively similar among themselves. In a broad sense, there are two methods of clustering—hierarchical and partitioning. With hierarchical methods, small clusters are formed that merge sequentially into larger and larger clusters until only one remains, resulting in a tree of clusters. Partitioning methods split a dataset into a set of discrete clusters that are nonhierarchical in nature because they do not fit into a tree or hierarchy. Different numbers of clusters on the same dataset can result in different partitioning that may overlap. To produce clusters, there must be some measure of dissimilarity or distance among objects. Similar objects should appear in the same cluster and dissimilar objects in different clusters. Different measures of similarity produce different hierarchical clusterings. If there are two vectors consisting of values on p features of two objects, popular distance measures are:¹⁰

images

FIGURE F-3 Subtopics of science, technology, and innovation indicators published in Science and Engineering Indicators 2012 Digest.
NOTES: The scale is in reverse order. As one moves closer to the epicenter, the value increases. BERD = business enterprise expenditure on R&D; BOP = balance of payments; GBAORD = government budget appropriations or outlays for research and development; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on R&D; HERD = higher education expenditure on research and development; R&D = research and development; S&E = science and engineering; SEH = science, engineering, and health.
SOURCE: Adapted from Science and Engineering Indicators 2012, see http://www.nsf.gov/statistics/seind12/tables.htm.

Euclidean—the square root of the sum of squared elementwise differences between the two vectors;
City Block—the sum of absolute differences between the two vectors;
cosine—the inner product of the two vectors divided by the product of their lengths (norms);
Pearson 1—the Pearson correlation between two vectors; and
Jaccard—the sum of the mismatches between the elements of one vector and the elements of the other.

____________________

⁹For examples of the broad analytical capabilities of cluster analysis, see Feser and Bergman (2000) on industrial clusters, Myers and Fouts (2004) on K-12 classroom environments for learning science, and Newby and Tucker (2008) in the area of medical research.

¹⁰This explanation of distance measures and linkage methods is based on the Data Analysis output of AdviseStat (see http://www.skytree.net/products-services/adviser-beta/ [December 2012]).

Page 158 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

Euclidean distances are sensitive to differences between values because the differences are squared, and larger differences carry more weight than small. City Block distances are similarly sensitive to differences between values, but these differences are not squared. In contrast with Euclidean or City Block distances, cosine-based distances are invariant to scaling—multiplying all values by a constant will not change cosine distance. Pearson-based distances are invariant to scaling and location—adding a constant to the values will not change these distances. Jaccard distances are based only on mismatches of values and are invariant under one-to-one recodings of unique values. All of these measures are real (metric) distances; they obey the metric axioms.¹¹

Multidimensional Scaling

The panel used MDS models to discover which sets of indicators are more similar to each other and to help in arriving at a set of primary and derivative indicators. Table F-1 and Figure F1, presented earlier, show the great number of variables capturing numerous pieces of STI information. From the viewpoint of an agency or organization, it is important to understand which indicators are necessary for addressing key policy questions and in turn make the production of STI variables more efficient. Various applications of MDS are documented by Young and Hamer (1987). MDS is frequently applied to political science data, such as voting preferences. For example, Minh-Tam and colleagues (2012) used MDS to embed the network of capital cities of European nations based on their pairwise distances (Minh-Tam et al., 2012; Nishimura et al., 2009).

The original motivation for MDS was to fit a matrix of dissimilarities or similarities to a metric space. Since its origins, however, MDS has had many other applications. A popular use is to compute a distance matrix on the columns of a rectangular matrix using a metric distance function (Euclidean, cosine, Jaccard, power, etc.). The result is that MDS projects the original variables into a low-dimensional (usually 2-dimensional) space. This approach is an alternative to principal components analysis. If the projection is intrinsically nonlinear for a given dataset, MDS can provide a better view than principal components.

Young (2013, p. 1) describes the process as follows:

Multidimensional scaling (MDS) is a set of data analysis techniques that display the structure of distance-like data as a geometrical picture. It is an extension of the procedure discussed in scaling…. MDS pictures the structure of a set of objects from data that approximate the distances between pairs of the objects…. Each object or event is represented by a point in a multidimensional space. The points are arranged in this space so that the distances between pairs of points have the strongest possible relation to the similarities among the pairs of objects. That is, two similar objects are represented by two points that are close together, and two dissimilar objects are represented by two points that are far apart.

Given a configuration of points in a metric space, one can compute a symmetric matrix of pairwise distances on all pairs of points. By definition, the diagonal of this matrix is zero, and the off-diagonal elements are positive. Now suppose a condition is inverted. One has an input matrix X and wants to compute a distance matrix Y containing the coordinates of points in the metric space using the distance formula provided by the metric. A general formula for a distance metric is:

d_ij^p = ∑_a^r | X_ia – X_ja |^P, (p ≥ 1), X_i ≠ X_j

where there are r dimensions, where X_ia is the coordinate of point i on dimension a, and where X_i is an r-element row vector from the ith row of the n by r matrix X containing the coordinates X_ia of all n points on all r dimensions. For d_ij to satisfy all of the properties of a metric, d_ij must be positive. Therefore, only the positive root of d_ij must be used in determining d_ij. This is known as a Minkowski model. Three special cases of the Minkowski model are of primary interest. One of these is the Euclidean model, which is obtained when the Minkowski exponent (p) is 2. The second is the city block or taxicab model; when p = 1, d_ij is simply the sum of absolute difference in the coordinates of the two points. When p is infinitely large, the Dominance model is obtained.

The MDS analysis in this report uses the Euclidean model, as described earlier in this chapter. For the application to STI indicators, the input matrix X needed to be symmetric, which refers to X_ia = X_ai. Since the input matrix was not symmetric initially, a matrix of correlation coefficients of the variables in the analysis was used.¹²

Heat Map

Another notable method is cluster heat maps. In certain fields, the analyst wants to cluster rows and columns of a matrix simultaneously. The popular display is called the cluster heat map.¹³ Wilkinson and Friendly (2009) describe heat map analysis as follows:

The cluster heat map is a rectangular tiling of a data matrix with cluster trees appended to its margins. Within a relatively compact display area, it facilitates inspection of row, column,

____________________

¹¹Metric axioms are: Identity, where distance (A, A) = 0; Symmetry, where distance (A, B) = distance (B, A); and Triangle Inequality, where distance (A, C) ≤ distance (A, B) + distance (B, C). See http://www.pigeon.psy.tufts.edu/avc/dblough/metric_axioms.htm [July 2013].

¹²For more details on the derivation of the Euclidean model from the general formula for a distance metric, see Young and Hamer (1987, p. 87).

¹³The whole explanation of clustering methods is taken from “A Second Opinion on Cluster Analysis,” Whitepaper on a Second Opinion, downloaded from the AdviseAnalystics website (https://adviseanalytics.com/advisestat [December 2012]).

Page 159 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

and joint cluster structure. Moderately large data matrices (several thousand rows/columns) can be displayed effectively on a higher-resolution color monitor, and even larger matrices can be handled in print or in megapixel displays.

The heat map also orders the variables such that similar variables are closer to each other. Cluster heat maps are built from two separate hierarchical clusterings on rows and columns, and as a consequence, each rests on the same foundation that one-way hierarchical clustering involves.

ANALYSIS

STI databases consist of variables, some of which measure R&D expenditures, some the numbers of scientists and engineers, others the amount of trade taking place, and so on. As there is no uniform scale for all of these variables (dollar figures and actual counts), the panel decided to use Pearson distance, which is 1 − the Pearson correlation between two vectors. Therefore, if a correlation is negative, the distance will be greater than 1. For standardized variables (z-scores), 1 − Pearson is equivalent to Euclidean distance. One can also conclude that hierarchical clustering depends on linkage methods. These methods determine how the distance between two clusters is calculated. They are:

single—distance between the closest pair of objects, one object from each group;
complete—distance between the farthest pair of objects, one object from each group;
average—average of distances between all pairs of objects, one object from each group;
centroid—distance between the centroids of the clusters;
median—distance between the centroids of the clusters weighted by the size of the clusters; and
Ward—increase in the within-cluster sum of squares as a result of joining two clusters.

Once distances between clusters have been computed, the closest two are merged. Single linkage tends to produce long, stringy clusters, whereas complete linkage tends to produce compact clusters. Ward clustering usually produces the best hierarchical trees when the clusters are relatively convex and separated. Since the panel believes Ward linkage is the best all-around method, it was used for this analysis.

To analyze STI data, the panel used hierarchical clustering, cluster heat map, and MDS methods. For purposes of analysis, we used the statistics program AdviseStat, an expert system for statistics akin to an intelligent analytics advisor. In the cluster analysis, we used Pearson correlations as the similarity measure, and the hierarchical clustering used Ward linkage. Variables were standardized before similarities were computed. Standardizing puts measurements on a common scale and prevents one variable from influencing the clustering because it has a larger mean and/or variance. In general, standardizing makes overall level and variation comparable across measurements. As mentioned above, comprehensive evaluation of STI variables leads to scale issues as variables capture different types of information. Hence, it is necessary to put the variables in a common scale.

The analysis consisted of three segments, depending on the type of data. OECD, UNESCO, and Eurostat collect STI information from member nations; NCSES and Statistics Canada collect similar information from a single nation. Analyzing variables from all five databases would be intractable. Therefore, it was necessary to separate the analysis into two groups—many nations and single nation. In the many nations analysis, we concentrated on (1) variables from OECD, UNESCO, and Eurostat and (2) indicators from the SEI 2012 Digest. The single nation analysis has two components: (1) U.S. R&D expenditures and funding as published by NCSES, OECD, UNESCO, and Eurostat and (2) U.S. R&D human capital as published by NCSES, OECD, UNESCO, and Eurostat. The third segment focused on innovation data published by NCSES, Eurostat, and Statistics Canada. The conclusions and observations from the analysis are summarized below.

Many Nations Analysis

This analysis is restricted to OECD, UNESCO, and Eurostat as their databases contain data on more than one nation. As mentioned earlier, the data series of OECD and Eurostat go back to 1981, while that of UNESCO begins in 1996. Here again, the data were divided into two parts. The first part of the analysis focused on main S&T variables from the three databases for 1996 to 2011 (see Figures F-4, F-5, and F-6). The second part of the analysis was based on a subset of those variables, for which information was available going back to 1981, and hence was restricted to OECD and Eurostat (see Figures F-7, F-8, and F-9). The selected variables are listed in Table F-4. It should be noted that this is not an exhaustive list of all the variables available in the three databases. Early in the analysis, the panel took a “kitchen sink” approach whereby the analysis was run on all variables. We ran into multiple clusters, as many of the variables are tabs of a main variable. For example, “number of foreign citizen female researchers in government sector” is a subset of “number of female researchers.” As can be seen in Table F-1, the number of variables that can be gleaned from a single subtopic is large; for example, Eurostat’s STI database contains at least 180 variables on human resources in S&T. The aim of the cluster analysis and MDS is to understand what redundancy exists in main S&T variables, and our analysis was therefore restricted to the selected variables shown in Table F-4 that address various subtopics. The variable names start with EURO, OECD, or UN, denoting the S&T database to which the variables belong. It should not be assumed that the excluded variables are unimportant—Chapter 3 of this report

Page 160 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-4 Heat map of main science and technology variables from OECD, UNESCO, and Eurostat databases.
NOTES: AERO = aerospace industry; BOP = balance of payments; ELEC = electronic industry; EMPL = employment; EU = European Union; EURO = Eurostat; EXP = export industry; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; HRST = human resources in science and technology; HTECH = high technology; ICT = information and communication technology; INSTR = instrument industry; KIS = knowledge-intensive services; NONEU = non-European Union; OC = office machinery and computer; OSS = other supporting staff; PCT = Patent Cooperation Treaty; PHARMA = pharmaceutical industry; R&D = research and development; RD = R&D; RES = researchers; SE = science and engineering; TRD = Trade; UN = United Nations; UNESCO = United Nations Educational, Scientific and Cultural Organization.
SOURCES: Panel analysis and UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012].

Page 161 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-5 Hierarchical cluster of main science and technology variables from OECD, UNESCO, and Eurostat databases.
NOTES: AERO = aerospace industry; ELEC = electronic industry; EMPL = employment; EU = European Union; EURO = Eurostat; EXP = export industry; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; HRST = human resources in science and technology; HTECH = high technology; ICT = information and communication technology; INSTR = instrument industry; KIS = knowledge-intensive services; NONEU = non-European Union; OC = office machinery and computer; OSS = other supporting staff; PCT = Patent Cooperation Treaty; PHARMA = pharmaceutical industry; R&D = research and development; RD = R&D; RES = researchers; SE = science and engineering; TRD = trade; UN = United Nations; UNESCO = United Nations Educational, Scientific and Cultural Organization.
SOURCES: Panel analysis and UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012].

points out that they have a crucial role in answering policy questions. We also encountered the problem of missing data, especially in the UNESCO dataset. As described earlier, the UNESCO database consists of 216 nations, but the set is reduced to 80 if one excludes those countries for which the number of time series is limited. For the second part of the analysis, the number of observations was reduced still further to 52 nations. It is important to understand that we attempted to make our analysis as comprehensive as possible by merging information from three databases, but that effort was hampered by the unavailability of data in certain cases.

In addition to reviewing redundancy in statistics produced by UNESCO, OECD, and Eurostat, the panel expanded the analysis to include data taken from SEI 2012. The online version of SEI 2012 is available at http://www.nsf.gov/statistics/seind12/start.htm. The site provides access to tables and figures used in the digest. These tables and figures provide information on the United States, the EU, Japan, China, other selected Asian economies (the Asia-8: India, Indonesia, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand), and the rest of the world. A more detailed breakdown is available in the appendix tables.¹⁴ Because the digest is intended to inform the reader of broad trends, the source data for the figures and tables do not show a continuous time series.¹⁵ Note that this is not a review of everything that is in SEI 2012, because many tables and figures are used to highlight findings and conclusions. As pointed out in the introduction to SEI 2012:

The indicators included in Science and Engineering Indicators 2012 derive from a variety of national, international, public, and private sources and may not be strictly comparable in a statistical sense. As noted in the text, some data are weak, and the metrics and models relating them to each other and to economic and social outcomes need further development. Thus, the emphasis is on broad trends; individual data points and findings should be interpreted with care.

____________________

¹⁴Detailed appendix tables are available at National Science Foundation (2012c).

¹⁵For example, see Table 6-6 on p. 6-41 in the S&E 2012 Digest (National Science Board, 2012b).

Page 162 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-6 Multidimensional scaling of main science and technology variables from OECD, UNESCO, and Eurostat databases.
NOTES: EMPL = employment; EU = European Union; EURO = Eurostat; EXP = export; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; HRST = human resource in science and technology; HTECH = high technology; ICT = information and communication technology; KIS = knowledge-intensive services; NONEU = non-European Union; OC = office machinery and computer; OSS = other supporting staff; PHARMA = pharmaceutical industry; PCT = Patent Cooperation Treaty; R&D = research and development; RD= R&D; RES = researchers; SE = science and engineering; TRD = trade; UN = United Nations; UNESCO = United Nations Educational, Scientific and Cultural Organization.
SOURCES: Panel analysis and UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012].

Table F-5 shows a list of indicators used in the panel’s analysis. Every effort was made to be as congruent as possible with the list outlined earlier. Because this is a many nations analysis, not all the indicators are included in the table, because many were specific to the United States.

Single Nation Analysis: International Comparability and Human Capital in Science and Engineering

International Comparability

Complying with the Frascati Manual, NCSES reports R&D expenditures by performer and funder (see Table F-6). For comparability purposes, NCSES reports GERD for the United States in National Patterns, the SEI, and various InfoBriefs. Categorization of R&D expenditures by government priorities provides a broad picture of the distribution of R&D activities and a basis for international comparisons (National Science Foundation, 2010b). The standards for collecting data on socioeconomic objectives were introduced in the third edition of the Frascati Manual (OECD, 2002). Godin (2008) points out that the third edition of the manual expanded the scope of the previous edition to include research on the social sciences and humanities and place greater emphasis on “functional” classifications, notably the distribution of R&D by “objectives.” The Frascati Manual

Page 163 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-7 Heat map of main science and technology variables from OECD and Eurostat databases.
NOTES: E = European currency unit; EURO = Eurostat; EXP = export; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; ICT = information and communication technology; OC = office machinery and computer; OSS = other supporting staff; PCT = Patent Cooperation Treaty; R&D = research and development; RD = R&D.
SOURCES: Panel Analysis and UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012].

recommends collecting performer-reported data in all sectors for two priorities: (1) defense and (2) control and care of the environment. In Table 4-23 of the SEI 2012 Digest, U.S. GBAORD values are reported by socioeconomic objectives for 1981, 1990, 2000, and 2009. The agency publishes those figures using special tabulations because aggregate R&D funding data from federal agencies are already allocated to various socioeconomic objective categories, but performer-based R&D totals are not. BRDIS does intermittently survey companies to report their R&D performance for defense purposes and for environmental protection applications, even though the latter category is not fully compliant with the Frascati Manual. Along with GBAORD, Eurostat and OECD report GERD by socioeconomic objectives. The Frascati Manual also recommends that major fields of S&T be adopted as the functional fields of a science classification system. This classification should be used for R&D expenditures of the governmental, higher education, and private nonprofit sectors; if possible for the business enterprise sector; and for personnel data in all sectors (OECD, 2007). OECD,

Page 164 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-8 Hierarchical cluster of main science and technology variables from OECD and Eurostat databases.
NOTES: E = European currency unit; EURO = Eurostat; EXP = export; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; ICT = information and communication technology; OC = office machinery and computer; OSS = other supporting staff; PCT = Patent Cooperation Treaty; R&D = research and development; RD = R&D.
SOURCES: Panel analysis and UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012].

Eurostat, and UNESCO publish GERD figures by fields. NCSES publishes academic expenditures by S&E subfields. The original targets for categorization of R&D expenditures by socioeconomic objectives and fields of science were GBAORD and HERD, respectively. Revisions of the Frascati Manual have expanded the scope of the categorization to include all kinds of R&D expenditures.

Table F-6 shows total R&D expenditure and its components for the United States as published by NCSES and other international organizations. Figures F-10 and F-11 show the results of the cluster analysis performed on the data in Table F-6.

Human Capital in Science and Engineering

NCSES produces a multitude of STI human capital variables, as seen in Table F-7:¹⁶

Scientists and engineers—Scientists and engineers are individuals who satisfy one of the following criteria: (1) have ever received a U.S. bachelor’s or higher degree in an S&E or S&E-related field, (2) hold a non-S&E bachelor’s or higher degree and are employed in an S&E or S&E-related occupation, and (3) hold a non-U.S. S&E degree and reside in the United States.
Doctoral scientists and engineers—Doctoral scientists and engineers are scientists and engineers who have earned doctoral degrees from U.S. universities and colleges.
Bachelor’s and master’s degrees in S&E—Estimates of recent college graduates in S&E fields were generated from the biennial NSRCG, which provides information about individuals who recently obtained bachelor’s or master’s degrees in an SEH field. As the NSRCG was a biennial survey, it collects information for two academic years; therefore, the estimates of S&E bachelor’s and master’s degrees produced from the NSRCG are for two consecutive academic years. NCSES also requests special tabulations from NCES on S&E bachelor’s, master’s, and doctoral degrees, which are published in Women, Minorities and Persons with Disabilities in Science and Engineering.
Doctorate recipients—Doctorate recipients are individuals who received a doctoral degree from a U.S. institution in an SEH field.

____________________

¹⁶Definitions of these terms are found on NCSES’s website at http://www.nsf.gov/statistics.

Page 165 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-9 Multidimensional scaling of main science and technology variables from OECD and Eurostat databases.
NOTES: EMPL = employment; EU = European Union; EURO = Eurostat; EXP = export; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; HRST = human resources in science and technology; HTECH = high technology; ICT = information and communication technology; KIS = knowledge-intensive services; NONEU = non-European Union; OC = office machinery and computer; OSS = other supporting staff; PCT = Patent Cooperation Treaty; PHARMA = pharmaceutical industry; R&D = research and development; RD = R&D; RES = researchers; SE = science and engineering; TRD = trade; UN = United Nations; UNESCO = United Nations Educational, Scientific and Cultural Organization.
SOURCES: Panel analysis and UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database European Union, 1995-2013 [November 2012].

Graduate students in S&E—Graduate students in S&E are those who have been enrolled for credit in any SEH master’s or doctorate program in the fall of the survey cycle year.
Postdoctorates in S&E—Postdoctorates are defined as individuals who (1) hold a recent doctorate or equivalent, a first-professional degree in a medical or related field, or a foreign equivalent to a U.S. doctoral degree and (2) have a limited-term appointment primarily for training in research or scholarship under the supervision of a senior scholar in a unit affiliated with a GSS¹⁷ institution.
Nonfaculty researchers—Doctorate-holding, nonfaculty researchers are defined as individuals involved principally in research activities who are not postdoctorates or members of a faculty.

Relative to NCSES, other organizations/agencies do not publish a large set of human capital variables, but they capture information on certain R&D occupations (see Table F-1) that were missing from NCSES’s surveys very recently. OECD produces statistics on R&D personnel and researchers. In accordance with the Frascati Manual, R&D personnel include all persons employed directly in R&D activities, as well as those providing direct services, such as R&D managers, administrators, and clerical staff, while researchers are professionals engaged in the conception or creation of new knowledge, products, processes, methods, and systems and in the management of the projects concerned. Eurostat defines human resources in S&T as people who fulfill one of the following conditions:

____________________

¹⁷NSF/NIH Survey of Graduate Students and Postdoctorates in Science and Engineering at http://www.nsf.gov/statistics/srvygradpostdoc/ [December 2012].

Page 166 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-4 Selected Science and Technology Variables from OECD, UNESCO, and Eurostat Database Used in the Panel’s Analysis

Variable Label	Variable
EURO_GERD	Gross domestic expenditure on R&D (millions of current PPP dollars)
EURO_HRST	Human resources in science and technology
EURO_HTEC_EMPL	Employment in high-tech sectors
EURO_HTEC_TRD_EU	Trade with EU partners in high-tech sectors
EURO_HTEC_TRD_NONEU	Trade with non-EU partners in high-tech sectors
EURO_KIS_EMPL	Employment in high-tech knowledge-intensive services
EURO_OSS_FTE	Other supporting staff—full-time equivalent
EURO_OSS_HC	Other supporting staff—head count
EURO_RD_HR_FTE	R&D personnel—full-time equivalent
EURO_RD_HR_HC	R&D personnel—head count
EURO_RES_FTE	Researchers—full-time equivalent
EURO_RES_HC	Researchers—head count
EURO_SE	Scientists and engineers
EURO_TECH_FTE	Technicians—full-time equivalent
EURO_TECH_HC	Technicians—head count
OECD_AERO_BALANCE	Trade balance: aerospace industry (millions of current dollars)

(1) successfully completed education at the third level in an S&T field of study; and

(2) were not formally qualified as above, but are employed in S&T occupations in which the above qualifications are normally required.¹⁸

Eurostat refers to scientists and engineers as persons who use or create scientific knowledge and engineering and technological principles, i.e., persons with scientific or technological training who are engaged in professional work on S&T activities and high-level administrators and personnel who direct the execution of S&T activities. UNESCO publishes information on researchers, technical professionals, and other supporting staff. OECD, Eurostat, and UNESCO produce human capital statistics by head count and FTEs.

Table F-7 shows various human capital variables for the United States that are published by NCSES and other international organizations. Figures F-12 and F-13 show results of the cluster analysis performed on the data in Table F-7. Doctoral scientists and engineers is the only NCSES variable that is closely related to the variables reported by Eurostat and OECD.

Single Nation Analysis: Innovation Statistics—Levels versus Percentages

Table F-8 provides a comparative view of innovation data by industry classification that are available from the three surveys on innovation—the CIS, BRDIS, and Canada’s Survey of Innovation. SIBS 2009 has more recent data on the status of innovation activity in Canada, but the data are not available by industry classification; hence the 2003 Survey of Innovation data are presented here. NCSES data cover the period 2006-2008, because companies were asked to report on innovation activity for those years. The EU innovation data are taken from CIS 2006 and 2008. In Tables 1 and 2 of InfoBrief NSF 11-300, data on firms producing innovative products and processes are presented as percentages—for example, the percentage of innovative firms reporting that they produced a new/significantly improved product. This is also the case with innovation data produced by Statistics Canada, while data from the CIS are available in both level and percentage form. Staff of the Committee on National

____________________

¹⁸See http://epp.eurostat.ec.europa.eu/cache/ITY_SDDS/Annexes/hrst_st_esms_an1.pdf, page 1 [December 2012].

Page 167 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

Variable Label	Variable
OECD_AERO_EXP_SHARE	Export market share: aerospace industry
OECD_BIOTECH_PATENT_APPL_PCT	Number of patents in the biotechnology sector—applications filed under the PCT (priority year)
OECD_ELEC_BALANCE	Trade balance: electronic industry (millions of current dollars)
OECD_ELEC_EXP_SHARE	Export market share: electronic industry
OECD_GERD_$	Gross domestic expenditure on R&D (millions of current PPP dollars)
OECD_INSTR_BALANCE	Trade balance: instruments industry (millions of current dollars)
OECD_INSTR_EXP_SHARE	Export market share: instruments industry
OECD_OC_BALANCE	Trade balance: office machinery and computer industry (millions of current dollars)
OECD_OC_EXP_SHARE	Export market share: office machinery and computer industry
OECD_PATENT_APPL_ICT	Number of patents in the ICT sector—applications filed under the PCT (priority year)
OECD_PATENT_APPL_PCT	Number of patent applications filed under the PCT (priority year)
OECD_PHARMA_BALANCE	Trade balance: pharmaceutical industry (millions of current dollars)
OECD_PHARMA_EXP_SHARE	Export market share: pharmaceutical industry
OECD_R&D_HR_FTE	R&D personnel—full-time equivalent
OECD_RES_FTE	Researchers—full-time equivalent
OECD_RES_HC	Researchers—head count
OECD_TECH_BOP_PAYMENTS_$	Technology balance of payments: payments (millions of current dollars)
OECD_TECH_BOP_RECEIPTS_$	Technology balance of payments: receipts (millions of current dollars)
OECD_TRIADIC_PATENT_FAMILIES	Number of Triadic Patent Families (priority year)
UN_GERD_$	Gross domestic expenditure on R&D (millions of current PPP dollars)
UN_OSS_FTE	Other supporting staff—full-time equivalent
UN_OSS_HC	Other supporting staff—head count
UN_R&D_HR_FTE	R&D personnel—full-time equivalent
UN_R&D_HR_HC	R&D personnel—head count
UN_RES_FTE	Researchers—full-time equivalent
UN_RES_HC	Researchers—head count
UN_TECH_FTE	Technicians—full-time equivalent
UN_TECH_HC	Technicians—head count

NOTES: AERO = aerospace industry; APPL = application; BOP = balance of payments; ELEC = electronic industry; EMPL = employment; EU = European Union; EURO = Eurostat; EXP = export market share; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; HC = head count; HR = human resources; HRST = human resources in science and technology; HTECH = high technology; ICT = information and communication technology; INSTR = instruments industry; KIS = knowledge-intensive services; NONEU = non-European Union; OC = office machinery and computer; OSS = other supporting staff; PCT = Patent Cooperation Treaty; PHARMA = pharmaceutical industry; PPP = purchasing power parity; R&D = research and development; RD = R&D; RES = researchers; SE = science and engineering; TECH = technicians; TRD = trade; UN = United Nations; UNESCO = United Nations Educational, Scientific and Cultural Organization.

Statistics manipulated the available data in Table 1 of NSF 11-300 and converted percentage figures into levels; the results are shown in Table F-8. Survey results from Canada’s 2003 Survey of Innovation, which are available in CANSIM, are problematic to interpret as it is often difficult to understand what the denominator is. In some data tables, it is clear that the denominator is innovative firms, while for other tables the user must guess. One can calculate the total number of innovative firms receiving tax credits or total number of innovative firms reporting customers as an important source of innovation information if information on total innovative firms is available. Hence, staff of the Committee on National Statistics could not convert percentage figures into levels in the case of innovation data from CANSIM. It would be useful if more information on the surveyed population, such as total population, sample size, and response rate, were readily available. This information needs to be published by industry classification, as is evident from Tables F-8 and F-9.¹⁹

OBSERVATIONS

Many Nations Analysis

Figures F-4 to F-6 show a cluster heat map, a hierarchical cluster tree, and the multidimensional scaling of a Pearson correlation matrix, respectively. The input matrix consists of the main S&T variables from the OECD, UNESCO, and Eurostat databases. The red and orange squares along the diagonal of the heat map in Figure F-4 show that those variables are very closely related to each other, and either they could be merged, or the most well-behaved and consistent variables among them could be selected. Figures F-5 and F-8 show clusters of variables. Broadly speaking, human resource variables form one category and trade variables another. Figures F-6 and F-9 show sets of variables that are either similar or dissimilar. In these two figures, the dimensions have no interpretation, and one is looking for clusters of variables that would indicate they belong together. Strong correlation patterns are observed in the variables on researchers, technicians, and other supporting staff. These variables are closely grouped together. Moreover, within these variables, those produced by the same organization

____________________

¹⁹For further information, see Lonmo (2005, Table 1).

Page 168 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-5 Science and Engineering Indicators from SEI 2012 Digest Used in the Panel’s Analysis

Indicator Label	Indicator
RD_%_GDP	R&D expenditures as a share of economic output = R&D as percentage of GDP

Deg_NatSci	First university degrees in natural sciences
Deg_Eng	First university degrees in engineering
Doct_NatSci	Doctoral degrees in natural sciences
Doct_Eng	Doctoral degrees in engineering
S&E_Art	S&E journal articles produced
Eng_Share_S&E_Art	Engineering journal articles as a share of total S&E journal articles

Res_Art_Int_CoAuthor	Percentage of research articles with international coauthors
Share_Citation_Int_Lit	Share of region’s/country’s citations in international literature
Global_HighValue_Patents	Global high-value patents
Export_Comm_KIS	Exports of commercial knowledge-intensive services
HighTech_Exports	High-technology exports
Trade_Balance_KIS_IntAsset	Trade balance in knowledge-intensive services and intangible assets
VA_HighTech_Manu	Value added of high-technology manufacturing industries
VA_Health_SS	Global value added of health and social services
VA_Educ	Global value added of education services
VA_Whole_Retail	Global value added of wholesale and retail services
VA_Real_Estate	Global value added of real estate services
VA_Transport_Storage	Global value added of transport and storage services
VA_Rest_Hotel	Global value added of restaurant and hotel services

NOTES: GDP = gross domestic product; KIS = knowledge-intensive services; R&D = research and development; RD = R&D; S&E = science and engineering.

SOURCE: Panel analysis and Science and Engineering Indicators 2012, see http://www.nsf.gov/statistics/seind12/tables.htm [November 2012].

are more similar. Clusters of subtopics are also observed. Expenditure variables, trade variables, and patent variables are more similar to variables within their group. This shows that variables on a subtopic relay similar information; i.e., they are proxy variables. For example, if an analyst is looking at predictor variables for a regression model and is unable to obtain data on technical staff, then researchers can substitute. In some ways, this relieves the burden on statistical agencies/ offices trying to follow the Frascati Manual’s recommendations. Even if they fall short in collecting certain variables, similar information can be gleaned from other variables on the same topic.

Single indicators highlighted for each subtopic as primary indicators are not shown here, as that would lead to conjecture. Nations should decide which variables to collect depending on ease of collection and budgetary constraints. The panel is not asserting that statistical offices around the world should stop collecting detailed S&T data, as the utility of variables is not limited to the ability to feed them into a regression model. National statistical offices collect detailed STI information through surveys and/or by using administrative records to answer specific policy questions, such as the mobility of highly skilled labor, the gender wage gap in S&T occupations, and the amount of investment moving into certain S&T fields. It can be said that the S&T community is interested in understanding the progress of nations in attracting the best talent, or the broad careers pursued by Ph.D. holders in particular fields, or the R&D investment in environmental projects. The main concern faced by the panel was the unavailability of detailed data as main variables undergo disaggregation. Apart from OECD and Eurostat member countries, the rest of the world has yet to keep pace in terms of capturing STI information in accordance with recommendations of the Frascati and Oslo Manuals. OECD and Eurostat have been frontrunners in pursuing valuable information, and they should be commended for their efforts. At the same time, the panel is not critical of non-OECD and non-Eurostat nations, as both data collection agencies and respondents must undergo a learning process to provide such fine data in a consistent fashion.

Figures F-14 to F-16 show a cluster heat map, a hierarchical cluster tree, and the multidimensional scaling of a Pearson correlation matrix, respectively. The input matrix consists of S&E indicators from the SEI 2012 Digest. The red and orange squares along the diagonal of the heat map show that those variables are very closely related to each other, and either they could be merged, or the most well-behaved and consistent variables among them could be selected. Figure F-15 shows clusters of indicators; Figure F-16 shows sets of indicators that are either similar or dissimilar. In these two figures, the dimensions have no interpretation, and one is looking for clusters of variables that would indicate they belong together. Indicators representing the service sector are observed to be highly correlated with each other. Indicators denoting first university degrees are closely grouped together. The same conclusion can be drawn for indicators on generation of S&E knowledge (articles and citations). Therefore, clusters of subtopics are observed, similar to those observed for STI variables from the OECD, Eurostat, and UNESCO databases. Certain indicators, such as R&D as

Page 169 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-6 Statistics on U.S. R&D Expenditure Produced by NCSES, UNESCO, OECD, and Eurostat (in millions of current dollars)


National Center for Science and Engineering Statistics—National Patterns of R&D Resources

Indicator	Total R&D Expenditure	Industry-Performed R&D	Industry FFRDC-Performed R&D	Federally-Performed R&D	Universities and Colleges-Performed R&D	University and College FFRDC-Performed R&D	Nonprofit-Performed R&D	Nonprofit FFRDC-Performed R&D

Variable Name Year	NCSES_RD	NCSES_RD_IND	NCSES_RD_IND_FFRDCS	NCSES_RD_FED	NCSES_RD_UC	NCSES_RD_UC_FFRDCS	NCSES_RD_NP	NCSES_RD_NP_FFRDCS

1981	72292	50425	1385	8605	7085	2484	1784	524
1982	80748	57166	1484	9501	7603	2544	1915	536
1983	89950	63683	1585	10830	8251	2840	2176	585
1984	102244	73061	1739	11916	9154	3243	2511	620
1985	114671	82376	1863	13093	10308	3616	2761	655
1986	120249	85932	1891	13504	11540	3973	2867	541
1987	126360	90160	1995	13588	12807	4287	3013	509
1988	133881	94893	2122	14342	14221	4581	3213	510
1989	141891	99860	2195	15231	15634	4756	3669	547
1990	151993	107404	2323	15671	16939	4894	4126	636
1991	160876	114675	2277	15249	18206	5120	4652	696
1992	165350	116757	2353	15853	19388	5259	4993	748
1993	165730	115435	1965	16531	20495	5289	5267	749
1994	169207	117392	2202	16355	21607	5294	5599	758
1995	183625	129830	2273	16904	22617	5367	5827	808
1996	197346	142371	2297	16585	23718	5395	6209	772
1997	212152	155409	2130	16819	24884	5463	6626	821
1998	226457	167102	2078	17362	26181	5559	7332	843
1999	245007	182090	2039	17851	28176	5652	8207	993
2000	267983	199961	2001	18374	30705	5742	9734	1465
2001	279755	202017	2020	22374	33743	6225	11182	2192
2002	278744	193868	2263	23798	37215	7102	12179	2319
2003	291239	200724	2458	24982	40484	7301	12796	2494
2004	302503	208301	2485	24898	43122	7659	13394	2644
2005	324993	226159	2601	26322	45190	7817	14077	2828
2006	350162	247669	3122	28240	46955	7306	13928	2943
2007	376960	269267	5165	29859	49010	5567	14777	3316
2008	403040	290681	6346	29839	51650	4766	16035	3724
2009	400458	282393	6446	30901	54382	4968	17531	3835

Page 170 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


UNESCO

Indicator	GERD	GERD Performed by Business Enterprise Sector	GERD Performed by Government Sector	GERD Performed by Higher Education Sector	GERD Performed by Private Nonprofit Sector

Variable Name Year	UN_GERD	UN_GERD_BEP	UN_GERD_GOVP	UN_GERD_HEP	UN_GERD_PNPP

1996	197792	142371	25504	2378	6209
1997	212709	155409	25801	24873	6626
1998	226934	167102	26320	26171	7341
1999	245548	182090	27041	28165	8252
2000	268121	199961	27685	30693	9782
2001	278239	202017	31358	33731	11133
2002	277066	193868	33647	37202	12349
2003	289736	200724	35703	40470	12839
2004	300293	208301	36567	43128	12297
2005	323047	226159	38526	45197	13164
2006	347809	247669	39573	46983	13584
2007	373185	269267	40472	49021	14425
2008	398194	289105	42225	51163	15701

Page 171 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


OECD

Indicator	GERD	BERD	GBAORD	GOVERD	HERD	GERD Performed by Nonprofit Sector

Variable Name Year	OECD_GERD	OECD_BERD	OECD_GBAORD	OECD_GOVERD	OECD_HERD	OECD_GERD_PNPP

1981	72750	50425	33735	13455	7085	1784
1982	81166	57166	36115	14482	7603	1915
1983	90403	63683	38768	16294	8251	2175
1984	102874	73061	44214	18149	9154	2511
1985	115219	82376	49887	19775	10308	2761
1986	120562	85932	53249	20222	11540	2867
1987	126667	90160	57069	20686	12807	3013
1988	134202	94893	59106	21877	14220	3213
1989	142226	99860	62115	23065	15632	3669
1990	152389	107404	63781	23923	16936	4126
1991	161388	114675	65897	23858	18203	4652
1992	165835	116757	68398	24700	19385	4993
1993	166147	115435	69884	24956	20489	5267
1994	169613	117392	68331	25024	21598	5599
1995	184077	129830	68791	25813	22608	5827
1996	197792	142371	69049	25504	23708	6209
1997	212709	155409	71653	25801	24873	6626
1998	226934	167102	73569	26320	26171	7341
1999	245548	182090	77637	27041	28165	8252
2000	268121	199961	83613	27685	30693	9782
2001	278239	202017	91505	31358	33731	11133
2002	277066	193868	103057	33647	37202	12349
2003	289736	200724	114866	35703	40470	12839
2004	300293	208301	126271	36567	43128	12297
1005	325936	226159	131259	40378	45190	14209
1006	350923	247669	136019	42256	46955	14043
2007	377594	269267	141890	44474	49010	14843
2008	403668	290681	144391	45246	51650	16091
2009	401576	282393	164292	47118	54382	17683
2010			148448

Page 172 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


Eurostat

Indicator	GERD	BERD	GBAORD	GOVERD	HERD	GERD Performed by Nonprofit Sector

Variable Name Year	EURO_GERD	EURO_BERD	EURO_GBAORD	EURO_GERD_GOVP	EURO_GERD_HEP	EURO_GERD_PNPP

1982	72750	50425	33735	13455	7085	1784
1082	81166	57166	36115	14482	7603	1915
1983	90403	63683	38768	16294	8251	2175
1984	102874	73061	44214	18149	9154	2511
1985	115219	82376	49887	19775	10308	2761
1986	120562	85932	53249	20222	11540	2867
1987	126667	90160	57069	20686	12807	3013
1988	134202	94893	59106	21877	14220	3213
1989	142226	99860	62115	23065	15632	3669
1990	152389	107404	63781	23923	16936	4126
1991	161388	114675	65897	23858	18203	4652
1992	165835	116757	68398	24700	19385	4993
1993	166147	115435	69884	24956	20489	5267
1994	169613	117392	68331	25024	21598	5599
1995	184077	129830	68791	25813	22608	5827
1996	197792	142371	69049	25504	23708	6209
1997	212709	155409	71653	25801	24873	6626
1998	226934	167102	73569	26320	26171	7341
1999	245548	182090	77637	27041	28165	8252
2000	268121	199961	83613	27685	30693	9782
2001	278239	202017	91505	31358	33731	11133
2002	277066	193868	103057	33647	37202	12349
2003	289736	200724	114866	35703	40470	12839
2004	300293	208301	126271	36567	43128	12297
2005	323047	226159	131259	38526	45197	13164
2006	347809	247669	136019	39573	46983	13584
2007	373185	269267	141890	40472	49021	14425
2008	398194	289105	144391	42225	51163	15701
2009			164292
2010			148448

NOTES: BERD = business enterprise expenditure on research and development; EURO = Eurostat; FFRDC = federally funded research and development center; GBAORD = government budget appropriations or outlays for research and development; GDP = gross domestic product; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on research and development; HERD = higher education expenditure on research and development; IND = industry; NCSES = National Center for Science and Engineering Statistics; NP = nonprofit; PNPP = private nonprofit performed; R&D = research and development; RD = R&D; UC = universities and colleges; UNESCO = United Nations Educational, Scientific and Cultural Organization.

SOURCES: National Science Foundation (2012). National Patterns of R&D Resources: 2009 Data Update. NSF 12-321. National Center for Science and Engineering Statistics. Available: http://www.nsf.gov/statistics/nsf12321/ [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database [November 2012].

Page 173 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-10 Heat map of U.S. R&D expenditure variables from various STI databases.
NOTES: BEP = business enterprise performed; BERD = business enterprise expenditure on research and development; EURO = Eurostat; FFRDCs = federally funded R&D centers; GBAORD = government budget appropriations or outlays for research and development; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on research and development; GOVP = government sector performed; HEP = higher education sector performed; HERD = higher education expenditure on research and development; IND = industry; NCSES = National Center for Science and Engineering Statistics; NP = nonprofits; PNPP = private nonprofit performed; RD = research and development; UC = universities and colleges.
SOURCES: Panel analysis and National Science Foundation. (2012). National Patterns of R&D Resources: 2009 Data Update. NSF 12-321. National Center for Science and Engineering Statistics, available: http://www.nsf.gov/statistics/nsf12321/ [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database [November 2012].

a share of GDP, global high-value patents, doctoral degrees in engineering, and doctoral degrees in natural science are not strongly correlated with other indicators. Hence within the set of indicators analyzed, these four indicators stand apart. The reader should not assume that these indicators are unique, because the list of indicators analyzed here is small. The uniqueness might not hold if more indicators were included in the input matrix.

Single Nation Analysis

The clusters shown in Figures F-10 and F-11 are not surprising, as sector-specific expenditure variables are clustered together; i.e., business R&D expenditure figures are similar to each other irrespective of the data source. The same conclusion can be drawn for figures on expenditures on federal R&D, nonprofit R&D, and academic R&D.

Eurostat, OECD, and UNESCO report numbers of FTE researchers for the United States, but it is not clear how that number is calculated. NCSES and NCES report head counts of S&E human resources. Therefore, a disparity is seen in

Page 174 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-11 Cluster map of U.S. R&D expenditure variables from various STI databases.
NOTES: BEP = business enterprise performed; BERD = business enterprise expenditure on research and development; EURO = Eurostat; FFRDCs = federally funded R&D centers; GBAORD = government budget appropriations or outlays for research and development; GERD = gross domestic expenditure on research and development; GOVERD = government intramural expenditure on research and development; GOVP = government sector performed; HEP = higher education sector performed; HERD = higher education expenditure on research and development; IND = industry; NCSES = National Center for Science and Engineering Statistics; NP = nonprofit; PNPP = private nonprofit performed; RD = research and development; UC = universities and colleges.
SOURCES: Panel analysis and National Science Foundation (2012). National Patterns of R&D Resources: 2009 Data Update. NSF 12-321. National Center for Science and Engineering Statistics, available: http://www.nsf.gov/statistics/nsf12321/ [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database [November 2012].

the metric that is reported, as U.S. head counts represent the supply of human capital not necessarily involved in R&D. FTE researchers show the contribution of labor hours to the R&D process; hence it is important for a researcher or an S&T policy maker to understand that different data sources may appear to report the same thing, but this usually is not the case. Figures F-12 and F-13 show that variables representing head counts are not strongly correlated with FTE researchers. One advantage of having so many variables is that as a user, one can select among them depending on the question being addressed. Table F-5 shows that the whole set of variables produced by NCSES is an attempt at capturing different segments of the S&E population, which range from scientists to medical researchers. Figures F-12 and F-13 show that variables from NCSES and NCES are clustered together, with variables reporting the same indicator being more strongly correlated (see the cluster of doctorate recipients, graduate students [NCSES], and doctoral degrees [NCES]). This suggests the possibility that NCSES may be overproducing some of the S&E human capital variables. As previously mentioned, however, agencies produce variables to answer particular policy questions. The end result is a trade-off between efficiency and addressing user needs. It is commendable that NCSES has been able to satisfy academicians and policy analysts alike, but a more resourceful approach is required under current budgetary conditions.

The panel would also like to highlight the efforts of NCSES to comply more closely with the recommendations of the Frascati Manual. The Survey of Industrial Research and Development (SIRD) (the old industrial survey) questionnaire contained items on FTE R&D scientists and engineers only. NCSES decided to resolve this data gap by including questions on researchers (FTE) and R&D personnel (head count) by gender; occupation (scientists and engineers, technicians, support staff); and location, including foreign locations. With the new data, it is possible to generate tabs, for example, on female technicians working in Belgium. The Survey of Research and Development Expenditures at Universities and Colleges (the old academic survey) contained a serious data gap in terms of information on R&D personnel in the academic sector. In 2010, NCSES began using the HERD survey to collect researcher and R&D personnel head counts. The HERD redesign investigation process indicated that collecting FTE data would be highly problematic,

Page 175 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

whereas collecting principal investigator data appeared to be rather reasonable. Therefore, obtaining information on FTE researchers or R&D personnel in the academic sector still is not possible, but one can obtain head counts of researchers, principal investigators, and R&D personnel.

One point that came to the panel’s attention is that NCSES does not publish its main STI indicators on a single webpage. For national R&D expenditures, a user accesses National Patterns, while for human capital in S&E, one must generate tables from SESTAT. For further detail on academic R&D expenditures, WebCASPAR serves as a more useful tool. IRIS contains historical data tables on industrial R&D expenditures. SESTAT data feed into various NCSES publications, including (1) Characteristics of Scientists and Engineers in the U.S.; (2) Characteristics of Doctoral Scientists and Engineers in the U.S.; (3) Doctoral Scientists & Engineers Profile; (4) Characteristics of Recent College Grads; (5) Women, Minorities, and Persons with Disabilities in Science and Engineering; and (6) various InfoBriefs. It is difficult to find summary tables that combine information across all five publications. WebCASPAR contains detail on SEH degrees that is not available in SESTAT. When staff of the Committee on National Statistics downloaded STI databases of other agencies/organizations, they had an easier task because all variables were available on a single webpage.

Page 176 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-7 Statistics on U.S. Science and Engineering Human Resources Produced by NCSES, NCES, UNESCO, OECD, and Eurostat


Indicator	National Center for Science and Engineering Statistics
	SESTAT				WEBCASPAR
	Scientists and Engineers	Doctoral Scientists and Engineers	S&E Bachelor’s Degree Recipients	S&E Master’s Degree Recipients	S&E Doctorate Recipients (Includes Medical and Other Life Sciences)	SEH Graduate Students	SEH Postdoctorates	SEH Nonfaculty Research Staff

Variable Name Year	SE	DSE	RCG_BACH	RCG_MAST	DOCREP_SE	GRADSTUD	POSTDOC	NONFACULTY_RES_STAFF

1990					23823	452113	29565	5255
1991			308500	57000	25060	471212	30865	5478
1992					25785	493522	32747	5482
1993	11615200	513460	348900	73200	26640	504304	34322	6001
1994					27500	504399	36377	6209
1995	12036200	542540	354450	74750	27864	499640	35926	6534
1996			354450	74750	28564	494079	37107	6604
1997	12530700	582080	371500	78500	28650	487208	38481	6722
1998			371500	78500	28773	485627	40086	7100
1999	13050800	626700	379150	80050	27338	493256	40800	7573
2000			379150	80050	27557	493311	43115	7879
2001		656550	468850	123350	27,069	509607	43311	7531
2002			468850	123350	26263	540404	45034	7906
2003	21647000	685300	521833	138967	26916	567121	46728	8473
2004			521833	138967	27993	574463	47240	9075
2005			521833	138967	29768	582226	48555	9527
2006	22630000	711800	467000	102000	31774	597643	49343	10814
2007			467000	102000	33974	619499	50840	10752
2008	10204000	752000			34926	631489	54164	13747
2009					35562	631645	57805	14059
2010					35253	632652	63415

Page 177 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


Indicator	National Center for Science and Engineering Statistics
	Women, Minorities and Persons with Disabilities in Science and Engineering			International Organizations
				UNESCO	OECD	Eurostat
	S&E Bachelor’s Degrees	S&E Master’s Degrees	S&E Doctoral Degrees	Researchers (FTE)	Researchers (FTE)	Researchers in Business Enterprise Sector (FTE)	Researchers (FTE)

Variable Name Year	BACH_SE	MAST_SE	DOC_SE	UN_RES_FTE	OECD_RES_FTE	EURO_RES_BEMPOCC_FTE	EURO_RES_PERSOCC_FTE

1990	329094	77788	22868			758500
1991	337675	78368	24023		981659	776400	981659
1992	355265	81107	24675			772000
1993	366035	86425	25443		1013772	766600	1013772
1994	373261	91411	26205			757300
1995	378148	94309	26536		1035995	789400	1035995
1996	384674	95313	27243			859300
1997	388482	93485	27232	1159908	1159908	918600	1159908
1998	390618	93918	27278			997700
1999			25933	1260920	1260920	1033700	1260920
2000	398622	95683	25966	1293582	1293582	1041300	1293582
2001	400435	99,528	25453	1320305	1320305	1060000	1320096
2002	415983	99650	24254	1342454	1342454	1075300	1342454
2003	442755	108355	25425	1430551	1430551	1156000	1430551
2004	458658	119296	26573	1384536	1384536	1111300	1384536
2005	470214	120870	28561	1375304	1375304	1097700	1375304
2006	478858	120999	30452	1414341	1414341	1135500	1414341
2007	485772	120278	32588	1412639	1412639	1130500	1412639
2008	496168	126404	33359
2009	505435	134517	33284

NOTES: BACH = bachelor’s degrees; BEMPOCC = business enterprise sector; DOCREP = doctorate recipients; DSE = doctoral scientists and engineers; EURO = Eurostat; FTE = full-time equivalent; GRADSTUD = graduate students; MAST = master’s degrees; PERSOCC = researchers FTE; POSTDOC = postdoctorates; RCG = recent college graduates; RES = researchers; S&E = science and engineering; SEH = science, engineering, and health; SESTAT = Scientists and Engineers Statistical Data System; UN = United Nations; UNESCO = United Nations Educational, Scientific and Cultural Organization.

SOURCES: WebCASPAR, see https://webcaspar.nsf.gov [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB [November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database [November 2012].

Page 178 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-12 Heat map of U.S. human capital variables from various STI databases.
NOTES: BACH = bachelor’s degrees; BEMPOCC = business enterprise sector; DOC = doctorate; DOCREP = doctorate recipients; DSE = doctoral scientists and engineers; EURO = Eurostat; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; GRADSTUD = graduate students; MAST = master’s degrees; PERSOCC = researchers FTE; RCG = recent college graduates; RES = researchers; SE = science and engineering; UN = United Nations.
SOURCES: Panel analysis and WebCASPAR, see https://webcaspar.nsf.gov [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database [November 2012].

Page 179 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-13 Cluster map of U.S. human capital variables from various STI databases.
NOTES: BACH = bachelor’s degrees; BEMPOCC = business enterprise sector; DOC = doctorate; DOCREP = doctorate recipients; DSE = doctoral scientists and engineers; EURO = Eurostat; FTE = full-time equivalent; GERD = gross domestic expenditure on research and development; GRADSTUD = graduate students; MAST = master’s degrees; PERSOCC = researchers FTE; RCG = recent college graduates; RES = researchers; SE = science and engineering; UN = United Nations.
SOURCES: Panel analysis and WebCASPAR, see https://webcaspar.nsf.gov [November 2012]. UNESCO, see http://www.uis.unesco.org/ScienceTechnology/Pages/default.aspx [November 2012]. OECD, see http://stats.oecd.org/Index.aspx?DataSetCode=MSTI_PUB November 2012]. Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database [November 2012].

Page 180 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-8 Innovation Statistics from NCSES (2006-2008) and Statistics Canada (2003)


United States: National Center for Science and Engineering Statistics

Industry Classification	2006-2008: Number of Companies Who Answered Yes to Innovating
Industry Classification	Any Good/Service	Goods	Services	Any Process	Mfgr./Production Methods	Logistic/Delivery/Distribution Methods	Support Activities

Manufacturing industries, 31–33	27962	22878	12710	27962	22878	8897	16523
Food, 311	1547	1456	455	1547	1183	546	637
Beverage/tobacco products, 312	204	156	72	180	108	72	96
Textile/apparel/leather and allied products, 313–16	1159	915	549	1098	793	427	732
Wood products, 321	549	366	305	976	793	183	366
Chemicals, 325	2419	1947	1062	2006	1298	1003	1180
Pharmaceuticals/medicines, 3254	675	360	405	630	270	405	480
Other, 325	1760	1584	660	1364	1012	616	704
Plastics/rubber products, 326	1464	1281	671	1708	1464	488	915
Nonmetallic mineral products, 327	702	594	270	756	594	216	432
Primary metals, 331	357	273	231	399	357	84	231
Fabricated metal products, 332	4176	2871	2349	5742	4959	1305	3132
Machinery, 333	3120	2760	1320	2880	2520	600	1800
Computer/electronic products, 334	3150	3010	1260	2310	1820	770	1260
Computers/peripheral equipment, 3341	336	282	156	276	186	114	126
Communications equipment, 3342	408	408	168	264	200	56	136
Semiconductor/other electronic components, 3344	675	625	250	625	475	175	350
Navigational/measuring/electromedical/control instruments, 3345	1534	1508	624	1040	936	416	572
Other, 334	185	175	65	70	50	15	50
Electrical equipment/appliance/components, 335	1036	1008	308	784	672	336	588
Transportation equipment, 336	1512	1350	594	1242	972	270	810
Motor vehicles/trailers/parts, 3361–63	792	726	264	726	627	99	462
Aerospace products/parts, 3364	288	261	171	225	144	63	180
Other, 336	455	403	156	325	195	104	208
Furniture/related products, 337	1092	1014	468	1482	936	390	936
Manufacturing nec, other 31–33	5302	4097	2892	5543	4097	1687	3374
Nonmanufacturing industries, 21–23, 42–81	113432	42537	99253	113432	28358	42537	85074
Information, 51	6930	3696	5775	4620	1617	2310	3696
Software publishers, 5112	3080	2320	2240	2120	760	880	1720
Telecommunications/Internet service providers/Web search portals/data processing services, 517–18	2331	945	2142	1386	441	693	1260
Other, 51	1548	516	1419	1290	516	903	774
Finance/insurance, 52	4472	559	4472	4472	559	1118	3913
Real estate/rental/leasing, 53	3430	2940	2450	2940	980	980	2940
Professional/scientific/technical services, 54	22568	10416	20832	20832	6944	8680	17360

Page 181 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


Computer systems design/related services, 5415	6790	3880	5820	4850	1552	1358	4462
Scientific R&D services, 5417	1023	744	682	806	465	186	434
Other, 54	15110	6044	13599	15110	6044	6044	12088
Health care services, 621–23	18350	5505	16515	14680	5505	5505	12845
Nonmanufacturing nec, other 21–23, 42–81	55968	27984	46640	65296	18656	18656	46640

Canada: Statistics Canada

Industry Classification	2003: Percentage of Business Units
Industry Classification	Both Product and Process Innovators	Innovators	Process Innovators	Process Innovators Only	Product Innovators	Product Innovators Only

Air transportation [481]	22.4	36.7	32.7	10.2	26.5	4.1
Airport operations [48811]	17.1	46.3	41.5	24.4	22.0	4.9
Cable and other program distribution [5175]	42.8	66.5	42.8	0	66.5	23.7
Computer and communications equipment and supplier wholesaler-distributors [4173]	28.2	65.1	37.3	9.1	56.0	27.8
Computer systems design and related services [54151]	35.4	87.2	42.0	6.6	80.6	45.2
Contract drilling (except oil and gas) [213117]	14.3	32.1	17.9	3.6	28.6	14.3
Data processing, hosting, and related services [5182]	50.0	72.4	63.8	13.8	58.6	8.6
Electronic and precision equipment repair and maintenance [8112]	18.4	53.3	33.4	15.1	38.3	19.9
Engineering services [54133]	21.1	55.3	32.0	10.9	44.5	23.4
Environmental consulting services [54162]	32.8	67.3	45.9	13.1	54.2	21.4
Geophysical surveying and mapping services [54136]	14.6	57.8	41.4	26.8	31.0	16.4
Industrial design services [54142]	27.6	53.9	31.3	3.7	50.2	22.5
Information and communication technology (ICT) service industries	37.2	78.2	44.1	6.9	71.3	34.1
Internet service providers [518111]	58.2	75.4	61.2	3	72.4	14.2
Interurban and rural bus transportation [4852]	18.8	43.8	25	6.3	37.5	18.8
Management consulting services [54161]	26.5	44.1	35	8.5	35.7	9.1
Management, scientific and technical consulting services [5416]	26.6	47.1	35.9	9.2	37.8	11.2
Office and store machinery and equipment wholesalers-distributors [41791]	37.2	61.8	42.7	5.5	56.3	19.1

Page 182 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×


Canada: Statistics Canada

Industry Classification	2003: Percentage of Business Units
Industry Classification	Both Product and Process Innovators	Innovators	Process Innovators	Process Innovators Only	Product Innovators	Product Innovators Only


Office machinery and equipment rental and leasing [53242]	30.0	52.6	37.7	7.7	44.9	14.9
Other machinery, equipment and supplies wholesaler-distributors [4179]	26.0	63.8	33.7	7.7	56.1	30.1
Other scientific and technical consulting services [54169]	23.8	52.2	35.1	11.3	40.9	17.2
Other support activities for mining [213119]	20.0	34.5	29.1	9.1	25.5	5.5
Other telecommunications [5179]
Port and harbour operations [48831]	20.7	41.4	41.4	20.7	20.7	0
Rail transportation [482]	0	53.3	33.3	33.3	20	20
Research and development in the physical, engineering and life sciences [54171]	32.2	68.1	44.3	12.1	56.1	23.9
Research and development in the social sciences and humanities [54172]	21.2	60.1	50.5	29.3	30.8	9.6
Satellite telecommunications [5174]	62.7	100	73.7	11.1	88.9	26.3
Scientific research and development services [5417]	30.1	66.6	45.5	15.4	51.3	21.1
Software publishers [5112]	53.1	94.3	59.3	6.2	88.1	35.0
Support activities for forestry [1153]	10.3	28.7	25.0	14.7	13.9	3.6
Surveying and mapping (except geophysical) services [54137]	23.6	51.2	48.2	24.6	26.6	3.0
Telecommunications resellers [5173]	29.4	74.5	29.4	0	74.5	45.0
Testing laboratories [54138]	20.0	51.9	33.5	13.5	38.4	18.4
Truck transportation [484]	10.9	25.7	20.8	9.9	15.8	5.0
Water transportation [483]	8.3	20.8	16.7	8.3	12.5	4.2
Web search portals [518112]
Wired telecommunications carriers [5171]	57.8	75.4	60.5	2.6	72.8	15.0
Wireless telecommunications carriers (except satellite) [5172]	43.5	60.0	49.1	5.6	54.4	10.9

SOURCES: Adapted from National Science Foundation (2010). NSF Releases New Statistics on Business Innovation. NSF 11-300. National Center for Sci - ence and Engineering Statistics, available: http://www.nsf.gov/statistics/infbrief/nsf11300/ [November 2012]. Statistics Canada, Adapted from CANSIM Table 358-00321, 2 Survey of Innovation, selected service industries, percentage of innovative business units [November 2012].

Page 183 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

TABLE F-9 Innovation Statistics from Eurostat, 2006 and 2008


	European Union: Eurostat

			2006: Number of Enterprises with Type of Innovation
Industry Classification			Technological Innovation	Novel Innovators, Product Only	Novel Innovators, Process Only	Introduced Organizational/Marketing Innovation

Agriculture, forestry and fishing
Mining and quarrying			1402	239	559	652
Manufacturing			158629	35531	42018	76297
Electricity, gas, steam and air conditioning supply			2340	228	1191	1233
Water supply; sewerage, waste management and remediation activities Construction			19202	6528	7809	1751
Wholesale and retail trade; repair of motor vehicles and motorcycles			42233	7413	18292	15994
Transportation and storage
Accommodation and food service activities
Information and communication
Financial and insurance activities
Real estate activities
Professional, scientific and technical activities
Administrative and support service activities
Hotels and restaurants			5422	1306	2999	333
Transport, storage and communication			24702	4304	7301	13065
Financial intermediation			8792	1416	2200	4847
Real estate, renting and business activities			22748	4849	6395	6442

Industry Classification	2008: Number of Enterprises with Type of Innovation
Industry Classification	Innovation Activity	Technological Innovation Only		Innovation Activity	Novel Innovator, Product Only	Novel Innovators, Process Only

Agriculture, forestry and fishing	2799	1223		579	524	862
Mining and quarrying	4072	595		523	181	648
Manufacturing	446126	50774		39981	36255	44047
Electricity, gas, steam and air conditioning supply	4919	541		792	208	682
Water supply; sewerage, waste management and remediation activities	13891	1785		1719	702	1784
Construction	42042	9470		17805	3716	11230
Wholesale and retail trade; repair of motor vehicles and motorcycles	75102	12742		30184	7422	14042
Transportation and storage	72156	6310		12699	2822	8501
Accommodation and food service activities	15062	2492		6628	939	2048
Information and communication	27343	5300		4337	6748	2874
Financial and insurance activities	28580	1655		3065	1948	2483
Real estate activities	2631	361		1198	327	588
Professional, scientific and technical activities	19809	4521		5565	2978	3870
Administrative and support service activities	7909	1563		3557	910	1947
Hotels and restaurants
Transport, storage and communication
Financial intermediation
Real estate, renting and business activities

SOURCES: Eurostat, see http://epp.eurostat.ec.europa.eu/portal/page/portal/science_technology_innovation/data/database[November 2012].

Page 184 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-14 Heat map of science and engineering indicators from SEI 2012 Digest.
NOTE: GDP = gross domestic product; KIS = knowledge-intensive services; RD = research and development; S&E = science and engineering; SS = social services; VA = global value added. SOURCE: Panel analysis and Science and Engineering Indicators 2012, see http://www.nsf.gov/statistics/seind12/tables.htm [November 2012].

images

FIGURE F-15 Hierarchical cluster of science and engineering indicators from SEI 2012 Digest.
NOTES: GDP = gross domestic product; KIS = knowledge-intensive services; RD = research and development; S&E = science and engineering; SS = social services; VA = global value added.
SOURCE: Panel analysis and Science and Engineering Indicators 2012, see http://www.nsf.gov/statistics/seind12/tables.htm [November 2012].

Page 185 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×

images

FIGURE F-16 Multidimensional scaling of science and engineering indicators from SEI 2012 Digest.
NOTES: GDP = gross domestic product; KIS = knowledge-intensive services; RD = research and development; S&E = science and engineering; SS = social services; VA = global value added.
SOURCE: Panel analysis and Science and Engineering Indicators 2012, see http://www.nsf.gov/statistics/seind12/tables.htm [November 2012].

Page 186 Cite

Suggested Citation:"Appendix F: Science, Technology, and Innovation Databases and Heat Map Analysis." National Research Council. 2014. Capturing Change in Science, Technology, and Innovation: Improving Indicators to Inform Policy. Washington, DC: The National Academies Press. doi: 10.17226/18606.

×