This chapter reviews opportunities to enhance data analysis and dissemination efforts and increase outreach to stakeholders. The observations herein are based on input provided during the data user workshop held for this study and throughout the panel’s work, as well as on the panel members’ own experience. It is important to note that NCSES has always actively disseminated information about its surveys, data, and data products, and continues to do so despite staff resource limitations. For example, the agency recently published an Info Brief on expansion of the data from the Survey of Doctorate Recipients (SDR) (National Center for Science and Engineering Statistics, 2017a), a potentially impactful way to distribute some of the findings based on the new data, as well as discuss the new methodology. The use of Webinars to disseminate information about new reports, such as the recent Webinar on the Women, Minorities, and Persons with Disabilities in Science and Engineering (WMPD) report, is another example of the agency’s useful outreach efforts. As mentioned in Chapter 3, NCSES also recently launched a project focused on further expanding stakeholder outreach.
The panel acknowledges the staff resource limitations that make it difficult for NCSES to increase its emphasis on additional analysis and dissemination. The recommendations in this chapter focus primarily on strategies that could leverage existing mechanisms and build collaborations that could grow and strengthen the surveys’ stakeholder base with relatively low resource investments.
To fulfill its mandate of providing information about the state of science and engineering in the United States, NCSES produces in a typical year more than 30 reports based on its science and engineering workforce surveys. These include two flagship publications that are congressionally mandated: the Science and Engineering Indicators (SEI) and WMPD reports. The SEI report is prepared under the guidance of the National Science Board (NSB) and is updated biennially with new data to inform policy decisions. WMPD, a report prepared by NCSES and mandated by the Science and Engineering Equal Opportunities Act, is focused on the participation of women, minorities, and persons with disabilities in science and engineering education and employment. The WMPD report also is updated biennially.
In addition, NCSES issues companion pieces to these two main reports. For example, there are Digests for the SEI and WMPD reports, which are condensed versions of the longer reports and highlight some of the most important indicators. Revisiting the STEM Workforce, NSB’s companion piece to the 2014 SEI report, presents 2014 SEI data in the context of policy questions and illustrates ways in which the data can inform policy debates. In addition to reports, NCSES publishes shorter information briefs that summarize issues and trends based on the data. A variety of tables also are regularly published.
For researchers who wish to conduct their own analyses, the data are available in a variety of ways:
- public-use microdata files (available to download through the NCSES Website);
- public-use online data tools (e.g., WebCASPER, Scientists and Engineers Statistical Data System [SESTAT], SED Tabulation Engine);
- restricted-use license agreements, available for access to the National Survey of Recent College Graduates (NSRCG), SDR, SESTAT integrated file, Doctorate Records File/Survey of Earned Doctorates (SED); and
- Federal Statistical Research Data Centers (FSRDCs).
FSRDCs are managed by the Census Bureau and typically entail partnerships between federal statistical agencies and research institutions, such as universities. Currently, only National Survey of College Graduates (NSCG) data are available through FSRDCs, but NCSES is involved in discussions about expanding the use of FSRDCs for other NCSES surveys as well. A data enclave also is being tested.
As discussed in Chapter 3, NCSES has to balance requests from a variety of stakeholders when deciding on the content (and to some extent
the design) of its surveys. The prioritization of resources invested in data dissemination and outreach also deserves careful consideration based on stakeholder needs, including sufficient attention to building new stakeholder groups.
From the perspective of NCSES’s mandate, the flagship reports are a priority. Beyond those reports, additional efforts invested in analysis and broader dissemination of findings would represent opportunities for NCSES to increase awareness of the data and their impact, especially among policy makers. On the other hand, researchers have an increasing need for access to the microdata, and in particular the restricted-use data. The remainder of this chapter addresses two broad categories of stakeholder needs: (1) access to microdata, and (2) availability of publications, which for the purposes of this discussion includes preformatted data tables.
The NCSES science and engineering workforce surveys are of interest to a large group of stakeholders who would like to use the data for in-depth or customized analysis. Many of these stakeholders are academic researchers, but data users from many other sectors, such as nonprofit groups, professional societies, and the media, as well as international users, have this same need.
Historically, some of the data user needs could not be met even with access to the microdata files because of the small sample sizes in the public-use files. Given the types of highly specialized work being conducted by many scientists and engineers, some workshop participants expressed the need to be able to tabulate and model the data at a finer level than what has been available through the public-use files. For example, they would like to be able to separate doctorate recipients in nuclear engineering from other types of engineers to model the supply for clients such as the Department of Energy. They also would like to be able to conduct more detailed analyses by demographic group. For example, they would like to be able to answer questions for minority females, not just minorities and females separately (National Academy of Sciences et al., 2011; Pearson et al., 2015; Slaughter et al., 2015). In the case of the SDR, more of these data will become available as a result of the redesigned sample. NCSES is also in the process of reevaluating its disclosure control practices to determine whether more data could be included in the public-use microdata file in general.
Some existing research questions, including the need for more in-depth research on small population groups, such as racial or ethnic minorities, could be answered with the available restricted data but are not pursued because access to these data has been limited or cumbersome in the past. Increasing the number of datasets available through FSRDCs would alleviate
this problem to some extent, but continued collaboration with other agencies also is important to streamline the processes for accessing the data so as to alleviate the burden associated with establishing data access for researchers. In particular, technological innovations are making it easier to enable researchers to conduct analyses of the restricted data in a secure environment, without the need to travel.
Researchers would find it useful if linkages to other datasets were easier. Such requests currently are evaluated on a case-by-case basis, and the linkage is carried out by NCSES or an NCSES contractor. This approach is time-consuming and also makes replication difficult. A possible way to streamline the process would be to support research centers that could assist with these tasks. Many universities already are involved with the FSRDCs, and indeed, approximately 90 percent of the restricted-use licenses are granted to academic researchers; thus it might be advantageous to locate research centers in academic institutions. The centers also could take the lead on analysis projects and conduct their own dissemination. Importantly, these types of centers would lead to the growth of research communities around the surveys, starting with faculty and graduate students who wanted to conduct research on these topics. It might be particularly useful to consider establishing such a center at a minority-serving institution, which could not only increase the use of the data for research related to underrepresented groups but also boost the traditionally lower response rates among these groups.
RECOMMENDATION 6-1: The National Center for Science and Engineering Statistics should continue to add more of the data from its surveys to Federal Statistical Research Data Centers and collaborate with other government agencies on the development of procedures for streamlining data access, particularly remote access for data users.
A new initiative in this area is the Institute for Research on Innovation and Science, a research center at the University of Michigan that gathers administrative data from a consortium of member academic institutions and produces secondary research data based on those records. Currently, about 30 institutions are participating in this initiative, and various associations are encouraging broader participation.
The National Science Foundation (NSF) supports a variety of centers of excellence, which are a mechanism for fostering interdisciplinary research. One or more centers of excellence focused on science and engineering statistics could fulfill a variety of functions: (1) conducting analysis on topics of interest related to the science and engineering workforce; (2) conducting methodological research, such as the small area research described in Chapter 4; (3) performing linkages to administrative data and investigating
options for expanding opportunities in this area; and (4) supporting the development of the next generation of researchers—in other words, students. While NCSES has limited in-house resources for these types of activities, centers of excellence may be feasible to fund.
RECOMMENDATION 6-2: The National Center for Science and Engineering Statistics (NCSES) should consider establishing research centers of excellence. These research centers could carry out research and dissemination activities for which NCSES does not always have the funds and could facilitate the growth of research communities around the surveys.
Given the resource constraints for in-house analysis activities in particular, developing additional programs and cultivating relationships that would support analyses conducted by other researchers would be worthwhile. In addition to research centers, NCSES could consider supporting analytic grants and establishing graduate or postdoctoral fellowships, dissertation grants/awards, or an intern program. Hosting visiting scholars for brief 1- to 2-month periods also could help expand the network of data users. An important aspect of the success of these types of programs is ensuring that quick data access is facilitated and that researchers can conduct additional analyses after their participation in the program has ended should the need arise—for example, when revising a paper for publication in a journal.
An additional low-cost option would be to host researchers from other NSF units for short periods of time to conduct substantive research. NSF has a tradition of encouraging rotating assignments, and NCSES has in the past collaborated on substantive research with other NSF units. NCSES could invite staff from other units to spend brief, 2–3 month rotations in NCSES, familiarizing themselves with NCSES data and preparing reports relevant to their respective fields. NSF units that might be interested in collaborations of this type include the Division of Graduate Education within the Directorate for Education and Human Resources, the Division of Mathematical Sciences within the Directorate for Mathematical and Physical Sciences, the Directorate for Engineering, and the Office of International Science and Engineering.
NCSES also could increase its reliance on the data collection contractors to carry out analysis tasks. Outreach is needed to encourage and support research by and on underrepresented groups, especially racial/ethnic minorities and persons with disabilities.
RECOMMENDATION 6-3: To increase awareness of the usefulness of its survey data, the National Center for Science and Engineering
Statistics (NCSES) should develop programs that support data analyses conducted by other researchers. Such programs could include graduate or postdoctoral fellowships, dissertation awards, a visiting scholar program, an intern program, collaborations with researchers from other National Science Foundation divisions, and analytic grants. NCSES also could commission from its data collection contractors analysis tasks focused on dissemination goals. Special emphasis should be placed on supporting research on underrepresented groups, especially racial/ethnic minorities and persons with disabilities.
An important factor in the satisfaction of data users is thorough documentation of the data, including information on the complex sample designs, and a clear discussion of the limitations of the data. Inadequate documentation can lead to analytic errors in research that relies on secondary analysis of the microdata (for a meta-analysis of these types of errors in secondary research using the NCSES surveys, see West et al. ). NCSES could consider hosting an online forum to facilitate discussion among researchers with questions related to data analysis. Existing online communities that have well-established mechanisms for sharing information about the unique characteristics of a dataset, sharing code, and discussing best practices could serve as models.1 Online forums require an initial investment and some active engagement on the part of staff to encourage participation and answer questions, but once the communities have been established, they offer several benefits. The questions and answers are preserved in archives, so staff should not have to answer the same question repeatedly. In well-functioning online communities, moreover, users can answer each other’s questions in some cases, so the staff burden is further reduced. In addition, online forums can serve as vehicles for announcements (for example, about new datasets, new reports, upcoming conferences, and so on), and also can be a source of input to NCSES about matters of interest to data users.
NCSES indicated that one of its long-term goals is to support longitudinal research with the data from its science and engineering workforce surveys, and this goal could be a focus of a center of excellence. Although some of the existing weighting challenges are unique to the NCSES surveys, the agency could benefit from input from those who have conducted other
1 Examples of online support communities include the forums associated with providers of online courses, such as Udacity (https://discussions.udacity.com), and the forums for Stan, an open-source statistical analysis software package (http://discourse.mc-stan.org/categories); TensorFlo, an open-source software package for machine learning, released by Google (https://groups.google.com/a/tensorflow.org/forum/#!forum/discuss); Ubuntu, an open-source operating system (https://ubuntuforums.org); and Trellis, the platform created by the American Association for the Advancement of Science. All URLs active as of January 2018.
longitudinal studies—such as the Health and Retirement Study, the National Longitudinal Surveys, and the Wisconsin Longitudinal Study—about the types of issues they encountered and their experiences with enabling data access.
RECOMMENDATION 6-4: The National Center for Science and Engineering Statistics should continue making its work toward enabling longitudinal research based on the survey data a priority and should seek input from other agencies that provide access to longitudinal data.
RECOMMENDATION 6-5: The National Center for Science and Engineering Statistics should continue to prioritize the timeliness of its data releases and the provision of clear documentation of the data and should consider hosting an online forum or providing some other mechanism that can facilitate discussion among users of the survey data.
Because the SEI and WMPD reports are congressionally mandated, their publication is a priority for NCSES. The reports have been transitioned to electronic-only form in recent years, but NCSES still publishes the Digest versions of these reports in hard-copy form. This approach likely represents an appropriate compromise reflecting the different functions served by the full-length and condensed versions. Distributing the hard-copy Digest can increase the visibility of these reports, while the full electronic reports can be browsed for additional detail. Accessing the reports online also draws attention to the wealth of additional resources, including data tables, available on the NCSES Website.
The SEI and WMPD reports are updated every 2 years. With the increasing availability of data from a variety of sources at a rapid pace and the associated changes in expectations, the most important indicators likely will have to continue to be released with at least this frequency. In general with respect to survey data, whether in reports, tables, or other forms, timeliness is a need expressed repeatedly by stakeholders.
With the anticipated enhancements to the longitudinal data for the SDR, integrating longitudinal analyses into the SEI and WMPD reports could enhance the usefulness of these reports, and perhaps stimulate increased appreciation and use of the longitudinal microdata. These types of analyses conducted by the staff would also further facilitate the staff’s understanding of the strengths and limitations of the longitudinal data from a user perspective.
RECOMMENDATION 6-6: The National Center for Science and Engineering Statistics should facilitate the presentation of results of longitudinal analyses in the Science and Engineering Indicators and Women, Minorities, and Persons with Disabilities in Science and Engineering reports.
In the past, NCSES used e-mail blasts to announce the release of new reports or other data products. When used correctly, maintaining a mailing list to communicate actively with stakeholders can be an efficient way to increase the visibility and use of data products. Continuing these mailings is likely worthwhile, and their impact could be enhanced by highlighting the usefulness of the data. For example, the messages could include two or three findings likely to be of particular interest to policy makers or the media.
An area in which uses of the data and communications with stakeholders could be expanded is state-level policy making. For example, there is interest in the number of students educated in science and engineering fields in public colleges and universities and the number of students who subsequently enter the workforce in the same state. NCSES already produces “state profiles,” and these could be disseminated more actively to state representatives, congressional delegations, governors’ offices, the National Governors Association, coordination boards, and state data centers. One workshop participant suggested that developing a smartphone app would be particularly useful for introducing busy state policy makers to key data elements because anyone could quickly pull up relevant data during a conversation with a lawmaker.
Although government employees’ ability to attend professional association meetings is often limited, participation in these types of meetings can increase awareness of the NCSES surveys. In addition, many events, particularly in the Washington, DC, area, such as meetings of the American Association for the Advancement of Science, can provide opportunities to maintain staff awareness of new policy issues, the research landscape, and data needs, which can guide the prioritization of proactive dissemination efforts. Attendance at professional meetings also can provide ideas for increased communications about the data from other agencies and organizations conducting similar large-scale studies. For example, Understanding Society, the UK Household Longitudinal Study, has a well-developed outreach effort, including a policy unit specifically focused on expanding use of the data to inform policy. The unit collaborates with government units in the United Kingdom, organizations, and businesses to help them use the data effectively; organizes policy events and workshops; and conducts policy-specific training for data users. Study staff host an online user support forum and launched a podcast series that features interviews with the study team, researchers using the data, and others benefiting from the data.
As discussed, outreach efforts such as Webinars on the SEI and WMPD reports can be effective dissemination tools. Although developing the presentation materials for a Webinar can be resource-intensive, the materials once developed could be used in a variety of settings—for example, briefings for congressional staffers or meetings of professional associations, such as the American Association for the Advancement of Science.
Of particular importance is outreach to underrepresented groups. NCSES has collaborated with the National Association for Equal Opportunity in Higher Education and participated in conferences and meetings focused on issues specific to underrepresented communities. These relationships are important to ensure that the agency’s surveys and reports, such as the WMPD reports, remain relevant to underrepresented groups. Response rates to the surveys are already low among minority-serving institutions and individual sample members (for example, historically the response rate to the NSCG has been approximately 10 percent lower among African Americans than overall), and this gap could increase without adequate attention in this area.
RECOMMENDATION 6-7: The National Center for Science and Engineering Statistics should evaluate options for expanding communications with former, existing, and new stakeholders to raise awareness of the usefulness of the survey data. State and local governments warrant special consideration, as do groups historically underrepresented in the science and engineering workforce.
This page intentionally left blank.