Our panel was asked to review the NCSES communication and dissemination program that is concerned with the collection and distribution of information on science and engineering and recommend future directions for the program. Specifically, we were asked to:
Review NCSES’s existing approaches to communicating and disseminating statistical information, including the center’s information products, website, and database systems. [This review will be conducted in the context of both current “best practices” and new and emerging techniques and approaches.]
Examine existing NCSES data on websites, information gathered by and from NCSES staff, volunteered comments of users, and input solicited by the panel from key user groups and assess the varied needs of different types of users within NCSES’s user community.
Consider the impact that current federal and NSF website guidance and policies have on the design and management of NCSES’s online (Internet) communication and dissemination program.
Consider current research and practice in collecting, storing, and utilizing metadata, with particular focus on specifications for social science metadata developed under the Data Documentation Initiative (DDI).
Consider the impact of government-wide activities and initiatives (such as FedStats and Data.gov) and the emerging user capability for online retrieval of government statistics.
In their presentations to the panel, the NCSES staff produced a large hard-copy stack of tabulations, noting that the stack represented just one of the center’s periodic reports. The staff also noted that, even though the center has largely shifted to electronic dissemination, the dictates of data accuracy and reliability require that a great deal of NCSES time is spent in checking data and formatting the data for print and electronic publication.1 For example, each page of the hard copy must be checked by someone looking at the source data. This effort comes at the expense of ensuring data integrity at the source. We believe this emphasis is misplaced.
Although it will never be possible to fully avoid edit and quality checks because errors are prone to creep into data at any stage in processing, there is much to be gained by focusing primarily on the quality of the incoming “raw” data from the source. This approach is best ensured by adopting a comprehensive database management framework for the process, rather than the current primary focus on review of the tabular presentation. A framework that ensures integrity of the data at the source of the data, buttressed by the availability of metadata (that is, data about the data), is the necessary foundation of real improvement in data dissemination.
RECOMMENDATION 1: The National Center for Science and Engineering Statistics should transition to a dissemination framework that emphasizes database management rather than data presentation and strive to ensure integrity of the data at the source.
All of the tables published by NCSES are selections, aggregations, and projections of the underlying microlevel observations. The recommendation above envisions that, wherever