permit meaningful disaggregation. Funding data for other nations are even scarcer, and the levels of expenditures related to industry or military programs are generally unavailable because of proprietary or national security considerations. Data indicating output or impact also are not broadly available, especially since there is disagreement as to what constitutes output, how it should be measured, and what any given measurements might tell us.
A final issue is the paucity of data available for narrowly defined fields or new subfields. Most surveys look only at large fields, precluding the possibility of disaggregating the results any further. The lack of detail in available data proved to be a major problem in looking at the interdisciplinary subfield of AMO science, for example. In addition, even the data with some degree of detail presented significant difficulties. For instance, the best available database on science and engineering personnel, the two OSEP surveys, has subcategories for atomic and molecular physics and for optics, but both subcategories are listed under physics only, although a substantial amount of the research in AMO science is more properly categorized under chemistry or engineering. In addition, because of the small number of individuals in the existing subcategories, any further disaggregation based on demographic characteristics such as gender, race, and age, among many others, becomes statistically unreliable. This is true even for a well-defined, but relatively small, field such as astronomy. The small numbers of scientists in such fields make it possible to conduct a customized survey, however.
The quality of existing data is no less a problem than is availability. Although the statistical compilations of the NSF and OSEP, and other commercially available data sets reviewed by the panels, appeared to conform to rigorous statistical methods and their limitations and uncertainties were well documented, not all sources are equally reliable. Determining the accuracy of all sources often is not possible.
The problems associated with even good-quality data are compounded with every new source that is added, because each data set has its own scope, definitions, and biases. In the case of time series, this difficulty is further complicated by the unique circumstances—both internal and external—of each data collection point. Moreover, long time series such as the OSEP and NSF demographic data, which typically go back two or three decades and are the most useful sources for long-term retrospective analyses, reflect an inherent paradox. On the one hand, categories chosen initially in an attempt to maintain a consistent set of discipline and subdiscipline categories for the purpose of comparison over the years will inevitably atrophy and be supplanted by newly emerging areas of inquiry. This tendency, of course, is most likely to be a problem when the focus is the most dynamic and rapidly changing disciplines, which arguably are the ones about which we need the most information. In the case of the OSEP and NSF surveys that break major fields down into a handful of subfields, the tendency toward dynamic change, and the difficulty of accurately characterizing it, are evidenced by the great increase of survey responses that fall into the “general” or “other” subcategories. On the other hand, periodic adjustments made in the survey methodology to keep the data currently valid lead to statistical inconsistencies or