of such emerging, integrated data systems in an ad hoc group such as the NASA Heliophysics Data and Computing Working Group. Interagency coordination of the data environment as a whole would benefit researchers whose efforts are funded by multiple agencies.
The information technology industry continues to generate novel technologies and capabilities faster than any federally funded, competitively sourced research program can hope to match. Agencies must be agile enough to exploit emerging technologies without investing in their original development. The best approach is to (1) focus on commercially viable technologies for which there is a demonstrated need, such as high-performance computing clusters, and (2) otherwise invest modestly in the evaluation of emerging commercial technologies through existing mission and small-scale data center activities.
NASA has funded virtual observatories and related “middleware” development. Some of these have led to useful targeted data identification and access technologies, and some are still under development. Mature capabilities should not continue to compete with research proposals for funding. A more effective approach would be for NASA and its agency partners to establish a heliophysics-wide data infrastructure, selecting the most useful efforts for stable funding and bringing other efforts to a close. Future developments can be managed through the supplemental funding mechanisms discussed in the sections “Emerging Technologies” or “Community-Based Software Tools.”
Community-Based Software Tools
In a few subdisciplines, such as solar physics, the availability of integrated open-source data reduction and analysis tools makes a significant difference in the ability of researchers to access and manipulate data. In areas where such tools are not available, immediate agency investment in community-based development would be highly productive. Where tools are already available, support to maintain and evolve them as new data sets and capabilities emerge should continue. Capabilities should expand to include data mining and assimilation in order to enable full exploitation of the large new heliophysics data sets.
The astrophysics and geophysics communities have taken the lead in adopting modern, “semantic” technologies, where machines “understand” the context and meaning of data, to enable cross-discipline data access. Promoting the development of semantic technology would enable the emerging data access capability in heliophysics to share data and knowledge with other fields.
A National Approach to Data Policies
The heliophysics data policies of the funding agencies differ or are in some cases lacking. The NSF, for instance, now requires a data management plan in all research proposals, but geosciences does not yet have a uniform data access and preservation policy. NASA Heliophysics has a well-developed data management policy, but long-term preservation of data is in a state of flux. It would be wise for the agencies to