Deborah L. Crawford1
I have a somewhat unique perspective on this subject. Until September 2010, I worked at the National Science Foundation (NSF), where I was involved in the fashioning of NSF’s data management plan policy. Shortly afterwards, I returned to academia, joining Drexel University. I have the pleasure now of implementing the policies that I had a hand in preparing. It is an important topic with a lot of complexity.
Today, I will try to share with you my view from a university administrator’s perspective - but I will also touch on the respective roles and responsibilities of academic researchers as individuals and as members of research communities. I was asked to respond to the following question: How are university administrators thinking about data citation and related issues? What follows are some of my thoughts on this subject.
In my role as a vice provost for research at Drexel, I view the stewardship of research data as one of a number of responsibilities I have to create an environment that supports the responsible and ethical conduct of research in the public interest. Developing such an environment has implications for the management of the increasingly digital research data that we collect or create.
Let me first talk about the role of researchers, and the research communities to which they belong, in data stewardship. As is already quite well known, there are significant differences in practices among scientific communities, including the communities represented here at this workshop.
For example, some of our communities have, for a decade or more now, leveraged the economies afforded by data sharing, attribution, and citation. These tend to be the scientific and engineering communities, where data have been and continue to be created or collected with the intent to be shared broadly. These include, for example, environmental and astronomical sciences, and geosciences communities—typically those communities where data are collected on nationally or internationally-supported and community-governed instruments or facilities. And now, thanks to the “omics” revolution, a number of the life sciences communities too are generating data with intent to be shared.
In other fields, cultures continue to be much more individual investigator oriented. In such domains, the independence of individual investigators is fiercely guarded and research data are rarely shared, except in relatively modest ways through peer review publications. I think it is useful to keep these cultural and research differences in mind as we think about how we move forward-one size is unlikely to fit all.
1 Presentation slides are available at http://www.sites.nationalacademies.org/PGA/brdi/PGA_064019.
We need to develop explicit policies on data sharing, attribution, and citation-both domain-based policies for the scientific and engineering communities, and institutional policies that complement and support community policies. It is important that we develop these policies and supporting practices in a collaborative way, bringing all stakeholder groups along so that we can fully leverage the added value of the enormous and growing quantities of digital data to advances in science and technology.
Let us now turn our attention to the role of academic institutions. Just as some communities have well developed data policies and practices and others do not, so some institutions have data sharing policies and others do not—at least, not yet.
In tenure and promotion policies and practices that pertain to data sharing, citation and attribution, culture matters very much. For investigators in communities more accustomed to data sharing, data attribution and citation is likely to be valued in tenure and promotion decisions. This, however, is not true across the board. So in fashioning academic policies that promote data sharing, citation and attribution, we must be mindful of, and manage for, these differences.
Institutions should help faculty understand what is expected of them in the responsible stewardship of research data in our increasingly digital scientific world. Deans and department heads are major institutional stakeholders too, for they must provide leadership in raising awareness about this important topic and its implications in matters such as tenure and promotion (and others), and they must serve as advocates for change, where necessary.
I believe mid-career faculty play a very important role. We cannot expect our junior faculty, who are often pioneers or early adopters of new digital research modalities, to carry the weight of promoting the development of new data policies, for they have too many other pressures coming to bear on them, and in fact might be penalized for having pioneering views. Mid-career faculty members are likely to be key to moving a conversation forward on these topics. They are the ones who typically are more engaged in research where progress demands an increasing reliance on the sharing and attribution of digital data, and these faculty members may be more willing and able to speak to and be heard about the importance of these issues.
Let me now briefly address the issue of institutional repositories. Many of us believed that institutional repositories, interoperable ones of course, would be a key to the future; they would enable universities to actively manage their digital assets, manage their intellectual property with appropriate controls, and explore new forms of scholarly communications. The role of institutional repositories is especially important in the later stages of the data lifecycle, as researchers focus on new and interesting scientific opportunities and worry less about the research data of their past interests. Thus, institutional repositories were expected to play important roles in data curation and preservation.
In practice, however, institutional repositories are not living up to our expectations, partly because researchers are not routinely depositing their digital objects in the repositories that their institutions are providing. Many researchers do not see the value to their science and to their
careers of doing so, which ultimately is the bottom line for most researchers. This is something we need to keep in mind as we think about the ways in which institutions encourage faculty to engage in conversations about the future of data sharing, citation, and attribution.
It is important to note that it is not at all clear that academic institutions across the board are in a position to move boldly into this new world. For one thing, as we heard yesterday, universities have not been significantly engaged in the active long-term management of research data to date. Traditionally, the majority of investigators have managed or have been responsible stewards of their own data, where community governance of data was essential to advances in certain fields. It is fair to say that there is much more evidence of community-based initiatives, albeit in some fields more than others, than there are university-level initiatives.
Equally important, or maybe more important today, the substantial cost implications of providing long-term stewardship of data is a very significant concern for research universities. In an increasingly difficult economic environment, where concerns already exist about the escalating costs of higher education and where the federal government is unable or unwilling to support the full cost of research in the academy, who bears the responsibility for paying to ensure long-term open, useful access to research data created in the public interest? This is a policy issue that the government-university partnership needs to resolve.
The discussions that we have been having in this workshop and in recent years raise important questions about who is or should be the champion for these issues at the university, and who should do what? Unfortunately, there are no clear answers at most universities because of the complexities. Faculty, researchers, and students have a voice and a role, but for the most part, they are not substantially involved in conversations because, for the most part, the value to their science and to their careers is not readily apparent to them. Deans, colleges, and schools have voices as well, but for the most part, they are not serving as change agents. This is true, too, in university research offices, in part because they tend to reflect the cultures of the research communities they represent, and undoubtedly, because of concerns about costs. Libraries have served as the strongest advocates for these kinds of changes, but they probably do not have the institutional power or authority to really effect change. So, if we are going to make a difference, there has to be a clear change mandate in institutions involving all institutional stakeholders.
This is a critical topic of conversation for academic institutions, because it impinges upon their reputations as essential contributors to the national knowledge enterprise.
This page intentionally left blank.