National Academies Press: OpenBook

For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop (2012)

Chapter: 11- Institutional Perspective on Credit Systems for Research Data

« Previous: 10- Three Legal Mechanisms for Sharing Data
Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

11- Institutional Perspective on Credit Systems for Research Data

MacKenzie Smith1
Massachusetts Institute of Technology

My presentation is about the institutional perspective on credit systems for research data. Why does credit matter to the institution? Simply put, it is because academic research institutions depend on reliable records of scholarly accomplishments for key decisions about hiring, promotion, and tenure. These mechanisms evolved over decades for books, peer-reviewed publications, and sometimes grey literature (e.g., theses, technical reports and working papers, conference proceedings, and similar kinds of information that are not peer-reviewed). Also, a lot of services emerged to make assessment of the record easier for the administration. This includes impact factors, academic analytics, and other methods.

The traditional assessment model as we have it now is falling apart because it does not allow new emerging modes of scholarship and scientific communication to be included. For example, the current traditional evaluative process does not consider the following:

•  Preprint repositories like arXiv or SSRN (the Social Science Research Network).

•  Blogs, websites, and other social media.

•  Digital libraries like Perseus, Alexandria.

•  Software tools, e.g., for processing, analysis, visualization.

There are important reasons why institutions care very deeply about these issues. One of them is institutional representation. There are national and world rankings in universities. One of the things they look at is the accomplishments of faculty and researchers in the institution. These practices that we come up with, like impact factor, play a big role in some of the ranking decisions, which are extremely important to the administration of the university. Then there is academic business intelligence. Many universities now have major industrial liaison programs and technology licensing offices. They are always trying to figure out what the academics are producing that might be commercialized or otherwise exploited, both for the university’s benefit and the researcher. Furthermore, it is important for recruitment. The institution needs to be highly ranked in order to recruit excellent students and faculty members. Finally, there are public relations and fundraising considerations that are extremely important for the university. It is easier to raise money from donors if you have a good reputation and when you have some famous researchers. I know this can be very irritating to those of us who are working in research, but this is real life at the university.

In the past few decades, at least, the publishing process did not really involve the institution at all. The researchers did the research, wrote the papers, and published them on an outsourced basis through their societies, or increasingly with commercial publishers. The university did not get involved until the library bought it back. So, the only role that the university had was as a consumer. The researchers were acting almost like independent agents in that model.

______________________

1 Presentation slides are available at http://www.sites.nationalacademies.org/PGA/brdi/PGA_064019.

Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

However, that is changing with data because in order to produce data, you often need institutional infrastructure. Sometimes it is infrastructure related to a disciple, but a lot of times it is institutional. This is where we get into discipline-related variations. In fields like geophysics and genomics, for example, the infrastructure is not usually provided by the institution, but in the social sciences, it is frequently provided. In the neuroscience field, it is often the institution that funds the various imaging machines and pays for all the storage and infrastructure to maintain the resulting data.

We thus have gone from a system where the institution was not involved in the publishing process to one where the researchers cannot really do what they need to do without support from their institution. Furthermore, institutions have other responsibilities when research is concerned. For example, they have some responsibilities when it comes to funding. The institution is the grantee and is legally responsible for enforcement of the terms of the contract. Also there is additional infrastructure that we all rely on now to do our work, such as digital networks and computing, the library, the licensing office, and the like. The university is responsible for making sure that the infrastructure is well-maintained and functioning. Lastly, institutions are responsible for the long-term storage of scholarly records so they are preserved and will be available and accessible to all interested stakeholders.

Now I need to focus on the intellectual property (IP) part. I would say that to the extent that IP exists in data, or that it has commercial potential, oversight for citation or attribution requirements is unclear (see the presentation by Sarah Pearson). Researchers assume that they control the data and have the intellectual property rights and that they can decide what terms to impose on their data. Often, however, researchers do not, in fact, have these rights. Although funders do not assert intellectual property rights, they frequently do have policies about what should happen to those rights when they give a grant. For example, this is a quote from the NSF Administration Guide: “Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing.”2

Also, university copyright policies are evolving. This is another quote from an unnamed university’s faculty policy. “In the case of scholarly and academic works produced by academic and research faculty, the University cedes copyright ownership to the author(s), except where significant University resources (including sponsor-provided resources) were used in creation of the work” [italics added].

This quote is typical. You can find a similar formulation in just about every institution’s faculty policy document. This is what historically has been applied to things such as software platforms developed with university infrastructure. The same thing is being applied to data now. Note that the word “significant” in the statement is not defined.

Patent policy is similar. Here is another quote from an unnamed university: “Any person who may be engaged in University research shall be required to execute a patent agreement with the

______________________

2 NSF Award and Administration Guide, January 2011.

Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

University in which the rights and obligations of both parties are defined.” In other words, researchers do not get exclusive rights to their patents. They will have to negotiate with the university. This is somewhat vague, however. When data have commercial potential, and they do sometimes, this starts to get really interesting.

The new NSF requirement was not received well by all researchers. Some said: “I think I might be able to patent something from these data that will make me money. So please keep your hands off my research. I am not sharing.” I am exaggerating to make a point here, underscoring the fact that as commercial applications of data become better understood, especially in the life sciences and engineering, this could become a really tricky area for everyone involved in academic research.

From an institutional perspective, some of the requirements for data citation include:

img  Persistent or discoverable location

img  Works even if the data moves or there are multiple copies

img  Verifiable content

img  Authenticity (i.e., “I am looking at what was cited, unchanged”)

img  Requires discovery and provenance metadata

img  Standardized

img  Data identifiers: DataCite, DOIs

img  People identifiers: ORCID registry

img  Institutional identifiers: OCLC?NISO I2?

img  Financial viability

img  Identifiers cost money to assign, maintain

img  Metadata is expensive to produce

Let me elaborate on these requirements. First, a citation has to be persistent or provide a discoverable location. We need the citation and the discovery mechanism to work, no matter where the database is located. We need some way of proving the authenticity of the data. In other words, I am looking at a URI that is referenced in a research paper. How do I know that the dataset I get to by resolving that URI is the dataset that the researcher was using at the time? That requires discovery and enough metadata.

We also need more standardization in key areas. We have to have identifiers for the data, but we also need identifiers for the people and for the institutions involved. For example, I am involved in the ORCID (Open Researcher and Contributor Identification) initiative, which is looking at ways of creating identifiers for researchers that would be interdisciplinary, international, and portable across time.

Lastly, there is the issue of financial liability. We have to keep these efforts affordable so we can talk about identifiers, be it DOIs or DataCite URIs. I know that there has been contention for using identifiers for data in the past, since if we are talking about a million researchers, that is one thing, but if we are talking about billions of datasets and data points, all of which need unique URIs, that could cost a lot of money.

Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

Also, the metadata is currently very expensive to produce. This has to be done in a partnership between researchers and specialists who are paid to do this kind of work, whether it is in data centers or libraries. We have to involve experts whose job is to worry about quality control and metadata production, and that is also very expensive. So, we have to keep in mind these issues and requirements when we think about data citation techniques.

Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 77
Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 78
Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 79
Suggested Citation:"11- Institutional Perspective on Credit Systems for Research Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 80
Next: 12- Issues of Time, Credit, and Peer Review »
For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop Get This Book
×
Buy Paperback | $48.00 Buy Ebook | $38.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The growth of electronic publishing of literature has created new challenges, such as the need for mechanisms for citing online references in ways that can assure discoverability and retrieval for many years into the future. The growth in online datasets presents related, yet more complex challenges. It depends upon the ability to reliably identify, locate, access, interpret, and verify the version, integrity, and provenance of digital datasets. Data citation standards and good practices can form the basis for increased incentives, recognition, and rewards for scientific data activities that in many cases are currently lacking in many fields of research. The rapidly-expanding universe of online digital data holds the promise of allowing peer-examination and review of conclusions or analysis based on experimental or observational data, the integration of data into new forms of scholarly publishing, and the ability for subsequent users to make new and unforeseen uses and analyses of the same data-either in isolation, or in combination with, other datasets.

The problem of citing online data is complicated by the lack of established practices for referring to portions or subsets of data. There are a number of initiatives in different organizations, countries, and disciplines already underway. An important set of technical and policy approaches have already been launched by the U.S. National Information Standards Organization (NISO) and other standards bodies regarding persistent identifiers and online linking.

The workshop summarized in For Attribution -- Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop was organized by a steering committee under the National Research Council's (NRC's) Board on Research Data and Information, in collaboration with an international CODATA-ICSTI Task Group on Data Citation Standards and Practices. The purpose of the symposium was to examine a number of key issues related to data identification, attribution, citation, and linking to help coordinate activities in this area internationally, and to promote common practices and standards in the scientific community.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!