National Academies Press: OpenBook
« Previous: 12- Issues of Time, Credit, and Peer Review
Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

13- The DataCite Consortium

Jan Brase1
DataCite

I would like to give you an overview of what DataCite is doing. The idea behind our project is that science is global and therefore we need global standards to do it. We need some global workflows and cooperation between global players such as global data centers and publishers. Of course, science is also carried out locally. Scientists themselves do global science but they are embedded in their local institutions, libraries, and funding agencies. So, that is the paradox that we are dealing with in our project. We want to have global answers but at the same time want to act on a very local basis. That was the main motive behind founding DataCite as a global consortium of local institutions.

On the one hand, DataCite is carried out by members like the Technische Informations Bibliothek (the German Library of Science and Technology), the California Digital Library, and the British Library. These institutions act locally with local data centers. For example, the British data center works with the British Library and uses their services as their local coverage partner. On the other hand, DataCite itself as a global organization consisting of other global organizations, such as publishers. This is important, because the publishers now only have one central partner to work with on data citations and do not have to deal with the dozens of data centers individually on the local level.

This is the list of the current membership of DataCite. There are the 15 members from 10 countries.

images

FIGURE 13-1 List of DataCite members.

______________________

1 Presentation slides are available at http://sites.nationalacademies.org/PGA/brdi/PGA_064019.

Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

The next question is what are we doing? Simply, DataCite is by definition a registration agency of Digital Object Identifiers (DOIs). DataCite assigns DOI names to datasets. This is one of our main pillars. We are also actively involved with our members to work on standard definitions. We try to activate those standards for data citation. We also have plans to establish a central metadata portal, where you will have free access to the metadata of all content that we have registered.

One of our main functionalities is to provide DOI names for datasets. We all agree that identifiers for datasets are important to make data citation possible. Let me explain with an example of citations for a dataset:

Storz, D. et al. (2009):

Planktic foraminiferal flux and faunal composition of sediment trap L1_K276 in the northeastern Atlantic.

http://dx.doi.org/10.1594/PANGAEA.724325.

As a supplement to the article:

Storz, David; Schulz, Hartmut; Waniek, Joanna J; Schulz-Bull, Detlef; Kucera, Michal (2009): Seasonal and interannual variability of the planktic foraminiferal flux in the vicinity of the Azores Current.

Deep-Sea Research Part I-Oceanographic Research Papers, 56 (1), 107-124,
http://dx.doi.org/10.1016/j.dsr.2008.08.009.

First, we have a dataset citation. You will see that one of the good things about using DOI names is that it actually has the same look and feel as a classical journal citation, with the title and the DOI name that you can click on to access the article. It also has the data center as the affiliation. The most useful part, however, is that you can click on the DOI to directly access the data. This allows you to download the data into your system for your own analysis or visualization. If you decide to reuse the data for your own work, the fact that the data has a DOI name allows you to cite the data and give the original author credit for using those data.

If the user goes to the webpage of the article, that person then sees that there is data available for this article. In contrast to the article that may only be available to paying customers (subscribers), the access to the data is free of charge. So, if the user is interested in the article, he or she can look at the data first and then decide what to do.

That is one of the fundamentals of DataCite—we believe that the data that support the article should be freely available. In cooperation with the publishers, we have designed our system in a way that even if you do not have the right to look at the article (without paying) and you can only access the abstract and the table of contents, the link to the data is displayed and access to the data is free of charge. In a way, this is a win-win situation for the publishers because the availability of the data enhances the value of the article and promotes its use, while the publishers do not any lose revenues.

Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

Finally, let me briefly tell you where are we at this point. DataCite has registered over a million records with DOI names. We have also published metadata schema2 that we use for all records. We just released the Beta Version of DataCite Metadata Store online in July 2011: see http://search.datacite.org.

In 2012, we expect to have around 800,000 datasets in the Metadata Store and hope to have all records available in the middle of the year. As for next steps, we are working with other organizations, such as Thomson Reuters Science, to index our content. The metadata are freely available to harvesters via http://oai.datacite.org. We are also working with Elsevier and the Pangaea data center and other publishers to find more article-data links.

______________________

2http://schema.datacite.org.

Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

This page intentionally left blank.

Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 95
Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 96
Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 97
Suggested Citation:"13- The DataCite Consortium." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 98
Next: 14- Data Citation in the Dataverse Network »
For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop Get This Book
×
Buy Paperback | $48.00 Buy Ebook | $38.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The growth of electronic publishing of literature has created new challenges, such as the need for mechanisms for citing online references in ways that can assure discoverability and retrieval for many years into the future. The growth in online datasets presents related, yet more complex challenges. It depends upon the ability to reliably identify, locate, access, interpret, and verify the version, integrity, and provenance of digital datasets. Data citation standards and good practices can form the basis for increased incentives, recognition, and rewards for scientific data activities that in many cases are currently lacking in many fields of research. The rapidly-expanding universe of online digital data holds the promise of allowing peer-examination and review of conclusions or analysis based on experimental or observational data, the integration of data into new forms of scholarly publishing, and the ability for subsequent users to make new and unforeseen uses and analyses of the same data-either in isolation, or in combination with, other datasets.

The problem of citing online data is complicated by the lack of established practices for referring to portions or subsets of data. There are a number of initiatives in different organizations, countries, and disciplines already underway. An important set of technical and policy approaches have already been launched by the U.S. National Information Standards Organization (NISO) and other standards bodies regarding persistent identifiers and online linking.

The workshop summarized in For Attribution -- Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop was organized by a steering committee under the National Research Council's (NRC's) Board on Research Data and Information, in collaboration with an international CODATA-ICSTI Task Group on Data Citation Standards and Practices. The purpose of the symposium was to examine a number of key issues related to data identification, attribution, citation, and linking to help coordinate activities in this area internationally, and to promote common practices and standards in the scientific community.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!