National Academies Press: OpenBook
« Previous: 22- Data Citation and Data Attribution: A View from the Data Center Perspective
Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

23- Roles for Libraries in Data Citation

Michael Witt1
Purdue University

As a practicing librarian, I will be focusing on the roles for librarians and information professionals in data citation and attribution. I would like to start by answering the question, Why are librarians involved in data, and why are they interested in data citation? If we go back to the workshop on “New Collaborative Relationships: The Role of Academic Libraries in the Digital Data Universe” that was sponsored by the Association For Research Libraries (ARL) and the National Science Foundation (NSF) in September 2006, an important need was identified “… for new partnerships and collaborations among domain scientists, librarians, and data scientists to better manage digital data collections; necessary infrastructure development to support digital data; and the need for sustainable economic models to support long-term stewardship of scientific and engineering digital data for the nation’s cyberinfrastructure.”2

To follow up, in August 2010, the ARL did a survey of its member institutions (approximately 130) and 57 of them responded. Some of the findings include: (1) 21 of them currently provide infrastructure and services for e-Science and data support, and (2) 23 members are in the planning stages.3

This shows that libraries are involved in this area of data curation, at least in the context of academic and research libraries. That is not to say that any of these issues are exclusive to those libraries. In fact, I think that a lot of these needs extend to public libraries and citizen science, and other libraries outside of the university context.

I propose that data citation has “a last mile problem.” In communication networks it is usually easier to connect countries and cities than it is to connect to individual end-nodes, such as houses, especially in rural areas. In the data citation arena, the challenge is: how do we reach and affect a change in practice among end-users of data? How can we reach people who will be writing papers and citing the data? Those users could be students, faculty researchers, citizens, or government agencies, etc.

I believe that a role that librarians can play here is rooted in libraries’ tradition of information literacy outreach and instruction. Information literacy is a set of abilities requiring individuals to recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information.4 This includes the proper citation and attribution of sources.

______________________

1 Presentation slides are available at http://www.sites.nationalacademies.org/PGA/brdi/PGA_064019.

2 Available at: http://www.arl.org/pp/access/nsfworkshop.shtml.

3 C. Soehner, C. Steeves, and J. Ward, E-Science and Data Support Services: A Study of ARL Member Institutions Association of Research Libraries, 2010. http://www.arl.org/bm~doc/escience_report2010.pdf.

4 American Library Association. 1989, Presidential Committee on Information Literacy. Final Report.

Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

If you look at the Information Literacy Competency Standards for Higher Education5 from the Association of College and Research Libraries (ARCL), you can replace the word “information” with “data” and the competencies make sense and remain relevant.

Where can users look for information on how to cite data? One natural place to turn would be style guides. I did a study with two colleagues, where we looked at 20 different style guides and performed content analysis to see what kind of instructions they are providing users explicitly to cite digital data. The answer is: they do not consistently address data citation and attribution.

ch114.jpg

FIGURE 24-1 A Description of Data Citation Instructions in Style Guides.
SOURCE: International Digital Curation Conference, Chicago, IL. Retrieved from http://www.docs.lib.purdue.edu/lib_research/121/. Newton, Mooney, & Witt. (2010).

If you look at the above grid, it covers instructions for digital data, data in other formats (e.g., paper-based tables), and other electronic resources. The dark purple indicates the areas where the style guide provides explicit instructions for citation. The light colors (i.e., aqua or white) indicate that there are no explicit instructions. So, generally speaking, some style guides do a better job than others—but if this is where students and others are turning for instructions to properly cite data, they will undoubtedly be frustrated.

______________________

5 Available at: http://www.ala.org/ala/mgrps/divs/acrl/standards/informationliteracycompetency.cfm.

Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

One thing that we see happening on our university campuses is that librarians are stepping in to address this need by creating resource guides. This is a common practice of librarians to develop bibliographies and path-finders to introduce topics and tools to users. Here are some examples of resource guides on data citation that are appearing at universities from their libraries:

•    MIT: http://www.libraries.mit.edu/guides/subjects/data/access/citing.html

•    MSU: http://www.libguides.lib.msu.edu/citedata

•    Minnesota: http://www.lib.umn.edu/datamanagement/cite

•    Purdue: http://www.guides.lib.purdue.edu/datacitation

•    Oregon: http://www.libweb.uoregon.edu/datamanagement/citingdata.html

•    Cambridge: http://www.lib.cam.ac.uk/dataman/pages/citations.html

•   Virginia: http://www2.lib.virginia.edu/brown/data/citing.html

These guides are written by librarians in most cases and tailored for their particular audience. They may be tailored for undergraduate or graduate students, faculty researchers, or others.

One project that I would like to briefly talk about is Databib.6 This project was funded through the Institute for Museum and Library Services (IMLS). Here is the description of the project:

The libraries of Purdue University and Penn State University will partner to create a new online information resource for research data producers, users, publishers, librarians, and funding agencies. This resource, Databib, will be an annotated online bibliography of research data repositories, created and maintained by an online community of librarians. Databib will be an important focal point for connecting librarians more closely with other research data stakeholders and demonstrating the significant contributions libraries can make to solving the challenges posed by digital datasets. The Databib platform will also serve as a testbed for linking, integrating, and presenting information about datasets in new ways.7

Databib is essentially a bibliography that describes data repositories. What we are doing is creating a platform for librarians to submit and enhance bibliographic entries that describe these data repositories and do it in a way that is maximally open, using the Creative Commons Zero public domain protocol. If someone wants the list or the metadata, they are free to download and use them. Also, if someone wants to enhance the metadata or annotate them, that is also possible.

We are creating this resource for the community to help users find data as well as to help data producers identify repositories where they can submit their data, to share this information with funding agencies that mandate data management and tell them where data have been submitted, because these directions are unclear in many cases. We also want to test the notion of a bibliography. We will have bibliographic records that can be exported as MARC records, so if someone wants to download them into their library catalogue, they can. Also, if someone wants to integrate them with other Web 2.0 tools, such as social tagging and social bookmarking, Databib will facilitate sharing links and citations. Finally, we want to use this platform to experiment with linked data. We want to create a platform where the descriptions of these data

______________________

6 Databib website, http://databib.lib.purdue.edu.

7 IMLS press release, http://www.imls.gov/grant_awards_announcement_sparks_ignition_grants.aspx.

Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

repositories can be linked in as many ways as possible to other things, whether it is in the same subject area, same agency that supports the data repository, or any other level of linkage. This project is a nine-month project, and Databib will be going online in the spring of 2012.

Going back to the potential role of libraries and librarians, libraries are a primary actor in the scholarly communication chain. I believe that libraries can promote persistence for links to data. Jan Brase talked about DataCite yesterday. There are many libraries that are participating in this effort. I think that libraries need to adopt URI policies. We are creating a lot of digital content and making it available in a lot of different ways with links that break. So, in addition to minting and maintaining unique, global, and persistent identifiers, we can have more general URI policies, which we can advocate for web content across our institutions.

Are libraries presenting our own data in ways that facilitate or encourage citation? Libraries maintain institutional repositories and other digital libraries where they are presenting digital objects, but do we have supporting documentation and FAQs that give users instructions for citation? Do we provide embedded, structured metadata within the web page, such as COinS, micro-formats, or RDF? Do we facilitate exportable citations? Many of our libraries have data services that are doing outreach to faculty members to help them understand data management plans. Before projects are funded and data are generated, there is the opportunity to have a conversation about data-sharing with the different stakeholders. There is an opportunity for advocacy.

I would like to raise awareness of the work being done by the International Association of Social Science Information Services and Technology (IASSIST). I co-chair a special interest group on data citation with Mary Vardigan. Among the over 300 members that IASSIST has, about 40 or 50 of them are involved in this special interest group. Some of the activities that we have been engaged in include an effort to derive a common set of user instructions for citing data. We realize that we would not necessarily be able to use a perfect set of instructions for all cases, but if we can come up with a core set of instructions, that would be very useful. Also, there has been some work to integrate datasets as a resource type in citation management software such as EndNote or RefWorks. Moreover, we are doing some advocacy. We have been writing letters to style guides editors and publishers to encourage them to articulate policies and instructions for data citation to their authors. Also, like many other special interest groups, we are generating resources such as a website and brochure that are publicly available for use.

To conclude, librarians and information professionals can play important roles in advocacy and outreach, and in the integration and citation of data. This includes data citation in reference services and information literacy instruction and standards. Librarians should ask themselves: if we are publishing data, are we making our data citable, and are we incorporating data into information literacy?

One last observation: many libraries are creating new data services units that can help raise awareness of and address issues related to data attribution and citation for their communities. Promoting proper data use and citation should be a part of what we normally do in libraries, a part of our regular practices. There seems to be a trend of libraries addressing research data in a specialized manner, e.g., “data reference” and “data information literacy”. I suggest that, after a period of time, the library profession will become more comfortable with data and will not need

Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

to qualify “data” services as such. The same principles of library science that apply to traditional formats can be applied to data.

The timing seems to be perfect for people to connect and collaborate to address data citation and attribution issues.

Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

This page intentionally left blank.

Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 151
Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 152
Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 153
Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 154
Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 155
Suggested Citation:"23- Roles for Libraries in Data Citation." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 156
Next: 24- Linking Data to Publications: Towards the Execution of Papers »
For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop Get This Book
×
Buy Paperback | $48.00 Buy Ebook | $38.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The growth of electronic publishing of literature has created new challenges, such as the need for mechanisms for citing online references in ways that can assure discoverability and retrieval for many years into the future. The growth in online datasets presents related, yet more complex challenges. It depends upon the ability to reliably identify, locate, access, interpret, and verify the version, integrity, and provenance of digital datasets. Data citation standards and good practices can form the basis for increased incentives, recognition, and rewards for scientific data activities that in many cases are currently lacking in many fields of research. The rapidly-expanding universe of online digital data holds the promise of allowing peer-examination and review of conclusions or analysis based on experimental or observational data, the integration of data into new forms of scholarly publishing, and the ability for subsequent users to make new and unforeseen uses and analyses of the same data-either in isolation, or in combination with, other datasets.

The problem of citing online data is complicated by the lack of established practices for referring to portions or subsets of data. There are a number of initiatives in different organizations, countries, and disciplines already underway. An important set of technical and policy approaches have already been launched by the U.S. National Information Standards Organization (NISO) and other standards bodies regarding persistent identifiers and online linking.

The workshop summarized in For Attribution -- Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop was organized by a steering committee under the National Research Council's (NRC's) Board on Research Data and Information, in collaboration with an international CODATA-ICSTI Task Group on Data Citation Standards and Practices. The purpose of the symposium was to examine a number of key issues related to data identification, attribution, citation, and linking to help coordinate activities in this area internationally, and to promote common practices and standards in the scientific community.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!