National Academies Press: OpenBook
« Previous: 9- Data Citation in the Humanities: What's the Problem?
Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

10- Three Legal Mechanisms for Sharing Data

Sarah Hinchliff Pearson1
Creative Commons

Sharing data today can be easy; you can simply post them on the web. But doing so means losing some control over the data, including whether you will be accurately and properly credited. This is obviously the case when you share data without a related license, contract, or waiver. As I will explain, to a certain extent this is true even when any one of those legal mechanisms is used.

I will begin by defining some terms. For purposes of this presentation, attribution, credit, and citation all have distinct meanings. Attribution refers to the legally imposed requirement to attribute the rights holder when the data are copied or reused in a specified manner. The remedy against someone who fails to attribute is a lawsuit, either based on breach of contract or infringement of an intellectual property right, depending on the legal mechanism used to impose the attribution requirements. Credit, on the other hand, is what we all want—explicit recognition for our contribution to someone else’s work. Finally, there is citation, which is rooted in norms of scholarly communication. The purpose of citation is to support an argument with evidence. However, citation has also become a proxy for credit, albeit an imperfect one.

This is an important starting point. It reminds us that legal attribution requirements do not necessarily match our expectations for receiving credit, nor do they perfectly map to accepted standards of citation. When the remedy for failure to attribute is a lawsuit, we are well-served to recognize this incongruity. With that in mind, let us turn to the law.

There are three main legal mechanisms for sharing data: licenses, contracts, and waivers. Whenever data are shared, there is a possibility they will not be properly cited upon reuse. Licenses and contracts attempt to eliminate this risk by imposing legal attribution requirements. Waivers, however, do not legally impose attribution. Instead, they rely on community norms to ensure proper citation. There are consequences to each of the three approaches. I will address each below.

Licenses

We will start with the approach for which Creative Commons is best known - licenses. Licenses operate by granting permission to copy, distribute, and adapt data upon certain conditions. One of those conditions is attribution, as it is in all Creative Commons licenses. A license sounds a lot like a contract because it grants permission to use data under certain conditions. However, they are actually quite different because a license is built upon an underlying exclusive right. Therefore, in order to understand the scope of a license, you have to understand the scope of the underlying right. In the context of sharing scientific data, the rights involved are typically copyright or database rights.

______________________

1 Presentation slides are available at http://www.sites.nationalacademies.org/PGA/brdi/PGA_064019.

Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

We will begin by taking a closer look at copyright law. Copyright law grants a bundle of exclusive rights to creators of original works at the moment the work is fixed in a tangible medium. In non-legalese, that means copyright is granted automatically once you write your work down or enter it into the computer.

Copyright is limited in scope and duration, and the specific limitations vary by country. For scientific data, the most important limitation of copyright is that copyright never extends to facts. Copyright does, however, extend to a collection of facts if they are selected, arranged, and coordinated in an original way. The required threshold is low.

There is significant uncertainty about where the line of copyright extends, even among copyright lawyers. To complicate matters further, this line varies somewhat according to the laws of each country.

Determining what is subject to copyright is only the first hurdle. The next task is identifying the scope of copyright protection. Even when a database or a collection of facts is subject to copyright, the facts themselves remain in the public domain. This means that the general rule in the U.S. and elsewhere is that data can be extracted from a copyrighted database without infringing copyright law.

That is not true, however, in the European Union (EU). In the EU and a few other countries, governments have implemented what are called sui generis (“of their own kind”) database rights. These rights allow a database maker to prevent the extraction and reuse of a substantial part of the contents of a database, even if the contents are otherwise in the public domain.

A license can be built atop copyright or database rights or both. By way of example, Creative Commons (“CC”) licenses are copyright licenses. If a CC license is applied to a database, it covers both the data and the database, all to the extent each is subject to copyright. Any use of the data or database that implicates copyright, requires attribution. Any use of the data that does not implicate copyright - if for example, the data are in the public domain - does not require attribution, even if it triggers database rights.

Because of the difficulty of deciphering the contours of copyright protection in scientific data and databases, it is very hard for both the data provider and data user to know when the license applies and when it does not. In other words, it is difficult to know when attribution is legally required. This creates a number of risks.

For one, it creates the risk that data providers will be misled about what they are getting when they apply a license to their data. They may believe that if they apply a license to their data, any use of the data will require attribution. As I explained earlier, that is not the case. If the data are in the public domain, or if the use of copyrighted data falls under fair use, the attribution requirement is not triggered.

It also creates the risk that data users (also referred to as the licensee) will misjudge their attribution requirements because of the difficulty in determining when copyright applies. They may under- or over-comply with the license without realizing it. Either situation can be problematic.

Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

In addition to the legal uncertainty, licenses also create the risk of imposing burdensome attribution requirements. In the science context in particular, projects often rely on data gathered from a variety of different sources. Depending on the licenses used, it is possible that would require attributing each individual or institution that contributed any piece of data to the project. This is a problem we call attribution stacking.

This raises yet another potential problem with attribution. Attribution obligations written into a license are, by their nature, inflexible. No lawyer can anticipate every situation in which the attribution requirements would be triggered and account for all of the circumstances in which they will be applied. This can create some absurd situations where, for example, a user or aggregator of data may technically be required to attribute 1000 different data providers, all in the idiosyncratic manner that the rights holder has dictated. Conceivably, the user could do all this and still not satisfy people’s expectations for receiving credit or accepted standards of citation.

Contracts

The next legal mechanism for requiring attribution is contract law. Contracts can have different names and take a lot of different forms, but they are often called data use agreements or data access policies.

Unlike a license, a contract does not necessarily require an underlying intellectual property right. Technically, it requires a few legal formalities, including an offer and acceptance. In practice, sometimes that manifests in an online agreement, where the user has to click to accept the terms to access to data. Other times the user is presumed to have accepted the terms by continuing to use the site. If you read those terms, they may require attribution.

Like licenses, contracts suffer from a number of potential downsides. For one, they likely impose confusing obligations on users who get data from a variety of sources, all subject to different user agreements. This problem is even more pronounced with contracts because at least public licenses are somewhat standardized. User agreements are not, which means each data source likely has a different user agreement, filled with legalese imposing attribution and other obligations on users. The consequence is that some data sources may not be used simply because users cannot understand the terms.

Another limit to contract law is that it only binds the parties to the agreement. That may sound obvious, but this is not the case with licenses. If someone obtains licensed data and shares them, the person who obtains them it from that second user is still bound by the conditions of the license. If the data were shared by contract alone, the person who obtained the data from the second user would not be bound by the terms of the contract because they were not a party to the original agreement. In this respect, contracts have a more limited reach than licenses.

Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

In a different respect, contracts have a broader reach than licenses. Because they are not tied to an underlying right, contracts can impose obligations on actions that are not restricted by copyright or database rights. The effect could be to restrict or take away important rights granted to the public. For example, in 2011, the Government of Canada launched an open data portal with a related contract controlling access to the data. This agreement initially had a provision that forbid any use of the data that would hurt the reputation of the Canada. This requirement created an uproar and was changed within a day. Nevertheless, this example shows the potential for overreaching. This sort of thing is particularly troublesome in the context of standardized contracts, where the terms are rarely read and almost never negotiated.

Waivers

The last legal mechanism is the waiver. Waivers can take many forms, but the purpose is to dedicate the data to the public domain.

Waivers are not enforceable in every jurisdiction. To deal with this problem, CC has created a tool called CC0 (read CC Zero) that uses a three-pronged approach designed to make it operable worldwide. The first layer is a waiver of copyright and all related rights. If the waiver fails, CC0 has a fall-back license that grants all permissions to the data without any conditions. As a final backup, CC0 contains a non-assertion pledge, where the rights holder promises not to assert rights in the data.

Obviously waiving rights to a dataset means the provider no longer has control over it. Among other things, that means the data provider cannot require attribution (although they can certainly encourage it). Yet, as mentioned above, nearly every approach requires losing some measure of control in the data. Waivers also provide legal certainty in a way that contracts and licenses do not. There is no need to try to decipher the scope of copyright protection or consult a lawyer. Nor is there a need to try to parse the legalese of a variety of different user agreements. Note this certainty does not exist when data are released without any legal mechanism. The silent approach leaves people guessing about whether property rights exist in the dataset and whether they risk liability by using it.

To summarize, each approach has consequences. With licenses, we face legal uncertainty about the scope of the license, and we risk imposing attribution requirements that are inconsistent with relevant community norms and expectations. With contracts, we gain some measure of legal certainty, but we risk imposing even more burdensome attribution obligations as each institution or data provider creates its own contractual terms. Contracts also pose the risk of overreaching and imposing obligations that may restrict important rights of users. Waivers avoid the problems associated with licenses and contracts, but they require giving up control.

Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

It is important to remember that there is no mechanism that can impose legally binding obligations in a way that perfectly maps to our expectations for receiving credit or accepted standards of citations. By trying to use the law for control, we risk imposing unnecessary transaction costs on data sharing. We also potentially push people away from using our data sources. Choosing the right approach requires an understanding of the consequences. The conversation at this workshop is a good start.

Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×

This page intentionally left blank.

Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 71
Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 72
Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 73
Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 74
Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 75
Suggested Citation:"10- Three Legal Mechanisms for Sharing Data." National Research Council. 2012. For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop. Washington, DC: The National Academies Press. doi: 10.17226/13564.
×
Page 76
Next: 11- Institutional Perspective on Credit Systems for Research Data »
For Attribution: Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop Get This Book
×
Buy Paperback | $48.00 Buy Ebook | $38.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The growth of electronic publishing of literature has created new challenges, such as the need for mechanisms for citing online references in ways that can assure discoverability and retrieval for many years into the future. The growth in online datasets presents related, yet more complex challenges. It depends upon the ability to reliably identify, locate, access, interpret, and verify the version, integrity, and provenance of digital datasets. Data citation standards and good practices can form the basis for increased incentives, recognition, and rewards for scientific data activities that in many cases are currently lacking in many fields of research. The rapidly-expanding universe of online digital data holds the promise of allowing peer-examination and review of conclusions or analysis based on experimental or observational data, the integration of data into new forms of scholarly publishing, and the ability for subsequent users to make new and unforeseen uses and analyses of the same data-either in isolation, or in combination with, other datasets.

The problem of citing online data is complicated by the lack of established practices for referring to portions or subsets of data. There are a number of initiatives in different organizations, countries, and disciplines already underway. An important set of technical and policy approaches have already been launched by the U.S. National Information Standards Organization (NISO) and other standards bodies regarding persistent identifiers and online linking.

The workshop summarized in For Attribution -- Developing Data Attribution and Citation Practices and Standards: Summary of an International Workshop was organized by a steering committee under the National Research Council's (NRC's) Board on Research Data and Information, in collaboration with an international CODATA-ICSTI Task Group on Data Citation Standards and Practices. The purpose of the symposium was to examine a number of key issues related to data identification, attribution, citation, and linking to help coordinate activities in this area internationally, and to promote common practices and standards in the scientific community.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!