Intellectual Property Rights in Data
JULIE E.COHEN and WILLIAM M.MARTIN
In 1994, Michael Zeidenberg purchased a compact disc containing phone listings and related information compiled from more than 3,000 telephone directories. Zeidenberg copied the data to his own Web site, which he planned to use for commercial purposes. The manufacturer of the compact disc, ProCD, sued Zeidenberg and ultimately succeeded in stopping him from selling the data, but not because of any protection afforded by intellectual property law. Both the district court and the court of appeals agreed that ProCD’s database did not merit copyright protection. However, the court of appeals found that the “shrinkwrap license” that accompanied the product was valid and bound Zeidenberg not to redistribute the information (ProCD, Inc. v. Zeidenberg, 1996). This decision has proved extremely controversial. Many legal commentators have criticized the court for allowing ProCD to use a mass-market, standard-form license to confer upon itself broader protection than federal intellectual property law would allow. Others have praised the court for allowing ProCD to do what was necessary to protect its investment.
The Seventh Circuit’s decision in ProCD v. Zeidenberg and the controversy that followed it illustrate both halves of a growing problem concerning legal protection for databases. The first half of the problem concerns the difficult position in which database creators find themselves. Current intellectual property paradigms were not designed for an information economy (Reichman and Samuelson, 1997). Unlike holders of more traditional types of intellectual property, database creators enjoy only limited protection under the federal intellectual property laws, and so have turned to contracts to protect their investments. The second half of the problem lies in the proposed solutions. Both standard-form
contracts (shrinkwrap licenses) and legislative schemes that would grant property rights to database creators ultimately may undermine the broader purpose of intellectual property law. Given the particular structure of the database market, granting broad rights in information invites database creators to price according to profit rather than according to cost. These solutions to the problem of database protection may actually discourage knowledge sharing and hinder research and development (Samuelson, 1997).
SHORTCOMINGS OF THE PRESENT FRAMEWORK
The current framework for legal protection of databases is ill-suited to promote progress in the field of industrial ecology and allied disciplines. Industrial ecology looks to develop a systemic understanding of industrial processes, with the ultimate goals of optimizing material use and minimizing pollution and waste. This inquiry depends on access to data about how the industrial complex functions. Much of the data now being generated is not readily available. This problem may be attributable in part to the current framework of intellectual property law, which was not designed to protect and encourage the dissemination of compilations of factual information.
Consider the example of the intelligent vehicle highway system (IVHS). As proposed, this system would use sensors in the road and in vehicles, in conjunction with global positioning system satellite signals, to relay information to and from vehicles (Dingle, 1995). Such a system might collect a tremendous amount of valuable data, which could then be made available to various interested entities. For example, state departments of transportation might want to know how many vehicles use each highway so they could make better predictions regarding road repair and resurfacing. Civil engineers might be interested in information that could help improve highway safety and efficiency. The Environmental Protection Agency (EPA) might want information bearing on pollution levels, such as the number and types of vehicles that travel at particular times of the day. Public transit authorities could use the information to design bus schedules that better reflect commuter demand, and toll collection authorities could use the information for electronic debiting of tolls.
The data could also be valuable to nongovernmental entities. Advertising agencies seeking to determine the best locations for billboards might want to know traffic volumes and average speeds. Companies that produce traffic updates might want access to real-time data for radio news reports. Car manufacturers might use the information to design vehicles better suited to the actual driving conditions found on highways, resulting in safer and more fuel-efficient vehicles.
What part of an IVHS might benefit from patent protection? The electronics that actually monitor the vehicles could be patentable if they operate in a new and nonobvious way. Likewise, the structure of the database might be patentable if the data were stored in a technically novel and nonobvious structure. That
protection, however, would not extend to the most valuable aspect of the database: the data.
Even if databases could be patented, however, patent law is not responsive to the concerns of database creators. It can take years for a patent application to be processed. In the case of the IVHS, database creators will want protection the very week or day that the data are first gathered. Also, to obtain a patent, the invention must be clearly defined to the Patent and Trademark Office so that competitors and the public know what the claimed invention encompasses. This can be difficult with a database that may undergo constant change. Data gathered on the day after the patent filing date would render the patent application incorrect, or would be part of a new database.
Copyright law is equally ill-suited to protect databases. Copyright law expressly bars protection for ideas, functional principles, and facts; instead, the purpose of copyright is to protect original expression. In 1991, the Supreme Court held that a compilation of facts is copyrightable only if the selection or arrangement “possesses at least some minimal degree of creativity” (Feist Publications, Inc. v. Rural Telephone Service Co., 1991). For the white pages telephone directory at issue in Feist, there was no way to organize the listings, other than the obvious alphabetical-by-surname method, and still have a directory that would be usable by customers. Because the court grounded its holding not only in the Copyright Act, but also in the Intellectual Property Clause of the Constitution, Feist places significant limits on the ability of database makers to obtain copyright protection. Quite often, the obviousness or predictability of the selection and arrangement of data equates to usefulness, and therefore marketability. In a real sense, the database most easily copyrighted is the one that is least marketable.1
What aspects of an IVHS database might be copyrightable? One example might be a report that summarizes statistics about road usage for a particular month. The particular statistical facts cited in the report could not be protected, but the expressive aspects of the author’s arrangement of them, as well as his or her particular expression of the conclusions derived from the facts, could be protected. This level of protection is of little value to the creator of the IVHS database. There is no single report, arrangement, or expression of the data that captures the essential value of the database. A marketable digital database must be able to present data in many different and useful arrangements.
Turning to state law, there are two theories—trade secrecy and the tort of misappropriation—that under some circumstances provide protection for databases. Unfortunately, neither one is well tailored to the needs of database creators.
A trade secret can be any information that provides an economic advantage to a business relative to its competitors. The information cannot be generally known or easily ascertainable, and reasonable precautions must be taken to maintain its secrecy. The formula used to make Coca-Cola is one such trade secret. Courts will sometimes evaluate a company’s precautions to determine whether or
not the company itself believes the information has value—if the company does not consider it worth protecting, why should the law? In the case of Coca-Cola, the formula is in a bank vault that can be opened only upon instructions from the company’s board of directors.
The secrecy requirement is difficult to meet for databases that are designed to be marketed or shared, as in the case of the IVHS. For example, if the EPA acquires emissions information from the IVHS database creator and publishes it on the World Wide Web so that individual neighborhoods can monitor their exposure levels, the information is no longer secret. The database creator must walk a tightrope between preserving secrecy and making the information usable for its customers. The law requires actions that run directly counter to the motivation that caused the database creator to seek legal protection in the first place.
The database creator could seek to maintain a veil of secrecy by relying on contracts that prohibit each customer from disclosing the information and require the customer to adopt precautions against disclosure. This method can be effective when the number of customers is small and customers have no need to share information among themselves. However, a system such as the IVHS, with a complex web of interested parties, quickly becomes ill-suited to a contracts-based solution because of the high transaction costs involved in making and monitoring each agreement and controlling data exchange between customers. To simplify matters, the database creator might prohibit the lateral exchange of information between customers, but this would decrease the value of the data. In short, it is difficult to reconcile the strictures imposed by trade secrecy law with the goals and potential benefits of an IVHS database.
Trade secrecy law, moreover, does not give a monopoly. It simply limits the means that a competitor may employ to learn secret information. Improper methods include economic espionage, deception, bribery, and breach of contract. Once information has been made public by any of these means, however, the secret is lost. The trade secret owner may be entitled to a monetary remedy, but that may be scant comfort if the secret was a source of continuing value. In addition, it is completely legal to use information acquired by proper methods, such as reverse engineering.
The second area of state law that bears on the protection of databases is the tort of misappropriation, which is based on the Supreme Court’s decision in International News Service v. Associated Press (1918). The International News Service (INS) had been barred from the European theater of combat and so was limited to gathering news of World War I from its competitors. INS operatives copied Associated Press (AP) newswires from bulletin boards maintained by AP affiliates on the East Coast and then telegraphed the stories to INS newspapers on the West Coast. For obvious reasons, AP sought to prevent INS from continuing this practice. The Supreme Court agreed with INS that the news was not subject to copyright, but it nonetheless held that INS’s conduct constituted actionable unfair competition because it undercut AP’s incentive to gather the news.
For the most part, the rule announced in INS has been applied narrowly to protect only “hot news” or other time-sensitive information. Much of the information that is marketed in database form lacks this quality. The IVHS database, for example, would be valuable in part because it would accumulate data over an extended period, permitting longitudinal study of traffic patterns and emissions problems. However, broader application of the misappropriation tort to data that is not time sensitive may be preempted by the federal Copyright Act (National Basketball Ass’n v. Motorola, Inc., 1997).
PROPOSED STATUTORY PROTECTION FOR DATABASES
Because neither federal nor state intellectual property law provides satisfactory protection for databases, lawmakers and database creators have sought to create a sui generis statutory regime of legal protection for compiled information. Some of the proposed regimes, however, threaten to be worse than the situation that they are intended to remedy.
Part of the impetus for sui generis database protection comes from Europe. In 1996, the European Commission adopted the Directive on the Legal Protection of Databases, which requires member states to enact legislation granting database creators the “right to prevent extraction and/or reutilization of the whole or of a substantial part, evaluated qualitatively and/or quantitatively, of the contents” of a database (EC, 1996). To gain this protection, the database creator must establish only that there has been “a substantial investment in either the obtaining, verification, or presentation of the contents.” The term of protection is 15 years, but is renewable whenever the database holder makes “[a]ny substantial change, evaluated qualitatively or quantitatively, to the contents of the database.” This proviso makes the term effectively perpetual because a compiler need only add more data in order to renew protection for the entire database.
Noteworthy for the United States is that the Database Directive includes a strong reciprocity provision. Protection afforded by European Union member states under the new legislation will not be available to foreign companies from nations that have not provided comparable protection. American database companies and their lobbying organization, the Software & Information Industry Association, have invoked the European reciprocity provision to justify the enactment of legislation granting broad property rights in compilations of data. Thus far, however, their efforts have been unsuccessful.
Legislation to create a property right in databases was first introduced in Congress in 1996 (H.R. 3531, Database Investment and Intellectual Property Antipiracy Act). This bill would have granted rights substantially similar to those afforded under the European Database Directive. H.R. 3531 differed from the Database Directive, however, in that the Database Directive authorizes member states to enact limited “fair use” exceptions to database creators’ exclusive
rights, whereas H.R. 3531 contained no such provision. Because of its breadth and inflexibility, H.R. 3531 quickly encountered strong opposition. In particular, organizations such as the National Education Association, the American Library Association, the National Academy of Sciences, and the National Academy of Engineering expressed concern that the bill would undermine the nation’s research capability because of the potential restrictions on access to data (Samuelson, 1997). As a result, H.R. 3531 remained tabled in subcommittee for the remainder of the 104th Congress.
Also in the fall of 1996, the Clinton administration and the European Union submitted proposals for a database protection treaty to the World Intellectual Property Organization (WIPO) for consideration at WIPO’s December 1996 conference (WIPO, 1996). The treaty language proposed by the United States was nearly identical to that of H.R. 3531. The U.S. Patent and Trademark Office Commissioner Bruce Lehman, who headed the U.S. delegation to WIPO, admitted that the administration’s treaty proposals—the database proposal and a proposed copyright treaty, the terms of which also had failed to secure congressional approval—represented “a second bite at the apple” (Samuelson, 1997). Predictably, the proposed database treaty encountered severe criticism from the same organizations that opposed H.R. 3531, as well as from their international counter-parts and a number of developing nations. Lacking consensus on what, if anything, to do about legal protection for databases, the WIPO delegates set the issue aside for further study (Samuelson, 1997).
In the 105th Congress, proponents of strong database protection introduced H.R. 2652, the Collections of Information Antipiracy Act. In addition to arguing that the property right contemplated by H.R. 3531 was overbroad, opponents of H.R. 3531 also had argued that the bill would contravene the Intellectual Property Clause of the Constitution, which (per Feist) precludes grants of exclusive rights in facts. H.R. 2652 was billed as a response to both criticisms. Ostensibly, H.R. 2652 would have created a misappropriation tort based on specified unfair conduct rather than an absolute property right in compiled data. As written, however, the bill was as broad as the previous one.
H.R. 2652 would have protected any “collection of information gathered, organized, or maintained…through the investment of substantial monetary or other resources” against conduct that threatened an actual or potential market for the database. The bill set no limit on the type of data eligible for protection, few limits on the kinds of uses that might trigger liability, and no term after which the protection would expire. The bill did include a fair use exception allowing extraction of data for educational or research use, but the proviso that the use “not harm the actual or potential market” for the database indicated a very limited range of permitted uses.2 Thus, as a practical matter, the bill was no different from H.R. 3531. It would effectively have granted a monopoly right; a database maker could prevent anyone from extracting, using, or reusing any part of the database deemed “substantial.”
H.R. 2652 died at the close of the 105th Congress, but was reincarnated as H.R. 354 shortly after the 106th Congress convened in early 1999. This time, the bill faced competition; the powerful House Commerce Committee backed an alternative database protection bill, H.R. 1858. H.R. 1858 would have prohibited only the distribution of a duplicate of a database in competition with the maker of the original database, and thus would have granted only limited rights to control derivative markets and value-added uses. In addition, it was drafted to preserve substantially greater scope for fair academic and research use of duplicated information. The major national scientific and research associations, including the National Academies, the Association of American Universities, and the American Library Association, also supported H.R. 1858. The database industries and the House Judiciary Committee, however, remained committed to the basic framework set forth in H.R. 354, and the 106th Congress ended as it began, with no resolution of the database protection issue. The chair of the House Judiciary Committee’s Subcommittee on Courts, the Internet, and Intellectual Property for the 107th Congress, Rep. Howard Coble (R-NC), has vowed to reintroduce the Collections of Information Antipiracy Act yet again, and to continue seeking strong, property-like protection for databases.
The framework set forth in the Collections of Information Antipiracy Act— a very strong legal monopoly, coupled with a low standard to qualify and a likely infinite period of protection—is problematic for several reasons. First, the sole basis for the proposed grant is substantial monetary investment by the database creator; no showing of innovation is required. This scheme is paradoxical: The protection rivals that afforded by the patent laws, but unlike the patentee, the rights-holder need not demonstrate that the subject matter constitutes a contribution to society. Second, granting broad and perpetual monopoly rights likely will encourage excessive rent seeking by firstcomers. Database creators will be able to deny the use of “their” data in subsequent compilations or applications, conceivably forever. The important social benefits arising from cumulative and sequential innovation will become subject to the profit motive of rights-holders (Reichman and Samuelson, 1997).
A fundamental principle of intellectual property law is that no one should be given a monopoly on facts, ideas, or other building blocks of knowledge, thought, or communication. This principle underlies the idea-expression distinction in copyright law and its corollary, the merger doctrine, which denies protection to expression that is inseparable from the underlying idea. This is also the reason for denying patent protection to basic principles of science, such as Einstein’s theory of relativity or the laws of thermodynamics. The Collections of Information Antipiracy Act attempts no comparable separation of protectable and unprotectable aspects of databases, but asks only whether a challenged use is “substantial.” Thus, it appears that if the bill ever becomes law, protected databases will contain no substratum of public domain information that would be available to scientists and other researchers without the rights-holders’ permission.
THE PRIVACY PROBLEM
An additional legal consideration that bears on the compilation, use, and sale of data is individual privacy. Electronic databases of information pertaining to individual actions and transactions are potentially quite valuable, but also potentially invasive on an unprecedented scale. For example, various third parties might be interested in purchasing personal identifying information from an IVHS database, including retailers of car accessories (to sell such items as mobile phones and sound systems to people with longer commutes), private detective agencies (to track the movements of particular individuals), auto insurance providers (to determine whether those insured are driving safely and within speed limits), and state highway patrols (to catch speeders and car thieves). Each of these uses raises a host of difficult legal issues.
Governmental purchases of information for law enforcement purposes must be assessed according to constitutional standards. For example, the question whether a state may use IVHS data to apprehend speeders depends, at least in part, on whether such action would amount to an unreasonable search under the Fourth Amendment. In addition, some federal statutes limit the kinds of data that the federal government can collect from individuals; it remains to be seen whether these provisions apply equally to government purchases of personal identifying data from third parties. Discussion of these questions is outside the scope of this paper.
In the United States, there have been few restrictions on the acquisition and use of personal identifying information by nongovernmental entities. However, this situation is changing, again because of pressures originating in Europe.
In 1995, the European Commission enacted the Directive on the Processing of Personal Data, which required member states to adopt implementing legislation no later than October 1998 (EC, 1995). The Personal Data Directive provides that every collector or third-party recipient of personal identifying data must be required to disclose its identity and the existence of the data to each individual identified by the data. Individuals must be allowed to access the data, discover the sources and recipients of the data, and correct any inaccuracies. Individuals must also be given the right to opt out of the use or disclosure of personal data for direct marketing purposes, as well as the right to challenge other practices relating to data collection and use. The Personal Data Directive prohibits the transfer of data to countries that lack adequate privacy protection for individuals. When the Directive was enacted, European Union officials indicated that they considered the United States to be one such nation.
In 2000, the European Union and the United States reached agreement on a set of “safe harbor” information practices for United States companies and organizations receiving personal data concerning European Union nationals. It is too early to predict whether the safe harbor policy will be effective, or whether the United States will adopt similar measures to protect the privacy of United States
citizens. Thus far, the government has appeared to favor decentralized solutions to domestic privacy problems, such as the adoption of voluntary codes of conduct by database creators, but this may change if the safe harbor effort fails, or if popular outrage at perceived intrusions increases.
DESIGNING APPROPRIATE LEGAL PROTECTION FOR DATABASES
The European database protection scheme and the protections proposed in Congress have in common flaws that are intrinsic to a property-based view of information. A preferable system would be designed expressly to balance the interests of database creators with those of society, rather than relying on market forces to accomplish this balancing. What might such a system look like and how would it function? J.H.Reichman, professor of law at Duke University, and Pamela Samuelson, professor of law and information management at the University of California at Berkeley, have proposed one such alternative. They call their proposal a “modified liability approach” because it is based on liability rules (Reichman and Samuelson, 1997).
Liability rules differ from property rules primarily in the absence of a right to exclude. For example, a person who has a property right in a bicycle can deny anyone the use of that bicycle. Under a liability rule system, she would have no such right. Instead, she would simply be entitled to compensation for any use of the bicycle by others. The proper amount of compensation could be determined by the bicycle owner, based on her expected costs and desired profits, or by a court in the course of resolving the bicycle owner’s claim for damages, or by a government regulatory body.
The modified liability approach proposed by Reichman and Samuelson would consist of two phases of protection. The first phase would consist of a “blocking period” designed to preserve a certain amount of lead time for the database creator. A property rule would apply during this period, and competitors would not be permitted to use or copy the new database without the database creator’s consent. Reichman and Samuelson recognize that in traditional manufacturing there exists a period of natural monopoly afforded by the developer’s lead time—the period necessary for competitors to duplicate the new product. They conclude that a regime implementing this dynamic in the database industry would permit database creators to recover what may be significant research and development costs. This would prevent the market failure that might otherwise occur if a competitor could appropriate the database, at minimal cost to itself, and then undercut the originator’s prices.
The length of the artificial lead time period afforded under the Reichman-Samuelson approach would be very short, however, for two reasons. First, they argue that the market forces that ordinarily would limit a property owner’s ability to reap excessive profits do not exist in the information marketplace. Due to high
entry costs, there is a tendency in the database industry for market segments to be left unchallenged once one developer has made a substantial investment in that area. As a result, the database industry is characterized by an absence of direct competition. In such a situation, market forces alone cannot be relied on to allocate resources to would-be users according to their fair value. Second, as in any monopoly situation, the database market is threatened by excessive rent seeking. The most direct method for avoiding these market failures is to set a time after which the database owner’s right to exclude expires.
The initial blocking period afforded under the modified liability approach would be followed by an automatic license. Absent some other agreement, the database creator would be obligated, at minimum, to share the data with all secondcomers at rates established by a regulatory body composed of industry representatives and government officials. The ground rules for compensation would be designed to promote competition in the database industry and would permit adjustment of the liability framework when necessary because of changed market conditions. Compensation to owners would be tied to two criteria: (1) their costs for initial research and development and ongoing maintenance and (2) an evaluation of the relative significance of the borrowed content and the value added by the secondcomer. If the secondcomer appropriated the entire database and added little or nothing to it, the rate due the original database creator would be high. Conversely, if the secondcomer added substantial value, the rate paid would be low.
Reichman and Samuelson would not establish a strict compulsory license. They envision their framework simply as setting the baseline obligation for each party, while allowing bargaining for different terms. To prevent abuses of market power by database creators, however, they would require binding arbitration if bargaining failed to generate an agreement acceptable to both parties.
As we have shown, current legal protection for databases and proposed property-based regimes for database protection are equally unsatisfactory. Current intellectual property law affords insufficient protection for those who invest time, effort, and money in collecting and compiling data. As a result, database creators increasingly rely on broad and arguably abusive standard-form contracts. Proposed solutions based on property rules, however, would vest in database creators extremely broad exclusive rights in basic knowledge, a result contrary to society’s interests.
Under a modified liability approach, the database creator would recover its investment in the compilation process, but the data would remain publicly accessible on fair and reasonable terms. By setting reasonable limits on the power of database creators to exclude and/or charge monopoly rents, this framework would
serve society’s interests in knowledge sharing, research, and development, as well as database creators’ legitimate interests in recouping their development costs.
CCC Information Services v. Maclean Hunter Market Reports, Inc., 44 F.3d 61 (2d Cir. 1994).
Dingle, J. 1995. FHWA, IVHS, and privacy. Santa Clara Computer & High Technology Law Journal 11(1):15–20.
EC (European Commission). 1995. Directive 95/46/EC of the European Parliament and of the Council of 24 October 1995 on the Protection of Individuals with Regard to the Processing of Personal Data and on the Free Movement of Such Data. Geneva: EC.
EC. 1996. Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the Legal Protection of Databases. Geneva: EC.
Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991).
International News Service v. Associated Press, 248 U.S. 215 (1918).
National Basketball Ass’n v. Motorola, Inc., 105 F.3d 841 (2d Cir. 1997).
ProCD, Inc. v. Zeidenberg, 86 F.3d 1447 (7th Cir. 1996).
Reichman, J.H., and P.Samuelson. 1997. Intellectual property rights in data? Vanderbilt Law Review 50(1):51–166.
Samuelson, P. 1997. Big media beaten back. Wired 5.03. Online. Available: http://www.wired.com/wired/5.03/netizen.html [1997, March]. Accessed 3/5/2000.
WIPO (World Intellectual Property Organization). 1996. Basic Proposal for the Substantive Provisions of the Treaty on Intellectual Property in Respect of Databases to be Considered by the Diplomatic Conference. Document CRNR/DC/6. Geneva: WIPO.