National Academies Press: OpenBook

Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options (1999)

Chapter: Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry

« Previous: Appendix B: Workshop Agenda and Participants
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Appendix C

RAW KNOWLEDGE: PROTECTING TECHNICAL DATABASES FOR SCIENCE AND INDUSTRY

Stephen M. Maurer, Attorney-at-Law

NOTE: The author wishes to thank the National Research Council for commissioning this study and to acknowledge helpful conversations with Suzanne Scotchmer, Jeannette Balko, D. Ben Borson, Jack Brown, Richard Firestone, Richard Gilbert, Karl Kenna, Elizabeth Powers, Jerry Reichman, Kenneth Rosenblatt, Pamela Samuelson, John Stattler, Tom Slezak, Paul Uhlir, and Joel White. The author is solely responsible for all opinions, errors, and omissions contained herein.

This background paper was prepared by Stephen M. Maurer for the National Research Council's Committee on Promoting Access to Scientific and Technical Data for the Public Interest and its January 14-15, 1999, workshop on the same subject. Please note that a number of exhibits were prepared as attachments to this paper; these exhibits are available for viewing in the National Research Council's Public Access Records Office.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
SUMMARY

As usually defined, “databases” include numerical data, text, images, and any other “organized collection of information.” Because enormous numbers of products fit this description, it is sometimes hard to think about such apparently straightforward questions as, “Is existing legal protection adequate?” or, “Could it be improved?” This paper tries to make matters more concrete by examining existing databases and how they are produced. The results are then used as a benchmark to evaluate potential legislation. Special attention is paid to features and problems that set scientific/technology databases apart from other products.

The world of scientific and technology databases is already extremely rich and well-developed. Since the U.S. government has never enacted database legislation, this presents a paradox: If existing databases can be freely copied, why do firms continue to invest in them? The answer is that database providers have devised a bewildering number of unofficial (“self-help”) methods for protecting their investments. These include but are not limited to (1) bilateral agreements with users, (2) “shrink-wrap” or “click-wrap” language, (3) bundling with copyrighted materials, (4) continual updating and improvement that leaves would-be copiers “out of date, ” (5) search-only Web sites where the underlying database cannot be downloaded, and (6) passwords and encryption. The fact that rich and diverse databases exist in today's world shows that such protection can be extremely robust. At the same time, self-help strategies may cause undesirable distortions in the economy, particularly when they discourage database suppliers from sharing products with a wider audience. Even more insidious, lack of statutory protection may mean that some databases are never created in the first place.

Scientific and technology databases present unique needs and problems. These include

  • The need to assure private firms that they can profitably invest in commercializing and extending government databases for use by a broader audience;

  • The need to keep database prices within the reach of academic users, who have traditionally driven most advances in basic knowledge;

  • The scientific community's need for value-added or edited databases that not only collect but also update, cross-check, comment on, and try to reconcile reported results;

  • The fact that virtually all scientific databases have historically been created by combining and extending earlier data sets; and

  • The scientific community's need for full and unrestricted access to data, which inevitably conflicts with self-help strategies based on secrecy or partial disclosure.

The modern history of database reform begins with the U.S. Supreme Court's 1991 decision in Feist Publications, Inc. v. Rural Telephone Service Co., which restricted “sweat-of-the-brow” protection under copyright in the United States. This was followed by the European Union's (E.U.) 1996 Directive on Databases, which required member countries to expand their statutory protection of databases. The E.U. Directive also contained a controversial threat that citizens of countries (including the United States) that did not adopt E.U.-style statutes would not be protected by the new laws when they took effect. Because of the E.U. Directive, the U.S. Congress introduced European-style legislation in 1996 and again in 1997-1998. Scholars have also suggested alternatives to the European model.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Existing reform proposals can be broadly summarized as (1) de minimis changes to existing law, (2) “unfair competition” schemes that would examine the need for protection on a case-by-case basis, and (3) so-called sui generis protection that would give database owners strong property rights modeled on the E.U. Directive. The principal difficulty has been to reconcile these proposals with the public-domain principle that “mere facts” cannot be protected. Although this is an old problem, courts were frequently able to avoid it in the past because copyright and patent law protected only a small fraction of all possible commercial knowledge. Comprehensive database protection would turn the situation on its head by making virtually all facts protectable as “organized collections of information.”

In the final analysis, the policy debate for and against database protection cannot be settled by purely legal considerations. Instead, the underlying question is largely empirical. If free ridership turns out to be a problem for all databases, then some sort of additional protection should be enacted. But if free ridership is only “sometimes ” or “never” a problem, reform should be much more cautious. The fact that such questions have so far received relatively little attention makes the committee's work especially timely and represents a valuable opportunity to advance debate in this area.

PART I. TODAY'S DATABASES

The concept of a “database” is usually defined quite broadly. For example, one typical formulation describes a database as “any organized collection of information, ”1 even though the same phrase could just as easily describe intellectual property in general. The problem is that such definitions are too broad to provide a concrete sense of which databases actually exist in today's economy or why they should be protected.

Part I of this paper tries to make the concept of a database more concrete through examples, anecdotes, and case studies. By way of background, Examples 1, Examples 2 and Examples 3 describe some nontechnical databases that are available on CD-ROM, over the Internet, and in print. Examples 4, Examples 5, Examples 6, Examples 7, Examples 8 continue the discussion by describing an assortment of databases drawn from the physical sciences, biotechnology, and engineering. The final section ends by collecting and commenting on various lessons learned from these examples. The lessons provide a benchmark for evaluating proposed reforms later in this paper.

Some Commercial Databases
Example 1: A Sampler of CD-ROMs

As of January 1995, the authoritative Gale Directory of Databases listed 9,385 electronic databases for sale by commercial vendors. The list was further subdivided by format, including online and CD-ROM. Table C.1 analyzes a sample of 100 databases randomly selected from the catalog 's CD-ROM listings.2

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

TABLE C.1 A CD-ROM Sampler

Vendor/Type

Numerical Data and Directory

Software

Bibliography

Text, Image, and Multimedia

Government provider

6

0

1

0

Commercial provider of public domain data

0

1

6

8

Commercial provider of public domain data enhanced with proprietary software or other features

7

0

0

0

Commercial provider of original data

9

3

21

38

The fact that such a rich and diverse selection of databases has evolved without statutory protection is striking. At the same time, Table C.1 illustrates the fact that database suppliers use a variety of nonstatutory strategies to protect their products:

  • Copyrighted Content. One of the most surprising aspects of Table C.1 is that most products continued to follow traditional print-based models. For example, nearly half of the sample (46 percent) consisted of text, image, and multimedia—predominantly electronic versions of books, journals, and newspapers. Virtually all of these materials are individually protected by copyright whether or not they are included in a database.

  • “Free” Counterparts. Another way to look at text, image, and multimedia is that the cost of producing them electronically tends to be small once print-based counterparts already exist. This makes electronic databases extremely tempting to would-be providers.

  • Updating. Seventy-three percent of the products listed in the sample were regularly updated on a quarterly or annual basis. This practice makes it extremely difficult for would-be copiers to sell a current product.

  • Enhancements. Many CD-ROM databases were packaged with advanced (and presumably copyrighted) search software. For example, many of the numerical data and directory products combined public-domain data with advanced software for making customer lists and address labels or performing searches. Since the copyright laws protect software, the presence of such enhancements forces would-be copiers to choose between selling a visibly inferior product and making investments of their own.

  • Reprints. Finally, some large providers were able to sell “reprints” of government databases despite the fact that this information is freely copyable. The tactic appears to work because large providers have advantages of scale when it comes to finding widely scattered consumers in a “thin” market. By contrast, would-be copiers tend to be too small to locate these same consumers for themselves.3

It is may be significant that the Gale Directory showed no obvious difference between CD-ROM products and those available online. As explained below, the Web offers several distinct technical advantages for self-help security. The fact that providers chose to forego these advantages shows that security can be accomplished in various ways.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Example 2: An Internet Sampler

Table C.2 summarizes 100 Web sites obtained by searching for the word “databases” on the Infoseek search engine. Table C.2 significantly expands the list of self-help strategies found in Table C.1.:

  • Passwords and Two-tier Access. Perhaps the most traditional way to protect databases is to use passwords. Many Web sites provide free samples of password-protected data.

  • Search-Only Web Sites. The most common form of self-help found in the sample was for users to submit requests to the vendor, who would then perform searches on their behalf. This provides stronger protection against piracy than passwords.4

  • Clearinghouses. Some databases earn income by selling listings instead of charging user fees. The classic example is a job agency, in which employers pay for ads that are then distributed to the public without charge.

  • Product Ties and Come-ons. Many Web databases are offered free as an inducement to purchase related products. In such cases, the producer provides data without charge in order to promote his core business.5

TABLE C.2 An Internet Sampler

Provider Type

Bulletin Board in Which Individual Needs Are Posted in a Single Place (e.g., Job Listings)

Compilations of Two or More Public Domain Databases

Original Data

Directory and Network Data that Identify Community Members to Each Other and/or the Public

Enthusiast

0

6

0

0

Government and education

2

40

3

4

Commercial Provider Access provided without charge

6

0

7

3

Commercial Provider: Portions of data restricted to users who have purchased passwords

0

0

8

0

Commercial Provider: Search only

0

0

18

3

Finally, Table C.2 is a stark reminder that not all database providers want protection. This is trivially true for the enthusiast, government, and education providers whose missions are heavily slanted toward dissemination. A more subtle point is that the phenomenon also exists in the commercial field, where databases are frequently used as “market makers” to bring buyers and sellers together. It is an open question whether or not such players welcome copying (particularly when they receive attribution) as a way of reducing their own publication costs and/or reaching even larger audiences.6

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Example 3: Other Types of Databases

Example 3(a): Dataquest. Even though all of the providers found in Table C.2 offered more or less standardized products, this is not the only business model available on the Web. For example, a consulting group known as Dataquest offers two types of proprietary information: (1) a library of 25,000 confidential reports that can be searched and downloaded over the Web at a cost of between $100 and $5,000 per item, and (2) custom research at a negotiated price. All of these products are subject to elaborate contractual safeguards governing each side's use and disclosure of the reports. Dataquest also sells “alert” services that notify users of developments in predefined areas of interest.7

Example 3(b): Info-Trac. One of the largest (and most useful) databases found in the course of preparing this report was a citation index called Info-Trac. Info-Trac is available both online and as a CD-ROM. Although Info-Trac is available to users (e.g., libraries) free or at nominal cost, it charges a substantial fee for copying hard-to-find articles.8 This is yet another example of using an essentially free database to market the seller's principal product.

Example 3(c): Paper-based Databases. The fact that Feist involved telephone books shows that paper-based databases are still important. Virtually all of the text and bibliographic products listed in Table C.1 have print-based counterparts.

Some Scientific Databases
Example 4: Some Electronic Database Samplers

Most of the examples listed below describe the creation, evolution, and/or capabilities of individual databases. The present section tries to set the stage by presenting broader, more impressionistic samplers of scientific and engineering databases offered over the Web or in libraries. Because the samplers show considerable overlap, they are discussed together at the end of this section.

Example 4(a): Physics. Table C.3 extends the previous discussion to the sciences by summarizing online and CD-ROM databases offered by the University of California (UC) at Berkeley Physics Library and by the results of a request to the Yahoo physics search engine for the word “database.”9

Because of their greater volume, the UC Berkeley and Yahoo resources for engineering are listed separately.

Example 4(b): Engineering (Library Resources). The UC Berkeley Engineering Library resources are given in Table C.4.

Example 4(c): Engineering (Internet Databases). Table C.5 summarizes the 71 relevant hits generated by polling the Yahoo engineering search engine for the word “database.”

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

TABLE C.3 A Physics Sampler

Resource

Number

Comment

Online versions of print journals

55

Includes physics-related journals published by American Physical Society, Reed Elsevier, American Chemical Society, and American Astronomical Society.

Electronic preprint servers

8

Includes servers maintained by government laboratories and professional societies. The American Institute of Physics also offers its own e-journal.

Electronic abstracting and indexing databases

2

Consists of INSPEC database of 4,000 journals plus selected conferences, reports, dissertations, and books, and Web of Science database of 3,300 scientific and technical journals.

Other electronic resources

12

Includes large atomic, particle, and thermodynamic databases prepared by national labs and universities.

TABLE C.4 An Engineering Sampler (UC Berkeley Resources)

Resource

Number

Comment

Online versions of print journals

60

Includes journals published by professional societies (ACM, ACS, IEEE, Society of Industrial and Applied Mathematics) and private publishers (Academic Press, Elsevier, Springer, Wiley).

Electronic abstracting and indexing databases

16

Includes private, DOE, EPA, and National Technical Information Service publications

Technical Report Databases (Includes both indexed and full text)

10

Includes government sites and the Yahoo physics search engine.

TABLE C.5 A Second Engineering Sampler (Internet Resources)

Vendor/Type

Full Text

Original Data

Directory and Network Data, that Identify Community Members to Each Other and/or the Public

Enthusiast

0

4

0

Government and education

0

18

3

Commercial provider/search only

0

3

0

Commercial provider/portions of data restricted to users who have purchased passwords.

0

10

0

Commercial “public service” provider

2

3

12

Commercial database limited to provider's own products

0

8

0

Commercial database offered at no charge to sell enhanced or CD-ROM versions of the same data and/or related products

0

4

0

Commercial database paid for by advertising and/or selling right to post items on a public bulletin board

0

4

0

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Table C.3, Table C.4 and Table C.5 are strikingly similar to the broader electronic databases discussed in the section on commercial databases above. In particular, they show

  • Richness. The sheer number and diversity of available databases is astonishing.

  • Diverse Suppliers. The products listed in Table C.3, Table C.4 and Table C.5 are produced by government laboratories, private institutes, and commercial ventures. Researchers appear to use these sources interchangeably.

  • Online Versions of Print Media. Academic journals and societies have rushed to make online versions of their journals available. This is exemplified by the fact that the American Physical Society, American Mathematical Society, American Chemical Society, and American Astronomical Society currently place all of their journals online. Most of the index and bibliographic products listed above are also extensions of preexisting print-based counterparts.10

  • Self-help. Private publishers universally rely on passwords and/or contractual restrictions to limit access to, and republication of, their products. 11

  • Electronic Options. Despite the greater technical difficulty of protecting CD-ROMs, they continue to be well represented in the sample.

Example 5: A Large Nuclear Science Database12

Since the late 1940s, the nuclear science community has struggled to reduce an exploding literature to a more manageable data set. Despite declining manpower and budgets, the Department of Energy (DOE) continues to spend approximately $4 million per year to maintain, update, edit, and disseminate nuclear science databases.13 Approximately $800,000 of this is spent to support a group at Lawrence Berkeley Laboratory (LBL) whose principal product is the Table of Isotopes. The product includes over 160,000 published references and approximately 1.5 gigabytes of data.

Historically, nuclear database creators have never started from scratch. For example, the Table of Isotopes can trace its lineage to roughly half a dozen nuclear databases, many of which still exist. The LBL group has made extensive efforts to improve and extend these sources by adding new data, checking reported calculations, comparing different experiments to arrive at best values, and deducing additional data not calculated by the original authors. The Table of Isotopes is currently 5 years behind the literature, on average.

Approximately one-half of the group's budget goes to improving its database so that it can support more advanced, relational searches; the balance is spent on disseminating the product over the Web and/or rearranging the data into new tables aimed at medicine and other non-traditional users.14 DOE has not asserted any proprietary interest over the database. The LBL group is not worried about copying, provided that proper attribution is given.

In addition to its public domain/Web-based version, the Table of Isotopes is also available as a commercial book and a CD-ROM. To protect against copying, the publisher has insisted on the following self-help provisions:

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
  • Updates. The group must supply new material annually, although the content of updates is discretionary. In practice, the group has concentrated on developing new tables aimed at nontraditional (and potentially lucrative) users in fields such as medicine.

  • Additional Graphics. The group must prepare copyrighted graphics that are, at least initially, superior to those found on DOE's Web site. This is an important selling point for commercial buyers who use the CD-ROM to prepare graphics for talks and presentations. The graphics material also adds copyrighted content to an otherwise public product.

  • Additional Software. The group must prepare additional software. This provides an additional selling point and copyrighted content not found at DOE's Web site. 15

Although these enhancements are useful, the LBL group probably would have invested its resources differently if left to its own devices. In particular, it would have devoted more effort to updating and improving the underlying (but unprotected) database itself. This is a concrete example of how reliance on self-help solutions can distort investments by comparison with a hypothetical world in which all forms of intellectual property were identically protected by statute.

At the same time, the Berkeley group does not seem to view self-help as a significant bottleneck to new commercial projects.

Example 6: Elsevier Science16

Elsevier Science publishes (1) nearly 1,200 English-language scientific journals, (2) a variety of highly specialized reference works, (3) various bibliographies, abstracts, and reviews, and (4) paper and electronic versions of the world's “most comprehensive interdisciplinary engineering database.”17 Virtually all of these materials are available both online and as CD-ROMs. Elsevier Science's search software permits users to search multiple journals at once.

Although old print journals never had enough space to include full data sets, the advent of online journals has effectively removed this constraint. As a result, Elsevier Science now requires authors to submit underlying data sets so that they can be linked to online journals.

Elsevier Science routinely asks authors for the copyright to their work (including any underlying data) but will usually agree to accept a license instead. According to the company, there is currently no other way to manage reprint and reuse requests. The company does not ask for patent or exclusive database rights.18

Elsevier says that its nonscientific divisions have sometimes decided not to invest in new databases because of protection concerns. So far, however, this has not happened to any of Elsevier Science's science projects. At most, database protection has been one issue among many.

To date, Elsevier Science has collected only a “tiny” number of databases and has little experience with database issues. In line with its current reprint policy, the company would probably not assert its copyright against authors who tried to make commercial products from their own previously submitted databases, but probably would demand reasonable reprint fees from third parties who wanted to republish the data for commercial gain. The company has given little, if any, thought to compiling its own commercial products from authors' data sets. In theory, Elsevier Science could assert its rights more aggressively in the future. Under this scenario, the company's large number of journals might then be leveraged into a corresponding

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

dominance over databases.19 So far, however, there is little indication that Elsevier Science 's disparate databases will ever be combined into a useful—much less dominant—commercial product.

Example 7: Biotechnology20

Bioinformatics. Finding commercially interesting genes is essentially a race to find subtle patterns in an enormous body of experimental data. (The task is often compared to that of prospectors looking for hints of gold in an otherwise featureless landscape.) The principal raw data needed to conduct academic and commercial biotech research are currently maintained in over 200 public sector databases scattered throughout the world. Virtually all of these Web sites are narrowly focused on the owner's research agenda. As a result, the system is often fragmented and redundant. From a computing perspective, many of the sites tend to be amateurish, underfunded, and unstandardized.21 This creates recurring difficulties for corporate users.22

The intersection between computer science and biology is known as bioinformatics. Next-generation bioinformatics systems will be designed to (1) convert diverse databases to a format that users can read, (2) search simultaneously the Web's 200+ sites as if they were a single database, (3) enhance existing text-based databases with relational links to make them more amenable to sophisticated searches, and (4) create software search tools that are not only powerful but also flexible enough to let researchers study the data in unanticipated ways.

GenBank. The best-known and most important public database is a National Institutes of Health (NIH) Web site called GenBank. GenBank is one of three official locations where researchers can deposit information about the precise order of base pairs found in human DNA. The current Release 110.0 of GenBank contains over 3 million sequence records and includes more than 2 billion base pairs. More than 100,000 sequences from individual laboratories and high-throughput sequencing centers are added each month. Since it was founded in 1982, GenBank's size has doubled every 14 months.

Because of funding constraints, GenBank's capabilities are limited. For example, search tools can perform full text searches only for written words. This is extremely unwieldy for most biology applications. In addition, editing and comments are limited to author annotations. No effort is made to comment on related journal articles or to identify or resolve conflicts between data submitted by different researchers.23 Finally, updating comments and sequences is virtually impossible. These problems are not unique to GenBank. In recent years, several not-for-profit biotechnology databases have either closed or been threatened with closure. Commentators have complained that the community may have to get by with inadequate updating, editing, and annotations. 24

Private Database Vendors. Beginning in the early 1990s, several firms began to offer private versions of a few databases to elite users willing to pay multimillion-dollar license fees.25 Initially, these biotechnology databases were attractive because they included large amounts of secret (i.e., proprietary) data, and they offered advanced bioinformatic search tools. Because public discovery was booming, the former advantage was short-lived. This has driven some firms to shift their emphasis to “the sale of new databases, software packages, and perhaps consulting.”26

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

One early leader in the field, Human Genome Sciences, started off by selling its proprietary database to a single research partner as part of a $125 million deal.27 More recently, Human Genome Sciences has broadened its relationships and now plans to offer market friendly software packages ranging from “simple, low-end packages for impoverished [academics to] tailor-made luxury items for drug companies.” Raw data will be provided free of charge.28

Human Genome Sciences' principal rival, Incyte, originally charged licensees $15 million to $20 million for access to its proprietary databases over a three-year period.29 Approximately 50 companies currently subscribe. Like Human Genome Sciences, Incyte focuses on software and database enhancements. According to Incyte's chief financial officer,

There's a huge information-based business growing from the pharmaceutical industry. . . . This is not a small market segment that's going to be serviced by half a dozen companies. This is going to be a fairly large segment of service for a lot of companies, for everything from software and hardware companies to more biologically oriented companies and consulting firms that do systems integration or go in and design something specifically for a big drug firm.30

Incyte subscribers can currently buy the company's advanced LifeSeq relational databases with or without proprietary data. However, even nonproprietary databases have been cleaned and standardized to support Incyte's advanced search software.31 Incyte also develops custom databases for individual clients; these are typically resold to other companies after an initial period of exclusivity.

Another private company, Celera Genomics, recently joined the ranks of Human Genome Sciences and Incyte. Celera's proposed human genome database will reportedly include extensive proprietary human genome data and a “value-added software and informatics system.” Celera has not asked to share in the profits from any discoveries. Instead, it will offer its databases to users on a straight fee-for-service basis. Very large users will be able to purchase dedicated systems.32

Human Genome Sciences, Incyte, and Celera have many smaller rivals. These firms rarely sell proprietary information at all. Instead they concentrate on helping clients to manage their existing data in new and better ways.

Examples 8(a)-8(h): Anecdotes and Profiles.

The following examples are drawn from earlier descriptions of databases found in the literature.33

Example 8(a): POISINDEX. This CD-ROM product links approximately 750,000 poisons to 775 management and treatment protocols. Approximately 200 clinicians from 20 countries participate in editing and selection. POISINDEX also hires computer scientists to maintain its database and create search software. It is updated quarterly and sold by subscription.

Example 8(b): MDL Drug Data Report. This Reed Elsevier CD-ROM database contains molecular structure and biology information for approximately 85,000 potential drug candidates. The data (1) are updated on a monthly basis from published reports, patent applications, and scientific papers, (2) allow users to track clinical trials, (3) come with ISIS software that allows

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

users to analyze the likely effects of modifying known drugs, and (4) can be combined with the user's own data to create individualized research tools.

MDL created the Drug Data Report (and seven other databases) because there was “an inadequate supply of scientific data for its ISIS software system.” MDL offers a preferred fee for academic users.

Example 8(c): Visible Human. This product consists of 10,000 “sliced” images of the human body and comes with software tools that include a navigator, bookmarks, and animation. Although originally developed from a database compiled by the U.S. government, Visible Human became only widely available after it had been commercialized. The government is still engaged in a major effort to update the underlying database.

Example 8(d): DERWENT World Patents Index. This database lists 7 million inventions compiled from 13 million patent documents worldwide. Scientific journals and conference papers are also reviewed. The database is updated quarterly and is available online, as a CD-ROM, and in print.

Example 8(e): National Agricultural Database Library. These animal husbandry databases are edited by a University of Wisconsin institute and are based on government publications submitted by agricultural extension offices around the country. The database is chiefly used by educational institutions. It is available both online and as a CD-ROM.

Example 8(f): Materials Science. A commercial gateway service known as the Science and Technology Network (STN) International provides access to “20 databases covering the physical and mechanical properties of thousands of materials as well as more than 100 factual and bibliographic databases.” All of STN databases can be searched simultaneously using a single set of sophisticated search tools.

STN databases tend to be fairly permanent, but they grow when new materials, conditions, and properties are measured. Users search STN International's databases but are not allowed to download them. According to the NRC's Bits of Power study, many materials scientists believe that STN does not contain enough databases.

Example 8(g) Chemical Sciences. Because of their long-standing ties to industry, chemists tend to provide a favorable environment for new commercial databases.34 Private sector databases include the Registry of Toxic Effects of Chemical Substances (full text), Chemical Abstracts Service (bibliography), DERWENT (patents), DETHERM (thermophysical properties), and SPECINFO (nuclear magnetic resonance and infrared spectra). Tabulations of evaluated data are also compiled by the Journal of Physical and Chemical Reference Data.

Publicly maintained databases include the Beilstein Institute (organic substances), Gmelin Institute (inorganic and organometallic substances), National Institute of Standards and Technology (atomic species properties), and Cambridge Crystallographic Data Centre (structural data).

Chemists typically need to search multiple databases for any given task; sophisticated software search tools have been developed to do this. Most of the foregoing databases are available both online and as CD-ROMs.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Example 8(h): Geophysics and Meteorology. Large databases in the earth sciences tend to be run almost exclusively by government agencies and clearinghouses. However, individual researchers frequently create smaller commercially valuable data sets in the course of writing papers. The American Geophysical Union may have further information on how such data are handled.35

Lessons Learned

The foregoing examples contain several recurring themes that need to be considered before attempting any reforms.

Protection under Existing Law

Compilations of Copyrighted Materials. Copyrighted information does not lose its protected status simply because it has been incorporated into a database. This fact is particularly important for so-called full-text databases, which often consist entirely of copyrighted documents. Full-text databases have played a prominent role in several of the examples given.36

Copyrighted Enhancements. Databases are frequently sold together with advanced software as a single package. Since software is copyrightable (and often patentable), would-be copiers are faced with the choice of marketing a less capable product or else investing the resources needed to develop their own search tools. Copyrighted enhancements appear frequently in the examples given.37

Self-Help Protection

Bilateral Contracts. The examples include both custom databases prepared for a single customer and semicustom databases prepared for a relatively small community.38 In the limit where only one customer wants to acquire a particular database, protection against third parties is virtually automatic.

More generally, existing law allows custom and semicustom database owners to limit each customer's right to use and/or disclose the information to others. Such contracts are enforceable as trade secrets even where the underlying information does not qualify for statutory protection.39 In practice, it is probably not feasible to negotiate and monitor more than a few dozen contracts at any one time. This limits dissemination to a comparatively small number of customers.

Shrink-wrap Licenses. So-called shrink-wrap and click-wrap licenses can be used to bind an unlimited number of customers and are a ubiquitous feature of life on the Web.40 Assuming that they could be enforced, most of these licenses would create protection comparable to, if not stronger than, that found in patents or copyright. The legal validity of such licenses is briefly discussed in Part II, “Existing and Proposed Law,” below.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Search-only and Password-protected Web Sites. The examples include both search-only41 and password-protected42 Web sites.43 Such strategies are probably more viable in academia, where researchers tend to be less concerned about security. Private sector corporations, on the other hand, tend to avoid using the Web because communications (including search requests) are not secure, and Web sites can change (or disappear) overnight. For this reason, corporations insist on using CD-ROMs wherever possible.

Updating. Seventy-three percent of the CD-ROM products listed in Table C.1 were regularly updated on an annual, quarterly, or monthly basis. 44 Regular updating was also a recurring feature of other examples described above.45

The consumer preference for updating is understandable when data change quickly or when even a small chance of error could compromise large investments of time, labor, and capital. A good example of a field where both factors apply is biotechnology.

Editing and Enhancements The central importance of editing and enhancements in the sciences recurs throughout the examples.46 Given the premium that science and engineering place on edited and enhanced databases, similar features would probably exist even without the threat of copying. This does not change the fact that editing and enhancements promote self-help protection by providing added copyright content, and occasions for frequent updating.47

Unprotected Products

Public Domain. A significant number of products are sold without any protection at all, sometimes for comparatively high prices. One suggestion invokes market imperfections as an explanation. Under this scenario, large vendors who can afford to circulate catalogs are able to make a profit on relatively obscure titles even if only a few customers purchase them. Would-be copiers are too small to reach these same customers and therefore do not compete.

Spin-offs. In the paradigmatic Feist case, plaintiff created a telephone directory as an essentially cost-free spinoff of providing telephone service. In Example 8(b), a provider created eight new databases to promote sales of its core product (software).48 In both cases, the provider could reasonably expect to recover its investment whether or not its databases were later copied.

These examples show that providers will continue to produce some databases regardless of whether they are protected.49

PART II. EXISTING AND PROPOSED LAW
The Limits of Federal Copyright Protection

For most of the 20th century, the extent to which databases were or were not protected was uncertain. The federal copyright and patent statutes seemed to be exclusive. Congress had not afforded database protection, the argument went (and still goes), so there shouldn 't be any.

The INS Case
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

The legal history of database protection in the United States begins with the Supreme Court's 1918 decision in International News Service v. Associated Press (INS).50 The case involved a wire service whose employees rewrote its rival 's published dispatches and then sold them as its own. On appeal, the Supreme Court was asked to decide whether the Copyright Act' s policy in favor of putting facts into the public domain created an absolute right to engage in such practices.

The Supreme Court decided that it did not. Instead, it drew a distinction between the public's right to information (which was protected by copyright) and a business competitor's right (which was not).51 The Court's policy analysis was starkly modern in its use of economic reasoning:

Indeed, it is one of the most obvious results of defendant's theory that, by permitting indiscriminate publication by anybody and everybody for purposes of profit in competition with the news-gatherer, it would render publication profitless, or so little profitable as in effect to cut off the service by rendering the cost prohibitive in comparison with the return.52

Although succeeding courts wrote a handful of opinions applying the reasoning in INS to new facts, they did little to clarify the Supreme Court's attempted distinction between “fair” and “unfair” uses of public knowledge. Instead, the case remained in a kind of legal limbo. In the words of one commentator,

Having been born over the objection of powerful dissents authored by Justices Holmes and Brandeis and thereafter subjected to the disapproval of Judge Learned Hand, it is not surprising that the INS case was “often confined strictly to its facts.” The upshot has been recognition of a claim where the defendant's commercial exploitation was likely to destroy its value but otherwise to allow the defendant to compete notwithstanding the advantage gained by use of the plaintiff's work.53

The Feist54 Case

The modern era of database law opened with the Supreme Court's 1991 decision in Feist Publications, Inc. v. Rural Telephone Service Co.55 The case stemmed from a publisher's attempt to copy a local telephone company's printed directories.56 Starting from the twin propositions that “facts are not copyrightable” but that “compilations generally are,”57 the Court explained that creating a telephone book in the usual format lacked the “minimal degree of creativity” required for copyright protection under the U.S. Constitution.58 The Court then effectively decided the case a second time by concluding that earlier cases that had extended the copyright statute to works created by “sweat of the brow” rather than creativity had been wrongly decided.59 However, the Court stopped short of overruling INS. Instead, it only distinguished the case by saying that it had been decided “on noncopyright grounds that are not relevant here.”60

Copyright Protection After Feist
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Since 1991, approximately a dozen courts have analyzed and elaborated on the principles announced in Feist. In doing so, they have frequently found that the compiler's choice and arrangement of data can be sufficiently creative to trigger copyright protection. Examples include: Warren Publishing, Inc. v. Microdos Data Corp., 52 F.3d 950 (11th Cir. 1995) (taking facts from an “external universe of existing material” and arranging them according to an idiosyncratic list of “principal communities” was sufficiently creative to qualify as a copyrighted compilation); 61 Key Publications, Inc. v. Chinatown Today Publishing Enterprises, Inc. (selection of businesses to be included in a directory “was in no sense mechanical, but involved creativity . . . in deciding which categories to include and under what name”);62 but see BellSouth Advertising & Publishing Corp. v. Donnelly Information Publishing, Inc. (fact that company's telephone directory limited entries to subscribers living within a certain region on or before a particular closing date did not satisfy Feist).63 The only significant qualification seems to be that such creativity must involve the arrangement of data and not the discovery of information itself. See BellSouth, supra (Copyright Act “affords no shelter to the resourceful, efficient, or creative collector ”).64

From the standpoint of science, the most important post-Feist development involves case law suggesting that compilers who apply judgment to their data also qualify for copyright protection: Mason v. Montgomery Data, Inc. (plaintiff applied “discretion” to task of selecting, interpreting, and reconciling inconsistencies among sources);65 Nester's Map & Guide Corp. v. Hagstrom Map Co. (author recommended best ways to find particular buildings and approximated street addresses so that they would be easier to remember);66 CCC Information Services, Inc. v. MacLean Hunter Market Reports, Inc. (price estimates based on “professional judgment and expertise” rather than “reports of historical prices” or “mechanical derivations of historical prices or other data” were copyrightable).67

The fact that many courts have been willing to find creativity in the way that databases are arranged does not mean that the data themselves are protected. If free-riders are willing to take the time and trouble to select from and rearrange copyrighted databases, they remain free to do so. See, for example, Warren Publishing (“content of datafields” was “merely fact[s]” and not copyrightable);68 Skinder-Strauss Associates v. Massachusetts Continuing Legal Ed., Inc. (bare fact that defendant copied information from plaintiff 's directory did not establish copyright violation);69 Cable News Network, Inc. v. Video Monitoring Services of America, Inc. (copyright in news broadcast extended only to compilation as a whole; individual news segments remained “factual in nature” and unprotected).70

Unfair Competition After Feist

Given Feist's extensive criticism of the “sweat of the brow” doctrine, it would have been reasonable to think that INS-type unfair competition claims had no further validity. However, this turns out not to be true. Instead, the prestigious Second Circuit Court of Appeals declared in National Basketball Assn. v. Motorola, Inc. (NBA)71 that the core situation addressed by INS—the so-called hot news cases—remained good law. Because the Second Circuit carefully restricted its discussion to “time sensitive” information, the decision does not currently include databases.72 Nevertheless, the fact that the Second Circuit continues to take unfair competition

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

seriously suggests that the doctrine may have a future. A detailed discussion of the NBA case and its potential extension to databases can be found in Part III, “Policy Implications,” below.

Implications for Science

To date, no court has applied Feist to scientific databases. Given the relatively low standards required to find “creativity” under the case law, it seems clear that (1) extensive editing judgments, (2) attempts to choose “best values,” or (3) enhanced “relational” search capabilities would all qualify for “compilation” protection under the copyright laws. The smaller, less elaborate data sets submitted in connection with individual papers would probably also qualify, although this presents a closer question. 73 At least one publisher (Elsevier Science) appears to be operating under the assumption that they do qualify.74

If copyright applies, it still would not prevent would-be copiers from extracting and rearranging data from a scientific database. The “minimum” amount of creativity that such copiers would have to show is an open question, but might be quite minimal.75 In any case, it seems safe to say that the traditional scientific practice of compiling new-generation databases from earlier ones remains viable. Cf., Sinai v. California Bureau of Automotive Repair (defendant was free to compile its own manual from data listed in copyrighted chart).76

State Contract Law

Despite their ubiquity, the effectiveness of 2 or click-wrap licenses remains unclear. Traditionally, courts have often been willing to look past the fiction that purchasing an article constitutes agreement to a license, particularly where the license is one-sided. Recent case law has held that shrink-wrap licenses can indeed be used to obtain greater rights than those obtainable through copyright. (See Pro-CD, Inc. v. Zeidenberg, enforcing shrink-wrap license restrictions protecting a telephone listings database against copying.77) Nevertheless, the doctrine's outer bounds remain unclear. See Vault Corp. v. Quaid Software Ltd. (federal law invalidated contractual restriction against decompiling computer program).78

Implications for Science

Existing case law suggests that shrink-wrap licenses are a viable strategy for extending database rights beyond copyright and patent law. The outer limits of such protection will ultimately be set by the courts' willingness to find overreaching provisions “unconscionable” or otherwise “contrary to public policy.”

The European Union Directive

Databases have traditionally received substantial protection in the United Kingdom, Ireland, the Netherlands, and the Nordic countries. Until recently, however, they received much less protection elsewhere in Europe. Beginning in the late 1980s, the European Union (EU) began studying database protection as part of a larger project to harmonize member states'

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

copyright laws. Although initial proposals were relatively moderate, calls for protection grew steadily stronger. In March 1996, the Council of the European Union issued the Directive on the Legal Protection of Databases.79

By its terms, the E.U. Directive applies to any “collection of independent works, data, or other materials arranged in a systematic or methodical way and individually accessible by electronic or other means.”80 The E.U. Directive protects such works against “temporary or permanent reproduction,”81 “adaption” or “alteration,”82 or “distribution to the public.”83 However, these protections do not apply unless a “substantial part” of the database, “evaluated qualitatively and/or quantitatively,” has been copied.84

The E.U. Directive provides protection for a 15 years.85 However, this period can be indefinitely extended if the “accumulation of successive additions, deletions, or alterations” amount to “substantial new investments.” This extension would extend to the database as a whole and not just to “new” components.86

To American eyes, the most striking aspect of the E.U. Directive is its refusal to extend protection to citizens of countries that do not adopt the E.U.'s standards. Formally, this is implemented by Article 11, paragraph 3, which gives the Council discretion to withhold database protection from “databases made in third countries . . . .” The E.U. Directive's preamble makes the underlying threat:

[T]he right to prevent unauthorized extraction and/or re-utilization in respect of a database should apply to databases whose makers are nationals or habitual residents of third countries or to those produced by legal persons not established in a Member State, within the meaning of the Treaty, only if such third countries offer comparable protection to databases produced by nationals of a Member State or persons who have their habitual residence in the territory of the Community. 87

Implications for Science

From the standpoint of the scientific community, one of the most important aspects of the E.U. Directive is found in Article 6, which gives member states the option to exempt copying “for the sole purpose of illustration for teaching or scientific research, as long as the source is indicated and to the extent justified by the non-commercial purpose to be achieved.”88

A related provision preserves member state exceptions “traditionally authorized under national law.”89 This provision presumably includes European equivalents of the U.S. “fair use” doctrine, which permits limited copying for scholarship and research. 90

Proposed Legislation and Suggested Reforms
The World Intellectual Property Organization Treaty

The August 30, 1997, Draft. Shortly after issuing its E.U. Directive, the European Union asked the World Intellectual Property Organization (WIPO) to consider a worldwide database treaty based on the European model. After preliminary discussions involving the United States, WIPO published a draft version on August 10, 1996.91

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

By its terms, the WIPO draft would have defined “databases” to include “a collection of independent works, data or other materials arranged in a systematic or methodical way and capable of being individually accessed by electronic or other means.”92 The heart of the treaty would have required member countries to adopt legislation granting database owners an “exclusive right” to prevent “the permanent or temporary transfer of all or a substantial part of the contents of a database to another medium” without their permission.93

Unlike the E.U. Directive, the WIPO treaty would have restricted the right of member states to enact exemptions for scientific research, either directly or through fair use provisions:

Contracting Parties may, in their national legislation, provide exceptions to or limitations of the rights provided in this Treaty in certain cases that do not conflict with the normal exploitation of the database and do not unreasonably prejudice the legitimate interests of the rightholder.94

Implications for Science. Far from resisting European efforts to implement database protection, the initial U.S. reaction was to extend the E.U. Directive's coverage by requesting a 25-year period. The WIPO draft left the question open.95 As previously noted, the WIPO draft also would have sharply reduced member nations' right to exempt scientific research.

The WIPO Treaty Derailed. Although the Clinton Administration originally backed the proposed treaty, support was split after U.S. scientists and developing nations protested in late 1996. Talks in Geneva were finally derailed roughly one week before the treaty was to have been completed.96 WIPO is still examining database protection.

Proposed Congressional Legislation97

WIPO's database protection provisions could not have been implemented without domestic legislation. Such legislation was first introduced in May 1996. Later, a similar bill (H.R. 2652/S2291)98 was offered as an amendment to an intellectual property reform package that ultimately became known as the Digital Millennium Copyright Act of 1998.99 H.R. 2652 was dropped in Conference Committee, largely because of protests from the science and engineering community, but will be reintroduced this year. The bill can be broadly described as an implementation statute designed to meet the European Union's demands. H.R. 2652 frequently paraphrases or incorporates the E.U. Directive verbatim.

As written, H.R. 2652 would have protected “information that has been collected and has been organized for the purpose of bringing discrete items of information together in one place or through one source so that users may access them.”100 The bill would have imposed liability on

[a]ny person who extracts, or uses in commerce, all or a substantial part, measured either quantitatively or qualitatively, of a collection of information gathered, organized, or maintained by another person through the investment of substantial monetary or other resources, so as to cause harm to the actual or potential market of that other person . . . for a product or service that incorporates that collection of information and is offered or

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

intended to be offered for sale or otherwise in commerce by that person . . . .101

Despite this provision, users who extracted individual facts or “insubstantial parts” of databases would not be liable.102

H.R. 2652 also would have provided additional exemptions for the following conduct:

  • Users could gather the same information independently and even use protected databases to verify the accuracy of their research;103

  • Users could copy material for “nonprofit educational, scientific, or research purposes in a manner that does not harm the actual or potential market for the product or service;”104 and

  • Users could use information “for the sole purpose of news reporting” so long as they did not copy time-sensitive “hot news.”105

The government and its employees would not have been allowed to claim protection under the statute. However, educational institutions would have been.106

H.R. 2652's principal enforcement mechanism would have depended on civil suits for damages, lost profits, injunctions, and treble damages in appropriate circumstances.107 Criminal penalties would also have been available.108

Implications for Science. Like the WIPO proposal, H.R. 2652 would have extended protection for databases beyond the European Union's demands by extending the term of protection from 15 to 25 years, and failing to take full advantage of the E.U. Directive's exemption for not-for-profit copying and/or fair use for teaching and scientific research.

State Legislation

Because shrink-wrap licenses are creatures of contract law, state legislatures have considerable power to limit or extend them by statute. In the interests of commerce, however, they have generally worked together to enact nationally uniform laws. One such draft currently under discussion would amend the Uniform Commercial Code (UCC) to codify and change the law of licenses. Although the proposed statute is complex, the most relevant section for present purposes involves enforceability:

If a court as a matter of law finds the contract or any term of the contract to have been unconscionable or contrary to public policies relating to innovation, competition, and free expression at the time it was made, the court may refuse to enforce the contract . . . . 109

“Reporter's Notes” to the proposed statute explain that it is designed to acknowledge the fact that (1) “[s]tate laws, including the UCC, cannot alter or create federal law”;110 (2) public policy with respect to “innovation, competition, and free expression policy” may render contracts invalid even when federal law is silent;111 and (3) the UCC “take[s] no position” on “a general federal policy question.”112 On the other hand, the provision reaffirms the traditional rule that “private parties may have sound commercial reasons for contracting for limitations on use . .

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

.,”113 even where those restrictions exceed the bounds of copyright.114 Such vague and conflicting comments will provide little guidance to courts trying to determine whether a particular shrink-wrap contract satisfies federal law. The expectation seems to be that the courts will eventually reach some sort of consensus.

More concretely, the Reporter suggests that license terms, which prevent customers from providing access to multiple users, using data for commercial purposes, or modifying a database's content, “would in most cases be enforceable.”115 Assuming that courts heeded this advice, the last two restrictions could potentially eliminate the now-common practice of creating new databases from old ones.

Academic Proposals

One of the most thorough reviews of the case law and recent history of database protection is found in a recent law review article by Jerome Reichman and Pamela Samuelson.116 The article concludes with two proposals: (1) an enhanced version of existing unfair competition law similar to, but more refined than, that announced in the Second Circuit's NBA decision or alternatively (2) a “preferred solution” in which an industry-based “collection agency” would set baseline license fees. A detailed discussion and evaluation of these proposals is presented at Part III, “Policy Implications,” below.

PART III. POLICY IMPLICATIONS
Is There Room for Improvement?

Statutory protection should not be extended to databases lightly or for no reason at all. Potentially, the best reason to enact protection would be to encourage investors to create new databases that do not currently exist.117 The chances that such an incentive would actually work is discussed as Issue 1. A second, weaker argument for statutory protection—that the existing system of self-help tends to create excessive secrecy and/or diverts investment away from databases into ancillary features —is discussed in Issue 2.

Issue 1: If Statutory Protection Were Enacted, What New Products Would Be Created?

Part I demonstrates that many types of databases can and do flourish in today's world. Traditionally, it has been argued that expanding these islands of legal protection would lead to more and better products. Human ingenuity being what it is, the NRC study committee should not dismiss this possibility lightly. Nevertheless, the interviews and research conducted for this paper did not find a single instance in which a commercial publisher decided not to start a project because it lacked statutory protection. At most, protection was one concern among many—and cost concerns unrelated to potential copying usually ended up driving the decision.118

The fact that databases have managed to prosper without statutory protection is sufficiently counterintuitive to cause one to wonder whether economic theory can account for it. One possible clue is that most of the databases described in this paper have existed for many

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

years, so that the providers have long since recouped their initial start-up investment. A second clue is that virtually all databases began as compilations of still earlier products. The assumption that database suppliers invariably face large start-up costs needs careful examination.

But if database providers do not have a large up-front investment to protect, what are they investing in? The answer is updates, improvements, and extensions of their existing product. After all, database providers already compete against their own product because consumers can almost always continue to use last year's CD-ROM rather than buy a new one. A third party's decision to make additional copies of last year's CD-ROM is unwelcome but conceptually similar.

If the foregoing observations are correct, the only new products that protection is likely to elicit would be large databases that cannot be assembled from precursors. Whether such projects actually exist is an open question.

Issue 2: Has the Absence of Statutory Protection Resulted in Economic Distortions and, If So, How Serious Are They?

Pathologies of Self-help: Limited Access and Secrecy. One of the most effective forms of self-help relies on bilateral licenses in which users promise to preserve the database's secrecy. This method of protecting intellectual property has a long history and is not confined to databases. The fact that some of today's firms still choose secrecy over patents shows that bilateral contracts can be a very powerful form of protection.

Secrecy is most effective when the number of authorized users is small, because large numbers of customers increase the risk of leaks. For this reason, owners who rely on secrecy tend to concentrate on their most lucrative customers while foregoing other, less profitable transactions. Since the lost transactions would have made both seller and buyer better off, such conduct is socially inefficient.

Ideally, database owners could use other forms of self-help instead of secrecy. However, such strategies are usually not as reliable as secrecy. Particularly in the case of very valuable products, the owner may be afraid to use them.

Statutory protection offers a way out if (and only if) it is a reliable substitute for secrecy. In such instances, owners will typically try to charge their core customers a high price while offering lower, preferred rates to others (e.g., academic scientists). This so-called “discriminatory pricing” model119 unambiguously improves efficiency by increasing the number of transactions between willing buyers and sellers in the economy.

Example 7 on biotechnology above shows how secrecy can end up restricting databases to a handful of firms willing to pay multimillion-dollar license fees. But would statutory protection have led to a different result? In the case of biotechnology, the existence of customers willing to pay millions of dollars almost certainly would have led to cheating if the same database had simultaneously been offered to academic researchers at affordable prices. Nevertheless, statutory protection could still be viable under other, less extreme situations. The study committee should ask witnesses whether such situations actually exist anywhere in the sciences.

Benign Self-help: Editing and Enhancements. This paper has shown that updating, editing, bundling with advanced software, and other enhancements are all popular forms of self-help. The danger is that the need for self-help will encourage database owners to over invest in such

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

features. If so, this distortion will be paid for by a corresponding under investment in the database itself. A relatively mild instance of this appears to have occurred in connection with the privately published nuclear database discussed in Example 5.

In economic terms, the most effective self-help perspective requires providers to supply the enhancements that consumers want most. Since this is the same solution as that found in a free market, distortions should be minimal in most cases. This conclusion might not hold, however, if the total enhancements needed to effectuate self-help exceeded the amount that consumers would demand in a free market. In that case, providers would begin to over invest in enhancements at the expense of underlying databases.

In the end, the issue of whether self-help introduces distortions depends on the empirical question of how many enhancements consumers would demand in a free market. However, this paper has pointed out that free-market demand for updating, editing, and other enhancements in the sciences is likely to be very strong. For this reason, broadly comparable activities are likely to go on regardless of whether statutory protections are enacted. The fact that self-help might produce unwanted distortions in some other, theoretical world hardly matters.

Summing Up. Most forms of self-help do not require secrecy and cause few distortions. A worrisome exception—bilateral contracts and secrecy—occurs in biotechnology but does not seem to be widespread elsewhere in the sciences. Even in biotechnology, there is little indication that database owners would actually give up secrecy if statutory protection became available.

Pitfalls and Drawbacks
Issue 3: Will Protecting First-generation Databases Discourage the Creation of Subsequent Products?

Scholars have pointed out that, under certain circumstances, protecting the rights of earlier innovators can discourage subsequent innovation. This has been called the “tragedy of the anticommons.”120 In the database context, the theory argues that giving first-generation database owners the right to demand compensation (1) encourages first-generation products but (2) raises the cost of producing later-generation products ever afterward. If effect (2) is larger than effect (1), statutory protection could actually end up reducing the total number of databases produced over time.

The tragedy of the anticommons will not occur if the creators of second-generation databases are allowed to negotiate licenses with existing database owners in advance.121 This is so because the existing owners can earn licensing revenues only if later-generation databases are actually created and sold. For this reason, the owners will always set their fees so that new projects remain profitable.

The tragedy of the anticommons might still occur under two circumstances. First, many databases are created on a not-for-profit basis, in which case there will be no revenue stream to share with the first-generation owner. In principle, this is not a problem because the first-generation owner can still make a profit by hiring someone else to do the work commercially. In the narrowly specialized world of the sciences, however, such people may not exist.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

The second circumstance in which the tragedy of the anticommons might still occur involves databases that are public goods—i.e., goods that cannot be sold at a profit even though they benefit society as a whole. In theory, the government should be prepared to pay whatever license fees are needed to produce these products just like any other cost of production. In practice, however, governments have often found it politically difficult to purchase intellectual property from the private sector. There is therefore no guarantee that such expenditures would be made in the future.

Issue 4: Should Statutory Protection Include Exemptions for Socially Useful Copying?

Exemptions for Noncommercial Use. Within the scientific community, much of the debate over database protection has centered on whether there should be exemptions for research.122 The standard economist's objection to such exemptions is that they penalize the database owner. If research truly benefits society, the argument runs, it should be paid for out of general tax revenues. Conversely, forcing the database owner to give up part of his rights unfairly puts society 's burden on a single individual.

On the other hand, the fact that society should use tax money to purchase databases for its researchers is no guarantee that it will. While this is essentially a political judgment, mainstream economics ' so-called theory of the second best teaches that enacting part of a socially optimal plan is sometimes worse than doing nothing at all.123

The idea that businesses should make in-kind subsidies to worthwhile activities continues to exert a powerful hold over the American imagination. For example, many doctors donate their labor to charity. The appeal is particularly strong in the present situation. Having asked for unprecedented protection, the argument runs, database providers should not complain if they end up receiving slightly less than the entire pie.

Finally, society may decide that noncommercial databases are valuable in their own right and need to be protected. This question is addressed in Issue 7, below.

A Fair Use Exemption for “Honest Copying.” Most commentators assume that new databases are created by gathering information de novo or by paying for the right to use someone else 's database as a starting point. However, the examples found in this paper suggest that neither model has been particularly important in the past. Instead, scientists have normally used earlier databases without payment to create fundamentally different products. Since such behavior does not fit the normal free-rider stereotype very well, it is worth asking whether it provides benefits and whether future reforms should try to retain it.124

From the economic perspective, the answer probably depends on how much additional protection databases need. If high levels of protection are needed to protect investors from free riders, the law should recognize few if any defenses to copying. But if the existing world of self-help is only slightly inadequate, statutory reform will have to include broad fair-use-type defenses to avoid overprotection. Creating a safe harbor exemption for traditional scientific practices is one way to do this.125

In addition to such economic considerations, there are also sound legal arguments why some sort of fair use exemption should protect users who genuinely try to improve what they copy. The reason is that copyright already protects authors who take facts from existing works: Refusing to implement a similar fair use exemption for database copying would lead to an

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

anomalous situation in which database protection actually exceeded that of copyright. The importance of maintaining the traditional distinctions between copyright and lesser forms of intellectual property is further discussed in Issue 7, below.

Issue 5: Would Increased Rights Allow Database Providers to Charge Higher Prices Within Individual Niche Markets?

Many observers argue that the sciences contain large numbers of niche markets, each of which is served by only one or two providers. The result, they claim, is a tendency toward price gouging that will only get worse if additional database protection is enacted. The usual economic counterargument is that existing providers cannot set prices too high without attracting competitors. For this reason, logic suggests that prices should remain at or near competitive levels even in single-provider markets.

A more sophisticated argument suggests that the investment needed to enter a particular niche may be nearly as large as the market itself. In such cases, the entire market will never generate enough revenues for more than one or two firms to recover their investment. Since this fact deters would-be competitors from entering the market, existing providers can raise prices without fear of entry. Statutory protection would make this problem worse by adding to would-be competitors ' start-up costs.

The question of whether niche markets actually exist will ultimately have to be settled empirically. So far, however, most studies have confined themselves to counting the number of existing competitors in each market—thereby ignoring the crucial role of potential entrants. More definitive studies will have to look at entry costs and/or evidence of abnormal returns to capital. In the meantime, suggestive evidence that niche markets may exist comes from the fact that a closely related industry—scientific journals—has recently been accused of price gouging.126 Inverting the normal economic argument, one could say that the existence of high prices in this area proves that entry is difficult. Statutory reform would make the resemblance between journals and scientific databases even closer than it is today.

Given the present state of the evidence, the NRC study committee should be careful to ask witnesses for concrete examples of price gouging. If such practices turned out to be widespread, they would constitute a strong argument against extending protection still further.

Issue 6: Could Statutory Protection Damage Science by Inadvertently Privatizing Its Databases?

Part I of this paper shows that different branches of science have created database communities that range from all public to all private, with every conceivable mixture in between. On the other hand, Issue 3 points out that protecting existing products can be a disincentive to further database creation (the anticommons problem) unless firms are able to buy licenses from one another. Since public databases do not generate the revenues needed to pay license fees, statutory protection imposes substantial (albeit inadvertent) pressures to privatize.

One problem with privatization is that most scientific databases are public goods that require government support. In theory, privatized databases could continue to receive such support in the form of government subsidies or grants. In practice, government may lack the political will to do this.127

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Harder to quantify, but no less important, are the likely effects of privatization on information exchanges between scientists. As one witness said, “This'll make it even harder for me to give stuff away for free.”128 Similarly, there are already complaints that the E.U. Directive has made European scientists more reluctant to share data with their U.S. collaborators.129 Enacting a database statute could cause similar problems within the United States itself.

Issue 7: How Would Database Protection Interact with Other Forms of Intellectual Property Protection?

The definition of databases as “compilations of information” is troubling because practically anything—including Gone with the Wind—can be described as “a compilation of information.” In the words of one commentator, “no abstract definition of a database will give us a bright line border between databases and non-database works.”130 This has prompted some scholars to supplement the definition by listing products that are not “databases.”131 Given the explosion of new forms of intellectual property, such negative definitions seem doomed to failure.

The NRC study committee should recognize that phrases like “compilations of information” continue to be used because they summarize basic attributes that people want to protect. The fact that these attributes can be found in almost all intellectual property only shows that most of the arguments for database protection are very general. If the NRC study committee ultimately agrees with these arguments, it should logically be prepared to advocate the same (or greater) protection for all other forms of intellectual property. Otherwise, the most general type of intellectual property (database protection) could end up becoming more desirable than the narrowly defined categories traditionally thought to merit heightened protection under the copyright and patent statutes.

Some scholars have tried to limit databases to a special category of products for which sui generis laws can be written. This approach is unnecessary if database protection is simply thought of as the default choice for products that do not meet the relatively high standards of copyright or patent protection. Conversely, the NRC study committee should be deeply suspicious of any proposal that would afford database owners any right that is not simultaneously available to copyright or patent owners.

Threats from the European Union
Issue 8: Does the European Union's Position on Databases Change the Foregoing Analysis?

The principal reason for the E.U. Directive is that stronger incentives would encourage European companies to create more databases.132 The European Union's threat to leave U.S. databases unprotected in Europe if the United States does not pass reciprocal legislation also appears to have been motivated by the fact that databases are a worldwide market that require a consistent set of rules.

The fact that the threats contained in the E.U. Directive did not take effect immediately suggests that they may have been intended, at least in part, as a bargaining position. While it is

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

true that some observers have called the E.U. Directive a hunting license to copy unprotected U.S. databases, others have remained skeptical. Last year, the journal Science reported that “observers on both sides of the Atlantic doubt that Europeans will do so because of fears that such a move could spark a trade war. ”133

If the United States decides that E.U.-style database protection is not in its best interests, it should ask the Europeans to negotiate. If this fails, the United States will have to decide whether its larger interests require it to enact E.U.-style legislation anyway. But there is no reason not to try.

AVAILABLE POLICY TOOLS

If the NRC study committee finds that existing law needs to be reformed, it must next consider what tools are available. This section summarizes the various strategies found in existing or proposed database protection laws (see Part II, “Existing and Proposed Law,” above) and comments on each.

For convenience, the tools open to lawmakers are grouped in ascending order of intrusiveness. Option 0 (no change) is self-explanatory. Option 1 and Option 2 would let courts decide which databases should be protected on a case-by-case basis. These options would probably be most useful in a world where some (but not all) databases were inadequately protected against free riders. Finally, Option 3, Option 4, Option 5, Option 6 through Option 7 would grant protection to all database providers. These options would be most appropriate in a world where virtually all databases faced significant threats of free ridership.

Option 0: No Change in Existing Law

Part I, “Today's Databases,” shows that many providers are willing to offer databases based on self-help alone. Furthermore, the section on limits of federal copyright law, in Part II, has shown that most scientific databases display sufficient creativity to qualify for copyright protection, even after Feist. This probably affords a modest amount of protection despite the fact that competitors are still free to copy data if they rearrange them creatively.

In the end, it is an empirical question whether existing protection strategies prevent suppliers from investing in new databases or unacceptably distort database content. Perhaps the most that can be said is that the NRC study committee should seek evidence of need.

Option 1: Judge-made Unfair Competition Law

This is the traditional formulation that governed database protection in the United States prior to Feist. The most recent and sophisticated statement of the doctrine is found in the Second Circuit's NBA decision, which asks courts to consider five separate “elements” before deciding whether protection is appropriate:

  1. Whether the database owner generates or collects information at some cost or expense;

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
  1. Whether the value of the information is time sensitive;

  2. Whether the defendant's use of the information constitutes free-riding on the plaintiff 's costly efforts to generate or collect it;

  3. Whether the defendant's use of the information is in direct competition with a product or service offered by the plaintiff; and

  4. Whether the ability of other parties to free-ride on the efforts of the plaintiff would so reduce the incentive to produce the product or service that its existence or quality would be substantially threatened. 134

As applied by the Second Circuit, a failure to prove any one of these elements would be fatal to a database owner's claim for protection.

As previously noted, the Second Circuit's discussion in the NBA case was narrowly limited to so-called hot news cases by element 2 (time-sensitive information). At the same time, it is interesting to ask what would happen if Congress or the courts were to overrule or liberally expand this element.135 In that case, it seems clear that elements 1, 3, and 5 (which collectively encapsulate the usual arguments against free ridership) would authorize courts to extend protection on a case-by-case basis.136 Such case-by-case flexibility would be particularly appealing if the NRC study committee believed that some (but not all) databases were vulnerable to free ridership.

In the past, legislatures have often relied on judicial discretion to implement policy on a case-by-case basis. Nevertheless, sending database protection back to the courts has drawbacks. Because judicial elaboration takes time, it might be many years before would-be copiers received a clear understanding of when the defense could and could not be raised. For example, the 70-year hiatus between INS and Feist produced very little guidance.

Option 2: Improved Unfair Competition Law

This is the first of two alternative reform measures advocated by Reichman and Samuelson. Drawing on INS and its progeny, courts would use the following eight factors to determine whether “unfair extraction” had occurred:

  1. The quantum of data appropriated by the user;

  2. The nature of the data appropriated;

  3. The purpose for which the user appropriated the data;

  4. The degree of investment initially required to bring those data into being;

  5. The degree of dependence or independence of the user's own development effort and the substantiality of the user's own investment in these efforts;

  6. The degree of similarity between the contents of the database and a product developed by the user (even if only privately consumed);

  7. The proximity or remoteness of the markets in which the database owner and user are operating; and

  8. How quickly the user was able to come into the market with his or her product as compared with the time required to develop the original database.137

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Reichman and Samuelson correctly note that courts could use the foregoing factors to identify instances in which “database suppliers are sometimes less vulnerable to free-riding injury than appears from superficial claims for relief.”138

At the same time, Reichman and Samuelson's eight factors are just that—factors, not rules. Even more than the Second Circuit, their schema would commit the task of developing bright-line rules to future judges.

Option 3: Sui Generis Protection for a Limited Term

Case-by-case approaches make little sense if almost all databases are vulnerable to copying. Under such circumstances, the exclusive rights model found in copyright and patent law is probably appropriate.

This still leaves the question of how much protection is needed. One way to adjust this parameter is to provide protection for a fixed number of years. Presumably, the periods chosen should be related to the typical time that database providers need to recoup their investments. However, there are at least two types of investments: the creation of the underlying database (case 1) and the updating and maintaining of the database (case 2).

In general, policymakers might decide that only one of these cases actually needs protection. In a world dominated by case 2 investments, the long protection periods associated with case 1 are probably unnecessary. Even if case 1 protections are needed, moreover, very few U.S. firms willingly make investments that cannot be recouped in 15 to 25 years. For this reason, the time periods found in the E.U. Directive, the WIPO draft treaty, and H.R. 2652 are almost certainly too long.

Finally, the question of whether to protect updates is separate and distinct from that of case 1 protection. If case 2 protection is granted, it probably should not last significantly longer than the mean time between updates (i.e., 1 or 2 years).139

Option 4: Sui Generis Protection with Not-for-Profit/Academic Exemptions

A second way to adjust the amount of protection afforded by a sui generis statute is to permit copying in certain clearly defined circumstances. One added advantage of this approach is that exemptions can be tailored to protect socially useful activities. Database providers who want to see their products disseminated on a not-for-profit basis should be encouraged. Exemptions can shelter not-for-profit databases from inadvertent statutory pressures to privatize (Issue 6).

The E.U. Directive allows member states to exempt copying “for the purposes of illustration for teaching or scientific research ” or a “non-commercial purpose.” An earlier NRC committee similarly recommended that any future legislation should embrace the principle that “[d]atabase owners should never possess the right to preclude access to otherwise publicly available data when sought for purposes of basic scientific research.”140

Practically all of today's science and technology fields depend on one or more not-for-profit databases. For this reason, the effects of privatization are likely to be pervasive.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

Significant adverse impacts could include reduced real government funding levels and damage to the existing culture of science (Issue 6). Future legislation should move cautiously in this area.

Option 5: Sui Generis Protection with a Defense for Improved Databases

Since advocates of extended database protection usually base their arguments on free ridership, it might make sense to exempt copiers who are willing to incur substantial costs. This view fits naturally with the existing world, in which databases are typically created by combining, improving, and extending earlier products.

The principal drawback of such a defense is that “substantial improvement” is hard to define and would almost certainly require judicial elaboration. The defense would presumably be available to any copier who invested in improvements, updates, and/or extensions at levels comparable to those of the original owner. Short of this, there is no obvious way to determine how substantial the copier's improvements would have to be. The concept would probably require judicial elaboration over time.

Option 6: Shrink-wrap Contract Reforms

Everyday experience suggests that the lawyers who write shrink-wrap and click-wrap contracts will continue to claim as many rights as possible—even when those rights happen to exceed the normal scope of copyright. The only real question, therefore, is what the courts will enforce. The draft UCC provisions discussed above provide little guidance.

The unpredictability and uncertainty of asking the courts to evolve common law solutions to the database problem were discussed under Option 1 above. However, common law unfair competition is at least based on free-ridership and other relevant concepts. In contrast, the shrink-wrap doctrine tends to be more concerned with contract law concepts like “offer,” “consent,” and “unconscionability.” Since these concepts have little or nothing to do with free ridership, reliance on the shrink-wrap doctrine is likely to divert attention from the public policy issues most relevant to databases.

Option 7: Administrative Solutions

In their preferred (second) solution, Reichman and Samuelson argue that all databases should be protected by automatic licensing according to a predetermined fee schedule.141 Although they recognize that automatic licensing schemes have met with mixed reviews in the past, Reichman and Samuelson believe that these criticisms could be ameliorated by (1) using an industry-based “collection society” to set baseline license fees and (2) allowing would-be licensees to opt out of the baseline by negotiating fee schedules directly with the database's owner.142

Reichman and Samuelson are right to point out that the collection-society concept has a history of mixed reviews. Potential problems include:

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
  • Need for Market-based Solutions. Economists have traditionally justified intellectual property because it creates a mechanism for turning private knowledge of research and development opportunities into socially optimal levels of investment. Reichman and Samuelson's proposal would replace this market mechanism with a collection society's judgment of what fees should be. For a particular database, the regulated price would either be lower than that required to cover costs (thereby jeopardizing investment) or higher (thereby deterring use).

  • Transaction Costs. Allowing participants to contract around the collection society may reduce transaction costs but will not eliminate them. (This is true for the same reason that allowing litigants to settle lawsuits has not put the court system out of business.)

  • Antitrust Concerns. Reichman and Samuelson correctly note that their proposal could only be enacted only after removing “any antitrust barrier that stands in the way . . . .”143 However, the dangers of collusion should not be minimized. Giving an industry-based collection society the power to set database prices would create a political lightning rod. If suppliers (consumers) eventually became dominant, the temptation to impose monopoly (monopsony) solutions could become irresistible.

Given these concerns, Reichman and Samuelson's proposal should be viewed with caution absent strong evidence that the existence of niche markets has created a natural monopoly requiring regulation. Even then, the issue of whether license fees should be set by an industry-based collection society remains an open question.

CONCLUSION

The principal argument for statutory protection is that firms do not create enough databases because doing would require a large up-front cost that is not currently protected. However, this paper has found little evidence that lack of statutory protection has prevented the creation of new products. The NRC study committee should ask witnesses for concrete of examples where this has happened. It should also ask whether the assumption of large up-front costs is realistic. Most of the database industry's products may instead consist of updates and improvements whose cost can be recouped within a year or so.

This paper has found evidence that self-help can cause distortions. From the vendor's perspective, these include overinvestment in updates, graphics, software, and other enhancements at the expense of the databases themselves. From the consumer's perspective, self-help can unnecessarily restrict access to data. The NRC study committee will have to decide how serious such distortions are and whether they constitute an adequate case for reform.

From a legal standpoint, the committee should remember that virtually all commercially valuable data can be described as “compilations of information” and hence “a database.” So-called sui generis protection is therefore unlikely to stay confined to a particular type of information for very long. Sooner or later, most commercially valuable information will probably end up receiving database protection. This may or may not be a sensible result, but that is the choice.

Finally, the benefits of reform must be weighed against its likely costs. Potential problems include, but are not limited to, deterring the creation of new databases from earlier

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

products, creating monopoly power within niche markets, making databases unaffordable by the same university researchers whose work typically advances knowledge in the first place, and damaging the culture of science through inappropriate privatizing and hoarding of information.

Throughout this century, most arguments for and against database protection have proceeded from relatively simple assumptions about why databases are created and how they are sold. This report shows that the reality is much more subtle. The January 14-15, 1999, NRC Workshop on Promoting Access to Scientific and Technical Data for the Public Interest represents a unique opportunity to deepen and extend this understanding.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
EXHIBITS*
  1. The “Gale 100” List

  2. Sample pages from various Web sites, including those of the University of California at Berkeley and Yahoo Web pages, that were used to compile Table C.3 and Table C.4

  3. Notes from the November 10, 1998 interview with Richard B. Firestone, Lawrence Berkeley National Laboratory

  4. Reprint of Appendix C of the 1997 National Research Council report, Bits of Power: Issues in Global Access to Scientific Data. National Academy Press, Washington, D.C.

  5. Notes from the November 25, 1998 interview with Karen Hunter, Elsevier Science; notes from November 10, 1998 interview with Richard B. Firestone, Lawrence Berkeley National Laboratory

  6. Notes from the November 10, 1998 interview with Thomas R. Slezak, Lawrence Livermore National Laboratory

  7. Reprint of Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, Official Journal of the European Community, No. L 77/20, 3/96

  8. World Intellectual Property Organization Basic Proposal for the Substantive Provisions of the Treaty of Intellectual Property in Respect to Databases to be Considered by the Diplomatic Conference, CRNR/DC/6, August 30, 1996, available on U.S. Copyright Office Web site at <http://lcweb.loc.gov/copyright/wipo/wipo6.html>

  9. U.S. Congress, H.R. 2652, Collections of Information Antipiracy Act

  10. Uniform Commercial Code Article 2B-110 (August 1998 draft)

* Please note that these exhibits, which were prepared as attachments to this paper, are available for viewing in the National Research Council's Public Access Records Office.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

NOTES

1 Laura D'Andrea Tyson and Edward Sherry, Statutory Protection for Databases: Economic and Public Policy Issues (1997) (report commissioned by the Information Industry Association). The breadth of this definition is intentional. Indeed, the European Union's Directive on Databases expressly extends to “literary, artistic, musical or other collections of works or collections of other material such as texts, sounds, images, numbers, facts, and data [as well as] collections of independent works, data or other materials which are systematically or methodically arranged and can be individually accessed.” E.U. Directive at ¶ 17. In fairness, the E.U. Directive does include an ad hoc exclusion for “audiovisual, cinematographic, literary, or musical work as such.” Id.

2 The Gale Directory of Databases describes itself as “easily . . . the most complete guide to the electronic database industry worldwide.” Kathleen Lopez Nolan (ed.), Gale Directory of Databases (New York and London, 1995) at p. vi. Entry-by-entry description of the sample can be found in Exhibit 1. A particularly useful feature of the Gale Directory is Professor Martha E. Williams' annual profiles of the industry.

3 The best example of this is a company called Silver Platter. Silver Platter's nuclear databases are discussed in my interview with Richard Firestone of Lawrence Berkeley National Laboratory (see Exhibit 3).

4 Tom Slezak, a computer scientist at Lawrence Livermore National Laboratory, confirmed that these methods conferred “reasonable and prudent security” when I interviewed him on November 20, 1998. A memorandum summarizing Slezak's comments can be found in Exhibit 6 of this paper.

5 Perhaps the best example of this in the sample was an online video store that allowed users to search a massive database of over 125,000 movies, many of which were not even available commercially.

6 See also J.H. Reichman and Pamela Samuelson, “Intellectual Property Rights in Data?” Vanderbilt Law Review, Vol. 50, p. 51 (January 1997) at p. 67 (“To the extent that government generated or university generated data remain noncommercialized, their vulnerability to technically refined means of [copying] may be of relatively little importance. Presumably, the originators want the broadest possible distribution of their data sets.”)

7 Joel S. White (personnel communication).

8 Joel S. White (personal communication). Info-Trac copies the articles at Bay Area libraries.

9 The UC Berkeley and Yahoo Web pages used to compile Table C.3 and Table C.4 can be found in Exhibit 2. Interested readers may want to acquire a feel for existing databases by skimming through these listings.

10 By way of example, the Berkeley Physics Department Web site reports that Inspec, MathSciNet, and Chemical Abstracts all existed on paper before their current electronic incarnations. Inspec is more than 100 years old.

11 For example, the UC Berkeley Engineering Library's Web site lists 47 of its 60 Web sites as “UC only” or “UCB only.” Publisher Web sites were similarly restricted, although four offered their products on a trial access basis.

12 This section is taken from a five-hour interview between the author and Dr. Richard Firestone, the head of LBL's Table of Isotopes project. Curious readers will find full details in a memorandum attached as Exhibit 3. An earlier workshop studied Brookhaven's related but distinct ENSDF database. See, National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data, National Academy Press, Washington, D.C., at Appendix C. A copy of Appendix C is reproduced as Exhibit 4 to this paper.

13 This figure does not include the actual work of reviewing articles, which is done on a volunteer basis throughout the world.

14 The Brookhaven National Laboratory followed a similar path with respect to its related Evaluated Nuclear Structure Data File (ENSDF) database. Like Berkeley, Brookhaven has devoted extensive effort to editing ENSDF's data and improving ENSDF so that it can support advanced relational search engines. Brookhaven has also created a new version of ENSDF for use by medical workers. Finally, it is working to improve dissemination by upgrading its Web site and making the same data available on floppy disk and CD-ROM. See National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data, National Academy Press, Washington, D.C., pp. 205-206.

15 Surprisingly, the private CD-ROM/book package competes successfully with—and indeed seems to benefit from—its Web-based counterpart. In addition to the relatively minor enhancements required by the publisher, there seem to be intrinsic reasons for this. For example, books are often easier to use; searches conducted over the Web are not confidential; and CD-ROMs are permanent, whereas data on the Web can potentially change or disappear without warning.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

16 This section is taken from a brief interview with Karen Hunter, who handles copyright issues and strategic planning for Elsevier 's scientific journals and databases. Curious readers will find full details in a memorandum attached as Exhibit 5, which also contains additional information reproduced from Reed Elsevier's Web site.

17 Reed Elsevier also publishes many nonscientific databases, including The Official Airline Guide.

18 Elsevier Science points out that these policies are broadly similar to those of many other journal publishers, including Academic Press, the American Chemical Society, the American Institute of Physics, and the American Geophysical Union.

19 J.H. Reichman (personal communication).

20 This section is taken from a four-hour interview between the author and Thomas R. Slezak, head of bioinformatics for Lawrence Livermore National Laboratory's human genome sequencing group. Full details can be found in a memorandum attached as Exhibit 6. A supplementary discussion of genome databases can be found in Appendix C to the NRC's Bits of Power report and is reproduced here as Exhibit 4.

21 In 1997, Science described the bottleneck this way:

Because the world's major biological databases are constructed differently, it is virtually impossible to devise search programs to tap into them all effectively. A user has to hop from one to the other using each database's search engine to retrieve information that comes in a variety of different formats.

The article also described how a “group of leading pharmaceutical companies” was putting its “considerable weight behind the development of common standards.” Nigel Williams, “Drug Firms Back Move to Link Databases,” Science, Aug. 15, 1997.

22 Because private biotechnology companies believe that submitting searches over the Web compromises security, each maintains internal copies of the 200+ public databases needed to conduct research and uses in-house software engineers to update them nightly. Since many online databases tend to change computing conventions abruptly, systems often crash without warning. These crashes cause recurring panics within corporate management information systems departments.

23 GenBank started out as a traditional database that tried to comment on and add value to journal articles. GenBank converted to its current format because it could no longer keep up with the volume of data.

24 See, e.g., Nigel Williams, “Unique Protein Database Imperiled,” Science, May 17, 1996 (international reaction to threatened closure of Swiss-Prot database); Howard M. Ca, “After the Genome Database,” Science, March 13, 1998 (user comment on closure of GDB database). Dr. Cann 's letter is particularly illuminating for its discussion of the current system and how it might be fixed:

In the post-GDB-project world, the user may have to click more often to find mapping information [at other Web sites] and perform interpretation and editing personally. Problems that might be expected in the absence of GDB coordination include recognizing duplicates of new markers and conflicting map locations from different resources.

Perhaps the community will get by with the available final copy of the GDB and with database “shopping” on the Internet. If not, the international community may have to pull together to arrive at a solution. For instance, database host institutions could form a consortium for the purpose of reviewing new data and maps in a coordinated fashion before release to the public. External expert reviewers might volunteer efforts (similar to those of the “editor” group of scientists that now review and edit GDB data) within the framework of such a consortium, injecting further assurances of quality and coordination. This type of program or something with a similar intent could be provided at a minimal cost increase and would continue to support the efforts of many scientists involved in mapping and eventually identifying genes underlying complex disorders.

25 These products were concentrated in particularly lucrative areas such as expressed sequence tags or, more recently, gene sequences.

26 “Incyte Serves Up Information, Part I,” In Vivo, May 1996.

27 See, e.g., Jon Cohen, “The Genomics Gamble,” Science, Feb. 7, 1997. The database user exercised its right to terminate the partnership in late 1996.

28 “Genetic Warfare,” The Economist, May 16, 1998.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

29 The fee amounted to roughly 1 percent of a typical large pharmaceutical company's research and development costs.

30 “Incyte Serves Up Information, Part II,” In Vivo, May 1996.

31 Incyte's bioinformatics capabilities are summarized on its Web site. The interested reader can find selected Web pages included as part of Exhibit 6 to this report.

32 “Perkin-Elmer's Pharmacogenomics Spin-Off Creating a New Customer for Instrumentation, ” Bioventure View, June 1, 1998.

33 Except as noted, all information reported in Examples 6(a) through (f) is taken from Tyson and Sherry, Statutory Protection for Databases: Economic and Public Policy Issues , at pp. 3-6. Supplemental research was taken from the Web and is collected at Exhibit 2. Examples 8(g) through (h) are based on descriptions found in Appendix C to Bits of Power at pp. 209-210 (materials science), pp. 210-212 (chemistry), pp. 214- 216 (geophysics), and pp. 217-218 (meteorology). A copy of Appendix C is attached as Exhibit 4 to this paper.

34 Interview with Richard Firestone (Exhibit 3).

35 Interview with Karen Hunter (Exhibit 5).

36 See Examples 1 (commercial CD-ROMs), 2 (Web sampler), 4(a) (full-text physics journals), 4(b) (full-text engineering journals), 5 (copyrighted nuclear science graphics), and 6 (Elsevier Science full-text journals).

37 See Examples 1 (commercial CD-ROMs), 2 (Web sampler), 5 (nuclear databases), 6 (Elsevier Science search engine), 7 (biotechnology databases), 8(a) (POISINDEX software), and 8(b) (MDL Drug Database software).

38 See Examples 3(a) (Dataquest semicustom reports) and 7 (semicustom databases in biotechnology).

39 See, e.g., ProCD, Inc. v. Zeidenberg, 86 F.3d 1447 (7th Cir. 1996) (“contracts about trade secrets may be enforced”).

40 See Examples 1 (CD-ROM sampler), 2 (Internet sampler), 4(a) (physics CD-ROMs), 4(b) (engineering CD-ROMs), and 4(c) (engineering Web sites).

41 See Examples 1 (Internet sampler) and 8(f) (materials science database).

42 See Examples 1 (Internet sampler), 4(a) (physics journals), and 4(b) (engineering resources).

43 Although technologically less secure, CD-ROM makers often use the parallel strategy of encryption to block access to their databases. This type of self-help recently received a legal boost when the U.S. Congress enacted P.L. 105-304 (the Digital Millennium Copyright Act). The statute establishes criminal fines and penalties for anyone who tries to defeat an electronic encryption system.

44 If anything, the statistic errs on the side of conservatism since it ignores products that advertise irregular updates.

45 See Examples 3(b) (Info-Trac), 4(a)-(c) (journals, indexes, and bibliographies), 5 (nuclear science), 6 (Elsevier Science), and 7 (biotechnology).

46 See Examples 5 (nuclear physics), 7 (biotechnology), 8(a) (POISINDEX), and 8(e) (animal husbandry).

47 See Example 5 (updating of nuclear physics databases to reflect improved data), 7 (biotechnology), and 8(f) (updating of materials science databases to reflect improved data).

48 This is also a popular strategy for Web-based businesses. See Example 1.

49 It might be argued that there is no reason to enact legislation that encourages cost-free databases because such spinoffs will exist whether or not they are protected. This argument ignores the role of price signals in achieving economic efficiency. If industry members are not allowed to recapture the value of spin-offs, the underlying product will be more expensive (and less used) than it should be. Suzanne Scotchmer, “Standing on the Shoulders of Giants: Cumulative Research and the Patent Law,” Journal of Economic Perspectives (Winter 1991), pp. 29-41.

50International News Service v. Associated Press, 248 US 215 (1918).

51Id. at p. 236.

52Id. at p. 241.

53 Jack E. Brown, “Obscenity, Anonymity, and Database Protection: Emerging Internet Issues,” The Computer Lawyer, 1997 (citations omitted). In 1942, a federal judge argued that INS would have been decided differently if it had been heard in that year. Id. at fn. 78.

54 One confusing aspect of Feist is that many commentators who disagree with the Court's reasoning nevertheless support its final ruling. For example, Tyson and Sherry argue that telephone book data should not be protected because they are generated “with no additional effort” in the course of operating a publicly sanctioned monopoly. Laura D'Andrea Tyson and Edward Sherry, Statutory Protection for Databases: Economic and Public Policy Issues (1997) (report commissioned by the Information Industry Association). In narrowly legal terms, the same result could also be reached by arguing that firms that exercise “monopoly power” in one market should not use it to obtain an “unfair” cost advantage elsewhere.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

55Feist Publications, Inc. v. Rural Telephone Service Co., Inc., 499 US 340 (1991).

56Id. at pp. 342-345.

57Id. at p. 344.

58Id. at p. 348.

59Id. at pp. 352-353.

60Id. at p. 354.

61Warren Publishing, Inc. v. Microdos Data Corp., 52 F.3d 950 (11th Cir. 1995).

62Key Publications, Inc. v. Chinatown Today Publishing Enterprises,

Inc., 945 F.2d 509, 514 (2d Cir. 1991).

63BellSouth Advertising & Publishing Corp. v. Donnelly Information Publishing, Inc., 999 F.2d 1436, 1441 (11th Cir. 1993).

64Id. at p. 1441.

65Mason v. Montgomery Data, Inc., 967 F.2d 135, 139 (5th Cir. 1992).

66Nester's Map & Guide Corp. v. Hagstrom Map Co., 796 F. Supp. 729, 733-34 (E.D.N.Y. 1992).

67CCC Information Services, Inc. v. MacLean Hunter Market Reports, Inc., 44 F.3d 61, 67 (2d Cir. 1994). According to CCC, an author's “loose judgment” that “vast regions” of the United States could be treated as a single market was also protectable.

68Warren Publishing, supra, at pp. 951-52

69Skinder-Strauss Associates v. Massachusetts Continuing Legal Ed., Inc., 914 F.Supp. 665, 675 (D. Mass. 1995).

70Cable News Network, Inc. v. Video Monitoring Services of America, Inc., 940 F.2d 1471 (11th Cir. 1991) at 1485. The opinion was subsequently vacated on other grounds and is cited here as an indication of what future courts might decide if faced with the same question. Cable News Network, Inc. v. Video Monitoring Services of America, Inc., 949 F.2d 378 (1991).

71National Basketball Assn. v. Motorola, Inc., 105 F.3d 841 (2d Cir. 1997).

72 In theory, database owners could argue that database updates are time-sensitive in an economic sense and should therefore be protected. This would require a semantic stretch beyond anything in NBA itself.

73 The obvious counterargument is that many scientific databases follow conventions that leave little room for creativity. For example, spectra almost always show frequency on one axis and amplitude on the other. A better argument might be that the experimenter's choice of which data to present still reflects creative choices. Even this argument might not be enough for human genome sequencing or other areas of routinized inquiry.

74See Interview with Karen Hunter (Exhibit 5).

75 The E.U. Directive on Databases suggested that existing databases could even be “rearranged electronically . . . to produce a database of identical content which, however, does not infringe any copyright in the arrangement of [the] database.” Directive at ¶ 38.

76Sinai v. California Bureau of Automotive Repair, 25 USPQ 2d 1809, 1811 (N.D. Cal. 1992).

77ProCD, Inc., supra. One noteworthy aspect of the ProCD decision was the court's statement that it would “refrain from adopting a rule that anything with the label ‘contract' is necessarily outside the preemption clause.” Id. at p. 1455.

78Vault Corp. v. Quaid Software Ltd., 847 F.2d 255 (5th Cir. 1988).

79 European Council Directive No. 96/9/EC, O.J.L 77/20 (1996). The E.U. Directive itself is not intended to be legislation. Instead, it sets forth requirements that member states must satisfy by enacting “at least materially equivalent” statutes. Id. at ¶ 32. A copy of the Council's Directive is attached as Exhibit 7.

80Id. at Art. 1, ¶ 2.

81Id. at Art. 5, subpart (a).

82Id. at Art. 5, subpart (b).

83Id. at Art. 5, subparts (c) through (e).

84Id. at Art. 7, ¶ 1.

88Id. at Art. 10, ¶ 2.

86Id. at Art. 10, ¶ 3.

87Id. at ¶ 56.

88Id. at Art. 6, ¶ 2(b).

89Id. at Art. 6, ¶ 2(d).

90 17 USC § 107.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

91 “Basic Proposal for the Substantive Provisions of the Treaty on Intellectual Property in Respect of Databases to Be Considered by the Diplomatic Conference,” dated August 30, 1996 (hereinafter WIPO). Interested readers will find a copy of the WIPO draft at Exhibit 8.

92Id. at Art. 2, ¶ (i). The definition would have specifically included “collections of literary, musical or audiovisual works or any other kind of works, or collections of other materials such as texts, sounds, images, numbers, facts, or data representing any other matter or substance. It is worth pointing out that in addition to many kinds of works and other information materials, databases may contain collections of expressions of folklore.” Id. at comment 2.02.

93Id. at Art. 2, ¶ (ii) and Art. 3, ¶ (1). The definition of “substantial part” was further amplified in a note:

The substantiality of any portion of the database is assessed against the value of the database. This assessment should evaluate the qualitative and quantitative aspects of the portion, although neither aspect is more important than the other . . . . The value of a database refers to its commercial value. This value consists on the one hand of direct investments made in the database and on the other hand of the expected market value of the database. This assessment may also take into account diminution of market value that may result from the use of the portion, including the added risk that the investment in the database will not be recoverable. It may even include an assessment of whether a new product using the portion could serve as a commercial substitute for the original, diminishing the market for the original.

Id. at Note 2.09. The concept of an “investment” included any and all “human, financial, technical or other resources” devoted to “the collection, assembly, verification, organization, or presentation of the contents of the database.” Id. at Note 2.10 (iv).

94Id. at Art. 5, ¶ (1). The accompanying notes emphasized the point by explaining that such exceptions “may never conflict with normal exploitation of the database” and could not “unreasonably impair or prejudice the legitimate interests, including economic interests, of the rightholder.” Id. at Note 5.01.

95 WIPO at Art. 8.

96 Jocelyn Kaiser (ed.), “Treaty on Database Access Stalled,” Science, Dec. 20, 1996.

97 The question of whether the U.S Constitution allows Congress to pass European-style database legislation is outside the scope of this report. For a list of possible problems, see U.S. Copyright Office, Report on Legal Protection of Databases, August 1997.

98 A copy of H.R. 2652 is attached as Exhibit 9 hereto.

99 The Digital Millennium Copyright Act of 1998 was subsequently enacted as Public Law 105-304.

100 HR 2652 at § 1201.

101Id. at § 1202.

102Id. at § 1203(a).

103Id. at § 1202(b) and (c).

104Id. at § 1203(d) (emphasis supplied). The reference to “potential markets” would have been more restrictive than the corresponding E.U. Directive, which permits copying “for the purposes of . . . scientific research, as long as the source is indicated and to the extent justified by the non-commercial purpose to be achieved.” E.U. Directive at Art. 9, subpart (a). The “potential markets” language was dropped shortly before the bill went to conference committee. Paul Uhlir (personal communication).

105Id. at § 1203(e).

106Id. at § 1204.

107Id. at § 1206(d). Courts would also have been given discretion to reduce damages for any employee of a nonprofit educational, scientific, or research institution who “believed and had reasonable grounds for believing that his or her conduct was permissible under this chapter.” Id. at § 1206(e).

108Id. at § 1207.

109 UCC 2B-110 (August 1998 draft). A copy of the draft provision with accompanying notes can be found at Exhibit 10. Interested readers can view the entire file at <http://www.law.upenn.edu/library/ulc/ucc2b/2b898.htm>.

110Id. at Note 2.

111Id. at Note 3. Significantly, the Reporter adds that state court judges “may look to federal copyright and patent laws for guidance on what types of limitations . . . ordinarily seem appropriate.” Id. This suggests that federal law may provide persuasive reasons why state courts should refuse to enforce particular licenses even where it does not directly command them to do so.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

112Id. at Note 1.

113Id. at Note 3.

114Cf., Reporter's Note 3 to draft UCC provision 2-105 (copyright statute permits “contractual restrictions on use”).

115Id. at Note 3 (emphasis supplied).

116 J.H. Reichman and Pamela Samuelson, “Intellectual Property Rights in Data?” Vanderbilt Law Review, Vol. 50 (January 1997), p. 51.

117 This policy-oriented approach to the problem necessarily ignores justice-based appeals that creators “should” be compensated. Suffice it to say here that strong normative arguments exist against rewarding inventors who knew in advance that certain types of activity would not be compensated.

118 Karen Hunter of Elsevier Science did report that her company had turned down nonscientific database products because it was afraid of copying. Furthermore, it is possible and even likely that counterexamples in science and engineering could be found if a more systematic survey were conducted. The apparent rarity of such counterexamples is nevertheless striking.

119 Walter Nicholson, Microeconomic Theory: Basic Principles and Extensions (6th ed. 1995) at pp. 625-628.

120 Michael Heeler and Rebecca Eisenberg, “Can Patents Deter Innovation? The Anticommons in Biomedical Research, ” Science (May 1, 1998) 280:698-701; see also Suzanne Scotchmer, “Standing on the Shoulders of Giants: Cumulative Research and the Patent Law,” Journal of Economic Perspectives (Winter, 1991), pp. 29-41.

121 Jerry Green and Suzanne Scotchmer, “On the Division of Profit Between Sequential Innovators,” Rand Journal of Economics (1996) 27:322-331.

122 See, for example, Andrew Lawler, “Database Access Fight Heats Up,” Science (November 15, 1996); see also Bits of Power, supra, p. 171 (recommending that fair-use-type provisions be included in any future database legislation).

123 See, e.g., Walter Nicholson, Microeconomic Theory: Basic Principles and Extensions (6th ed., 1995) at pp. 568-69.

124 An additional difficulty would be encountered during the transition period that followed any reform. This is because owners of pre-reform databases would receive full protection even though they had not paid for their own “head starts” under the old system.

125 Some proposed legislation suggests that copying should be permitted where the new database serves a different market than the first one. This is another kind of “honest copying” exemption.

126 Journal prices rose 115 percent between 1986 and 1994. A leading study commissioned by the Association of Research Libraries blamed the increases on an “imperfect, monopoly-like marketplace” controlled by a small group of publishers. See, e.g., Gary Taubes, “Electronic Preprints Point the Way to ‘Author Empowerment,'” Science Feb. 9, 1996.

127 Cf. Bits of Power, supra, at p. 114 (criticizing economic argument in favor of having researchers pay for databases from their individual research budgets as politically unsustainable).

128 Interview with Thomas Slezak (bioinformatics expert), Exhibit 6; see also interview with Karen Hunter, Exhibit 5.

129 Interview with Karen Hunter, Exhibit 5; see also Eliot Marshall, “Please Pass the Data,” Science 276:1961 (June 27, 1997) (reporting “recent pressure from [the EU] to give industry first crack at any genome data”).

130 U.S. Patent and Trademark Office, Report on and Recommendations from April 1998 Conference on Database Protection and Access Rules (July 1998) at p. 16.

131Id. at pp. 14-17; Terry M. Sanks, “Database Protection: National and International Attempts to Provide Legal Protection for Databases,” Florida State University Law Review (1998) 25:992.

132Directive at ¶ 11. (“Whereas there is at present a very great imbalance in the level of investment in the database sector . . . between the Community and the world's largest database producing third countries.”) At first blush, the E.U.'s logic seems paradoxical since greater incentives would also encourage U.S. companies to compete even harder. However, the European Union may believe that U.S. companies have already decided to enter the database market. If so, additional protection might persuade risk-averse European firms to enter the market without eliciting still more investment by the Americans.

133 Andrew Lawler (ed.), “EU Database Directive Raises Hackles,” Science 279:165 (Jan. 9, 1998).

134National Basketball Assn., supra, at pp. 852-853.

135See, U.S. Patent and Trademark Office Report, supra, at p. 6 (reporting suggestion by Professors Ginsburg and Reichman).

136 Element 4's limitation to use of information “in direct competition with a product or service offered by the plaintiff ” is more suspect. From an economic perspective, society wants investment incentives to reflect the potential

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×

value of a proposed database to all markets—not just the ones that the owner happens to be in at any given time. Suzanne Scotchmer, “Standing on the Shoulders of Giants: Cumulative Research and the Patent Law,” Journal of Economic Perspectives (Winter, 1991), pp. 29-41.

137 Reichman and Samuelson, supra, at pp. 142-143.

138Id. at p. 143 and fn. 423.

139 Various commentators have suggested that initial start-up protection should be extended each time a database is updated. If the database has only been updated, it makes little sense to extend start-up protection a second time.

140Bits of Power, supra, at p. 166.

141 Reichman and Samuelson also suggest using an initial blocking period in which no databases could be copied. Reichman and Samuelson, supra, at pp. 145-146. This is conceptually identical to sui generis protection (Option 3) and will not be discussed further.

142Id. at pp. 146-150.

143Id. at p. 148.

Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 337
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 338
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 339
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 340
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 341
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 342
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 343
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 344
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 345
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 346
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 347
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 348
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 349
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 350
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 351
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 352
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 353
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 354
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 355
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 356
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 357
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 358
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 359
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 360
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 361
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 362
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 363
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 364
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 365
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 366
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 367
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 368
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 369
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 370
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 371
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 372
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 373
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 374
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 375
Suggested Citation:"Appendix C: Raw Knowledge: Protecting Technical Databases for Science and Industry." National Research Council. 1999. Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options. Washington, DC: The National Academies Press. doi: 10.17226/9693.
×
Page 376
Next: Appendix D: Acronyms »
Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!