2
Incentives and Disincentives Affecting the Availability and Use of Scientific and Technical Databases
As noted in Chapter 1, scientific and technical (S&T) databases are not just a by-product of research, but also an essential foundation on which progress in science builds. Increasingly, too, databases are a research tool that can be sold or licensed and used as the input for new products, the creation of customized derivative databases, and innovations to broaden the scope and increase the pace of discovery. The incentives and disincentives to provide databases reflect all these uses, as well as financial motives. This chapter points out the divergent objectives of organizations that produce and distribute S&T databases, and it outlines some of the economic factors affecting access to such data. In addition, it elaborates on the committee's conclusion that any new legislation that would change the status quo must take into account how it would alter the incentives to produce both original and derivative databases, how it would affect the dissemination and use of databases (especially regarding whether access would be exclusive or unrestricted, particularly for the scientific community), and what the unintended consequences might be for scientific inquiry and other public-interest uses.
DIVERGENT OBJECTIVES OF ORGANIZATIONS THAT PRODUCE AND DISTRIBUTE SCIENTIFIC AND TECHNICAL DATABASES
Original and derivative S&T databases are produced by government, not-for-profit/academic, and for-profit organizations (see Table 1.1 in Chapter 1 for selected examples). Whereas individual researchers typically are motivated by curiosity, a desire to contribute to the knowledge base, and an opportunity to
influence the thinking of others, their employers also have in mind a combination of organizational mission and funding considerations. Table 2.1 summarizes typical objectives of government, not-for-profit/academic, and for-profit organizations in producing and disseminating S&T databases and the different weight each places on mission versus financial considerations.
In government and not-for-profit research organizations, including universities, basic research institutes, and national laboratories, the advancement of knowledge as an intrinsic good and in the service of national goals motivates the production and distribution of S&T databases; exploiting data for financial gain is subordinate to fulfilling public-interest objectives. The data products of not-for-profit and government organizations are judged primarily by criteria that are not directly profit related, such as their value to end users, their potential value in advancing a field, their ability to enhance the status of an institution and its research or educational capabilities, and similar objectives typically associated with public-interest or public good activities related, for example, to improving knowledge of disease factors or interdependencies within ecosystems.
Of course, not all not-for-profit institutions behave in this generalized way. At one end of the spectrum are organizations that, especially if they are fully subsidized, distribute their data freely on the Internet without any effort at cost recovery. Many individual researchers or academics certainly operate this way. At the other end are not-for-profit institutions that seek to maximize the revenues from their S&T databases, subject to the constraints of their tax-exempt status, to finance future R&D and database development in order to remain at the forefront of their respective fields. Most not-for-profits, however, fall somewhere in the middle in trying to reconcile their public-interest mission, on the one hand, with the need to generate sufficient operating revenues, on the other. Universities present a good example of this dichotomy, with the trend in recent years toward greater cost recovery1 and greater attention to the protection and exploitation of their intellectual property.2
In contrast, the for-profit sector seeks mainly to generate profit for management and shareholders. Of course, market success also depends on creating value for users—otherwise, the data products would not be successful. High-value can translate to high prices, and such pricing inevitably restricts access. Nevertheless, there are exceptions to the rule here as well, since not all for-profit entities attempt to charge as much as they could for their proprietary databases, perhaps
1 |
For a discussion of the trend in academic institutions to protect their research results as intellectual property, see Kenneth W. Dam (1998), "Intellectual Property and the Academic Enterprise," John M. Olin Law & Economics Working Paper No. 68 (2d Series), University of Chicago Law School. |
2 |
Intellectual Property Task Force (1999), "Intellectual Property and New Media Technologies: A Framework for Policy Development and AAU Institutions," Association of American Universities, May 13, Washington, D.C., 31 pp., available online at <www.tulane.edu/~aau/AAUPolicy.html>. |
TABLE 2.1 Typical Objectives of Organizations That Produce and Disseminate S&T Databases
instead using such databases as a marketing tool for other products or services, or choosing revenue-generating methods such as advertising as an alternative to charging users for access to data. Such exceptions, however, usually are seen in non-scientific database markets that have large user clienteles.3
Not-for-Profit/Academic |
For-Profit |
Fulfill mission, including furthering research, education, creation of knowledge, and discovery; remain economically viable |
Achieve corporate objectives, including profit making and growth, and ensure shareholder and customer satisfaction |
Advance knowledge by conducting new research and by validating and building on the research of others; educate future researchers; contribute to basis for producing social benefits; build reputation and status of researchers and their institutions |
Support development of new or improved products or services; develop databases for direct sale or lease as products or as services in support of other products or services |
Encourage open sharing of ideas; enable existing data to be reused for discovery of new knowledge; invite review and validation of research results; facilitate use of research results for product development by S&T community and commercial concerns; recover costs or generate revenue in support of mission |
Disseminate data to protect competitive advantage when databases are used for development of other products or are themselves products or services; disseminate via sale or license to generate revenue, enhance customer base and market position, gain competitive advantage, achieve profits, or recover costs |
Open, with data and ideas shared after results have been published |
Internal and confidential, or available/marketed externally at a cost set by the organization |
Moderate; ranges from very low (for fully subsidized databases) to moderate (when cost recovery is necessary) to high (when data are a source of revenue required to support mission) |
Very high; databases regarded as investments to be protected whether they are used in product development or are themselves products or services to be sold |
SCIENTIFIC AND TECHNICAL DATABASE COSTS, PRICING, AND ACCESS
Despite their differences in mission and motivation, organizations in all three sectors are constrained by financial responsibilities: federal government agencies must justify their expenditures to Congress; not-for-profit entities must make their organizations at least viable (with any excess of income over expenses reinvested in the organization); and commercial firms must answer to shareholders. All organizations therefore give careful consideration both to the costs
associated with database production and distribution and to the potential for generating revenue.
Production and Distribution Costs
The costs of databases can be categorized as production costs and distribution costs. Database production includes both data collection and database preparation. The cost of data collection can be very high and varies with the size, complexity, and difficulty of obtaining the data. It includes the costs of the observational or experimental instrumentation, deployment and maintenance of that instrumentation for the lifetime of the data collection project, and related infrastructure costs such as initial data storage. Data acquisition costs typically represent the bulk of the costs of a database. In major data collection projects such as those involving remote sensing satellites (e.g., National Oceanic and Atmospheric Administration geostationary or polar-orbiting spacecraft), or in large physics experiments requiring specialized facilities (e.g., high-energy particle accelerators), the costs can easily be in the hundreds of millions, or even billions, of dollars. Database preparation costs are those associated with preparing a database for dissemination, including ensuring data quality and accuracy, and enhancing the utility of the data for users. The cost of these efforts can vary from modest to large, depending on the nature of the database and its intended use. For example, the cost of the labor for abstracting or indexing databases can make their production expensive.
Distribution costs include the cost of making, sending, and billing for physical copies; any additional transaction costs such as those for licensing and related administrative activities; and user-specific costs such as those for database maintenance and specialized handling. Distribution costs tend to be much lower than database production costs, particularly if the Internet is the medium of dissemination and little effort is devoted to marketing or user assistance. In some cases, the distribution costs are simply the costs of copying.
Producing databases for online distribution is more a matter of potential market opportunity and incentives, rather than of potential cost savings. Customers have come to expect online distribution, particularly in the S&T data market in which nearly all users are now connected to networks and are technically sophisticated. For such users, online access can improve a database's accessibility, functionality, and utility. From the producer's perspective, it opens new market opportunities or broadens existing ones, but it is more likely to shift costs than to reduce them. The concern over adequate protection of a rights holder's database products is exacerbated in the online milieu by factors such as the possibility of unauthorized access, duplication of content, and mass redistribution. As discussed in Chapter 3, however, making the information available online also can reduce the opportunities for misappropriation of a database by preventing the user from accessing all the underlying data.
Pricing and Access
Because for most S&T databases the production costs are high relative to the subsequent distribution costs, there is a trade-off between efficient access and cost recovery. An economist would define an efficient-access price for a database as one that is equal to the cost of making the database accessible to one additional (the marginal) user. A higher price inefficiently excludes potential users, whether the data activity is in the public or the private-sector. The users' welfare could be enhanced without increasing the burden on taxpayers if users were allowed to buy access to the data at the marginal cost of dissemination.4
An efficient-access price will almost never generate revenue that matches the database production costs.5 Instead, the database must be subsidized in some way. Cost recovery has an equity justification, namely that the database is paid for' by the users instead of being subsidized out of general revenue. It also subjects the database to a (weak) market test of whether the value to users exceeds the cost. If not, then revenue cannot exceed costs.
Efficient-access pricing is only feasible in conjunction with some other source of revenue or subsidy, such as taxpayer support (in the case of federal agencies) or endowments or industry or government contract support (in the case of not-for-profits). Federal agencies are required by law to provide efficient access, limiting charges to no more than the incremental cost of data dissemination, and not including the average cost of producing the database.6 Frequently, the federal agency simply charges for the cost of reproduction and distribution, which can be zero if the database is distributed online. 7
In contrast to government agencies and some not-for-profit entities, profit-motivated firms must, at a minimum, recover their costs. Of course, for-profit entities seek to generate additional income, sometimes using sophisticated pricing strategies, such as segmenting the market with differentiated products and varying the price according to volume, convenience of delivery, customer support services, documentation, scope of subject matter, geographic coverage, and so on.8 (See Box 2.1 for examples of marketing models for private-sector S&T data products and services.) Unless commercial providers are much more efficient than government providers, the profit motive leads to higher prices for access than a government agency would ideally charge under its mandate to provide wide access.
Whether a commercial database will be made available to public users (such as university laboratories) on better terms than to commercial firms depends on whether there is competition between those two sectors. If the database contains information whose value to commercial buyers is reduced by academic use, then the vendor will not sell it at preferential rates to academics.
Stronger Statutory Protection and the Incentives for Investment
There is a natural link between cost recovery and protection of databases. 9 If databases can be copied without any legal constraints or otherwise freely acquired by users or competitors, then the rights holders will not recover their costs and hence will have no incentive to produce databases.10 Stronger statutory protection might help prevent unauthorized copying, particularly in unprotected digital formats, and thus promote cost recovery and improve profit margins. In this way it could provide incentives for the creation of new databases in the private-sector, especially by commercial entities. However, although this motivation sounds compelling, it should be tempered by the realization, elaborated in
8 |
See National Research Council (1997), Bits of Power, note 7, p. 125. |
9 |
The trade-off between access and cost recovery is expressed by Richard Gilbert in Chapter 4 of the committee's online workshop Proceedings as a trade-off between access by users and protection of rights holders or vendors, where protection refers to market exclusivity that comes from intellectual property rights. See National Research Council (1999), Proceedings of the Workshop on Promoting Access to Scientific and Technical Data for the Public Interest: An Assessment of Policy Options, National Academy Press, Washington, D.C., <http://www.nap.edu>. |
10 |
Some databases, such as consumer-oriented catalogs, are a way to sell products. These databases are not the committee's concern, however, because firms already have every incentive to provide them and they do not require statutory protection. The committee's concern here is with databases for which the pricing for access is the only source of revenue, e.g., most S&T databases. |
Chapter 3, that databases already benefit from many forms of protection that permit rights holders to recover their costs. And even when balanced, new legislation could have unintended, negative consequences that need to be avoided, as discussed in Chapter 4.
Moreover, the participants in the committee's January 1999 workshop did not report any instances in which the current lack of statutory protection had dissuaded them from investing in promising new projects. While most of the not-for-profit and commercial-sector participants noted that their organizations' data had been copied on occasion, such copying was written off as an unavoidable loss and part of doing business (similar to shoplifting in retail stores). Commercial-sector participants noted that in repeated instances of suspected infringement, the practice of issuing "cease and desist" letters normally was effective, but when not, they felt appropriately obliged to terminate the infringer's further access (i.e., to their password-protected online databases). 11 In no case, however, had copying of their data prevented the companies represented at the workshop from earning a reasonable profit, over and above full cost recovery.12
Mounting Pressures on Government Producers and Distributors of Scientific and Technical Databases for Cost Recovery
Historically, the generation of primary S&T data, in such diverse fields as meteorology, astronomy, and high-energy physics and in the public census, has been funded by governments either directly through specific contracts or indirectly through the sponsorship of academic research. In the United States, the resulting databases of the federal government have been placed in the public domain and have provided the basis for subsequent research.
In many cases in which the government produces databases, it distributes the raw or partially processed data as a public good and lets not-for-profit organizations and commercial firms customize the data for special market segments or individual users. This achieves a better balance between requirements for cost recovery and the advantages to the public of efficient-access pricing. Under this approach, taxpayers subsidize the collection and preservation of the raw or partially processed data, but users pay the entire additional cost of customizing the
11 |
Several participants reported that pirated editions of their databases were being sold in developing countries. This kind of wholesale copying is illegal under domestic and international copyright and unfair competition laws. The ongoing failure of other nations to respect and enforce existing intellectual property law is largely a concern of new World Trade Organization rules known as the TRIPS agreement (see note 5 in Chapter 3). Increased worldwide protection of databases would require a new treaty, and its effectiveness could depend on its integration into the TRIPS agreement. |
12 |
An extensive background report prepared in advance of the workshop also failed to uncover any negative consequences for companies. See Stephen M. Maurer, Appendix C, in the committee's online Proceedings , note 9. |
Box 2.1 Marketing Models for Private-Sector S&T Data Products and Services Traditional Marketing Methods Many of the business plans for creating and distributing databases are similar to those used in the software industry. Referred to here as "traditional" methods, they are usually tailored to the perceived market's size and wealth:
|
databases, which are prepared and sold mostly by commercial firms. There are problems with both systems of finance, however. As mentioned above, subsidies avoid the market test of whether the willingness to pay for the data exceeds their cost. Commercial provision at higher prices must face this market test, but the higher prices inefficiently restrict access. In the evidence collected by the committee, there was no suggestion that federal agencies were collecting too much raw data, so it is reasonable to price the raw data for efficient access, relying on
New Marketing Methods In addition to taking a traditional approach to marketing databases, vendors have also devised effective alternatives for selling their products. The following sampling of new methods is provided by way of illustration:
|
taxpayer subsidies to the extent possible. Of course, taxing for general revenue also involves some inefficiency.13
However, because of mounting budgetary pressures generally, and increasing costs of public information management specifically, government database providers worldwide have recently been coming under pressure to recover all costs, rather than only the cost of distribution. This pressure toward full cost recovery results both from the rising costs associated with the rapidly expanding rate of data collection and from foreign pressures. European governments, for example, are turning increasingly to a full cost recovery approach for their S&T database production and dissemination activities. The E.U. Database Directive puts considerable pressure on non-E.U. countries, including the United States, to do the same.14 Although existing law in the United States precludes adoption of such restrictions for government data, the enactment of a strong new database protection statute for private-sector databases could stimulate further interest in privatizing U.S. government database dissemination activities. By making databases more profitable, new protectionist legislation could shift responsibility for their creation from the public-sector to the private-sector. The social harm of such a shift would be an increase in the price for access, especially for highly specialized databases—such as some S&T databases—with a comparatively small market. A possible social benefit would be that the private-sector would be subjected to a weak market test of whether the value of the databases exceeded their cost. As noted above, however, most data collected by public agencies are raw data whose full potential value has not yet been realized, and in many cases, user-oriented transformations of the data are already in the hands of the private and nonprofit organizations that might have a better sense of the user market.
Pressures for cost recovery also arise because the benefits that accrue to consumers and the broader society under the efficient-access pricing model are harder for legislators and administration policy makers to see (and measure) than those that accrue as reduced tax burdens under the cost-recovery model, or those that accrue as increased profits under the commercial model. Thus, science agencies in the United States are increasingly turning to the private-sector, to both not-for-profit and commercial entities, in outsourcing government S&T database dissemination activities, or even to purchase data from commercial suppliers. For example, in order to promote private-sector investment and development of space technologies and applications, the Commercial Space Act of 1998 encourages NASA—an agency engaged to a substantial degree in basic research activities—to purchase space and Earth science data products and services from the private -sector, and to treat data as a commercial commodity under federal procurement regulations. The potential negative effects of this trend are discussed in some detail in Chapter 4.
14 |
See Stephen M. Maurer (1999), "Raw Knowledge: Protecting Technical Databases for Science and Industry," Appendix C in the online workshop Proceedings, note 9. |