Participants discussed copyright issues, databases, repositories, and methods to retain the value of the scientific work.
Archives are a key component in the change from print to electronic publishing. The position of the ACS Publications Division is that it is prepared to stop having printed journals when two changes occur. One is when authors decide print no longer adds value, Robert Bovenschulte said. The other is when there is a reliable technology available to store the archives, so that the science will be preserved, even without print. Currently, print is much more reliable than the current technology. Martin Apple has observed that there is worldwide scholarly journal dissatisfaction for many reasons; including the pervasive reluctance of libraries to cancel print until e-archiving arrangements are secure and durable. Another reason is the need for organizational structures to ensure access to digital archives.
Maintaining archives in the future is one of the main problems. There is a pressing question of how to maintain these functions over time, especially in the eye of technology migration. “How do we take electronic technologies that exist today and, if we embody our content into those, what will happen 5 years from now, 10 years from now, 20 years from now?” Bovenschulte asked. To the extent that publishers like the ACS feel it is their professional obligation to enable such migration, they have to invest a great deal to make sure that they can handle these transitions, he added.
Some commercial publishers have already started to digitize their back issues. Patrick Jackson said that ScienceDirect today has about 2 million current articles and about 4 million articles that belong to the back-files. These are increasingly being supplemented with other types of content, including reference works, books, and handbooks. Material that was published 50 years ago, or even 100 years ago, is in many cases held to be just as valid now as it was then. Elsevier began to digitize all of its journals around the year 2000 and will be finished by 2005, at a cost of $40 million. All 1,800 Elsevier titles, including all of the discontinued, split, and name-change journals will be included. Elsevier sells about 29 different back-file packages. The key benefits are getting rid of the physical archive, access to internal links and CrossRef links, and free access to 6 million abstracts. In case disaster strikes there is a contingency. Comparing paper to electronic files in terms of reliability, Jackson reminded listeners of what can happen to libraries in times of war (e.g., Bosnian National Library) and natural disasters (e.g., earthquake, Kobe, Japan).
Not all costs can be cut, however. Gordon Hammes said that data archiving could possibly be done more efficiently, but the data still have to be reviewed. Not all data should be archived, or archives of incorrect or obsolete data would ensue.
Martin Blume said that the American Physical Society (APS) offers a CD collection at the end of the year that can be loaded on the intranet of the institution, so that everyone will have desktop access to it. There has been a call in the United Kingdom for every researcher to self-archive every published article in a peer-review journal in his institutional archives, Stevan Harnad said.
Participants discussed some copyright issues that electronic archives might create. There was a policy forum for science some years ago arguing for authors to retain copyright the way novelists do, Stephen Berry said. “Every scientist gives the publishers the copyright, and one way that has worked very well is for the journal that holds the copyright to give the author a very, very open license,” Berry said. The APS uses this model. The only reason left to argue over copyright ownership is the original intent of giving the author or inventor the protection that comes from being the creator, according to Berry. Although the issues of intellectual prop-
erty rights are quite central, they are solvable in our context, Berry maintained. He called for an analysis of the financial pathways to open access, but cautioned to not expect any one simple solution.
Alternative models to existing copyright laws already exist—Creative Commons being one of those that the participants discussed. Creative Commons is an important alternative to the conventional understanding of copyright, Anna Gold said. This initiative originated in the artistic, creative part of our culture and has moved into the area of scientific creativity. Creative Commons provides well-crafted legal agreements that allow authors to both control their creative property and enable its reuse without having to be contacted themselves, so they can maintain control while providing access.
Berry talked about the pressing problem of databases and the laws protecting the data, especially in the European Union. The European Union “database directive” created a specific new kind of protection for databases that is more protective than even a copyright. As a result, a number of privately owned and distributed databases in Europe have been created, many of which are very expensive. There was one attempt in the United States to protect satellite data in this way. That idea failed because no one in space science could afford the data. There is an ongoing battle in Congress about whether to enact a law comparable to the one in Europe, Berry said. However, most scientific data are the kind of raw data that can be copyrighted. The Journal of Physical and Chemical Reference Data, for example, is copyrighted because it contains evaluated data. Berry doubted whether truly raw data still exists from scientific experiments. “When we do an experiment in my lab, we do not simply collect the electrical impulses that detectors find and print those out or put them directly onto some electronic record,” Berry said.
Berry also discussed how a data registry might be needed in the chemistry community—a registry that would not be a data repository, but merely information about whether and where data exist. He said, there could be a great amount of interest in a global biologicals registry, but this might be difficult to set up because many developing countries are very protective of their native flora inventory data. They fear they will be exploited, so they are very cautious about allowing people to construct databases of such information. The chemistry community has large sectors in which work is done on potential pharmaceuticals of natural products, where having data repositories of substances recovered from organisms but not yet studied would be very valuable, according to Berry. Bristol-Myers Squibb would probably be interested in accessing such a registry, but probably would not contribute to it because a registry is a company’s intellectual capital and money, Lou Ann Di Nallo said. According to Ned Heindel, this concept is not at all new. He said the ACS had a section in its Journal of Medicinal Chemistry some years ago that listed negative results for “me-too” compounds. Di Nallo added that it is extremely helpful to the pharmaceutical industry to know which compounds have no activity.
THE FUTURE OF ARCHIVES
Gold talked about the future of archives and some archival tools. The journal is not the final stop on the scientific communication road; the archive is, Gold said. The archive is never final, it is a way to preserve scientific knowledge, to preserve the record so that it can be built on and used into the future.
Some of the problems of archiving include reliability, Gold said. Librarians and scholars have dealt subsequently with publishers who have left the scene and dealt with how to recover data, records, and so forth. Reliable archives will benefit our children and our grandchildren, but in the digital realm, reliability into the future is not a foregone conclusion, she said.
The archive is important because it provides context for work—not merely a way of getting at a particular known piece of work. Libraries provide that context by bringing together the patchwork of various publishers and models, and then deal with the frustrations of trying to piece it all together. Libraries work toward a grand vision of a richer and more interoperable context, Gold said.
Some of the solutions in terms of reliability include finding ways to agree on and share responsibility, according to Gold, and cannot be done in a single organization. Open access with cost sharing in some way and Creative Commons as a means of managing access are very promising ways of handling intellectual property issues and dealing with management and governance issues to help us move into the future. Gold named some current approaches, such as JSTOR, DSpace, and E-Depot. JSTOR is a multi-institution approach to providing access to a historical journal archive. E-Depot is a national-library-plus-publisher initiative to ensure the longevity and reliability of a digital archive into the future. DSpace is MIT Libraries’ multi-institution federation. It is institutional repository software, but also preservation repository software, intended to be open-source and openly developed across many institutions. Possible content ranges across the spectrum from journals to many other kinds.
Some participants felt that depositing is an important part of archiving. The feeling was expressed that either the process of depositing should be part of a seamless workflow, which might be automated through harvesting, or it has to be stewarded. To leave this to individual faculty or their administrative assistants, some felt, is extremely unreliable.
Although there is a tendency in e-business and the Internet-enabled communications industry to charge very
little or zero for content, this does not rule out charging for other value added, Gold said. Open access begins to allow scientists to work with their archive in much more creative ways. She cited some very interesting recent articles1 about how the archive could actually begin to represent the dynamics of scholarship in more creative ways than it does now.
A new system will mean new archival choices and challenges, Gold said. The perspective of archivists is more and more becoming a core part of what libraries do throughout their organization. Archivists help preserve the context, the dynamics of knowledge, and libraries will begin to play a much greater role in this activity, she noted. The challenges for providing this kind of dynamic archive are immense as well: interoperability, selecting information into the archive, managing an archive in environments where people have little time. She added that developed and agreed-on standards that could support such an interoperable world and the migration of functionality over time are further challenges.
According to Gold, chemistry has much at stake. Chemists are heavy users of journals, their journals are generally agreed to be the most expensive, and costly and complex data are embedded in their literature, she said. Opportunities may be lost: interoperability with related disciplines, interoperability with emerging centers of international research that are going open access, and inter-operability with academic repositories. The key to the future of creating this new and lasting value in the chemistry publishing web is open access to content. “We can only imagine what is possible. Actually, we have more than imagined; we have seen what is possible with organizations like HighWire. But it is certain that it will dwarf what any one company might achieve,” Gold said.