Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
5 What Is Publishing in the Future? The digital revolution is changing the traditional processes of many knowledge-intensive activities, in particular the publishing or scholarly communication processes, and it is offering alternatives both to how the various stages of these processes are conducted and who does them. The various functions--whether metadata creation, credentialing review, or long-term stewardship--can be separated or disaggregated, and players dif- ferent from those who traditionally have carried out these tasks can, in theory, perform them. Publications can now exist in many intermediate forms, and we are moving toward more of a continuous-flow model, rather than a discrete-batch model. The raw ingredients--the data, the computational models, the out- puts of instruments, the records of deliberation--can be online and acces- sible by others, and can be used to validate or reproduce results at a deeper level than traditionally has been possible. Third parties--particularly in an open-access, open-archives context--can then add value by harvesting, mining, enriching, and linking selected content from such collections. The presentations in this session of the symposium identified some of the social processes, specific pilot projects, and the challenges and opportu- nities that may provide the basis for future "publishing processes," which ultimately may be more holistically integrated into the "knowledge cre- ation process." 48
PUBLISHIING IN THE FUTURE 49 IMPLICATIONS OF EMERGING RECOMMENDER AND REPUTATION SYSTEMS1 When some people think about changing the current publication pro- cess to a more open system, they express concerns that scholarly communi- cation will descend into chaos and that no one will know what documents are worth reading because we will not have the current peer-review process. This idea should be turned on its head. Instead of going without evalua- tion, there is a potential to have much more evaluation than we currently have in the peer-review process. Public Feedback There can be a great deal of public feedback, both before and after whatever is marked as the official publication time, and we can have lots of behavior indicators, not just citation counts. Various Web sites now evalu- ate different types of products or services and post reviews by individual customers. For example, the reviews at Amazon.com provide both text re- views and numeric ratings. Even closer to the scientific publishing world is a site called Merlot, which collects teaching resources. Merlot provides for a peer-review process before a publication is included in the collection, but even after material is included, members can add comments. Many other examples of this type of public feedback, both within and outside the STM communities, al- ready exist. Behavioral Indicators With behavioral indicators, the researcher does not ask people what they think about something; he or she watches what they do. For example, Amazon.com, in addition to its customer reviews, provides a sales rank for each book. Google uses behavioral metrics of links in its page rank algo- rithm. The Social Science Research Network uses the download count as a behavioral metric. 1This section is based on the presentation by Paul Resnick, associate professor at the University of Michigan.
50 ELECTRONIC PUBLISHING Potential Problems The existing examples of public feedback approaches and behavioral indicators provide some potential models for developing credentialing pro- cesses for scientific publication in the future. There are also some problems that require further examination, however. An obvious potential problem is "gaming the system." For example, an owner of a Web site can hire consultants to help the site get higher Google rankings. No matter what the system, some people will try to figure out what the scoring metric is and attempt to influence it to boost their success in the rankings. Another problem is eliciting early evaluations. In those systems where there is widespread sharing of evaluations, there is an advantage to being second, to let somebody else figure out whether an article is a good one to read or not. Yet another problem can be herding, where the later evaluators are overly influenced by what the previous evaluators thought. Experiments to Try in STM Publishing Some potential experiments are more radical than others. Journal Web sites could publish reviewer comments. The reviewers might be more thor- ough if they knew their comments were going to be published, even with- out their names attached. The reviews for rejected articles could be pub- lished as well. Such a system could reduce really poor submissions. Other experiments might try to gather metrics. Projects such as CiteSeer in the computer science area measure citations in real time. Ex- periments in evaluating the evaluators are needed as well. More attention, greater credit, and rewards need to be given to reviewers for evaluating early, often, and well. Publishers already are complaining about the diffi- culty of finding good reviewers. Finally, the thread of the original version of a paper, along with review- ers' comments and authors' revisions or responses to the comments, as well as the journal editor's additional comments and interpretations, could be used as educational material and enrichment. All this could be done either anonymously or by attribution.
PUBLISHIING IN THE FUTURE 51 PREPRINT SERVERS AND EXTENSIONS TO OTHER FIELDS2 The article preprint is a well-known, well-understood concept in the physics community, but is not as well known in other communities. Pre- prints have a well-understood "buyer beware" connotation in the physics community. They provide a means to get informal, non-peer-review feed- back that is weighted very differently by physicists than a formal refereed report. Preprints get an early version of an article out to colleagues to solicit feedback. The e-Print arXiv in Physics and Other Similar Projects The e-Print arXiv3 was created by Paul Ginsparg (then at the Los Alamos National Laboratory) for the high-energy theoretical physics com- munity. Today the archive is hosted at Cornell (where Ginsparg works) and covers 28 or so fields and subfields in physics, with more than 244,000 papers. It has succeeded in large part because Dr. Ginsparg is a physicist. He understood well how that community works and what its needs are. The arXiv clearly has increased communication in its field. It is the domi- nant method for authors to first register publicly their ideas. It addresses the common interests of a community in a sociologically compatible way. This approach has spread to other fields, including biology, materials science, social sciences, and computation. Spinoffs have been organized by both universities and government agencies. CogPrints,4 at the University of Southampton in the United Kingdom, is a well-known preprint system in cognitive science. NCSTRL, the Networked Computer Science Techni- cal Reference Library,5 was an early effort for harvesting computer science papers and building a federated collection of that literature. The National Aeronautics and Space Administration National Technical Reports Server6 was a pioneer in bringing together and making available a collection of 2This section is based on the presentation by Richard Luce, research library director at Los Alamos National Laboratory. 3For additional information on the e-Print arXiv, see http://www.arXiv.org/. 4For additional information on CogPrints cognitive sciences e-print archive, see http:/ /cogprints.ecs.soton.ac.uk/. 5For additional information on the NCSTRL archive, see http://www.ncstrl.org/. 6For additional information on NASAs' National Technical Reports Server, see http:// ntrs.nasa.gov/.
52 ELECTRONIC PUBLISHING federal reports, both metadata and the full text. PubMed Central is cer- tainly well known in the life sciences community. Inspiration for the Future What do these developments mean beyond the physics community, and what new efforts can we foresee? A major question to be addressed is how peer review will work in the preprint and open-access contexts. How should we evaluate, recognize, and reward this scientific work? A variety of methods are worth considering in a composite approach. Today we use citations as the sole indicator of influence, but the problem is that the citation is only one indicator of influence. A better approach might be to supplement the current system with a multidimensional model to balance bias. An ideal system might have the following elements: citations and co-citations; the semantics, or the content and the meaning of the content in articles to see how they are related; and user behavior, regarding how readers use scientific information online. The latter metrics raise pri- vacy concerns, however. INSTITUTIONAL REPOSITORIES7 The mission statements of many universities proclaim that they are committed not only to generating knowledge but also to disseminating and preserving it. Massachusetts Institute of Technology (MIT) decided recently to initiate two projects that would better implement this mission. OpenCourseWare8 aims to provide free access online to the primary mate- rials for the university's courses. The DSpace initiative,9 which is the sister project of OpenCourseWare, is a prepublication archive for MIT's research, supported by an institutional commitment from the MIT libraries. DSpace is meant to be a federation of the intellectual output of some of the world's leading researchers. MIT will not build the whole system for DSpace. In- stead, it is creating elements that communicate through open standards, so 7This section is based on the presentation by Hal Abelson, professor of electrical engi- neering and computer science at MIT. 8For additional information on MIT's OpenCourseWare project, see http:// ocw.mit.edu/index.html. 9For additional information on the DSpace Federation, see http://www.dspace.org/.
PUBLISHIING IN THE FUTURE 53 that many users can enter at different places in the value chain and add value in different ways. Both OpenCourseWare and DSpace are ways that MIT and other uni- versities are asking what their institutional role should be in disseminating and preserving their research output. Why ask this question now? The answer is that the increasing tendency to proprietize knowledge, to view the output of research as intellectual property, is hostile to tradi- tional academic values and to the public missions of universities. The cur- rent research information environment includes high and increasing costs; imposition of arbitrary and inconsistent rules that restrict access and use; impediments to new tools for scholarly research; and risk of monopoly ownership and control of the scientific literature. The basic "deal," as seen by many in universities, is that the authors, the scientists, give their property away, through copyright transfer, to STM journals. The journals then own this property and all rights to it for the duration of copyright. The publishers take this property and magnani- mously grant back to the authors some limited rights. The universities, who might have a stake in ownership transfer, generally retain no rights at all, and the public is not even a factor in this discussion. It is instructive to list some of the elements that are valuable for pro- moting the progress of science. They include quality publications and a publication process with integrity; open, extensible indexes of publications; automatic extraction of relevant selections from publications; automatic compilation of publication fragments; static and dynamic links among pub- lications, publication fragments, and primary data; data mining across mul- tiple publications; automatic linking of publications to visualization tools; integration into the semantic web; and hundreds of things no one has thought of yet. OpenCourseWare and DSpace are only two examples of the changing role of universities in using the new technological capabilities to reinforce their public missions and promote the progress of science. ISSUES RAISED IN THE DISCUSSION Assessing Journals and Authors In developing new feedback systems, it is useful to factor in the enor- mous value that is derived from the literature outside academia. There are two purposes for doing this. One is that it is a better way of assessing journals and authors. The other is that it also will begin to develop a means
54 ELECTRONIC PUBLISHING for the authors to better recognize that their larger audience is outside of their immediate peers. Indicators versus Metrics It is better to use the word "indicators," rather than "metrics," because there is no one number that can be used as a measure of quality. There are problems with all measures, including the opportunities for gaming the system. One has to look into all of these and use them as indicators, and it takes a number of indicators to develop a fair measure of quality. Open versus Confidential Peer Review Although public commentary is good, nevertheless there is a sort of "Greshams' law of refereeing," whereby bad referees tend to drive out the good ones. The knowledge that a paper is going to be peer reviewed does have an effect on authors. One way to evaluate the reviewers is to have an editor who chooses them or moderates. One could also develop a system to calibrate reviewers against each other. Unfortunately, this leads to overburdening of the good reviewers, so they are in effect punished for the good work that they do. Some participants believe the peer-review system could be adapted to a more public version. In a public system, one does not necessarily have to give all the lousy reviewers an equal voice. Constraints of the Print Model on the Electronic Model Many people are still working in the paradigm of the old paper model of publishing, where there is a lot of prepublication work, because a big print run is needed in order to economize on the costs of distribution. The questions about archival policy, editorial policy, and open access all change completely if one moves to a model of continuous improvement of the materials, or continuous publication, where all peers have an opportunity to adjust the prominence of newly developed pieces. In that model, the world changes completely. A prerequisite for this kind of innovation is that the materials that are being continually re-evaluated are an open public resource. It is hard to see how this kind of approach would work, as a practical matter, if you still
PUBLISHIING IN THE FUTURE 55 have a system in which every publisher's body of work is in a separate, restricted repository. Relationship between Open Archives and Traditional Publishers One can see that repositories such as DSpace have a very valuable role in shaking up the system and in helping to establish or return to better priorities in scholarly publication. There is no inherent hostility between institutional archives and traditional publishers. One can imagine a univer- sity holding both the preprint and the final edited version of an article, and the journals providing some kind of authentication and review service. Preprint servers are repositories at the "lower" layer and provide a plat- form or an infrastructure on which a whole host of yet to be fully imagined value-added features could be built, some of them by for-profit entities. The goal is to create a more open environment for the primary, or up- stream, parts of the value chain and then to encourage scholarly activity on top of that. In any event, of deep concern to the government science agen- cies and to public institutional repositories is being able to have access to material created with public monies, and to make such information pub- licly available.