13.
Exploring the Impacts of Enhanced Access to Publicly Funded Research1

John Houghton

Victoria University, Australia


In late 2006 Colin Steele, Peter Sheehan, and I produced a report for the Australian Department of Education, Science and Training (DEST) on the costs of research communication, emerging opportunities, and benefits.2 The aims of the project were to explore and, where possible, to quantify the costs associated with research communication, focusing mainly on scientific publishing but also, to a lesser extent, on scientific data. The study also explored the potential benefits of enhanced access to research findings and tried to compare the costs and benefits of alternative access systems. The project was funded by the Australian government as a part of the government’s consideration of open access policy and legislation, but it was also aimed at the funding agencies, research councils, and universities as a way of both providing input to their deliberations about access policies and offering a guide to the budgetary implications of alternative access models.


To explore the costs of interest, we developed an activity cost model that was based on an extensive review of the literature on the activities of research and the production of research findings. We adopted a systems perspective that was based on previous research by Donald King and colleagues in the United States.3 The cost model included all the activities related to research, publishing and dissemination in higher education, and it covered databases as well as journals and books. Due to a lack of comparable data, all other private and public sector research activities were excluded.


Most of the costs associated with research communication are related to time—the time involved in reading, research, writing, peer review, and other tasks. To convert time to dollars, we used a model for full cost recovery that included salaries and overhead costs common in universities. In fact, it was based on the model for full cost recovery for non-laboratory-based contract research that is imposed on universities in Australia by the national competition policy.


One crucial aspect is the level at which costing is done. Research communication is multidimensional, so it is useful to take a matrix approach to costing that identifies the activities, actors, objects, functions, and, to a lesser extent, the applications with the aim of being able to break down the value chain and reassemble it along any of these dimensions. For example, activity costs can be reassembled as the cost of objects, such as the cost of producing a journal article; the cost for actors, such as the costs experienced

1

Based on a presentation available at http://www.oecd.org/dataoecd/11/20/40067323.pdf

2

Houghton, J. W., C. Steele, and P. J. Sheehan. 2006. Research Communication Costs in Australia: Emerging Opportunities and Benefits. Canberra: Department of Education, Science and Training. Found at http://dspace.anu.edu.au/handle/1885/44485

3

See, for example, Tenopir, C. and D.W. King. 2000. Towards Electronic Journals: Realities for Scientists, Librarians and Publishers. Washington, DC: Special Libraries Association.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 51
13. Exploring the Impacts of Enhanced Access to Publicly Funded Research1 John Houghton Victoria University, Australia In late 2006 Colin Steele, Peter Sheehan, and I produced a report for the Australian Department of Education, Science and Training (DEST) on the costs of research communication, emerging opportunities, and benefits.2 The aims of the project were to explore and, where possible, to quantify the costs associated with research communication, focusing mainly on scientific publishing but also, to a lesser extent, on scientific data. The study also explored the potential benefits of enhanced access to research findings and tried to compare the costs and benefits of alternative access systems. The project was funded by the Australian government as a part of the government’s consideration of open access policy and legislation, but it was also aimed at the funding agencies, research councils, and universities as a way of both providing input to their deliberations about access policies and offering a guide to the budgetary implications of alternative access models. To explore the costs of interest, we developed an activity cost model that was based on an extensive review of the literature on the activities of research and the production of research findings. We adopted a systems perspective that was based on previous research by Donald King and colleagues in the United States.3 The cost model included all the activities related to research, publishing and dissemination in higher education, and it covered databases as well as journals and books. Due to a lack of comparable data, all other private and public sector research activities were excluded. Most of the costs associated with research communication are related to time— the time involved in reading, research, writing, peer review, and other tasks. To convert time to dollars, we used a model for full cost recovery that included salaries and overhead costs common in universities. In fact, it was based on the model for full cost recovery for non-laboratory-based contract research that is imposed on universities in Australia by the national competition policy. One crucial aspect is the level at which costing is done. Research communication is multidimensional, so it is useful to take a matrix approach to costing that identifies the activities, actors, objects, functions, and, to a lesser extent, the applications with the aim of being able to break down the value chain and reassemble it along any of these dimensions. For example, activity costs can be reassembled as the cost of objects, such as the cost of producing a journal article; the cost for actors, such as the costs experienced 1 Based on a presentation available at http://www.oecd.org/dataoecd/11/20/40067323.pdf 2 Houghton, J. W., C. Steele, and P. J. Sheehan. 2006. Research Communication Costs in Australia: Emerging Opportunities and Benefits. Canberra: Department of Education, Science and Training. Found at http://dspace.anu.edu.au/handle/1885/44485 3 See, for example, Tenopir, C. and D.W. King. 2000. Towards Electronic Journals: Realities for Scientists, Librarians and Publishers. Washington, DC: Special Libraries Association. 51

OCR for page 51
SOCIOECONOMIC EFFECTS OF PSI ON DIGITAL NETWORKS 52 by universities or publishers in producing the article; and the cost of various functions, such as peer review of the article, quality control, and certification. Costing the various activities involved in research communication, we found that reading is a major activity and that in Australian universities during 2005 the time spent on reading alone may have cost approximately A$5.8 billion. Reading by those researchers who were actively publishing in 2005 (i.e., reading in order to write) cost around A$3 billion. We estimated that writing scholarly, independent, peer-reviewed publications cost approximately A$640 million during 2005, and peer review and the editorial activities of academics cost approximately A$170 million. In sum, the estimated system-wide costs of the activities associated with core scholarly communication activities in Australian higher education came to approximately A$4 billion, or 30 percent of the total expenditure for higher education. Having adopted a matrix approach,4 we could then examine the activity costs in various ways. For example, summing the costs for objects suggested that, in 2005, producing a journal article in an Australian university cost, on average, A$21,000, excluding the research and reading time involved. By summing the costs for all actors, we calculated that writing all those journal articles counted in the Higher Education Research Data Collection cost the Australian National University approximately A$50 million in 2005. In the second part of the study, we explored the potential benefits of enhanced access to research findings. Again, the analysis was based on an extensive review of the literature. It has been suggested that the most immediate benefits of enhanced access are probably felt within research itself, and these potential benefits might include increased speed of access resulting in a speeding up of the research and discovery process, a decrease in the amount of redundant research and a reduction in the investigation of blind alleys, and an increase in the efficiency of research and development. Wider access would also enable greater participation from poorer institutions and developing countries, provide more opportunities for interdisciplinary research and inter-sectoral collaboration, and allow researchers to study their fields more broadly, which could potentially lead to increased opportunities for commercialization. 4 The following list of the various costs included in this actor/object matrix analysis may be helpful: • Reading: academic staff ≈$5.8 billion, published staff ≈$3 billion pa. • Writing (HERDC publications only)≈$636 million pa. • Peer review(scaled to HERDC)≈$132 million pa. • Editorial activities (scaled to published staff)≈$36 million pa. • Editorial board activities (scaled to published staff)≈$3.8 million pa. • Preparing grant applications(ARC & NHMRC)≈$110 million pa. • Reviewing grant applications(ARC & NHMRC)≈$26 million pa. • Publisher costs(scaled to HERDC)≈$164 million pa. • Library acquisition costs (CAUL)≈$199 million pa. • Library non-acquisition costs (CAUL)≈$321 million pa. • Cost per download (sample of CAUL subscriptions)$3.51 (mean). • ICT infrastructure(estimated total expenditure)≈$1billion pa. • Sum of core activities ≈$4 billion (≈30% of HE expenditure).

OCR for page 51
ASSESSING ECONOMIC/SOCIAL BENEFITS 53 Providing doctors, nurses, teachers, students, and small firms with a wider access to research findings may also lead to improvements in the quality of service and productivity in those areas of the economy, and it is also possible that the emergence of new industries could be encouraged by the availability of open-access content (e.g., weather derivatives based on access to meteorological data). For the wider community, potential benefits include such things as encouraging the development of informed citizens and informed consumers, who will be better able to make good decisions about the use of services like health and education and who will make better consumption choices. Ultimately, having informed citizens and informed consumers should increase the overall economic welfare of the country. With such a wide range of potential benefits, the task of quantifying the impact of enhanced access is enormous, and there is no single definitive approach to the problem. Ideally, all the possible costs and benefits of all the possible alternative models for access would be accounted for, but that was beyond the scope of the project and may well be practically impossible. In order to gain a preliminary sense of the overall effect of enhanced access, we chose to take a two-pronged approach. First, we explored some impact scenarios and case studies and then hypothesized about the possible effects of these scenarios on returns on R&D. Second, we modified a simple Solow-Swan5 model and used it for our calculations. There is a vast literature in economics that focuses on estimating the rate of return on R&D, and the returns quoted in that literature vary from time to time, place to place, and between fields of research. Nevertheless, we found certain patterns from that literature spanning a period of 20 to 25 years: Generally speaking, it is a characteristic finding that the rate of return is high, and it is typically in the range of 30 to 60 percent a year, sometimes higher. The standard, neoclassical approach to estimating the rate of return on R&D makes some key simplifying assumptions: • It assumes that all R&D generates knowledge that is useful in economic or social terms (i.e., the efficiency of R&D); • It assumes that all knowledge is equally accessible to all entities that could make productive use of it (i.e., the accessibility of knowledge); and • It assumes that all types of knowledge are equally substitutable across uses (i.e., the substitutability of knowledge). The substitutability assumption is clearly not realistic, however, as a great deal of research is application specific, and much work has been done to address that fact. Much less work has been done to address the other two, equally unrealistic assumptions, and this is where we focused our efforts. Basically, we introduced accessibility and efficiency 5 Solow, Robert. A Contribution to the Theory of Economic Growth. Quarterly Journal of Economics. Febuary 1956; Swan, Trevor. 2002. Economic Growth. The Economic Record. vol. 78. issue 243. pages 375-80.

OCR for page 51
SOCIOECONOMIC EFFECTS OF PSI ON DIGITAL NETWORKS 54 into the standard model as negative or friction variables, and we then looked at the effect on returns to R&D of reducing the friction by increasing accessibility and efficiency. There are a number of assumptions and caveats here. First, we assumed that the increase in accessibility and efficiency would be the same. Second, we assumed that a move to open access would have no net effect on the rates of accumulation or obsolescence of the stock of knowledge—the key word here being net. Third, we assumed that the information to which access was provided would indeed be discoverable. We explored rates of return on R&D in the range of 25 to 75 percent, and we looked at increases in accessibility and efficiency in the range of 1 to 10 percent. For each category of R&D expenditure, we produced a table. Based on the review of the literature, we assumed a very conservative 25 percent social return on public sector R&D, and we suggested that a 5 percent increase in accessibility and efficiency would be plausible. Rates of return vary considerably, and the further one gets from the aggregate, the larger the range of uncertainty becomes. Nevertheless, to give an example: • With government R&D funding in Australia at about A$6.5 billion in 2005 and a 25 percent return on R&D, a 5 percent increase in accessibility and efficiency would be worth around A$166 million a year; • With higher education R&D at around A$4.3 billion, a 5 percent increase in accessibility and efficiency would be worth around A$110 million a year; and • With the Research Council's competitive grants funding to higher education at around A$830 million, a 5 percent increase would be worth around A$20 million a year. These are recurring annual gains from one year's R&D expenditure. Thus, if the change to enhanced access is a permanent one, they may be converted to growth rate effects. It is possible to express some of these costs and impacts as benefit-cost ratios. For example, focusing on a limited range of costs, it is possible to compare the estimated incremental cost of open-access institutional repositories in higher education with the potential incremental benefits from enhanced access to higher education research, assuming that everything else remains the same. Again, there are a number of assumptions about the rates of increase of R&D expenditure, discount rates, risk premiums, and so on. Nevertheless, we estimated that over 20 years a national system of institutional repositories costing A$10 million a year would cost around A$130 million in net present value, whereas enhanced access to higher education research would be worth around A$4.8 billion in increased returns to R&D (in net present value); the resulting benefit-cost ratio would be 37. Similarly, enhanced access to the Research Council’s competitive grants funding of higher education research, with benefits of around A$925 million, resulted in a calculated benefit-cost ratio of just over 7. So what was learned from the study? Clearly, this is just one way to estimate the potential overall impacts of enhanced access to publicly funded research findings, and it has limitations and weaknesses. Perhaps its strength is its simplicity, bypassing the

OCR for page 51
ASSESSING ECONOMIC/SOCIAL BENEFITS 55 complexity of calculating the impact of each possible change. Ideally, it would be supplemented by detailed studies of how the impacts work in particular areas, that is, by more work on actual scenarios and developing those scenarios into detailed studies that would support the macroeconomic estimates. The main critique of this sort of traditional approach from the point of view of new growth and evolutionary economics is that it does not take account all of the ways in which research makes contributions. Consequently, the impact estimates from this study may be viewed as being on the conservative side, which was the intention. It is debatable whether this approach could be adopted to calculations outside R&D, as it depends on having estimates of returns to R&D spending. However, differences between scientific data and public sector information, such as meteorological or geological observation, are often quite small. They are simply another form of scientific observation. Consequently, it may be possible to apply the rates of return applicable to observational sciences to various forms of public sector information and to produce preliminary estimates of returns to public sector information.

OCR for page 51
DISCUSSION BY WORKSHOP PARTICIPANTS PARTICIPANT: I would like to follow up. I do not think we should leave the impression that the U.S. federal government is monolithic in its policy and that everything is free and rosy. What I mean is that OMB Circular A-130 actually puts an upper bound for pricing at the incremental1 cost of the information management system. So it actually can exceed the marginal2 cost and often does. There is a lot of data and information sold at much more than the marginal cost of fulfilling the user request or the dissemination. So not everything is free online, and there is quite a bit of information that is sold. Even at NOAA, the National Climatic Data Center, for example, which has all the retrospective archived climate data, charges fairly high fees for accessing those data. In fact, it funds about 30 percent of its annual operations through those sales, although I think the money actually goes to the treasury. Until recently, Landsat images were $500 each, which was quite expensive. Even though it was substantially less than the French SPOT image, it was still a lot for an individual scene. So there is a large variance in the pricing of PSI in the federal government, even if the information is in the public domain. I also should point out there is one exception to the public domain exclusion for federal government information from the Copyright Act, and that is in the National Institute of Standards and Technology. The Standard Reference Data Center has a legislative exemption from the exclusion and can copyright its standard reference data publications, and that may be the only exception to that in the federal government. I suppose that if someone were to not honor the copyright, they would not be sued by NIST, but nonetheless it does have a copyright in those publications. So I just wanted to clarify that it is a very big system, and all the agencies really operate individually. There is an overall policy, and I think marginal cost pricing is the preferred option, but it can go up to incremental costs. PARTICIPANT: I have one question, or actually a thought, after hearing these presentations. I was quite attracted to this methodology that I, in my mind, called deprivation method, in which you go to people and ask them, If we take oxygen away from you, how much would you pay to get it back? We hear this being applied to public sector information. I was just wondering, before I heard the last presentation by John Houghton, that since we are counting on information products or services that we already have, if we were to take those away, how much would we pay for getting them back? However, for the externalities and the innovation component, can you really go to end users and ask them how much they would pay for products which they do not have yet? 1 In incremental cost pricing, “The price to secondary users is set so that revenues cover the cost to provide this incremental use, including recompiling the data, perhaps maintaining a computer site for downloading, purchasing CD- ROM blanks, recording the data, shipping to the user, and customer support, but not including the costs for the core service.” National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data, National Academy Press, pp. 125-126. 2 In marginal cost pricing, “The cost to secondary users is set at the marginal cost of a specific unit sent to the user, including the cost of the CD-ROM blanks and postage and shipping. This price is lower than the incremental cost price, as long as the cost of output per unit declines when volume increases.” Id. at p. 126. 56

OCR for page 51
EXPLORING THE IMPACTS OF ENHANCED ACCESS 57 PARTICIPANT: One thing I noticed in the literature is the assumption that one of the things that is asked is: How much?” “What is your time worth?” Most people will say that they are willing to spend a great deal more for information than they would, in fact, spend. Once you ask them to actually spend it, then their backs go up. They will say one thing, but they do not follow through if the price is, in fact, what they would say it should be. So you cannot really rely on their willingness to pay. The other alternative, then, is to say, “How much time are you saving?” which is considered to be more effective than asking what they are willing to pay. PARTICIPANT: I would just add that if you are a public agency like NOAA that typically does not charge, the big problem is that it does not really know who the users are and what the user needs are. If you do not know who the users are and how they use the data, you do not know much about the marginal benefits. So I think one of the advantages of having free and open access is that there is no way that an economist can tell an agency what the market is going to be for something. Just making the data as widely available as you can, I think, is the way to maximize the benefits rather than sitting there trying to say, “What would someone be willing to pay for this product?” when the agency really does not have a good sense of how people use it. I totally agree that when you go out and ask people what they would be willing to pay, you will get numbers that are probably very inflated. A final comment from a professional point of view: I do not think the data withdrawal approach is a good one because I just think that there are ex post facto questions, and there are huge substitution possibilities among sources of data. I think it is much better to try to estimate what the marginal benefit is of additional data or new data, if you can get some handle on that. PARTICIPANT: Yes, on the NOAA data, you were focusing primarily on the economic kinds of uses, but I understand that a lot of the NOAA data are used in classrooms, that there is significant educational reuse of data that translates into not just the actual use but also the training of new scientists and that kind of thing. The other one being basic research—the use of the data for climate research and analysis, which obviously also has major benefits. PARTICIPANT: Yes, I should have made it clear that I was mainly talking about operational data, but you are absolutely right about the research value of the data. I was fascinated by John Houghton's presentation because of what we have been grappling with at NOAA. You think you have a big problem when you are trying to determine what the value of a better weather forecast is. The big problem we have is the value of the R&D, the research that NOAA is doing, and the last presentation gave me a few ideas about how to pursue this. There are a lot of uses for the data besides just the operational. PARTICIPANT: Well, I have heard many interesting comments today. What has been lacking in general are some more views from industry about how they see the whole thing. I think it is a bit absent today. This is important when listening to the presentation of Robbin te Velde, who gave the impression that Dutch public sector bodies were left on their own to do whatever they thought was good for them. They were not told by the government what should be done. They could charge or not, depending on their own wishes. We are told that there will be some resistance for change, so it seems that this is only a supply issue. I would be really interested to hear much more about companies and what are they doing to fight when systems do not work well, because one of the things that is lacking, at least in Europe, is

OCR for page 51
SOCIOECONOMIC EFFECTS OF PSI ON DIGITAL NETWORKS 58 solid complaints. The European Court of Justice has not received a single case related to public sector information. I think that it is important to look at the judicial perspective. I also was very much interested by the comment of our Google colleague, who reported today about their work with the U.S. government to see how they can work together in order to get good information for reuse. That brings me to the NOAA. Sometimes it seems to me that you are working in a very competitive model with private companies, but not competing directly with them. They are your users. You work together. You build a better world together. I would be interested to know whether the private companies in the EU would tell me that this is the way they look to the work with European Met offices and whether this is the view that European Met offices have about work with the users. The question is, what works better—i.e., how can we work together, private and public, not working against public sector policy, but working together with public sector policy? PARTICIPANT: Recalling what Dr. Fornefeld presented this morning about the value change of the PSI data and about how data has to be combined together so that services can be provided, I think we come back to this once again because the economic models that Ms. Nilsen presented say that the product has to be sold for the marginal costs. That is right, but it does not take into account the investments that are needed. When you sell a product at the marginal cost, you do not take into account the investment required to produce these goods. In the case of information, the marginal cost is nearly zero because it is very easy to distribute information over the Internet. So we can state that the bulk of the cost of information, from our discussion today, is investment cost. That is why you cannot sell information, because if you want to price information in its marginal cost, it makes no economic sense to sell information. But, as you remember from this morning, to provide services based on this information does have marginal cost. There is also a business opportunity to sell your services along with the information. It makes no sense for anybody to sell information priced at marginal cost because it has no marginal cost, but it does make sense to sell services. In the public sector debate, we speak about public sector information holders who sell information to reusers under certain licensing terms, and this is considered all right because we see it as acceptable to sell information to the people who are going to provide services. When we look at this value chain, we say information should not be sold, but the service should be sold. However, we must remember that there is no such thing as a free lunch. The information producer should realize some return of the profits as well. The producer needs to cover the investment costs of producing the information. I have very seldom seen a business model based on the notion that the information will be provided free to a service provider, but a certain percentage of the profit of the service provider will come back to the data producer. This would be an integration into a new value chain. There would be an agreement of cooperation between the data producer and the service provider, and the data would not be permitted to be sold by the service provider. The only thing that is sold is the service, but a part of the profit derived from selling the service would come back to the data producer. The licensing could be easier because it is an agreement between the data producer and the service provider. That is not how it is today—the price of the information and the licensing depends on the business model of the service provider.

OCR for page 51
EXPLORING THE IMPACTS OF ENHANCED ACCESS 59 For example, perhaps a local government could provide cartographic information to a small company that wants to provide services based on this cartographical information. The small company has to explain its business model to the local government, so that the local government could set the price. They might ask how many users do you intend to have, or at what rate would you want to price your service, and then the local government would set a price. This is very difficult for innovating companies because they do not know their market. They do not know the price or the benefits they have to make of it, and they have to first pay for the information before they can go into business. This way no innovation can work, and it is really an inhibition to innovation by small companies that are not able to prepay for the information. PARTICIPANT: My company has a very positive working relationship with many different public agencies when it comes to the sharing and dissemination of data. That is not really something that is controversial or problematic. But industry is interested, as more technology and services develop, in being able to provide complex datasets not only to other companies or other customers but also to users in order to unleash the kind of creativity leading to new applications or innovations. So what industry would like to ensure—at least what my company is looking for—is that the usually positive initial contact between government and industry continues to be positive. In a previous life I worked for an industry association that handled PSI questions, and I saw there at least two examples of very negative cases that will probably end up in court when the PSI Directive has been implemented in Sweden, and we will see if they end up in the European Court of Justice. It is clearly a risk that this initially very positive contact becomes harsher or more problematic when the depth of access requested by private companies and industries increases. I hope, of course, that this will not be the case. I think that there is most definitely a demand side in private industry to complement the supply side in the government as in the discussion we are having here today. PARTICIPANT: I know others are quite rightly saying, “Where are the complaints? Where are the cases that people bring?” I think what we all have to acknowledge is that it is extraordinarily difficult for the private sector to take on the public sector in a court of law. It is a big step, especially if you are a small company. I think also there is the question of time, or timeliness. If a company has an idea that it wants to develop but in order to develop it, it has to go to court, then the idea is dead before it emerges into the real world. I have a special relationship with one particular trading fund, and that has caused us, as a business, more grief than I would have thought possible. I think one of the reasons for that is that public sector bodies, understandably, do not like being attacked. If you were to attack a supermarket, for example, on fair and reasonable grounds, if it is a well-run business you are likely to get a sensible, moderate response. That is not necessary true of state-run organizations that do not feel that they are able to defend themselves in the same way. So in dealing with a state-run organization, you have to really believe that the relationship has gone to a point of no return before you take action. There are two other points I would like to make. The first is that obviously you have to go through the national process before you can then get to the broader EU level. That can be quite telling. Once you get as far as the EU, then I think the situation becomes a great deal more interesting. The EU institutions have got a different slant on some of these issues.

OCR for page 51
SOCIOECONOMIC EFFECTS OF PSI ON DIGITAL NETWORKS 60 Secondly, I think the competition authorities in various EU countries may have a great deal of ability to unlock some of these problems for all of us. It only takes one or two decisions and all of a sudden the whole of Europe will have to take an interest in this issue.