Read "The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium" at NAP.edu

Page 17 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

PART TWO:
STATUS OF ACCESS TO SCIENTIFIC DATA

Page 18 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

5. Overview of Scientific Data Policies

Roberta Balstad Columbia University, United States

We have heard a lot about the practical economic and applied implications of having open access to data, but we should not lose sight of the benefits to science. One of the reasons that access to data are becoming so important is, of course, that the technology has changed, and that we can deal with massive databases in a way that we could not have dealt with them 20 or 30 years ago. Another reason is in the very nature of the scientific process itself. What is science? For many people, it is simply experimentation and testing. That narrow definition has been modified in recent years to include experimentation, observation, and testing. For other people, science is really a matter of modeling and projections. If you cannot project something accurately, many believe, it is not science. So you need data for projections, too.

Equally important, scientific research is increasingly evolving into “data-intensive science.” You read about it in the field of health care, for example, where scientists combine data from 20, 30, or 100 different studies to get a larger base in order to analyze and investigate topics that are impossible to pursue in a small, intensive study of perhaps 20 individuals. This is also true in a number of other fields. Data-intensive science relies on open access to data from all sectors, because only then are scientists able to combine datasets to ask new types of questions.

Scientists are able to address much broader questions in data-intensive science than they could if they were responsible for collecting their own data for every study that they conduct. Increasingly, for example, we find that governments collect much of the scientific data that we use. These databases in many countries are open. We would like to see them become more open in even more countries so that scientists can use them.

Open access to data advances science. It improves descriptive, comparative, and observational science; it enriches modeling and prediction; and it makes it easier to test and retest propositions using the same databases. That, of course, goes back to the philosopher of science Karl Popper, who said that true science is science that can be tested, that is falsifiable, and that you can prove wrong. To do that, you have to have access to data.

A second reason for providing better access to scientific data, in addition to advancing science, is that it levels the playing field for scientists from smaller or less-developed countries so that they are able to conduct data-intensive science using publicly available data. In short, data access makes a principal resource of scientific research available to all.

Traditionally, data access policies were quite restrictive in terms of both policies and practices. Data were held to be the private property of a scientist. At the end of doing a dissertation, we had a body of data that we could mine for a long time. That was considered to be the property of the scientist and that was what made his or her work significant. In other cases, the kinds of data that Professor Farouk El-Baz was talking about (e.g., remote-sensing data) were often seen as a national asset that had to be protected.

Data were also seen as a commodity that had economic value for the scientist or, more often, for the government that sponsored the data collection. When science becomes a commodity, obviously, those who collect data begin to think about marketing the data, and then they easily slide into charging for data in a for-profit or even not-for-profit setting.

To summarize, the benefits of changing from restricted to open data access policies are as follows:

Page 19 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

• Open access to science contributes to innovation and economic growth.

• Scientific advances, both substantive and methodological, are now data intensive and require open access to scientific data.

• The cost of research is reduced. This is very important right now in most countries, because there is often less money available for research. To keep science alive and vital, open access to data is a real advantage.

Limiting access to data—the other side of that coin—results in higher research costs, lost opportunities, barriers to innovation, less less-effective scientific cooperation, suboptimal quality of the data (since no one is working with them and cannot provide corrections to them), and a widening gap between the Organization for Economic Co-operation and Development (OECD) countries and the developing countries.

In pushing for open access to data, however, we must acknowledge that there are some legitimate reasons for limiting access to public data:

• National security and public safety.

• Personal privacy and confidentiality, which are protected in many countries.

• Proprietary rights of private-sector parties. No one is talking about forcing open access on research that a company has done in order to advance its product.

Internationally, there have been a number of activities that have advanced open access over the past 50 or 60 years. A big impetus to open access to data was the International Geophysical Year (IGY) in the 1950s, a massive global-scale data-collection effort that stressed open access to the scientific data collected under the aegis of the IGY. One of the results of the IGY was that the International Council for Science (ICSU) formed the World Data Centers. In order to become a World Data Center, a center had to agree that it would provide scientific data to whoever asked. That does not seem to be required anymore, but it was at the time, particularly because a major goal was to make sure that data were available both to scientists in the West and scientists in the Soviet bloc. The Iron Curtain divided scientists as well as politicians, and the World Data Centers were meant to overcome the limits to exchange of data among scientists.

When the Group on Earth Observations formed the Global Earth Observation System of Systems (GEOSS) in 2005, it established the following open data principles:

• There will be full and open exchange of data, metadata, and products shared within GEOSS, recognizing relevant international instruments and national policies and legislation.

• All shared data, metadata, and products will be made available with minimum time delay and at minimum cost.

• All shared data, metadata, and products being free of charge or no more than cost of reproduction will be encouraged for research and education.

In 2007, the OECD made a strong stand on behalf of open data access, recommending that data policies show openness, flexibility, transparency, legal conformity with existing laws, protection of intellectual property, formal responsibility for the data, professionalism, interoperability, data quality, data security, data efficiency, accountability, and sustainability.

There has been gradual movement toward even more openness in data in the United States as well. In the 1980s, the Reagan administration proposed a policy of commercialization of all data that were collected

Page 20 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

under grants supported by the National Science Foundation (NSF). That would have meant that investigators would have to sell their data to anyone who wanted to use it. One division in the NSF, the Division of Social and Economic Science, established a policy of not making a grant to anyone who would not agree to put their data in a publicly accessible data archive before receiving the grant. If the grantees did not follow through with what they had proposed, they would not receive another grant. The policy had limited success because it only covered one division, and it went counter to the national policy. However, in 1991, the U.S. Global Change Research Program, backed by the White Houce Office of Science and Technology Policy (OSTP) established a policy of open access on data related to global change. In 2005, the National Institutes of Health required data management plans for all of its large grants. And this year the National Science Foundation promulgated a new data management policy, which requires all grant proposals to include a data management plan. In sum, the role of data—and policies for data—are changing rapidly.

Page 21 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

6. Implementing a Research Data Access Policy in South Africa

Michael Kahn University of Stellenbosch, South Africa

Today I am going to speak about three things, broadly: South Africa as the gateway for the BRICS Group, whose members are Brazil, Russia, India, China, and South Africa; its innovation system and policy; and prospects for research data policy.

It is safe to say that South Africa represents, by world standards, a relatively small innovation system that is quite dynamic. In the apartheid years, it engineered all sorts of bad things, but also some good things, especially in the fields of health, plant and animal science, ecology, and environment. Those deep skills prevail into the present. That is essentially why South Africa can play an important role of being a higher-education hub for the rest of Africa.

The story in Africa is very interesting. There are many African countries that are among the fastest-growing economies in the world. This morning CNN referred to China as growing at a mere 9.7 percent per year at the moment. The country that comes immediately after that is Ethiopia, at 8.5 percent. Angola, Chad, Democratic Republic of Congo, Ethiopia, Mozambique, and Zambia will all have growth projected over the next 5 years well in excess of 7 percent.

South Africa is currently growing around 3 percent and is struggling to break out of what appears to be a natural confine of around 4 percent. But I want to draw your attention to scientific production in South Africa. If we take the country’s scientific article production, as recorded on Thomson Reuters Institute for Scientific Information (ISI), South Africa is the clear leader in Africa. The real question is whether growth can be driven by science and research and development (R&D), or whether growth is driven by industrialization.

Among the universities in Africa, the top 10 are all South African—hence, the higher-education hub. Also, there are a number of fields in which South Africa has scientific impact somewhat above the world average, such as immunology and space science. There is also a high level of particular expertise and activity, and therefore, necessarily, a great volume of research data in agriculture, environment, ecology, and geosciences. Furthermore, I should mention the increasing number of domestic journals that are indexed to Thomson Reuters ISI.

When talking about data policy, you can only talk about it in the context of your entire innovation system. In the case of Pakistan, mentioned this morning, the private sector plays a very small role. In the case of South Africa, the private sector is the largest performer of R&D, but also the smallest producer of scientific output in the form of articles, unlike Japan and even the United States, where many articles emanate from private-sector addresses.

When you talk about an innovation system, you talk about the main actors: business, higher education, and government. These operate in the pursuit of innovation activity, of which R&D is but one activity. You also have to look at your financial system; your cultural and political norms; the regulatory framework; the legal framework, including intellectual property; and information policy. If any one of these is suboptimal, the others are not going to flourish.

In South Africa, we have the National Research and Development Strategy of 2002, which brought about some reorganization of the science system related to reporting lines, which may or may not be significant. It did lead to the development of changes to intellectual property law, as well as the introduction of a forward-looking incentive for R&D, which rewards R&D, much as happens in the United States, with a

Page 22 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

150 percent deduction in taxation. It also led to some initiatives in human-resource development, such as a research chairs project modeled on that of Canada, and the introduction of centers of excellence in the universities in various fields.

This has been followed by an innovation plan in 2008, which plays into the same theme of grand challenges that you find in many countries. The five grand challenges identified are all highly data-intensive areas:

1. Space science (i.e., remote monitoring and telemetry).

2. Energy, the hydrogen economy, and new materials or catalysis.

3. Farming to pharmaceuticals and biotechnology, plant and animal science.

4. Global dynamics (i.e., climate change) and remote sensing.

5. Human and social dynamics (i.e., social sciences).

Then we have the introduction of the Technology Innovation Agency in 2010, with a mandate of converting R&D findings into commercial prospects. These initiatives, of course, operate within the larger context of other national laws on employment equity as well as immigration law. These various thrusts are now being driven forward through an industrial policy. South Africa is a very important gateway in Africa, through its work with the New Partnership for Africa’s Development. We have also been active in promoting international networking through our centers of excellence in high-speed computing.

What about the prospects for research data policy? The country has a science system—arguably going back a century and a half—to the onset of large-scale mineral exploitation, so it is not a newcomer. There are many countries in Africa where science systems date back perhaps 50 years or less. We have been at it for a long time, in the same way that Brazil has been at it for a very long time.

South Africa is data rich at the system level. We have well-quantified data on R&D and innovation, and educational statistics. Bibliographic information is held in a private database, which is not unusual. Thomson Reuters, after all, is also private. We have good data on higher education, although this is inadequately exploited, and we are busy building a new database called the Research Information Management System, which will hopefully lead to better research management, both in our research councils and in the universities.

What about data held by regulators? We have a clinical trials register. We have gene banks. We have data on plant breeders’ rights, biodiversity, and indigenous knowledge compliance, and we have an ethical clearance system built into the funding awards process. We also have the Promotion of Access to Information Act, which allows access to this information. The Patent Amendment Act has introduced some potential dysfunctions into the system, in that it might well constrain people from taking out patents in South Africa as opposed to exporting their knowledge abroad, which is the exact opposite of what the drafters of the act intended.

We are also data rich at the sector level. We have the South African Earth Observation Network. We have radio astronomy data; seismic data; oceanographic, geological, and meteorological data; social science data; and biodiversity data, including about aquatic diversity. We have an extremely able statistics service.

The question is, who gets access to this and how can it actually be used? Regrettably, there is fragmentation by default. Although Statistics South Africa has a mandate to coordinate national surveys, that is as far as it goes. The national Department of Science and Technology (DST) has a mandate to coordinate science budgets, but not to coordinate information. Indeed, it has experienced extreme difficulty even in the attempt to coordinate science and technology budgets across government. Silos rule

Page 23 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

in the government domain. Originally the DST mandate was to coordinate R&D budgets. It turned out to be impossible, so it was widened to cover science and technology. That has been even more difficult.

We have a National Advisory Council on Innovation, but it lacks the authority to carry out its mandate. It is currently undergoing its third review. We have the problem of management information systems being designed for one task, but being forced to address other tasks; there are resource limitations of inadequate metadata and training.

So, what are the prospects for coordination? I would like to be optimistic and say they are reasonable and perhaps improving. There is a commitment to monitoring and evaluation at the highest level. We now have in the presidency a minister for monitoring and evaluation. That minister has required each of the ministers of state to enter into an indicator-based performance agreement. This, unfortunately, might bring about perverse behavior. If you are asking me to account for myself, it is in my interest to set the bar as low as possible, because then I will be sure to succeed, and I will get a pat on the back.

We also have a commitment to big science, and that necessitates a lot of work on data coordination. South Africa has given strong support to the African Union’s work on science and technology policy, in the form of promoting high-speed, wide-area networking and data sharing, and supporting the development of suitable infrastructure. We also have a commitment to the Organization for Economic Co-operation and Development guidelines for access to publicly funded research. A review of the science system is currently under way, which provides an opportunity to accelerate progress.

I must also draw your attention to the United Nations Educational, Scientific and Cultural Organization (UNESCO) study on the social sciences, Knowledge Divides, which was published in 2010. It looks quite critically at the issue of social data.

To conclude, I would like to focus on the North-South relationship. I parody this relationship as one of academic hunters exploiting data gatherers. With due respect to my hosts: “We from the North are the hunters, and you in the South are the gatherers. Collect the data, and we will process it for you. We will keep the datasets, because we cannot trust you to manage or to exploit them properly.”

These kinds of interactions lead to mechanisms that restrict inquiry. There are many African countries that have been exploited because they have very interesting archaeological or anthropological resources, and people from other countries wish to come in and get their Ph.D.s, carry out interesting research, and get the credit for that. As a result, hardly surprisingly, many countries have closed their doors. In addition, these unbalanced relationships are used to restrict other kinds of research, particularly social science research that might turn up unwelcome findings.

Finally, there is a large amount of information that has not been digitized, and therefore is not easily accessible. The promotion of legislation for data curation and archiving, and for open-access publication, would be desirable as well.

Page 24 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

7. Access to Research Data and Scientific Information Generated with Public Funding in Chile

Patricia Muñoz Palma National Commission for Scientific and Technological Research, Chile

INTRODUCTION

The progressive and significant increase of public funding for the development of scientific research in Chile, while advancing new technologies and international institutions, impose a new challenge: to insure access to and promote the reuse and preservation over time of scientific and technological data generated using public funds. I will describe some policy initiatives addressing to the access to research data and scientific information that are currently under study in Chile.

The situation in Chile can be summarized as follows: Over 80 percent of research activity is supported by public funding, and these funds are oriented mainly to researchers, universities, and research institutions. The results of the research are published in scientific journals and proceedings, and are available in both electronic and print format. Most research data are not available for access or consultation, and commonly are managed by the researchers themselves.

In order to design and implement a policy initiative addressing the access of data, besides the practices of researchers and institutions, other aspects must be considered, such as legal frameworks, agreements between beneficiaries and funding agencies, intellectual property, copyright, and transparency law, and international recommendations of institutions such as the OECD, ICSU, and CODATA.

In this context, CONICYT (National Commission of Scientific and Technological Investigation) takes into account the country’s needs to have an adequate infrastructure and to generate policies that facilitate access to research data and scientific information. For this reason, we have developed the following three main initiatives, as described below.

THE FIRST INITIATIVE: A Program for Management of Research Data and Scientific Information

The principal objectives of the proposed program are to manage, strengthen, and guarantee access to data and scientific information collected and produced in Chile from public funding, in order to:

• Facilitate and promote access to, and exchange of, data and scientific information generated in the country;

• Preserve the scientific patrimony (i.e. specific elements of agreed national importance);

• Promote utilization of international standards for management of data and scientific and technological information in public and private institutions;

• Create information products and services as added-value tools for scientific, productive, and economic development of the country;

• Systematize the process of management of data and scientific information with the goal to obtain data, which will allow the creation and improvement of national indicators regarding investment, transference, and impact of public investment in the national system of science and technology.

The program has considered four strategic items:

Institutional framework: Respond to the needs of the country with a broader vision for development of science and technology and innovation.

Page 25 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

Specific studies: Generate basic inputs for decision making and generation of policies.
National and international linkages: Create incentives for development of national and international networks.
Human capital: Develop instruments allowing for specialization of professionals in these areas.

The most important product of this program will be the design of a National Platform for Access to Science, Technology, and Innovation Knowledge. This platform will allow consulting and access to research data and information, and will enable their storage and preservation. This platform will involve all the relevant institutions, such as universities and technology research centers, participating in the process of production of technological and scientific knowledge, including the collection of data, processing, production of specific reports.

THE SECOND INITIATIVE: Study State of the Art of Access and Management of Research Data and Scientific Information Generated with Public Funding

This initiative was the first step for the implementation of the above-mentioned program. Its goal was to gain basic understand about international and national practices on access to data and information.

The specific objectives of this study were to:

• Identify the international state of the art on these topics;

• Describe the practices and policies for access to research data and scientific information in Chile; and

• Generate recommendations related to management of research data and scientific information for the Chilean institutions, considering the international context.

This study covered several deficiencies of the Chilean practices, such as the ones explained in more detail below:

Institutional framework:

• Lack of institutional awareness at all institutional levels, about the relevance of keeping adequate data management practices.

• Responsibility for the management of data and scientific information is spread throughout organizations, lacking one specific organizational unit in charge of it.

• Data management within the organizations is financed through individual projects, rather than considered an activity that requires continuity, therefore a permanent source of funding.

• Patrimony is not seen as an institutional and national asset, but as researchers’ personal resources.

• Lack of standardization of data and the formats of scientific information.

Human capital:

• Lack of critical mass of human resources specialized in data management.

• Scant supply of formal and specialized academic programs focused on data management.

• Lack of incentives to create upward career paths in the field.

• No leadership or national references in the field.

• Unstable and short-term work positions.

Patrimony:

• Lack of or weak valuation of patrimony.

Page 26 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

• Patrimony scattered throughout organizations.

• Patrimony considered personal, rather than institutional or national.

• Preservation and formatting according to individual criteria.

• Lack of initiatives for sharing or exchanging.

• No culture of giving access to patrimony.

Policies:

• Lack of institutional policies addressing access to research data and scientific information.

THE THIRD INITIATIVE: Development of a Policy on Management of Research Data and Scientific Information

The goal of this policy for the main Chilean public funding institution (CONICYT) will be to implement a series of standards for capture, registration, and management of data and information systems to be by all beneficiaries of public funding.

The principal objectives of this activity are to:

• Optimize and rationalize use of public resources involved in generating and managing scientific knowledge;

• Increase access to research data and scientific information; and

• Adopt and attain international standards, including OECD recommendations.

In order to implement this new policy for CONICYT, the next steps will be to:

• Generate awareness in the scientific community, and among academics, public employees, and public agencies about the relevance of having access to research data and scientific information; and

• Develop a network of institutions and individual professionals interested and willing to collaborate to the implementation of this policy.

CONCLUSIONS

These initiatives represent a significant challenge for CONICYT. As the main Chilean funding agency for scientific and technology research, it is aware of the value that the access to research data and scientific information adds to the process of knowledge production. The implementation phase that comes next will encounter resistance at different levels, from legal barriers to deeply established notions of ownership of the data produced by researchers. We trust CONICYT will be able to gather the support of the scientific community and political leaders, and be able to align them around the goal of boosting the quality and quantity of the national scientific and technological production.

Page 27 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

8. The Management of Health and Biomedical Data in Tanzania: The Need for a National Scientific Data Policy

Leonard E. G. Mboera National Institute for Medical Research, Tanzania

Sound statistics are a key component of evidence. This is the main reason that we see that scientific research is increasingly dependent on successful outcomes of access to data. In particular, the research findings that have been conducted in one place can be well utilized or relevant somewhere else. Many developed countries have created some initiatives whereby they have established national policies and programs for data management and access, but the situation is different for developing countries. This includes Tanzania. Most of those countries either have a weak or nonexistent policy for management of data and access.

However, there have been some initiatives taking place in the African region that are worth mentioning. For example, the Algiers Declaration (2008) aims at improving

• The availability of relevant and timely health information, and access to global health information;

• The management of health information through better analysis and interpretation of data;

• The availability of relevant, ethical, and timely research evidence;

• The use of evidence by policy makers and decision makers;

• The dissemination and sharing of information, evidence, and knowledge; and

• The use of information and communication technologies.

As for the current situation in Tanzania, we have developed guidelines on data transfer, but there is no policy on research data sharing yet. Difficulties also exist in accessing and sharing the scientific data that has been collected by different researchers. Specific barriers to the access and sharing of scientific data collected by researchers using either public or donor funding include scientific and technical, institutional and management, economic and financial, legal and policy, and normative and sociocultural aspects.

The National Institute for Medical Research (NIMR) plays a great role in health research and data sharing in the country. For example, the Parliamentary Act No. 23 in 1979 gives the institution a mandate to work on some key issues: (1) to establish a system to register the findings of medical research carried out within Tanzania, and promote the practical application of those findings for the purpose of improving or advancing the health and general welfare of the people of Tanzania; and (2) to establish and operate systems of documentation and dissemination of information on any aspect of the medical research carried out by or on behalf of the institute. This is the basic mandate that the NIMR has had since its establishment more than 30 years ago.

Regarding management of health research in the country, the situation is not very good. Researchers in Tanzania, like in other developing countries, lack the norms and traditions of open data sharing for collaborative research. If you go to individual institutions within the country, you will find that many institutions are treating their data either as a secret or as commercial commodities. They are not really open to say, for example, “Here’s the data. Let’s do something with it. Let’s make sure that we inform the world about the research we have been doing.” Also, although the government does not actively protect such data, it lacks policies that provide guidance or identify responsibilities for the researchers in making research data available for others to use. Moreover, Tanzania does not have a central data center or digital repository in place where researchers can submit their data for use by others. This is a key challenge. We are saying that researchers are not willing to submit their data, but if they are, where are

Page 28 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

they going to put it to share with other institutions or with other people?

Next, I would like to talk about the issue of research data transfer in the country. What are the guiding principles for data-transfer agreements in Tanzania? Tanzania introduced a procedure for data transfer between Tanzanian and foreign institutions in 2010. The Data Transfer Agreement (DTA) is used as the only legal document by institutions in Tanzania to regulate the uses of data they provide to specific research projects inside and outside the country. It also provides opportunity for institutions to claim co-ownership of the improvements made from data the recipient has acquired. Upon approval of the DTA, the agreement becomes valid and the recipient is granted unique access for a period of time, depending on the duration of the project, after which the data are placed in the public domain.

Before concluding, I want to inform you about publication procedures in the country. No research work can be published without getting permission from the National Institute for Medical Research (NIMR). The procedure is simple. When you want to do research in the country, you need to get ethical clearance from the NIMR.

Finally, I want also to tell you about some new initiatives within the Institute for Medical Research. The NIMR is in the process of establishing a Web-based National Health Research Data Repository. The primary purpose of this project is to provide a storage database for all health research work that has been conducted in the country. We want to put together all research work conducted in different places in the country in a central database, whereby it can be accessed.

This has not been easy, and we still have a number of challenges. The financial challenge is the big one. Also, issues of human resources are a challenge, especially regarding technical staff. Furthermore, we have challenges related to infrastructure. We did not give up, though. The NIMR has different centers and stations around the country. Each station or center has a purpose for conducting particular research. Some centers focus on malaria; others focus on tuberculosis or neglected tropical diseases. The data repository for the NIMR will collect information from these centers and stations. Once we are done with them, we can move on to other institutions.

To conclude, with the increased need for data- and information-sharing globally, it is time now for Tanzania to develop and implement a national research data policy. This policy, among other things, will oversee the generation, storage, data transfer, and use of research among local, and between local and international, users of different research findings. Also, we need to have a formal mechanism established to govern the accessibility and use of created research data.

Page 29 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

9. The Data-Sharing Policy of the World Meteorological Organization: The Case for International Sharing of Scientific Data

Jack Hayes U.S. Permanent Representative to the World Meterological Organization, NOAA Assistant Administrator for Weather Services, and Director, National Weather Service

The mission of the National Oceanic and Atmospheric Administration (NOAA) National Weather Service is to use the information that science has equipped us with to provide forecasts and warnings to protect life and property and enhance national economies. This will be the focus of my talk.

From the U.S. perspective, we have long believed that free data sharing is a benefit to the United States, and from a perspective of the use of information created using the U.S. tax dollar, it must benefit society and must be made available to people. My focus will be on how it benefits the United States today. I will conclude with some additional thoughts about where free and open access can benefit capacity building in developing countries, and then, as a weather service operator, how I see that benefiting society.

If you are not familiar with the World Meteorological Organization (WMO), it is a specialized agency of the United Nations that was created after World War II, in 1951, although it existed in various forms since the 1870s. The WMO has 190 members. Its mission is to focus international partnerships on the collection, production, and exchange of weather, water, and climate information in order to protect life and property, enhance national economies, and preserve environmental quality. There are two important WMO resolutions related to data sharing:

Resolution 40: “WMO commits itself to broadening and enhancing the free and unrestricted international exchange of meteorological and related data and products.” This resolution was passed by the WMO Congress in 1995 to provide free and unrestricted international exchange of meteorological and related data and products across the world. In 1999, Resolution 25 established the same basic policy across the 190 members for hydrological data and products.

Resolution 25: This resolution calls for “committing to broadening and enhancing, whenever possible, the free and unrestricted international exchange of hydrological data and products, in consonance with the requirements for WMO’s scientific and technical programs.”

The way this policy is implemented across the globe is through giving the national owner of data and products the ability to discriminate between the data that are essential to the core mission—to protect life and property—and the data that are called supplemental. Generally the protect-life-and-property data and products are free and open, and supplemental data have variable degrees of openness, depending on the country.

U.S. data policy has its foundations in the Office of Management and Budget’s Circular A-130: “…government information is a valuable national resource, and … the economic benefits to society are maximized when government information is available in a timely and equitable manner to all.” What we have found in the United States is that free and unrestricted data sharing maximizes the value for the information we collect and create within the National Weather Service. We also find that the reverse is true. Where we have barriers, we underutilize the data that we collect and produce. Fundamentally, the more people use our information, the more value it has.

To fulfill our mission to protect life and property and to enhance the economy, we collect and produce information that can be used to do that, but this data can be used for other purposes as well. There is a

Page 30 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

commercial weather sector that uses our data, adds value to it, and provides products and services for very specific needs. This, we find, has aided economic growth within the United States.

Let me give some specific examples. When an F5 tornado hit the community of Greensburg, Kansas, in 2007, the National Weather Service detected it. The tornado occurred at night. It formed in northern Oklahoma and moved across the border into Kansas, where it leveled Greensburg. While the community expected hundreds of people to lose their lives, the number was about 10. This was possible because the National Weather Service detected the tornado and, in partnership with the media, alerted people via television, radio, and NOAA weather radio, so that they had plenty of advance warning. It was a partnership that we had with the emergency management community that worked with the hospitals, so that we had moved ambulances outside of the town that was going to be struck to a position where they could be of more use.

What did the commercial weather sector do in this incident? At that time, there was a train headed right into the path of that tornado outbreak. There was a commercial weather company that used the information we collected to relay a warning to the dispatcher in Omaha, Nebraska. They put a hold on that train several miles east of the path that we projected for the tornado outbreak. That train stopped and the company saved the lives of that train crew and the train itself.

To emphasize the value of data sharing in this incident, I would like to compare it with a very similar tornado outbreak in the early 1950s in Udall, Kansas. In that incident, a very similar tornado moved out of Oklahoma into Udall at night and destroyed the city. As a result, 100 people lost their lives. I would say that the free and open data exchange, and the partnership created between the private and public sectors in the United States, has allowed us to reduce the loss of life over the past 30 to 40 years.

Here is another example. There are various motor raceways across the United States. In the partnership we have with the private sector—for example, the Pocono Raceway in Pennsylvania—our job is to monitor severe weather and to ensure that the raceway has the warning it needs to get the people attending out of harm’s way. The private sector will use information we collect about temperature, humidity, wind—anything that might affect the conduct of the race or the comfort of the people there, add value to it, and give it to the raceway officials, to the racers, and to the people in general. That is a partnership that increases the value of the information.

In one last example, in partnership with the Federal Aviation Administration and the Department of Defense, we have roughly 150 Doppler radars spread across the United States. We use the data from these radars to detect severe weather and to warn others of severe weather. The commercial weather industry provides applications. If you have a BlackBerry or a cell phone that has an imagery capability, you can get access to that imagery.

I will switch to a discussion of Europe now. Different countries in Europe have different models. In general, however, many countries supplement their federal budgets by charging for some data and products that they produce. Protect-life-and-property data and products are generally freely available, and their warnings and observations are too. The warnings, in general, are simple cartoons, not as vivid as what we can produce in the United States. The observations are generally conveyed as text rather than as a visual representation.

Let me talk about capacity building in developing countries and give you an example of where a united effort sponsored by the WMO is certainly going to enhance the capacity within developing countries. In the wake of the December 2004 tsunami in the Indian Ocean, nations of the world got together, through the WMO and the International Oceanographic Commission, and created a tsunami warning system that is used internationally. It is that free and open exchange of information that aided and helped us move that

Page 31 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

along. One effort that I have been personally involved in is in southern Africa, called the Severe Weather Forecast Demonstration Project. I have long objected to the use of the word “demonstration,” because basically it means that we take model information based on geophysical equations and apply it to predicting and warning citizens. I do not think we need to demonstrate that; we have seen that for 30 or 40 years in the United States. It is a matter of growing actual predictive capabilities in developing countries.

What we have done is allied with other similarly inclined developed countries and created in South Africa the ability to bring in global model data. The government of South Africa runs a regional very-high-resolution model, combines it with satellite information, and creates products and data that it distributes to 16 developing countries in the Southern African Development Community. Similar initiatives have been undertaken in East Africa, Northwest Africa, the Southwest Pacific, and Southeast Asia, as well as the Caribbean. These initiatives are regionally driven, in which a subregion or a region makes a statement about its environmental threats, needs, and priorities. Under the WMO auspices, developed countries help to develop capacity so that we can transition to those countries things that we have become used to in the United States.

Looking at a world picture over the past 50 years, increased data sharing has contributed to a major reduction in loss of life. In the 1956-to-1963 time frame, 2.6 million people lost their lives due to environmental disasters related to weather, water, and climate. Fast-forward to 1996 through 2005, when 220,000 lost their lives. This is still too many, but it is an order-of-magnitude reduction in loss of life due to weather, water, and climate.

Looking to the future, initiatives such as the Global Earth Observation System of Systems (GEOSS) is an attempt to create data sharing for future scientific and operational benefit. As we look to the nine societal benefit areas in GEOSS, we are already seeing the scientific interrelatedness. It will only grow. From my perspective, free and open data exchange helps us advance the vision of GEOSS faster, cheaper, and further than we could if we did not have that capability.

To conclude, from a U.S. perspective, free and unrestricted data sharing in our country has created great value that we realize today. We strongly support initiatives to sustain and broaden it, both in the United States and internationally.

Page 32 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

10. DISCUSSION OF PART TWO BY THE SYMPSOSIUM PARTICIPANTS

PARTICIPANT: I have a question for Dr. Hayes. When you mentioned the less-than-optimal policies of certain European countries in data sharing, do you see an improvement in recent years in that respect?

DR. HAYES: The answer is yes. In the 1990s, there was very strong opposition to more free and open data exchange policies. However, there is a trend in many of those countries today to adopt our model, if not formally, at least informally.

PARTICIPANT: My question is to Dr. Kahn. Do you have any divisions between locally published articles versus internationally recognized journals—Nature, Science, and even specialized journals?

DR. KAHN: That is indeed the very problem of utilizing Thomas Reuters or Scopus or any international repository to measure publication output. It is naturally biased in three ways: language, field, and against locally published journals. It is a restricted view, but it has become the gold standard. Garfield gave us the ISI nearly 50 years ago, and it has retained its status up to the present time, but it has to be used with caution, which is why I made the remark I did. If we take any country outside the core of the Organization for Economic Co-operation and Development (OECD), there will be a host of local journals in the local language, particularly in the social sciences, law, economics, and so on, that are not counted. To get a true picture of the contribution to the knowledge pool, you have to extend what is included. We can assume, of course, that these local journals are subject to peer review and have an appropriate frequency of appearance. Plus, in the social sciences, publication channels differ. Books are more common than journal articles.

PARTICIPANT: I have one partial correction for you and then a question. The correction is that the implementation guidelines that you showed for the Global Earth Observation System of Systems (GEOSS) are no longer proposed; they are actually accepted by the GEOSS plenary. That is progress. The group is still working on pushing that further. There are a lot of issues that those guidelines and the original principles do not address.

My question is that there were a lot of very interesting presentations about the responsibilities of the developing countries’ governments to coordinate and get their data policies and act together. We have heard less about what the responsibilities of developed countries should be, having had a tradition of being hunters. From your perspective, how should developed countries behave? What practices should they adopt for acting respectfully toward developing countries’ efforts?

MR. MAYALA: I think that there is a need for the developed countries to focus on what developing countries are doing. Of course, if there is any support they can provide, then they should provide it. We need to learn from each other. From there, we can move together.

DR. BALSTAD: A few years ago, the International Council for Science sponsored a study of various priority areas in international science. One of them was data and information. The report that came out on data and information was not intended to be a manual or a policy prescription, but something in between that discussed major issues in data and information and some of the responses that have been developed in the developing world. Part of it is that we are all learning in this new area, and we need to work together. Some countries have gone farther down this road than others. It is vitally important that they share information about how to do it.

DR. KAHN: Remember the medical oath of Hippocrates: First, do no harm. That is what doctors agree to do when they are sworn into the profession. In asking developing countries to make data available, we should expect that there would be reciprocity from the North. The reciprocity should include an

Page 33 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

understanding of the defensiveness that might be encountered from the South, which is a residue of many years of a lopsided set of relationships. A book like John Le Carré’s The Constant Gardener does not come out of nowhere. There is lopsidedness and often a legitimate concern that we in the South are, as one cabinet minister put it, a giant zoo for the benefit of others. The important thing is that the barriers need to be negotiated away, lest the general scientific enterprise suffers.

PARTICIPANT: I would like to share some of our experiences and the challenges that we face with this. Our project does computer modeling of infectious disease transmission, and for that, we typically use high-resolution surveillance data from governments. For this, we tried to form partnerships with countries in Southeast Asia (Laos, Cambodia, Vietnam, and Thailand), and in Africa (Niger and Nigeria). Our approach has been to make agreements with ministries of health to use data collaboratively and then build capacity inside the ministries to manage data, where needed, digitize paper records, and then put them up in repositories that can be used, and also train people in modeling. In some countries, like Thailand, that has worked very well. In other countries, we can make agreements with ministries to share and coauthor papers and provide technical reports, especially in very-low-income countries, but our experience has also been that the time that people have in ministries of health or in governmental agencies is incredibly limited to interact with us and to be trained. They have very limited capacity to train people and then use the data management system at all. So it has been a struggle for us to identify people and push governments to see if there are people available to be trained. There is also the issue of lack of interest in the data itself. We found that a lot of governments do not really seem to care too much or do not see the value yet. In summary, our challenge is the sustainability of training people and making sure that after we leave, the country retains the databases and data systems and modeling capacity, and it has also retained the interest of the people locally.

PARTICIPANT: My question is for Ms. Muñoz. You gave us a very comprehensive description of the initiatives of the National Commission of Scientific and Technological Investigation (CONICYT) with regard to promoting open data-sharing policies, but I wonder whether you could expand on the efforts of CONICYT to promote a culture of data sharing with the international community for Chilean scientists.

MS. MUÑOZ: I think that we can work with the whole community to promote open access to data and scientific information, but at this moment, we think that the most important initiative will be to develop and implement the policy for management of publicly funded research data and scientific information, because this is the missing gap.

PARTICIPANT: My question is about how you see the new paradigm coming out from the cloud-computing utilities that, in a way, the scientific community is going to adopt. We can see that this phenomenon is growing very fast in the United States. The National Oceanic and Atmospheric Administration (NOAA), for example, is leading, with other agencies, this kind of phenomenon. I am very interested in your perspective. The second question concerns how you see this new utility from the technological point of view, as a way to boost collaboration among developing countries. I am representing Italy, and am very interested to see how we can boost this transatlantic dialogue between the United States and Europe in order to support developing countries.

DR. HAYES: From a NOAA perspective or a National Weather Service perspective, I put on rose-colored glasses and say, “How could we benefit society?” Then I take the glasses off and say, “What constraints do I have to live with?” One of my challenges here is to increase the value of weather information to the United States. I think that creating a concept like an open National Weather Service—which would include the public sector, the private sector, and the academic sector—would ensure that. This allows research managers to talk with operations managers and encourages data-sharing partnerships among a variety of missions.

Page 34 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

Given the importance of water to our future, we have launched an integrated water resources science and services within NOAA. There are 21 federal agencies in the United States government that produce water information. I would think that we could partner across the U.S. federal government, share data and information, and increase its value and partnership to the United States. I would broaden that to the private sector as well.

Another example is one that we have just started. It has to do with space weather. We are going to have a solar maximum in 2013. The infrastructure that we have created is more technologically advanced, but it is also more vulnerable to things that happen on the sun than it was even 20 years ago. I envision a future where we could create an international partnership around the globe through an organization like the World Meteorological Organization (WMO) to bring together academic, operational, and private-sector resources to benefit society.

PARTICIPANT: I would like to address a question to Dr. Hayes about what might be related issues and possibly also get some response or input from our colleagues from Africa. Is there any information available about trends in the participation of developing countries in the WMO GTS (Global Telecommunication System) weather system? Do you see more data coming into the WMO from developing countries, particularly Africa? Also, has the WMO become more active in recent years in providing services or products, above and beyond the basic weather forecasting, that would be of special benefit for developing countries? Again, I ask particularly with respect to Africa.

DR. HAYES: I will try to address it, and if I miss something, you can come back to me. First, observations are not inexpensive. You can imagine a developing country has similar challenges. We also have commercial interests that are threatening; for example, 4G mobile communication standards are invading the space that used to be reserved for meteorological sensing and data transmission. Second, going beyond weather, there is a concept in WMO called the Regional Specialized Meteorological Center. It really, I think, stemmed out of Chernobyl, where some countries had a nuclear-dispersion capacity and could share that with regions and make those products available. I certainly see that capability. As it applies to Africa, I think the Severe Weather Demonstration Program that WMO is using is trying to anchor within a sub-region of Africa in more than just the computing and distribution infrastructure, but in the ownership as well.

PARTICIPANT: This is a question for panelists Muñoz, Mayala, and Kahn. In developing your respective national science data policies, did you find the greatest resistance coming from political forces, economic or business forces, the scientific community, or just plain inertia? In other words, we have not needed it before; why do we need it now?

DR. KAHN: Previously, I worked for our Human Sciences Research Council at a time when we were busy debating the meaning of the OECD 2007 guidelines, which are great and noble and agreed upon by all. The Human Sciences Research Council, where I worked, was a research performer, unlike the Economic and Social Research Council of the United Kingdom, which is a grant maker. The parody at the time was that our universities wanted only two simple things from us: all of our research funds and all of our data. The parody was, you show us yours and we will show you ours. It is a very critical issue. We are moving toward an open-access data policy. Certainly the national Department of Science and Technology believes that this is a good thing, but at the same time you get what I will call a “techno-nationalism” that creeps in. It is even present in what I heard this morning from Tanzania. We have been cheated in the past, and therefore we have to be on guard to ensure it does not happen in the future. So you get these countervailing tendencies. On the one hand, there is something we might even call data sovereignty. On the other hand, there is data openness. We all agree that data openness is the way to go, but in the evening, when we have a drink with our party colleagues, we say, “No, no, no. The nasty foreigners are going to take it all away from us, and we have to be on our guard all the time.” It comes from two sides.

Page 35 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

You asked where most of the resistance comes from. I think there is a political dimension. The business sector is another case, because in South Africa the greatest volume of research and development is done by the business community. In that sense, we are more like an OECD core member state than an emerging economy. However, the business community does not publish very much in the scientific literature, so that does not really come up. The complication arises when you have a research project, and this takes us to the Bayh-Dole Act in your country—a research project that is jointly funded by business and public funds. How do you now decide which piece of data goes into the public domain and which remains behind the company walls? There is resistance. It is political. It is nationalistic, techno-nationalism, as I said. There are also proprietary restrictions.

MS. MUÑOZ: I think that the big problem for implementing this in Chile is our scientific community, because of the cultural aspect. In this context, the researchers do not have knowledge about what is important in these matters in the development of the country, and its public value.

MR. MAYALA: I want to give a little bit of experience from my country. First of all, I think it is known that you cannot put science and politics in the same pot. It is very difficult to act together. In my country, we want to come up with something that people will see. There are those groups who wait and see and then they react. We are in that kind of situation. I mentioned that we are trying to build a data repository. We hope that when it is out, we can get an appropriate reaction from people.

PARTICIPANT: I have a question for Dr. Kahn. You mentioned that South Africa is now a member of the BRICS Group. I presume that means more cooperation and partnerships with the large developing countries that are members of that network. You also mentioned that there is the issue of South Africa in Africa. You did not talk about that as much. Could you describe some of the partnerships and networking that are going on in South Africa with other African countries?

DR. KAHN: First, on an economic standpoint, South Africa accounts for something like 30 percent of the continent’s gross domestic product and around 70 percent of its scientific production. Those two walk together. South Africa, for the last 10 years, has been paying the bill to host the New Partnership for Africa’s Development and now the African Union’s Science and Technology Secretariat, which is hosted in our Council of Science and Industrial Research in Pretoria. In addition, I mentioned the role that we are playing in supporting higher-education development. I mentioned a figure of 50,000 students from across Africa studying in South Africa, out of a student population of around a half a million. It is close to 10 percent. Of these 50,000, fully 30,000 come from an economic grouping known as the Southern African Development Community of 15 southern African countries, including Madagascar. Those students are regarded as home students. They pay exactly the same fees as I pay for my own children. So, South Africa is donating 1 billion Rand (the equivalent to 122.1 million U.S. dollars) a year to support students from the Southern African Development Community. This is very important.

Because of the economic dominance of the country, there are many international agreements to which we are a party and where we play a central, coordinating role, ranging from air traffic navigation to weather and climate mapping and the like.

Southern Africa’s achievements lie in environmental modeling and monitoring cross-borders, as well as in the harmonization of transport links. All of these involve considerable research and data sharing. So we in South Africa are playing a role both in capacity development and in data sharing in numerous fields, across southern Africa in particular, but further afield as well. A lot of this, of course, is driven by self-interest, and our business activities to the north are well supported by this research investment over many years.

Page 36 Cite

Suggested Citation:"PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA." National Research Council. 2012. The Case for International Sharing of Scientific Data: A Focus on Developing Countries: Proceedings of a Symposium. Washington, DC: The National Academies Press. doi: 10.17226/17019.

×

PARTICIPANT: Some international organizations still have contributing members who are not following those data policies. How can we provide incentives rather than simply just examples as we move forward? Are there lessons to be learned from other communities that have had success?

DR. HAYES: I will speak from a meteorological perspective. One of the challenges in our community is that ownership of the data is country dependent. The philosophy we use in the United States is not the same philosophy they use in Russia. It is more than data. It is how the government is organized and how they fund the services. While we may not feel a threat in the United States, the Russian Hydrometeorological Service does. You have to almost deal with those one at a time, country by country, and be willing to sit down and try to find a way to reduce the perception of threat.