Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 17
PART TWO:
STATUS OF ACCESS TO SCIENTIFIC DATA
17
OCR for page 18
18 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
5. Overview of Scientific Data Policies
Roberta Balstad
Columbia University, United States
We have heard a lot about the practical economic and applied implications of having open access to data,
but we should not lose sight of the benefits to science. One of the reasons that access to data are becoming
so important is, of course, that the technology has changed, and that we can deal with massive databases
in a way that we could not have dealt with them 20 or 30 years ago. Another reason is in the very nature
of the scientific process itself. What is science? For many people, it is simply experimentation and testing.
That narrow definition has been modified in recent years to include experimentation, observation, and
testing. For other people, science is really a matter of modeling and projections. If you cannot project
something accurately, many believe, it is not science. So you need data for projections, too.
Equally important, scientific research is increasingly evolving into “data-intensive science.” You read
about it in the field of health care, for example, where scientists combine data from 20, 30, or 100
different studies to get a larger base in order to analyze and investigate topics that are impossible to
pursue in a small, intensive study of perhaps 20 individuals. This is also true in a number of other fields.
Data-intensive science relies on open access to data from all sectors, because only then are scientists able
to combine datasets to ask new types of questions.
Scientists are able to address much broader questions in data-intensive science than they could if they
were responsible for collecting their own data for every study that they conduct. Increasingly, for
example, we find that governments collect much of the scientific data that we use. These databases in
many countries are open. We would like to see them become more open in even more countries so that
scientists can use them.
Open access to data advances science. It improves descriptive, comparative, and observational science; it
enriches modeling and prediction; and it makes it easier to test and retest propositions using the same
databases. That, of course, goes back to the philosopher of science Karl Popper, who said that true science
is science that can be tested, that is falsifiable, and that you can prove wrong. To do that, you have to have
access to data.
A second reason for providing better access to scientific data, in addition to advancing science, is that it
levels the playing field for scientists from smaller or less-developed countries so that they are able to
conduct data-intensive science using publicly available data. In short, data access makes a principal
resource of scientific research available to all.
Traditionally, data access policies were quite restrictive in terms of both policies and practices. Data were
held to be the private property of a scientist. At the end of doing a dissertation, we had a body of data that
we could mine for a long time. That was considered to be the property of the scientist and that was what
made his or her work significant. In other cases, the kinds of data that Professor Farouk El-Baz was
talking about (e.g., remote-sensing data) were often seen as a national asset that had to be protected.
Data were also seen as a commodity that had economic value for the scientist or, more often, for the
government that sponsored the data collection. When science becomes a commodity, obviously, those
who collect data begin to think about marketing the data, and then they easily slide into charging for data
in a for-profit or even not-for-profit setting.
To summarize, the benefits of changing from restricted to open data access policies are as follows:
OCR for page 19
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 19
• Open access to science contributes to innovation and economic growth.
• Scientific advances, both substantive and methodological, are now data intensive and
require open access to scientific data.
• The cost of research is reduced. This is very important right now in most countries,
because there is often less money available for research. To keep science alive and vital, open
access to data is a real advantage.
Limiting access to data—the other side of that coin—results in higher research costs, lost opportunities,
barriers to innovation, less less-effective scientific cooperation, suboptimal quality of the data (since no
one is working with them and cannot provide corrections to them), and a widening gap between the
Organization for Economic Co-operation and Development (OECD) countries and the developing
countries.
In pushing for open access to data, however, we must acknowledge that there are some legitimate reasons
for limiting access to public data:
• National security and public safety.
• Personal privacy and confidentiality, which are protected in many countries.
• Proprietary rights of private-sector parties. No one is talking about forcing open access on
research that a company has done in order to advance its product.
Internationally, there have been a number of activities that have advanced open access over the past 50 or
60 years. A big impetus to open access to data was the International Geophysical Year (IGY) in the
1950s, a massive global-scale data-collection effort that stressed open access to the scientific data
collected under the aegis of the IGY. One of the results of the IGY was that the International Council for
Science (ICSU) formed the World Data Centers. In order to become a World Data Center, a center had to
agree that it would provide scientific data to whoever asked. That does not seem to be required anymore,
but it was at the time, particularly because a major goal was to make sure that data were available both to
scientists in the West and scientists in the Soviet bloc. The Iron Curtain divided scientists as well as
politicians, and the World Data Centers were meant to overcome the limits to exchange of data among
scientists.
When the Group on Earth Observations formed the Global Earth Observation System of Systems
(GEOSS) in 2005, it established the following open data principles:
• There will be full and open exchange of data, metadata, and products shared within
GEOSS, recognizing relevant international instruments and national policies and legislation.
• All shared data, metadata, and products will be made available with minimum time delay
and at minimum cost.
• All shared data, metadata, and products being free of charge or no more than cost of
reproduction will be encouraged for research and education.
In 2007, the OECD made a strong stand on behalf of open data access, recommending that data policies
show openness, flexibility, transparency, legal conformity with existing laws, protection of intellectual
property, formal responsibility for the data, professionalism, interoperability, data quality, data security,
data efficiency, accountability, and sustainability.
There has been gradual movement toward even more openness in data in the United States as well. In the
1980s, the Reagan administration proposed a policy of commercialization of all data that were collected
OCR for page 20
20 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
under grants supported by the National Science Foundation (NSF). That would have meant that
investigators would have to sell their data to anyone who wanted to use it. One division in the NSF, the
Division of Social and Economic Science, established a policy of not making a grant to anyone who
would not agree to put their data in a publicly accessible data archive before receiving the grant. If the
grantees did not follow through with what they had proposed, they would not receive another grant. The
policy had limited success because it only covered one division, and it went counter to the national policy.
However, in 1991, the U.S. Global Change Research Program, backed by the White Houce Office of
Science and Technology Policy (OSTP) established a policy of open access on data related to global
change. In 2005, the National Institutes of Health required data management plans for all of its large
grants. And this year the National Science Foundation promulgated a new data management policy, which
requires all grant proposals to include a data management plan. In sum, the role of data—and policies for
data—are changing rapidly.
OCR for page 21
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 21
6. Implementing a Research Data Access Policy in South Africa
Michael Kahn
University of Stellenbosch, South Africa
Today I am going to speak about three things, broadly: South Africa as the gateway for the BRICS
Group, whose members are Brazil, Russia, India, China, and South Africa; its innovation system and
policy; and prospects for research data policy.
It is safe to say that South Africa represents, by world standards, a relatively small innovation system that
is quite dynamic. In the apartheid years, it engineered all sorts of bad things, but also some good things,
especially in the fields of health, plant and animal science, ecology, and environment. Those deep skills
prevail into the present. That is essentially why South Africa can play an important role of being a higher-
education hub for the rest of Africa.
The story in Africa is very interesting. There are many African countries that are among the fastest-
growing economies in the world. This morning CNN referred to China as growing at a mere 9.7 percent
per year at the moment. The country that comes immediately after that is Ethiopia, at 8.5 percent. Angola,
Chad, Democratic Republic of Congo, Ethiopia, Mozambique, and Zambia will all have growth projected
over the next 5 years well in excess of 7 percent.
South Africa is currently growing around 3 percent and is struggling to break out of what appears to be a
natural confine of around 4 percent. But I want to draw your attention to scientific production in South
Africa. If we take the country’s scientific article production, as recorded on Thomson Reuters Institute
for Scientific Information (ISI), South Africa is the clear leader in Africa. The real question is whether
growth can be driven by science and research and development (R&D), or whether growth is driven by
industrialization.
Among the universities in Africa, the top 10 are all South African—hence, the higher-education hub.
Also, there are a number of fields in which South Africa has scientific impact somewhat above the world
average, such as immunology and space science. There is also a high level of particular expertise and
activity, and therefore, necessarily, a great volume of research data in agriculture, environment, ecology,
and geosciences. Furthermore, I should mention the increasing number of domestic journals that are
indexed to Thomson Reuters ISI.
When talking about data policy, you can only talk about it in the context of your entire innovation system.
In the case of Pakistan, mentioned this morning, the private sector plays a very small role. In the case of
South Africa, the private sector is the largest performer of R&D, but also the smallest producer of
scientific output in the form of articles, unlike Japan and even the United States, where many articles
emanate from private-sector addresses.
When you talk about an innovation system, you talk about the main actors: business, higher education,
and government. These operate in the pursuit of innovation activity, of which R&D is but one activity.
You also have to look at your financial system; your cultural and political norms; the regulatory
framework; the legal framework, including intellectual property; and information policy. If any one of
these is suboptimal, the others are not going to flourish.
In South Africa, we have the National Research and Development Strategy of 2002, which brought about
some reorganization of the science system related to reporting lines, which may or may not be significant.
It did lead to the development of changes to intellectual property law, as well as the introduction of a
forward-looking incentive for R&D, which rewards R&D, much as happens in the United States, with a
OCR for page 22
22 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
150 percent deduction in taxation. It also led to some initiatives in human-resource development, such as
a research chairs project modeled on that of Canada, and the introduction of centers of excellence in the
universities in various fields.
This has been followed by an innovation plan in 2008, which plays into the same theme of grand
challenges that you find in many countries. The five grand challenges identified are all highly data-
intensive areas:
1. Space science (i.e., remote monitoring and telemetry).
2. Energy, the hydrogen economy, and new materials or catalysis.
3. Farming to pharmaceuticals and biotechnology, plant and animal science.
4. Global dynamics (i.e., climate change) and remote sensing.
5. Human and social dynamics (i.e., social sciences).
Then we have the introduction of the Technology Innovation Agency in 2010, with a mandate of
converting R&D findings into commercial prospects. These initiatives, of course, operate within the
larger context of other national laws on employment equity as well as immigration law. These various
thrusts are now being driven forward through an industrial policy. South Africa is a very important
gateway in Africa, through its work with the New Partnership for Africa’s Development. We have also
been active in promoting international networking through our centers of excellence in high-speed
computing.
What about the prospects for research data policy? The country has a science system—arguably going
back a century and a half—to the onset of large-scale mineral exploitation, so it is not a newcomer. There
are many countries in Africa where science systems date back perhaps 50 years or less. We have been at it
for a long time, in the same way that Brazil has been at it for a very long time.
South Africa is data rich at the system level. We have well-quantified data on R&D and innovation, and
educational statistics. Bibliographic information is held in a private database, which is not unusual.
Thomson Reuters, after all, is also private. We have good data on higher education, although this is
inadequately exploited, and we are busy building a new database called the Research Information
Management System, which will hopefully lead to better research management, both in our research
councils and in the universities.
What about data held by regulators? We have a clinical trials register. We have gene banks. We have data
on plant breeders’ rights, biodiversity, and indigenous knowledge compliance, and we have an ethical
clearance system built into the funding awards process. We also have the Promotion of Access to
Information Act, which allows access to this information. The Patent Amendment Act has introduced
some potential dysfunctions into the system, in that it might well constrain people from taking out patents
in South Africa as opposed to exporting their knowledge abroad, which is the exact opposite of what the
drafters of the act intended.
We are also data rich at the sector level. We have the South African Earth Observation Network. We have
radio astronomy data; seismic data; oceanographic, geological, and meteorological data; social science
data; and biodiversity data, including about aquatic diversity. We have an extremely able statistics
service.
The question is, who gets access to this and how can it actually be used? Regrettably, there is
fragmentation by default. Although Statistics South Africa has a mandate to coordinate national surveys,
that is as far as it goes. The national Department of Science and Technology (DST) has a mandate to
coordinate science budgets, but not to coordinate information. Indeed, it has experienced extreme
difficulty even in the attempt to coordinate science and technology budgets across government. Silos rule
OCR for page 23
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 23
in the government domain. Originally the DST mandate was to coordinate R&D budgets. It turned out to
be impossible, so it was widened to cover science and technology. That has been even more difficult.
We have a National Advisory Council on Innovation, but it lacks the authority to carry out its mandate. It
is currently undergoing its third review. We have the problem of management information systems being
designed for one task, but being forced to address other tasks; there are resource limitations of inadequate
metadata and training.
So, what are the prospects for coordination? I would like to be optimistic and say they are reasonable and
perhaps improving. There is a commitment to monitoring and evaluation at the highest level. We now
have in the presidency a minister for monitoring and evaluation. That minister has required each of the
ministers of state to enter into an indicator-based performance agreement. This, unfortunately, might
bring about perverse behavior. If you are asking me to account for myself, it is in my interest to set the
bar as low as possible, because then I will be sure to succeed, and I will get a pat on the back.
We also have a commitment to big science, and that necessitates a lot of work on data coordination. South
Africa has given strong support to the African Union’s work on science and technology policy, in the
form of promoting high-speed, wide-area networking and data sharing, and supporting the development
of suitable infrastructure. We also have a commitment to the Organization for Economic Co-operation
and Development guidelines for access to publicly funded research. A review of the science system is
currently under way, which provides an opportunity to accelerate progress.
I must also draw your attention to the United Nations Educational, Scientific and Cultural Organization
(UNESCO) study on the social sciences, Knowledge Divides, which was published in 2010. It looks quite
critically at the issue of social data.
To conclude, I would like to focus on the North-South relationship. I parody this relationship as one of
academic hunters exploiting data gatherers. With due respect to my hosts: “We from the North are the
hunters, and you in the South are the gatherers. Collect the data, and we will process it for you. We will
keep the datasets, because we cannot trust you to manage or to exploit them properly.”
These kinds of interactions lead to mechanisms that restrict inquiry. There are many African countries
that have been exploited because they have very interesting archaeological or anthropological resources,
and people from other countries wish to come in and get their Ph.D.s, carry out interesting research, and
get the credit for that. As a result, hardly surprisingly, many countries have closed their doors. In addition,
these unbalanced relationships are used to restrict other kinds of research, particularly social science
research that might turn up unwelcome findings.
Finally, there is a large amount of information that has not been digitized, and therefore is not easily
accessible. The promotion of legislation for data curation and archiving, and for open-access publication,
would be desirable as well.
OCR for page 24
24 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
7. Access to Research Data and Scientific Information Generated with Public Funding in Chile
Patricia Muñoz Palma
National Commission for Scientific and Technological Research, Chile
INTRODUCTION
The progressive and significant increase of public funding for the development of scientific research in
Chile, while advancing new technologies and international institutions, impose a new challenge: to insure
access to and promote the reuse and preservation over time of scientific and technological data generated
using public funds. I will describe some policy initiatives addressing to the access to research data and
scientific information that are currently under study in Chile.
The situation in Chile can be summarized as follows: Over 80 percent of research activity is supported by
public funding, and these funds are oriented mainly to researchers, universities, and research institutions.
The results of the research are published in scientific journals and proceedings, and are available in both
electronic and print format. Most research data are not available for access or consultation, and commonly
are managed by the researchers themselves.
In order to design and implement a policy initiative addressing the access of data, besides the practices of
researchers and institutions, other aspects must be considered, such as legal frameworks, agreements
between beneficiaries and funding agencies, intellectual property, copyright, and transparency law, and
international recommendations of institutions such as the OECD, ICSU, and CODATA.
In this context, CONICYT (National Commission of Scientific and Technological Investigation) takes
into account the country’s needs to have an adequate infrastructure and to generate policies that facilitate
access to research data and scientific information. For this reason, we have developed the following three
main initiatives, as described below.
THE FIRST INITIATIVE: A Program for Management of Research Data and Scientific
Information
The principal objectives of the proposed program are to manage, strengthen, and guarantee access to data
and scientific information collected and produced in Chile from public funding, in order to:
• Facilitate and promote access to, and exchange of, data and scientific information
generated in the country;
• Preserve the scientific patrimony (i.e. specific elements of agreed national importance);
• Promote utilization of international standards for management of data and scientific and
technological information in public and private institutions;
• Create information products and services as added-value tools for scientific, productive,
and economic development of the country;
• Systematize the process of management of data and scientific information with the goal
to obtain data, which will allow the creation and improvement of national indicators regarding
investment, transference, and impact of public investment in the national system of science and
technology.
The program has considered four strategic items:
1. Institutional framework: Respond to the needs of the country with a broader vision for
development of science and technology and innovation.
OCR for page 25
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 25
2. Specific studies: Generate basic inputs for decision making and generation of policies.
3. National and international linkages: Create incentives for development of national and
international networks.
4. Human capital: Develop instruments allowing for specialization of professionals in these
areas.
The most important product of this program will be the design of a National Platform for Access to
Science, Technology, and Innovation Knowledge. This platform will allow consulting and access to
research data and information, and will enable their storage and preservation. This platform will involve
all the relevant institutions, such as universities and technology research centers, participating in the
process of production of technological and scientific knowledge, including the collection of data,
processing, production of specific reports.
THE SECOND INITIATIVE: Study State of the Art of Access and Management of Research Data
and Scientific Information Generated with Public Funding
This initiative was the first step for the implementation of the above-mentioned program. Its goal was to
gain basic understand about international and national practices on access to data and information.
The specific objectives of this study were to:
• Identify the international state of the art on these topics;
• Describe the practices and policies for access to research data and scientific information
in Chile; and
• Generate recommendations related to management of research data and scientific
information for the Chilean institutions, considering the international context.
This study covered several deficiencies of the Chilean practices, such as the ones explained in more detail
below:
Institutional framework:
• Lack of institutional awareness at all institutional levels, about the relevance of keeping
adequate data management practices.
• Responsibility for the management of data and scientific information is spread
throughout organizations, lacking one specific organizational unit in charge of it.
• Data management within the organizations is financed through individual projects, rather
than considered an activity that requires continuity, therefore a permanent source of funding.
• Patrimony is not seen as an institutional and national asset, but as researchers’ personal
resources.
• Lack of standardization of data and the formats of scientific information.
Human capital:
• Lack of critical mass of human resources specialized in data management.
• Scant supply of formal and specialized academic programs focused on data management.
• Lack of incentives to create upward career paths in the field.
• No leadership or national references in the field.
• Unstable and short-term work positions.
Patrimony:
• Lack of or weak valuation of patrimony.
OCR for page 26
26 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
• Patrimony scattered throughout organizations.
• Patrimony considered personal, rather than institutional or national.
• Preservation and formatting according to individual criteria.
• Lack of initiatives for sharing or exchanging.
• No culture of giving access to patrimony.
Policies:
• Lack of institutional policies addressing access to research data and scientific
information.
THE THIRD INITIATIVE: Development of a Policy on Management of Research Data and
Scientific Information
The goal of this policy for the main Chilean public funding institution (CONICYT) will be to implement
a series of standards for capture, registration, and management of data and information systems to be by
all beneficiaries of public funding.
The principal objectives of this activity are to:
• Optimize and rationalize use of public resources involved in generating and managing
scientific knowledge;
• Increase access to research data and scientific information; and
• Adopt and attain international standards, including OECD recommendations.
In order to implement this new policy for CONICYT, the next steps will be to:
• Generate awareness in the scientific community, and among academics, public
employees, and public agencies about the relevance of having access to research data and
scientific information; and
• Develop a network of institutions and individual professionals interested and willing to
collaborate to the implementation of this policy.
CONCLUSIONS
These initiatives represent a significant challenge for CONICYT. As the main Chilean funding agency for
scientific and technology research, it is aware of the value that the access to research data and scientific
information adds to the process of knowledge production. The implementation phase that comes next will
encounter resistance at different levels, from legal barriers to deeply established notions of ownership of
the data produced by researchers. We trust CONICYT will be able to gather the support of the scientific
community and political leaders, and be able to align them around the goal of boosting the quality and
quantity of the national scientific and technological production.
OCR for page 27
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 27
8. The Management of Health and Biomedical Data in Tanzania: The Need for a National Scientific
Data Policy
Leonard E. G. Mboera
National Institute for Medical Research, Tanzania
Sound statistics are a key component of evidence. This is the main reason that we see that scientific
research is increasingly dependent on successful outcomes of access to data. In particular, the research
findings that have been conducted in one place can be well utilized or relevant somewhere else. Many
developed countries have created some initiatives whereby they have established national policies and
programs for data management and access, but the situation is different for developing countries. This
includes Tanzania. Most of those countries either have a weak or nonexistent policy for management of
data and access.
However, there have been some initiatives taking place in the African region that are worth mentioning.
For example, the Algiers Declaration (2008) aims at improving
• The availability of relevant and timely health information, and access to global health
information;
• The management of health information through better analysis and interpretation of data;
• The availability of relevant, ethical, and timely research evidence;
• The use of evidence by policy makers and decision makers;
• The dissemination and sharing of information, evidence, and knowledge; and
• The use of information and communication technologies.
As for the current situation in Tanzania, we have developed guidelines on data transfer, but there is no
policy on research data sharing yet. Difficulties also exist in accessing and sharing the scientific data that
has been collected by different researchers. Specific barriers to the access and sharing of scientific data
collected by researchers using either public or donor funding include scientific and technical, institutional
and management, economic and financial, legal and policy, and normative and sociocultural aspects.
The National Institute for Medical Research (NIMR) plays a great role in health research and data sharing
in the country. For example, the Parliamentary Act No. 23 in 1979 gives the institution a mandate to work
on some key issues: (1) to establish a system to register the findings of medical research carried out
within Tanzania, and promote the practical application of those findings for the purpose of improving or
advancing the health and general welfare of the people of Tanzania; and (2) to establish and operate
systems of documentation and dissemination of information on any aspect of the medical research carried
out by or on behalf of the institute. This is the basic mandate that the NIMR has had since its
establishment more than 30 years ago.
Regarding management of health research in the country, the situation is not very good. Researchers in
Tanzania, like in other developing countries, lack the norms and traditions of open data sharing for
collaborative research. If you go to individual institutions within the country, you will find that many
institutions are treating their data either as a secret or as commercial commodities. They are not really
open to say, for example, “Here’s the data. Let’s do something with it. Let’s make sure that we inform
the world about the research we have been doing.” Also, although the government does not actively
protect such data, it lacks policies that provide guidance or identify responsibilities for the researchers in
making research data available for others to use. Moreover, Tanzania does not have a central data center
or digital repository in place where researchers can submit their data for use by others. This is a key
challenge. We are saying that researchers are not willing to submit their data, but if they are, where are
OCR for page 28
28 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
they going to put it to share with other institutions or with other people?
Next, I would like to talk about the issue of research data transfer in the country. What are the guiding
principles for data-transfer agreements in Tanzania? Tanzania introduced a procedure for data transfer
between Tanzanian and foreign institutions in 2010. The Data Transfer Agreement (DTA) is used as the
only legal document by institutions in Tanzania to regulate the uses of data they provide to specific
research projects inside and outside the country. It also provides opportunity for institutions to claim co-
ownership of the improvements made from data the recipient has acquired. Upon approval of the DTA,
the agreement becomes valid and the recipient is granted unique access for a period of time, depending on
the duration of the project, after which the data are placed in the public domain.
Before concluding, I want to inform you about publication procedures in the country. No research work
can be published without getting permission from the National Institute for Medical Research (NIMR).
The procedure is simple. When you want to do research in the country, you need to get ethical clearance
from the NIMR.
Finally, I want also to tell you about some new initiatives within the Institute for Medical Research. The
NIMR is in the process of establishing a Web-based National Health Research Data Repository. The
primary purpose of this project is to provide a storage database for all health research work that has been
conducted in the country. We want to put together all research work conducted in different places in the
country in a central database, whereby it can be accessed.
This has not been easy, and we still have a number of challenges. The financial challenge is the big one.
Also, issues of human resources are a challenge, especially regarding technical staff. Furthermore, we
have challenges related to infrastructure. We did not give up, though. The NIMR has different centers and
stations around the country. Each station or center has a purpose for conducting particular research. Some
centers focus on malaria; others focus on tuberculosis or neglected tropical diseases. The data repository
for the NIMR will collect information from these centers and stations. Once we are done with them, we
can move on to other institutions.
To conclude, with the increased need for data- and information-sharing globally, it is time now for
Tanzania to develop and implement a national research data policy. This policy, among other things, will
oversee the generation, storage, data transfer, and use of research among local, and between local and
international, users of different research findings. Also, we need to have a formal mechanism established
to govern the accessibility and use of created research data.
OCR for page 29
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 29
9. The Data-Sharing Policy of the World Meteorological Organization: The Case for International
Sharing of Scientific Data
Jack Hayes
U.S. Permanent Representative to the World Meterological Organization,
NOAA Assistant Administrator for Weather Services, and
Director, National Weather Service
The mission of the National Oceanic and Atmospheric Administration (NOAA) National Weather Service
is to use the information that science has equipped us with to provide forecasts and warnings to protect
life and property and enhance national economies. This will be the focus of my talk.
From the U.S. perspective, we have long believed that free data sharing is a benefit to the United States,
and from a perspective of the use of information created using the U.S. tax dollar, it must benefit society
and must be made available to people. My focus will be on how it benefits the United States today. I will
conclude with some additional thoughts about where free and open access can benefit capacity building in
developing countries, and then, as a weather service operator, how I see that benefiting society.
If you are not familiar with the World Meteorological Organization (WMO), it is a specialized agency of
the United Nations that was created after World War II, in 1951, although it existed in various forms since
the 1870s. The WMO has 190 members. Its mission is to focus international partnerships on the
collection, production, and exchange of weather, water, and climate information in order to protect life
and property, enhance national economies, and preserve environmental quality. There are two important
WMO resolutions related to data sharing:
Resolution 40: “WMO commits itself to broadening and enhancing the free and unrestricted international
exchange of meteorological and related data and products.” This resolution was passed by the WMO
Congress in 1995 to provide free and unrestricted international exchange of meteorological and related
data and products across the world. In 1999, Resolution 25 established the same basic policy across the
190 members for hydrological data and products.
Resolution 25: This resolution calls for “committing to broadening and enhancing, whenever possible,
the free and unrestricted international exchange of hydrological data and products, in consonance with the
requirements for WMO’s scientific and technical programs.”
The way this policy is implemented across the globe is through giving the national owner of data and
products the ability to discriminate between the data that are essential to the core mission—to protect life
and property—and the data that are called supplemental. Generally the protect-life-and-property data and
products are free and open, and supplemental data have variable degrees of openness, depending on the
country.
U.S. data policy has its foundations in the Office of Management and Budget’s Circular A-130:
“…government information is a valuable national resource, and … the economic benefits to society are
maximized when government information is available in a timely and equitable manner to all.” What we
have found in the United States is that free and unrestricted data sharing maximizes the value for the
information we collect and create within the National Weather Service. We also find that the reverse is
true. Where we have barriers, we underutilize the data that we collect and produce. Fundamentally, the
more people use our information, the more value it has.
To fulfill our mission to protect life and property and to enhance the economy, we collect and produce
information that can be used to do that, but this data can be used for other purposes as well. There is a
OCR for page 30
30 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
commercial weather sector that uses our data, adds value to it, and provides products and services for very
specific needs. This, we find, has aided economic growth within the United States.
Let me give some specific examples. When an F5 tornado hit the community of Greensburg, Kansas, in
2007, the National Weather Service detected it. The tornado occurred at night. It formed in northern
Oklahoma and moved across the border into Kansas, where it leveled Greensburg. While the community
expected hundreds of people to lose their lives, the number was about 10. This was possible because the
National Weather Service detected the tornado and, in partnership with the media, alerted people via
television, radio, and NOAA weather radio, so that they had plenty of advance warning. It was a
partnership that we had with the emergency management community that worked with the hospitals, so
that we had moved ambulances outside of the town that was going to be struck to a position where they
could be of more use.
What did the commercial weather sector do in this incident? At that time, there was a train headed right
into the path of that tornado outbreak. There was a commercial weather company that used the
information we collected to relay a warning to the dispatcher in Omaha, Nebraska. They put a hold on
that train several miles east of the path that we projected for the tornado outbreak. That train stopped and
the company saved the lives of that train crew and the train itself.
To emphasize the value of data sharing in this incident, I would like to compare it with a very similar
tornado outbreak in the early 1950s in Udall, Kansas. In that incident, a very similar tornado moved out of
Oklahoma into Udall at night and destroyed the city. As a result, 100 people lost their lives. I would say
that the free and open data exchange, and the partnership created between the private and public sectors in
the United States, has allowed us to reduce the loss of life over the past 30 to 40 years.
Here is another example. There are various motor raceways across the United States. In the partnership
we have with the private sector—for example, the Pocono Raceway in Pennsylvania—our job is to
monitor severe weather and to ensure that the raceway has the warning it needs to get the people attending
out of harm’s way. The private sector will use information we collect about temperature, humidity,
wind—anything that might affect the conduct of the race or the comfort of the people there, add value to
it, and give it to the raceway officials, to the racers, and to the people in general. That is a partnership that
increases the value of the information.
In one last example, in partnership with the Federal Aviation Administration and the Department of
Defense, we have roughly 150 Doppler radars spread across the United States. We use the data from these
radars to detect severe weather and to warn others of severe weather. The commercial weather industry
provides applications. If you have a BlackBerry or a cell phone that has an imagery capability, you can
get access to that imagery.
I will switch to a discussion of Europe now. Different countries in Europe have different models. In
general, however, many countries supplement their federal budgets by charging for some data and
products that they produce. Protect-life-and-property data and products are generally freely available, and
their warnings and observations are too. The warnings, in general, are simple cartoons, not as vivid as
what we can produce in the United States. The observations are generally conveyed as text rather than as
a visual representation.
Let me talk about capacity building in developing countries and give you an example of where a united
effort sponsored by the WMO is certainly going to enhance the capacity within developing countries. In
the wake of the December 2004 tsunami in the Indian Ocean, nations of the world got together, through
the WMO and the International Oceanographic Commission, and created a tsunami warning system that
is used internationally. It is that free and open exchange of information that aided and helped us move that
OCR for page 31
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 31
along. One effort that I have been personally involved in is in southern Africa, called the Severe Weather
Forecast Demonstration Project. I have long objected to the use of the word “demonstration,” because
basically it means that we take model information based on geophysical equations and apply it to
predicting and warning citizens. I do not think we need to demonstrate that; we have seen that for 30 or
40 years in the United States. It is a matter of growing actual predictive capabilities in developing
countries.
What we have done is allied with other similarly inclined developed countries and created in South Africa
the ability to bring in global model data. The government of South Africa runs a regional very-high-
resolution model, combines it with satellite information, and creates products and data that it distributes to
16 developing countries in the Southern African Development Community. Similar initiatives have been
undertaken in East Africa, Northwest Africa, the Southwest Pacific, and Southeast Asia, as well as the
Caribbean. These initiatives are regionally driven, in which a subregion or a region makes a statement
about its environmental threats, needs, and priorities. Under the WMO auspices, developed countries help
to develop capacity so that we can transition to those countries things that we have become used to in the
United States.
Looking at a world picture over the past 50 years, increased data sharing has contributed to a major
reduction in loss of life. In the 1956-to-1963 time frame, 2.6 million people lost their lives due to
environmental disasters related to weather, water, and climate. Fast-forward to 1996 through 2005, when
220,000 lost their lives. This is still too many, but it is an order-of-magnitude reduction in loss of life due
to weather, water, and climate.
Looking to the future, initiatives such as the Global Earth Observation System of Systems (GEOSS) is an
attempt to create data sharing for future scientific and operational benefit. As we look to the nine societal
benefit areas in GEOSS, we are already seeing the scientific interrelatedness. It will only grow. From my
perspective, free and open data exchange helps us advance the vision of GEOSS faster, cheaper, and
further than we could if we did not have that capability.
To conclude, from a U.S. perspective, free and unrestricted data sharing in our country has created great
value that we realize today. We strongly support initiatives to sustain and broaden it, both in the United
States and internationally.
OCR for page 32
32 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
10. DISCUSSION OF PART TWO BY THE SYMPSOSIUM PARTICIPANTS
PARTICIPANT: I have a question for Dr. Hayes. When you mentioned the less-than-optimal policies of
certain European countries in data sharing, do you see an improvement in recent years in that respect?
DR. HAYES: The answer is yes. In the 1990s, there was very strong opposition to more free and open
data exchange policies. However, there is a trend in many of those countries today to adopt our model, if
not formally, at least informally.
PARTICIPANT: My question is to Dr. Kahn. Do you have any divisions between locally published
articles versus internationally recognized journals—Nature, Science, and even specialized journals?
DR. KAHN: That is indeed the very problem of utilizing Thomas Reuters or Scopus or any international
repository to measure publication output. It is naturally biased in three ways: language, field, and against
locally published journals. It is a restricted view, but it has become the gold standard. Garfield gave us the
ISI nearly 50 years ago, and it has retained its status up to the present time, but it has to be used with
caution, which is why I made the remark I did. If we take any country outside the core of the Organization
for Economic Co-operation and Development (OECD), there will be a host of local journals in the local
language, particularly in the social sciences, law, economics, and so on, that are not counted. To get a true
picture of the contribution to the knowledge pool, you have to extend what is included. We can assume,
of course, that these local journals are subject to peer review and have an appropriate frequency of
appearance. Plus, in the social sciences, publication channels differ. Books are more common than journal
articles.
PARTICIPANT: I have one partial correction for you and then a question. The correction is that the
implementation guidelines that you showed for the Global Earth Observation System of Systems
(GEOSS) are no longer proposed; they are actually accepted by the GEOSS plenary. That is progress.
The group is still working on pushing that further. There are a lot of issues that those guidelines and the
original principles do not address.
My question is that there were a lot of very interesting presentations about the responsibilities of the
developing countries’ governments to coordinate and get their data policies and act together. We have
heard less about what the responsibilities of developed countries should be, having had a tradition of
being hunters. From your perspective, how should developed countries behave? What practices should
they adopt for acting respectfully toward developing countries’ efforts?
MR. MAYALA: I think that there is a need for the developed countries to focus on what developing
countries are doing. Of course, if there is any support they can provide, then they should provide it. We
need to learn from each other. From there, we can move together.
DR. BALSTAD: A few years ago, the International Council for Science sponsored a study of various
priority areas in international science. One of them was data and information. The report that came out on
data and information was not intended to be a manual or a policy prescription, but something in between
that discussed major issues in data and information and some of the responses that have been developed
in the developing world. Part of it is that we are all learning in this new area, and we need to work
together. Some countries have gone farther down this road than others. It is vitally important that they
share information about how to do it.
DR. KAHN: Remember the medical oath of Hippocrates: First, do no harm. That is what doctors agree to
do when they are sworn into the profession. In asking developing countries to make data available, we
should expect that there would be reciprocity from the North. The reciprocity should include an
OCR for page 33
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 33
understanding of the defensiveness that might be encountered from the South, which is a residue of many
years of a lopsided set of relationships. A book like John Le Carré’s The Constant Gardener does not
come out of nowhere. There is lopsidedness and often a legitimate concern that we in the South are, as
one cabinet minister put it, a giant zoo for the benefit of others. The important thing is that the barriers
need to be negotiated away, lest the general scientific enterprise suffers.
PARTICIPANT: I would like to share some of our experiences and the challenges that we face with this.
Our project does computer modeling of infectious disease transmission, and for that, we typically use
high-resolution surveillance data from governments. For this, we tried to form partnerships with countries
in Southeast Asia (Laos, Cambodia, Vietnam, and Thailand), and in Africa (Niger and Nigeria). Our
approach has been to make agreements with ministries of health to use data collaboratively and then build
capacity inside the ministries to manage data, where needed, digitize paper records, and then put them up
in repositories that can be used, and also train people in modeling. In some countries, like Thailand, that
has worked very well. In other countries, we can make agreements with ministries to share and coauthor
papers and provide technical reports, especially in very-low-income countries, but our experience has also
been that the time that people have in ministries of health or in governmental agencies is incredibly
limited to interact with us and to be trained. They have very limited capacity to train people and then use
the data management system at all. So it has been a struggle for us to identify people and push
governments to see if there are people available to be trained. There is also the issue of lack of interest in
the data itself. We found that a lot of governments do not really seem to care too much or do not see the
value yet. In summary, our challenge is the sustainability of training people and making sure that after we
leave, the country retains the databases and data systems and modeling capacity, and it has also retained
the interest of the people locally.
PARTICIPANT: My question is for Ms. Muñoz. You gave us a very comprehensive description of the
initiatives of the National Commission of Scientific and Technological Investigation (CONICYT) with
regard to promoting open data-sharing policies, but I wonder whether you could expand on the efforts of
CONICYT to promote a culture of data sharing with the international community for Chilean scientists.
MS. MUÑOZ: I think that we can work with the whole community to promote open access to data and
scientific information, but at this moment, we think that the most important initiative will be to develop
and implement the policy for management of publicly funded research data and scientific information,
because this is the missing gap.
PARTICIPANT: My question is about how you see the new paradigm coming out from the cloud-
computing utilities that, in a way, the scientific community is going to adopt. We can see that this
phenomenon is growing very fast in the United States. The National Oceanic and Atmospheric
Administration (NOAA), for example, is leading, with other agencies, this kind of phenomenon. I am
very interested in your perspective. The second question concerns how you see this new utility from the
technological point of view, as a way to boost collaboration among developing countries. I am
representing Italy, and am very interested to see how we can boost this transatlantic dialogue between the
United States and Europe in order to support developing countries.
DR. HAYES: From a NOAA perspective or a National Weather Service perspective, I put on rose-
colored glasses and say, “How could we benefit society?” Then I take the glasses off and say, “What
constraints do I have to live with?” One of my challenges here is to increase the value of weather
information to the United States. I think that creating a concept like an open National Weather Service—
which would include the public sector, the private sector, and the academic sector—would ensure that.
This allows research managers to talk with operations managers and encourages data-sharing partnerships
among a variety of missions.
OCR for page 34
34 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
Given the importance of water to our future, we have launched an integrated water resources science and
services within NOAA. There are 21 federal agencies in the United States government that produce water
information. I would think that we could partner across the U.S. federal government, share data and
information, and increase its value and partnership to the United States. I would broaden that to the
private sector as well.
Another example is one that we have just started. It has to do with space weather. We are going to have a
solar maximum in 2013. The infrastructure that we have created is more technologically advanced, but it
is also more vulnerable to things that happen on the sun than it was even 20 years ago. I envision a future
where we could create an international partnership around the globe through an organization like the
World Meteorological Organization (WMO) to bring together academic, operational, and private-sector
resources to benefit society.
PARTICIPANT: I would like to address a question to Dr. Hayes about what might be related issues and
possibly also get some response or input from our colleagues from Africa. Is there any information
available about trends in the participation of developing countries in the WMO GTS (Global
Telecommunication System) weather system? Do you see more data coming into the WMO from
developing countries, particularly Africa? Also, has the WMO become more active in recent years in
providing services or products, above and beyond the basic weather forecasting, that would be of special
benefit for developing countries? Again, I ask particularly with respect to Africa.
DR. HAYES: I will try to address it, and if I miss something, you can come back to me. First,
observations are not inexpensive. You can imagine a developing country has similar challenges. We also
have commercial interests that are threatening; for example, 4G mobile communication standards are
invading the space that used to be reserved for meteorological sensing and data transmission. Second,
going beyond weather, there is a concept in WMO called the Regional Specialized Meteorological Center.
It really, I think, stemmed out of Chernobyl, where some countries had a nuclear-dispersion capacity and
could share that with regions and make those products available. I certainly see that capability. As it
applies to Africa, I think the Severe Weather Demonstration Program that WMO is using is trying to
anchor within a sub-region of Africa in more than just the computing and distribution infrastructure, but
in the ownership as well.
PARTICIPANT: This is a question for panelists Muñoz, Mayala, and Kahn. In developing your
respective national science data policies, did you find the greatest resistance coming from political forces,
economic or business forces, the scientific community, or just plain inertia? In other words, we have not
needed it before; why do we need it now?
DR. KAHN: Previously, I worked for our Human Sciences Research Council at a time when we were
busy debating the meaning of the OECD 2007 guidelines, which are great and noble and agreed upon by
all. The Human Sciences Research Council, where I worked, was a research performer, unlike the
Economic and Social Research Council of the United Kingdom, which is a grant maker. The parody at the
time was that our universities wanted only two simple things from us: all of our research funds and all of
our data. The parody was, you show us yours and we will show you ours. It is a very critical issue. We are
moving toward an open-access data policy. Certainly the national Department of Science and Technology
believes that this is a good thing, but at the same time you get what I will call a “techno-nationalism” that
creeps in. It is even present in what I heard this morning from Tanzania. We have been cheated in the
past, and therefore we have to be on guard to ensure it does not happen in the future. So you get these
countervailing tendencies. On the one hand, there is something we might even call data sovereignty. On
the other hand, there is data openness. We all agree that data openness is the way to go, but in the
evening, when we have a drink with our party colleagues, we say, “No, no, no. The nasty foreigners are
going to take it all away from us, and we have to be on our guard all the time.” It comes from two sides.
OCR for page 35
PART TWO: STATUS OF ACCESS TO SCIENTIFIC DATA 35
You asked where most of the resistance comes from. I think there is a political dimension. The business
sector is another case, because in South Africa the greatest volume of research and development is done
by the business community. In that sense, we are more like an OECD core member state than an emerging
economy. However, the business community does not publish very much in the scientific literature, so
that does not really come up. The complication arises when you have a research project, and this takes us
to the Bayh-Dole Act in your country—a research project that is jointly funded by business and public
funds. How do you now decide which piece of data goes into the public domain and which remains
behind the company walls? There is resistance. It is political. It is nationalistic, techno-nationalism, as I
said. There are also proprietary restrictions.
MS. MUÑOZ: I think that the big problem for implementing this in Chile is our scientific community,
because of the cultural aspect. In this context, the researchers do not have knowledge about what is
important in these matters in the development of the country, and its public value.
MR. MAYALA: I want to give a little bit of experience from my country. First of all, I think it is known
that you cannot put science and politics in the same pot. It is very difficult to act together. In my country,
we want to come up with something that people will see. There are those groups who wait and see and
then they react. We are in that kind of situation. I mentioned that we are trying to build a data repository.
We hope that when it is out, we can get an appropriate reaction from people.
PARTICIPANT: I have a question for Dr. Kahn. You mentioned that South Africa is now a member of
the BRICS Group. I presume that means more cooperation and partnerships with the large developing
countries that are members of that network. You also mentioned that there is the issue of South Africa in
Africa. You did not talk about that as much. Could you describe some of the partnerships and networking
that are going on in South Africa with other African countries?
DR. KAHN: First, on an economic standpoint, South Africa accounts for something like 30 percent of the
continent’s gross domestic product and around 70 percent of its scientific production. Those two walk
together. South Africa, for the last 10 years, has been paying the bill to host the New Partnership for
Africa’s Development and now the African Union’s Science and Technology Secretariat, which is hosted
in our Council of Science and Industrial Research in Pretoria. In addition, I mentioned the role that we are
playing in supporting higher-education development. I mentioned a figure of 50,000 students from across
Africa studying in South Africa, out of a student population of around a half a million. It is close to 10
percent. Of these 50,000, fully 30,000 come from an economic grouping known as the Southern African
Development Community of 15 southern African countries, including Madagascar. Those students are
regarded as home students. They pay exactly the same fees as I pay for my own children. So, South
Africa is donating 1 billion Rand (the equivalent to 122.1 million U.S. dollars) a year to support students
from the Southern African Development Community. This is very important.
Because of the economic dominance of the country, there are many international agreements to which we
are a party and where we play a central, coordinating role, ranging from air traffic navigation to weather
and climate mapping and the like.
Southern Africa’s achievements lie in environmental modeling and monitoring cross-borders, as well as
in the harmonization of transport links. All of these involve considerable research and data sharing. So we
in South Africa are playing a role both in capacity development and in data sharing in numerous fields,
across southern Africa in particular, but further afield as well. A lot of this, of course, is driven by self-
interest, and our business activities to the north are well supported by this research investment over many
years.
OCR for page 36
36 THE CASE FOR INTERNATIONAL SHARING OF SCIENTIFIC DATA
PARTICIPANT: Some international organizations still have contributing members who are not following
those data policies. How can we provide incentives rather than simply just examples as we move
forward? Are there lessons to be learned from other communities that have had success?
DR. HAYES: I will speak from a meteorological perspective. One of the challenges in our community is
that ownership of the data is country dependent. The philosophy we use in the United States is not the
same philosophy they use in Russia. It is more than data. It is how the government is organized and how
they fund the services. While we may not feel a threat in the United States, the Russian
Hydrometeorological Service does. You have to almost deal with those one at a time, country by country,
and be willing to sit down and try to find a way to reduce the perception of threat.