Like most areas of scholarship, mathematics is a cumulative discipline: new research is reliant on well-organized and well-curated literature. Because of the precise definitions and structures within mathematics, today’s information technologies and machine learning tools provide an opportunity to further organize and enhance discoverability of the mathematics literature in new ways, with the potential to significantly facilitate mathematics research and learning. Opportunities exist to enhance discoverability directly via new technologies and also by using technology to capture important interactions between mathematicians and the literature for later sharing and reuse.

In most scientific disciplines, including mathematics, Web-based access to digital resources representing the disciplinary literature is now mature and quite effective. Through a mixture of open and proprietary tools, mathematicians are able to search the enormous and very rapidly growing literature using attributes such as subjects, titles, authors, dates, and keywords; they can follow chains of citations among works backward and forward in time. While much information is contained in individual items in the mathematical literature, a greater amount of information is represented by the way they are linked. This is not just via references but through the interrelation of concepts, insights, and techniques as they are developed, refined, and spread from one mathematical discipline to another. For example, if mathematicians were able to search the literature for instances where a specific equation was used or solved, it would allow them to consider alternative approaches toward solving their own research questions. This search capability could be facilitated through the use of a database

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 1

Summary
Like most areas of scholarship, mathematics is a cumulative discipline:
new research is reliant on well-organized and well-curated literature. Be-
cause of the precise definitions and structures within mathematics, today’s
information technologies and machine learning tools provide an opportu-
nity to further organize and enhance discoverability of the mathematics
literature in new ways, with the potential to significantly facilitate math-
ematics research and learning. Opportunities exist to enhance discoverabil-
ity directly via new technologies and also by using technology to capture
important interactions between mathematicians and the literature for later
sharing and reuse.
In most scientific disciplines, including mathematics, Web-based access
to digital resources representing the disciplinary literature is now mature
and quite effective. Through a mixture of open and proprietary tools,
mathematicians are able to search the enormous and very rapidly grow-
ing literature using attributes such as subjects, titles, authors, dates, and
keywords; they can follow chains of citations among works backward and
forward in time. While much information is contained in individual items in
the mathematical literature, a greater amount of information is represented
by the way they are linked. This is not just via references but through the
interrelation of concepts, insights, and techniques as they are developed,
refined, and spread from one mathematical discipline to another. For ex-
ample, if mathematicians were able to search the literature for instances
where a specific equation was used or solved, it would allow them to con-
sider alternative approaches toward solving their own research questions.
This search capability could be facilitated through the use of a database
1

OCR for page 1

2 DEVELOPING A 21ST CENTURY MATHEMATICS LIBRARY
of machine-generated and human-cultivated information about the math-
ematical literature and allow for a variety of other capabilities to be built.
This report discusses how information about what the mathematical
literature contains can be formalized and made easier to express, encode,
and explore. Many of the tools necessary to make this information system
a reality will require much more than indexing and will instead depend on
community input paired with machine learning, where mathematicians’
expertise can fill the gaps of automatization. The Committee on Planning
a Global Library of the Mathematical Sciences proposes the establishment
of an organization; the development of a set of platforms, tools, and ser-
vices; the deployment of an ongoing applied research program to comple-
ment the development work; and the mobilization and coordination of the
mathematical community to take the first steps toward these capabilities.
Mathematics today has the opportunity to expand and redefine the way
in which mathematical knowledge is represented and used, the character
of the mathematical literature and how it evolves, and the way that math-
ematicians interact with this collection of knowledge. This new relationship
with the literature and the mathematical knowledge corpus goes beyond
new forms of access and analytical tools; it must also include the tools and
services to accommodate the creation, sharing, and curation of new kinds
of knowledge structures.
To be clear, what the committee proposes builds on the extensive work
done by many dedicated individuals under the rubric of the World Digi-
tal Mathematical Library,1 as well as many other community initiatives.2
Comparing desired capabilities going forward with what has been achieved
by these efforts to date, the committee concludes that there is little value
in new large-scale retrospective digitization efforts or further aggregations
of mathematical science publications (both traditional journal articles and
newer preprint, blog, video, and similar resources) beyond the federation
of distributed repositories already achieved through existing search services.
Nor is another bibliographically based secondary indexing service needed at
this time. Necessary incremental improvements will likely continue to occur
in these areas, but they do not require an initiative on the scale of what is
being called for in this report.
The real opportunity is in offering mathematicians new and more direct
ways to discover and interact with mathematical objects and mathematical
knowledge through the Web. The committee’s consensus is that by some
1 The World Digital Mathematics Library rubric has been used by a variety of organizations
for many distinct projects. A history of many of these efforts and the current state-of-the-art
can be found on the wiki page from the International Mathematics Union’s Digital Mathematics
Workshop in June 2012, http://ada00.math.uni-bielefeld.de/mediawiki-1.18.1/index.php/.
2 Examples include the Encyclopedia of Integer Sequences, the NIST Digital Library of
Mathematical Functions, and the Guide to Available Mathematical Software.

OCR for page 1

SUMMARY 3
combination of machine learning methods and community-based editorial
effort, a significantly greater portion of the information and knowledge in
the global mathematical corpus could be made available to researchers as
linked open data3 through a central organizational entity—referred to in
this report as the Digital Mathematics Library (DML).
The DML would aggregate and make available collections of ontolo-
gies, links, and other information created and maintained by human con-
tributors, curators, and specialized machine agents, with significant editorial
input from the mathematical community. The DML would enable function-
alities and services over the aggregated mathematical information that go
well beyond simply making publications available, to include capabilities
for annotating, searching, browsing, navigating, linking, computing, and
visualizing both copyrighted and openly licensed content. While the DML
would store modest amounts of new knowledge structures and indices, it
would not generally replicate mathematical literature stored elsewhere.
Instead, it would strive to represent the mathematical knowledge presented
within a publication and illustrate how it is connected with other resources.
While the committee believes that the DML could begin development
soon, it notes that this work would need to be complemented by an ongoing
research program to fill in gaps, improve quality and performance, increase
the robustness of available technologies, and increase the automation of
processes that still rely heavily on human intervention.
The DML would facilitate discovery of and interaction with math-
ematical information from diverse sources with varying levels of copyright.
The committee envisions the DML as a growing corpus of public-domain
and openly licensed mathematical information, Web services, and software
agents, which would coexist with present mathematical publishing and
indexing services for the foreseeable future.
A key early issue for the DML organization is how to establish con-
structive and effective partnerships with existing publishers, Web services,
and other resources, both those specific to mathematics and those serving
the much broader scholarly community. Some of these partnerships might
be challenging because of copyright concerns. However, establishing fruit-
ful partnerships is essential to the success of the DML. While the DML
would sometimes provide services and functional features that overlap with
existing services and tools provided by both commercial and not-for-profit
3 Broadly defined, linked open data are structured data that are published in such a way
that makes it easy to interlink them with other data, therefore making it possible to connect
them with information from multiple sources. These connected data can provide a user with
a more meaningful query of a subject by consolidating relevant information from a variety of
places—e.g., in different research papers—and pulling out specific components that the user
might be particularly interested in.

OCR for page 1

4 DEVELOPING A 21ST CENTURY MATHEMATICS LIBRARY
entities, the committee suggests partnering with current service providers
whenever possible rather than replicating capabilities of existing resources.
For example in MathOverflow,4 a question-and-answer website for
research mathematicians, research articles and papers are often referenced
in answers given. While the DML would not want to replicate the inter-
face and social networking features of MathOverflow, it would be wholly
appropriate for the DML to instigate and participate in a multi-party col-
laboration with MathOverflow and publishers of research mathematics
to automatically capture citations entered in MathOverflow answers and
republish them as linked open data annotations. In this scenario, the DML
could help broker standard practices for interoperability and help main-
tain the software agents and annotation repositories that would allow
publishers to make mathematicians coming to their websites aware of
MathOverflow discussions potentially relevant to the papers they are view-
ing. The converse could also be supported. Posts on MathOverflow could
be automatically annotated when errata or other commentary is added to
the publisher’s website for an article mentioned in the MathOverflow post.
This illustrates the potential for chains of annotations as a new mode of
scholarly discourse (Sukovic, 2008). To visualize how an annotation chain
might come about, begin by assuming that a post in MathOverflow refer-
encing a particular article is automatically added as an annotation to this
article on the publisher’s website. A subsequent reply to this annotation
made by a reader of the publisher website is then automatically added to
the thread on MathOverflow. A new reply subsequently added to the thread
on MathOverflow is then automatically added as a further annotation on
the publisher’s website, and so on. This would allow users of two disparate
services—i.e., one scholar using MathOverflow and the other using only the
publisher’s website—to nonetheless carry on a substantive discourse about
published mathematics research in spite of the fact that each is using a dif-
ferent utility to access the publication being discussed.
Similarly, MathSciNet and Zentralblatt Math (zbMath) already clas-
sify research papers according to the Mathematics Subject Classification
(MSC)5 schedule. The DML would not want to replicate this indexing.
However, it might be beneficial for the DML to provide complementary
indexing on other dimensions—e.g., by the occurrence in articles of well-
known special functions (hierarchies of which are maintained by the Na-
tional Institute for Standards and Technology (NIST)6 and by Wolfram
4 MathOverflow, http://mathoverflow.net/, accessed January 16, 2014.
5 American Mathematical Society, 2010 Mathematics Subject Classification, http://www.
ams.org/mathscinet/msc/msc2010.html, accessed January 16, 2014.
6 NIST, Digital Library of Mathematical Functions, Version 1.0.6, release date May 6, 2013,
http://dlmf.nist.gov/.

OCR for page 1

SUMMARY 5
Research7). Used in concert, one could then envision a collaboratively built
interface that allows refinement of an initial MSC search via attributes such
as which special functions are used in the articles that appear in the results
from the MSC search.
Such partnerships and collaborations are essential. It is vital that users
see a well-integrated interface that incorporates both the DML services and
commercial services for those affiliated with institutions that have access to
the commercial services. The committee envisions the resources, services,
and tools offered by the DML as coexisting with, and often enhancing, the
offerings from existing players in the mathematical information landscape.
The committee hopes that relevant organizations will contribute to the
work of the DML in various ways, such as by providing financial support,
allowing appropriate access to their content and services, or by participat-
ing in the collaborative development, with the shared goal of enhancing
the value of the mathematics literature. Building these partnerships would
likely require significant negotiations and collaborations, and the DML
organization would have to allocate much time and effort to their planning
and execution.
The biggest challenge, however, will be in establishing the technical,
organizational, and community-coordinating capabilities to deliver on the
construction of the resources, services, and tools described earlier in this
summary and then planning and implementing the development and deploy-
ment of the necessary systems. Some of the technologies required to build
the requisite tools and services do not exist today or are not sufficiently
mature. The committee sees the DML as having a minimal direct research
role; rather, the committee believes that the establishment of the DML
needs to be complemented by a long-term (5 to 10 years) commitment to a
focused and applied research program that would encompass both needed
technology, tools, and services and (to a lesser extent) independent research
to understand how the DML is being used and how well it is working. Ide-
ally, the commitment to fund this program could come in parallel with the
commitment for the initial funding for the DML itself (whether from one
or multiple sources). These research programs need to be well connected
to the work of the DML. This could be achieved either by ensuring that
the DML is deeply involved in the development of the calls for proposals
and the subsequent proposal evaluation or by actually placing the DML in
the role of a re-granting organization (although the committee sees some
potential bureaucratic complications with the latter option).
7 Wolfram Research, Inc., The Wolfram Functions Site, http://functions.wolfram.com/, ac-
cessed January 16, 2014.

OCR for page 1

6 DEVELOPING A 21ST CENTURY MATHEMATICS LIBRARY
ORGANIZATION AND RESOURCES NEEDED
The committee’s vision of an incremental development of the DML
starts with the creation of a small nonprofit organization, referred to here as
the DML organization. The DML organization will need a small and dedi-
cated paid staff, including a well-respected mathematician in a senior role,
to ensure its development and growth. Other staffing needs may become
necessary as the needs and status of the DML evolve, although much of the
software development and operations could be contracted out. Ideally, the
DML would be attached to and draw support from some host institution (a
university, a research laboratory, or other organization) in order to facilitate
sharing of services and to reduce overhead. The DML organization could
be governed ultimately by the mathematical sciences community through
organizations such as the International Mathematical Union and, thence,
through their member organizations.
The first and foremost challenge that the DML will face is finding a set
of primary funding sources that could support its initial development and
early operations (a period of between 5 and 10 years). It is the committee’s
hope that the DML would become a self-sustaining entity once some of its
key capabilities are established and a potential sustainable business model
is chosen from among options.8
For the first few years, perhaps the best approach would be to split
operational governance from high-level, longer-term policy governance, be-
cause these two tasks will be quite distinct. Both in the short and the longer
term, appropriate connections are needed between funding and revenue
sources and governance, and these connections may well need to shift over
time. Particularly in the early days, a light and agile governance mechanism
is crucial. Upon launching the DML effort, there would likely be a coalition
of partners with a commitment to the DML concept.
CONCLUSION
Like other scientific disciplines, mathematics is now completing a com-
plex multi-decade transition from print to a digital system that closely
emulates print for authors and readers. The mathematics community is thus
at an inflection point where it has the opportunity to think about how its
collective knowledge base is going to be constructed, used, structured, man-
aged, curated, and contributed to in the digital world and how that knowl-
edge base will be related to the existing literature corpus, to authoring
practices in the future, and to the social and community practices of doing
8 There are many lessons on sustainability to draw upon, including experiences with digital
libraries (such as arXiv) and open or community source software as well as work on research
data curation.

OCR for page 1

SUMMARY 7
and learning mathematics. Colleagues in other disciplines—stronomy,
a
molecular biology, genomics, chemistry—are in many cases well advanced
in formulating their own disciplinary-specific answers that take into ac-
count disciplinary practices (such as the mix of experimental, observational,
theoretical, and computational approaches) and the conceptual models that
underlie disciplinary thinking.
Mathematics is unusual in many ways; it maintains a healthy and con-
structive relationship with its past, as documented in the literature of the
field going back hundreds of years, and some of its literature has a long
“shelf life.” The committee believes that investments in refreshing and
restructuring the corpus of mathematical literature and abstracting it into
a knowledge base for future centuries is a valid and sound investment in
the future of mathematical scholarship. The DML proposed in this report
provides a platform and a context to achieve this and also offers a criti-
cal point of focus for the mathematical community in a genuinely digital
environment to engage in discussions about the creation, curation, and
management of mathematical knowledge.
REFERENCE
Sukovic, S. 2008. Convergent flows: Humanities scholars and their interactions with electronic
texts, Library Quarterly 78(3):263-284, doi.org/10.1086/588444.