The following is a brief overview of some of the many information resources and tools currently available in mathematics and selected other fields, which offer some insight into the diverse ways that mathematics literature can be used.
GENERAL BIBLIOGRAPHIC RESOURCES
Library information services have well-established conceptual schemas and database tools for handling the first five classes of bibliographic objects listed in Chapter 2 (documents, people, organizations, events, and subjects) and the most common relations between objects in these classes. These library services are exemplified by the following cross-disciplinary databases and portals:
- WorldCat1—a union catalog that itemizes the collections of 72,000 libraries in 170 countries and territories that participate in the Online Computer Library Center (OCLC) global cooperative;
- Library of Congress—index of books, both academic and nonacademic;
- SciVerse Scopus—index of abstracts and citations for journal articles2;
- Web of Science3—index of abstracts and citations for journal articles;
- Google Scholar4—a search engine for research literature capable of examining full text of articles (not just metadata and abstracts), ranking returns by citation counts and other criteria, and providing links to related papers and accessible versions;
- Scopus5—a bibliographic data service covering all academic fields, offering citation analysis tools, owned by Elsevier;
- Web of Science6—a bibliographic data service covering all academic fields, offering citation analysis tools, owned by Thompson Reuters; and
- Microsoft Academic Search7—a relatively new, free search engine for academic papers and resources, with the capability to identify papers, authors, conferences, journals, and organizations as first class objects; display relations between these objects; and the displays of “citation in context” with snippets from citing documents.
Larger, more loosely defined data structures and services use methods of massive data analysis (NRC, 2013) for search and discovery on the vastly larger scale of the World Wide Web. These services have become essential tools for information retrieval in mathematics as in every other field. They include the following:
- Google Web Search,8
- Google Scholar10 (an index of an unknown and not easily estimated number of academic books and articles), and
- Microsoft Academic Search11 (an index of 48 million publications and more than 20 million authors across a variety of domains with updates added each week).
6 Thomson Reuters, “Web of Science,” http://thomsonreuters.com/products_services/science/science_products/a-z/web_of_science/, accessed January 16, 2014.
11 “Microsoft Academic Search,” Wikipedia, last modified January 12, 2014, http://en.wikipedia.org/wiki/Microsoft_Academic_Search.
Other, more specialized indexes provide essential Web services to participating partners. These services provide data that are consumed to varying extents in machine processing by the above services in preparation of data for display to human users. These indexes include the following:
- CrossRef12 index of Digital Object Identifiers (DOIs),13 available only to participating publishers; and
- ORCID (Open Researcher and Contributor ID) index of nonproprietary alphanumeric codes that uniquely identify academic authors with annual open data dumps.
RESOURCES FOR THE MATHEMATICAL SCIENCES
Specialized Mathematical Databases
Specialized mathematical databases are examples of “bottom up” attempts by the mathematics community to create relatively open, accessible databases of mathematical facts. There are many specialized databases of formal information that are of interest to specific communities, such as those described below.
• On-Line Encyclopedia of Integer Sequences (OEIS)14—This searchable database of integer sequences provides a brief description for each sequence, including how that sequence is defined and how it arises in various contexts, and related formulas, generating functions, code, links, and references (Sloan, 1973; Sloan and Plouffe, 1995). This resource is extremely valuable for researchers in number theory and combinatorics, where sequences arise naturally. It is very useful for a researcher encountering an unfamiliar sequence to check quickly if this sequence has been encountered before and, if so, what is known about it. OEIS has an active user community, which it relies on heavily for user contributions. It is licensed under the Creative Commons Attribution Non-Commercial 3.0 license.15
• EZFace interface for evaluation of Euler sums16—This specialized computational tool provides for the evaluation of multiple Euler sums, also known as multiple zeta values. Multiple zeta values are
functions of a finite sequence of positive integers and are known to satisfy a myriad of tricky identities. They can sometimes be reduced to polynomial functions of evaluations of the Riemann zeta function at integer values. This tool helps researchers who may encounter such sums to evaluate them using known reduction algorithms.
—Distributome: An Interactive Web-based Resource for Probability Distributions17—This is an open-source, open content-development project for exploring, discovering, navigating, learning, and computationally utilizing diverse probability distributions. Probability distributions are highly structured mathematical objects with fairly universal features, depending on the space over which a given probability distribution is defined (discrete, continuous, univariate, multivariate, Euclidian, non-Euclidean, etc.), such as a probability mass or density function, distribution function, quantile function, probability and moment generating function, etc. The interactive Distributome graphical user Navigator and the Distributome-Editor provide the following core functions:
Visually traverse the space of all well-defined (named) distributions;
Explore the relations between different distributions;
Distribution search by keyword, property, and type;
Obtain qualitative (e.g., analytic form of density function) and quantitative (e.g., critical and probability values) information about each distribution;
Discover references and additional distribution resources; and
Revise, add, and edit the properties, interrelations, and meta-data for various distributions.
Complete Java source code is available with the LGPL license.
• Modular Forms Database18—This database consists of tables related to modular forms, elliptic curves, and abelian varieties, which are specialized data of interest to number theorists.
• Multiple Zeta Value Data Mine19—These pages contain tables with multiple zeta values and Euler sums to allow people to look for relations, systematics, and patterns. They are expressed in terms of a basis.
• NIST Digital Library of Mathematical Functions (DLMF)20—This is the Web version of the authoritative 1,046-page Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (Abramowitz and Stegun, 1972). The DLMF has been constructed specifically for effective Web usage and contains features unique to Web presentation. The webpages contain many active links, for example, to the definitions of symbols within the DLMF, and to external sources of reviews, full texts of articles, and items of mathematical software. Advanced capabilities have been developed at the National Institute of Standards and Technology for the DLMF and also as part of a larger research effort intended to promote the use of the Web as a tool for doing mathematics. Among these capabilities are the following: a facility to allow users to download LaTeX and MathML encodings of every formula into document processors and software packages; a search engine that allows users to locate formulas based on queries expressed in mathematical notation; and user-manipulatable three-dimensional color graphics.
• Information on Enumerative Combinatorics21—This website contains a number of supplements to the two-volume textbook Enumerative Combinatorics,22 including a Catalan Addendum, a 94-page PDF listing 204 combinatorial interpretations of the sequence of Catalan numbers. This site structures and curates the information and makes it available in machine-readable formats to allow various means of searching, browsing, and reuse.
• Wolfram Functions Site23—This website provides a substantial collection of formulas and graphics about mathematical functions. The information is fragmented into small packages (which makes it difficult to browse) and does not include references to original sources, and it is available only in proprietary formats (Mathematica® Notebook and PDF).
Currently, there is no unified way to exchange information between these specialized databases, and it is not clear that there are any incentives to make these databases talk to each other. Libraries have approached the interoperability issues at multiple levels. The highest-level and simplest ap-
22 Stanley, R.P., Enumerative Combinatorics, Cambridge University Press, Cambridge, Volume 1 (2nd edition, 2011) and Volume 2 (2001).
proach is the Open Archives Initiative (OAI), which provides for metadata exchange and federated search. The Protocol for Metadata Harvesting (OAI-PMH) enables spiders to gather up the cataloging information from multiple websites and then build a central search engine. The best known such service is OAISTER, now run by OCLC, which provides a search of more than 25 million records contributed by more than 1,100 institutions. For example, a search for a map of Polynesia from the 19th century turns up an 1839 map from the U.S. Hydrographic Office in the Harvard Map Collection (corrected to 1872). Entries in OAISTER typically have detailed but conventional library cataloging and refer to whole documents or objects.
More detailed interoperability methods include the linked open data movement, which tries to connect individual pieces of data using RDF (resource description format). RDF entries name two items and a relation between them, and are thus called “triples.” So, to take an example from “data.gov.uk”: the triple “John works for Ordnance Survey” would look something like:
(John Goodwin, http://data.gov.uk/blog/what-is-linked-data)
In this example, the triple contains two items which identify John Goodwin and the Ordnance Survey, and a link between them labeled with “works for” as a relational concept. In this case, URLs are used for each item, with the relation taken from an ontology of organizational relations defined by a European project on learning. Other relations are defined by groups like the Dublin Core Metadata Initiative, which has cataloging-type relations such as publication date, author, and so on. The ontology for music (musicontology. com) describes relations such as conductor or artist.
Linked data are an example of the general concept of the Semantic Web introduced by Tim Berners-Lee and are in use in some very large organizations such as the British Museum. In general, these cooperative catalogs are based on volunteer contributions and run by some kind of nonprofit group.
There are currently many bibliographic resources available within the mathematical sciences as well as the larger scientific community. Some examples of these mathematical bibliographic resources include the following:
• MathSciNet24 is the online interface to the database of Mathematical Reviews maintained by the American Mathematical Society (AMS) since 1940. It is a carefully maintained and easily searchable database of reviews, abstracts, and bibliographic information for much of the mathematical sciences literature. More than 100,000 new items are added each year, most of them classified according to the Mathematics Subject Classification (MSC). Authors are uniquely identified, enabling a search for publications by individual author rather than by name string. Expert reviewers are selected by a staff of professional mathematicians to write reviews of the current published literature; more than 80,000 reviews are added to the database each year. MathSciNet contains more than 2.8 million items and more than 1.6 million direct links to original articles. Bibliographic data from retro-digitized articles dates back to the early 1800s. Reference lists are collected and matched internally from approximately 500 journals, and citation data for journals, authors, articles, and reviews is provided. This Web of citations allows users to track the history and influence of research publications in the mathematical sciences. MathSciNet is a major revenue generator for AMS, for which reason the database contents are closely protected by copyright and licensing.
• Zentralblatt MATH (zbMATH)25 is a thorough and long-running abstracting and reviewing service in pure and applied mathematics. The zbMATH database contains more than 3 million bibliographic entries with reviews or abstracts drawn from more than 3,500 journals and 1,100 serials and covers the period from 1826 to the present. Reviews are written by more than 10,000 international experts, and the entries are classified according to the MSC scheme (MSC 2010). zbMATH covers published and refereed articles, books, and conferences as well as other publication formats (CD-ROM, DVD, videotapes, Web documents). Within current electronic library activities retrospective data of journals are made available even prior to 1868. The bibliographic information and links to the full text are stored within zbMATH if available. The current number of new items added to zbMATH is about 120,000 per year. More than 50 percent of the items core areas are independent reviews by experts, the remainder are abstracts and summaries of comparable quality. zbMATH is run jointly by the European Mathematical Society, FIZ Karlsruhe, and Springer-
Verlag. zbMATH is a subscription service but allows nonsubscribers to ask queries and access the zbMATH author profile pages,26 which are freely accessible.
• Ulf Rehmann’s DML page27 lists retro-digitized mathematics links to nearly 5,000 digitized books and to nearly 600 digitized journals/ seminars. This is a major resource for discovering information that has already been digitized. The webpage also lists more than 2,800 journals that have been digitized whole or in part and notes whether they are free or require a paid subscription.
• AMS Digital Mathematics Registry28 provides centralized access to certain collections of digitized publications in the mathematical sciences. The registry is primarily focused on older material from journals and journal-like book series that originally appeared in print but now are available in digital form.
• AMS eBooks29 includes retrospective digitization of Contemporary Mathematics back to the beginning of the series in 1980.
• European Digital Mathematics Library (EuDML)30 makes a significant portion of European mathematics literature available online: more than 200,000 publications, in the form of an enduring digital collection, developed and maintained by a network of institutions. A unified metadata schema was developed and adopted by all providers. The library offers a number of features including the following:
— Metadata search over the entire corpus,
— Reference and citation lists,
— Capability for users to make lists and annotations,
— An API for metadata search over the entire corpus, and
— Some capability for formula search.
Some encyclopedia resources are listed below.
27 DML: Digital Mathematics Library, http://www.mathematik.uni-bielefeld.de/~rehmann/DML/dml_links.html, accessed January 16, 2014.
• MacTutor History of Mathematics Archive31 contains biographies of several thousand historical and contemporary mathematicians as well as an index of famous curves and histories of various mathematical topics. The full text is freely available, with no formal copyright or licensing restrictions.
• On-Line Encyclopedia of Integer Sequences is described above in the “Specialized Mathematical Databases” discussion.
• Mathematics Genealogy Project32 aims to list all individuals who have received a doctorate in mathematics, providing the following information:
— The complete name of the degree recipient,
— The name of the university that awarded the degree,
— The year in which the degree was awarded,
— The complete title of the dissertation, and
— The complete name(s) of the advisor(s).
The Mathematics Genealogy Project contains more than 170,000 records. Individual pages can be freely copied without explicit licensing or copyright restriction, but data are not made available in bulk, and there is no API.
• Wolfram’s MathWorld33 is a comprehensive and interactive encyclopedia of mathematical equations, terms, derivations, and more, for students, educators, math enthusiasts, and researchers.
• Wikipedia34 is perhaps the most well known of all online encyclopedia resources. It also houses a wide array of mathematical content, generally very useful as a first place to look for the definition of a mathematical concept. Wikipedia uses Creative Commons Attribution-ShareAlike (CC-BY-SA) license.
• Encyclopedia of Mathematics35 is an open access wiki that includes original articles from the online Encyclopedia of Mathematics (2002) as well as user-added articles, totaling more than 8,000 entries and nearly 50,000 notions in mathematics. Springer, in cooperation with the European Mathematical Society, has made the content of this encyclopedia freely open to the public. The original
articles from the Encyclopedia of Mathematics remain copyrighted to Springer, but any new articles added and any changes made to existing articles within encyclopediaofmath.org will come under the CC-BY-SA license. An editorial board, under the management of the European Mathematical Society, monitors any changes to articles and has full scientific authority over alterations and deletions. This wiki is a MediaWiki that uses the MathJax extension, making it possible to insert mathematical equations in TeX and LaTeX.
• The Stacks Project36 website is an open source textbook and reference work on algebraic stacks and the algebraic geometry needed to define them. The Stacks Project aims to build up enough basic algebraic geometry to serve as foundations for algebraic stacks.
Specialized Mathematical Resources
Several specialized mathematical resources are available to the mathematics community. Some of these resources include the following:
• MathOverflow37 is an online resource that allows users to ask and answer research-level mathematics questions such as arise when writing or reading articles or graduate-level books. Users gain writing authority on the site by building up reputation points. Mathematics display support is provided with MathJaX from LaTeX source. MathOverflow runs on Stack Exchange, the hosted service that provides the same software as the popular programming Q&A site Stack Overflow. The hosting cost is paid from the research funds of Ravi Vakil at Stanford University. User-contributed content is licensed under Creative Commons Attribution-Share Alike.
• Wolfram|Alpha38 is a “computational knowledge engine” developed as an online service by Wolfram Research. It answers factual queries by computation of the answer from an internal database of mathematical and factual data acquired from diverse data sources. Both free and premium services are available. Underlying software combines natural language processing of queries with symbolic computation using Mathematica. Numerous mathematical concepts, such as sequences, functions, and probability distributions are recognized and displayed in ways that respect their mathematical structure.
• Selected Papers Network39,40 is a free, open-source project aimed at improving the way people find, read, and share academic papers. This project is not a website with a system for reviewing, evaluating, rewarding, etc. Rather, it is an environment that makes it easy to build one’s own systems, which allows for more flexibility when needed.
• Tricki41 is a Wiki-style site intended to develop into a large store of useful mathematical problem-solving techniques. Some of these techniques are very general, and others concern particular subareas of mathematics spanning all levels of experience. This project is largely inactive now after failing to acquire critical mass of users.
SELECTED RELATED EFFORTS
Many disciplines have ongoing efforts that aim to bring diverse discipline-specific information together, and many of these hold valuable lessons for the mathematics community. The following are a few illustrations of such efforts.
• Digital Library Federation Aquifer (DLF Aquifer)42 promotes effective use of distributed digital library content for teaching, learning, and research in the area of American culture and life. It supports scholarly discovery and access by developing schemas, protocols, and communities of practice to make digital content available to scholars and students where they do their work, and by developing the best possible systems for finding, identifying, and using digital resources in context.
• Project Bamboo43 is a partnership of 10 research universities building shared infrastructure for humanities research. Led by the University of California, Berkeley, one of the goals of this project is to design research environments where scholars may discover, analyze, and curate digital texts across the 450 years of print culture in English from 1473 until 1923, along with the texts from the Classical world upon which that print culture is based.
• Research Papers in Economics44 is a collaborative effort of hundreds of volunteers in 75 countries to enhance the dissemination
40 “The Selected-Papers Network,” Gower’s Weblog, June 16, 2013, http://gowers.wordpress.com/2013/06/16/the-selected-papers-network/.
of research in economics and related sciences. The heart of the project is a decentralized bibliographic database of working papers, journal articles, books, books chapters, and software components, all maintained by volunteers. The collected data are then used in various services.
• Digital Library of Chemistry Education45 provides an integrated guide to chemistry textbooks and allows both students and educators to explore chemistry. The ChemEd DL repository can be searched for resource groups within particular domains of chemistry, such as organic or physical. Resource groups relate to specific topics, such as bonding or kinetics, and are associated with specific elements. ChemEd also allows users to search by topics and look up definitions of terms. The provided glossary is extensive.
• Digital Library of Biochemistry and Molecular Biology. BioMoleculesAlive. org is a collection of digital resources sponsored by the American Society for Biochemistry and Molecular Biology. It is part of a larger effort called the BioSciEdNet (BEN) initiative.46 The collection includes resources in five areas: software, visual resources, curriculum resources, reviews, and articles from the Biochemistry and Molecular Biology Education journal. Efforts on the Web interface, database design, and tools and guidelines for submission to BioMoleculesAlive.org began in 2003 and are still ongoing.
• Astrophysics Data Service (ADS).47 Also known as the Digital Library for Physics and Astronomy, this library is maintained by the Smithsonian Astrophysical Observatory, working with NASA and the community of astronomers and astrophysicists, and links to more than 10 million papers in astronomy and related areas. An unusual aspect of this system is that it not only catalogs papers, but also tries to link papers to the astronomical objects to which they refer. A user can see papers that refer to a specific star or galaxy, via volunteer tagging of all papers with star catalog entries. NASA provides the base funding for ADS.
• U.S. Virtual Astronomical Observatory. Astronomers have access to a variety of sky images, including some interfaces designed for the general public, such as Google Sky or the WorldWide Telescope (Microsoft). Digital imagery exists at multiple wavelengths, includ-
ing the Sloan Digital Sky Survey showing visible light, the Two Micron All Sky Survey (2MASS), the Chandra X-ray Observatory, and so on. These databases are unified via the Virtual Observatory program, including the Euro-VO in Europe and others. Funding for the U.S. Virtual Observatory has been provided by NSF and NASA, but the organization is attempting to find a new support model.
• National Center for Biotechnology Information (NCBI). The National Library of Medicine maintains many important biomedical data resources. Full articles are stored in PubMed Central,48 which receives medical articles deposited by authors working on NIH-funded research (after an embargo period). It currently contains 2.8 million articles. More detailed data is stored in several specific resources such as GenBank or OMIM (Online Mendelian Inheritance in Man). NCBI also provides software tools such as BLAST (Basic Local Alignment Search Tool). All these resources are funded by NIH in the United States. A number of other organizations support tools for molecular biology. For example, EMBL (the European Molecular Biology Laboratory) provides bioinformatic services including tools for sequencing, structural analysis, microscopy, and so on. Other groups that provide molecular biology tools include the Wellcome Trust Sanger Institute, the Craig Venter Institute, and commercial suppliers. EMBL is funded by a consortium of nations not exactly overlapping the European Union, but close. The Wellcome Trust is endowed under the will of Sir Henry Wellcome, the Venter Institute is supported by J. Craig Venter and others, and so on.
• Digital Public Library of America. Numerous libraries have provided cataloging information to the Digital Public Library of America, which provides links to more than 2 million items in its member libraries. There are currently more than 400 participating libraries, including the many libraries aggregated by state library systems. The organization is a cooperative of its members, aggregated into “hubs.”
• Chemical Abstracts.49 The American Chemical Society operates one of the largest and oldest scientific information services. Chemical Abstracts Service indexes and abstracts the chemical literature and maintains an authority file of chemical compounds, with more than 70 million entries. It also keeps track of reactions, suppliers,
and other chemical information resources. Chemical Abstracts dates back to 1907 and is one of the most exhaustive services, with a history of seeking out all important chemical information, wherever it is published. In its early years, it was largely supported by major chemical companies, but for decades has been funded by users, typically university libraries or industrial organizations in chemistry, chemical engineering, biomedicine, or related areas.
• Internet Public Library. The Internet Public Library is a resource to provide answers to questions, particularly questions from students and educators. It also maintains some information collections. Originally operated at the University of Michigan with funding from the W. K. Kellogg Foundation, it is now run by Drexel University with support from a group of about 20 universities.
Abramowitz, M., and I.A. Stegun, eds. 1972. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications, New York.
National Research Council. 2013. Frontiers in Massive Data Analysis. The National Academies Press, Washington, D.C.
Sloane, N.J.A. 1973. A Handbook of Integer Sequences. Academic Press, Boston, Mass.
Sloane, N.J.A., and S. Plouffe. 1995. The Encyclopedia of Integer Sequences. Academic Press, San Diego, Calif.
This page intentionally left blank.