| The Mission
|
|
The National Academies Press is neither a government publication center
nor a federal office of any kind. Rather, we are the publication arm
of the National Academies, and our mission is to make public the
fruits of research performed by the National Academy of Sciences, the
National Academy of Engineering, the Institute of Medicine, and the
National Research Council. That mission is tempered by the
institutional requirement of being self-supporting--we must sell books
to be able to continue to publish them.
|
| The page image
|
|
In 1992, the National Academies Press initiated use of
the Xerox Docutech's short-run printing capabilities, and began
harvesting the TIFF images generated by that machine.
These TIFF pages began to be made public initially in 1994. First, the
Xerox Corporation's "DocuWeb" system was used, followed by the
NAP-developed "Book Object" system. The limitations of both of those
presentation mechanisms, and changes in toolsets and technologies,
led us to develop the "Open Book" page presentation system.
|
| Useful, Cheap, Coded--Choose Two
|
|
Many of the NAP's publications are made available online in HTML, PDF, and
even some
XML-like presentations. But HTML and PDF (not to mention XML)
can be surprisingly expensive in personnel and time if anything but a
collection of "save-as-html" files is desired.
Further, an astonishing degree of diversity in our content stream is a
given: print runs may be anywhere from 200 to 20,000; material may
come to us camera-ready from a committee, or as manuscript, or as Word
files. The print results desired may be strip-bound digital photocopies, or
may be rich four-color work. Multiple data types like this make almost
any coherent content strategy difficult and expensive to implement.
What we have
been striving toward over the last years has been the development of
an
integrated presentational system, combining an up-to-date
database, the best of HTML, full-text searching, and robust production
scripts to meet an array of goals.
|
| Multiple goals
|
|
The goals we set out for ourselves included ensuring:
- full-text searching by both internal and external search engines,
- easy, predictable, direct links to any page,
- a consistent site-wide interface structure,
- continuous improvability,
- reasonably rapid page access (even for those with 28.8 modems),
- swift, relatively inexpensive production processes,
and much more. Along with these goals came the constraints
of all nonprofit organizations, in which these capabilities must be kept
economical, be technically sustainable, and be
economically self-supporting.
|
| Multiple solutions
|
|
We have consequently created a database-generated Web catalog, which
includes links to the free versions of our publications, which may be
HTML summary and/or full HTML and/or PDF and/or OpenBook. The "common
denominator" of the NAP publications is
the OpenBook, a navigational/search envelope surrounding a page from a book
(whether in HTML or via a picture of the page).
Since early 2001, we have been underwriting the digitization of the text of our books,
and have replaced most page images with HTML text. We currently use page
images only for very new (prepublication or not yet coded) books, or very old
archival publications.
The OpenBook navigational envelope enables page-by-page browsing or reading,
and integrates exploration mechanisms which can search the entire full-text
corpus, or an entire book, or just one chapter. Discovery tools like
"find more like this book" and the "skim" function can help
researchers. Navigational elements and
other ease-making mechanisms are sprinkled throughout (such as consistently
available Tables of Contents, a "jump to page" system, etc.).
|
| Real URLs, Printable
Documents, Buyable books
|
|
The OpenBook's HTML presentation framework means you can send a
"real URL" composed of the book's ISBN and page numbers to someone by email
("Fred, check out this recommendation on water quality:
http://www.nap.edu/books/030905984/html/15.html").
Any single page can even be printed out (though at low-resolution
quality of 150x300 dpi) for
offline reference.
To achieve self-sustainability, we have a consistent "buy it" button that
enables secure commerce mechanisms for buying our books. We give away
these online versions freely for the public good, and in the belief--proven
generally true so far--that it will mean that customers who otherwise
wouldn't have found the publications will do so--and
will order books they find, thereby enlarging our
customer base sufficiently to pay for the added (and significant)
expense of doing what we're doing.
The mission of the National Academies Press is fundamentally about
disseminating the fruits of the scientific research. We hope that
our Open Book system helps that process.
|
| The Nitty Gritty
|
|
The NAP's Open Book is built using a collection of tools, of course,
predominant among them:
- Servers: Linux/Apache
- Scripting/programming tools: Perl, PHP, BBedit, Vedit, vi, Emacs, etc.
- Search engines: Swish++ on the backend, with locally-developed Perl tools
for processing raw results, as well as locally-developed mechanisms for
specific book- and chapter-based searching, and well as "related
titles" identification
- Underlying Database: Mysql
- Database Interface: generally PHP
We strive to use opensource tools whenever possible. We are hoping to
find the funding resources necessary for us to commit the time to make
much of the OpenBook system more generic, tailorable, and open source, to allow other
adventuresome publishers to make their publications available through
a navigational system similar to the NAP's. If you are interested in
more information, contact the Michael Jensen, Director of Publishing Technologies at the National Academies Press.
last updated 6/29/05
|