As part of our suite of automated tools, the Web Search Builder is
designed to ease what can be a significant stumbling block for Web
searching: coming up with search terms that will bring you
back gold.
The Builder, when working with a chapter from a report of the National
Academies Press, takes the algorithmically-derived key phrases from
that chapter, and presents them via Javascript in a manner that allows
you to easily "pair up" key phrases, or add your own.
Those key phrase pairs can be then used to search Google Web, Google Scholar,
Google Books, Yahoo Web, and MSN Web, as well as the NAP's 4000+
books.
The Builder is also used by the Reference Finder, for taking text
you provide, extracting the key phrases, and allowing you to have
the same Search Builder interface, but for terms from your own rough
draft, or news article, or blog entry.
|
 |
 |
Frequently Asked Questions
About Web Search Builder
- How do you get those key phrases?
- Since 2001, we've been developing textual analysis tools to help improve knowledge discovery within our unique collection of 4000+ reports
of the National Academy of Sciences, the Institute of Medicine, the National Academy of Engineering, and the National Research Council.
Algorithmic key phrase weighting was a necessary precondition, so we
developed an approach that takes into
account word frequency, phrase frequency, term placement,
co-occurrence with other
significant terms, containing-sentence significance,
and other factors, to give individual phrases a "significance weight" within
the context of the chapter or article. This allows us to extract
the top N phrases for a given chunk of ASCII text, which we can apply
to chapter skimming, reference finding, and search building.
- But don't you want to keep me on your site?
- The National Academies Press, as part of the National Academies,
encourages knowledge exploration, research, and refinement. The
language of our reports -- which tend to be distillations of the best
experts in a given field -- also tends to include a high proportion of
"terms of art," which can be very useful for locating, within the
surfeit of Web content, high-quality resources. "Keeping you on the
site" is a tired tactic for commercial Websites -- we believe that if
these tools are useful, you'll return to take advantage of them, and
perhaps, eventually, find a book you want to buy in one form or
another.
- What else do you do with those phrases?
- We have three other primary applications: a) as an adjunct to fulltext
searching, in the backend of our book search engine; b) as the key
element in "Chapter Skimming," we we can identify, on every chapter
page, the two-sentence chunk containing the highest value of
high-value phrases, providing a proxy for skimming, and c) as the key
backend system for building the Reference Finder.
- Is this open source? Where can I get this software?
- Currently, you can't -- it has been written in Perl as a set of
operational tasks, suited to our peculiar structure, needs and purposes.
Most of the underlying code has been written by
Michael Jensen,
who while not a
professional programmer, uses code to solve publishing problems.
The code wouldn't pass Perl/Javascript 101 for
elegance, but it gets the job done. We are willing, even interested,
in making parts of this suite (for example, the keyphrase-weighter
and extractor) available in open source, so it could
be improved by real programmers
with a purpose, and thus made more efficient and worthy of reuse by
others. See Michael's page to contact him about this
option, if you're interested.
- Can I link to a "search builder" from my own site?
- Please do. We hope to see this tool being widely used to assist
in better use of our publications, and the content within them.
Simply copy the URL listed in
your browser, for any chapter's "search builder" interface.
|
 |
 |
|