Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 16
16
times on a network. This approach was tested by scanning addition, some states have begun to post historic contexts,
several paper documents and rescanning several documents National Register nominations, and similar documents in
that existed in PDF image-only format. The process chosen PDF format on the Internet. Some of these documents were
was to first scan the hard-copy documents to a Tagged Image also downloaded and rescanned. Most of the documents,
File Format (TIFF) and then rescan the TIFF to PDF. however, were collected in hard-copy format from SHPOs in
A cover sheet was created to capture metadata2 for each Vermont, New Hampshire, Minnesota, and New Mexico. A
document. The cover sheet is designed to be read by the project team member spent 1 to 2 days onsite at each office,
capture software when the document is scanned; the values copying documents identified by the SHPO staff. With some
entered in each box on the sheet correspond to a field in the guidance from SHPO staff, the cover sheets for each docu-
database. The capture software reads the value and writes it ment were completed. The paper files were then shipped to
into the database field. A quality control step allows the oper- URS's Raleigh-Durham office for scanning. A complete list
ator to review the data entry and make corrections if needed. of the documents included in the prototype (and the docu-
For hard-copy scanning, the cover sheet is printed, is placed ment sources) are provided in Appendix C.3.
on top of the document, and is the first page scanned into the After the documents were scanned, the metadata values
system. The cover sheet is not used for electronic documents; were written to the ECREL Structured Query Language
instead, the metadata on these documents are entered by the (SQL) Server database and documents were indexed for full-
operator. text searching by Verity. New documents had to be loaded
As noted in Chapter 2, ECREL is to be a web-based sys- manually to the prototype system, and Verity re-indexed the
tem. URS used Cold Fusion to develop the web pages entire document.
because Cold Fusion includes a license for the Verity search The final step before testing began was to develop an
engine. Verity is one of the more widely used search engines on-line help file and evaluation forms. RoboHelp was used
and therefore allows evaluators and testers easy access to the to create the ECREL user's guide (Appendix C.4). The
website containing ECREL. ECREL evaluation form (Appendix C.5) and test procedures
All ECREL design artifacts (i.e., elements and documents) (Appendix C.6) were then posted on the website.
are included in Appendix C. The artifacts include the com-
plete requirements list (Appendix C.1) and technical specifi-
cations (Appendix C.2), which include use cases and archi- Verify and Validate
tecture (Appendix C.2.1), entity-relationship diagram (ERD)
(C.2.2), data dictionary (C.2.3), and the cover sheets used for The initial (verification) testing was completed by mem-
document scanning (C.2.4). bers of the design team, and the primary focus was on veri-
fying that all requirements had been met and that the evalu-
ation form was designed to capture information that could be
Develop used to validate the prototype. The testers' comments were
incorporated into the final ECREL prototype.
Development of the ECREL prototype consisted of four To test ECREL, URS solicited members of the American
components: Cultural Resource Association (ACRA) and the National Con-
ference of State Historic Preservation Officers (NCSHPO).
· Development of the web pages and database;
ACRA is a national professional association of CRM profes-
· Collection, scanning, and indexing of the documents;
sionals and includes over 138 member firms. These firms pro-
· Loading of the documents into the database and running
vide historic preservation, archaeological, historic architec-
of the indexer; and
tural, anthropological, and landscape architectural services. A
· Development of on-line help files.
request to test ECREL was posted on ACRA's "Members
Only" section of the association's website. The NCSHPO acts
The database was developed using SQL Server 2000 and the
as a communications vehicle among SHPOs and their staffs
web pages using Cold Fusion v.6 and HTML.
and represents most SHPOs in developing national agreements
Documents were collected in several different ways. The
and protocols with federal agencies and national preservation
NPS has scanned multiple property submission documents
organizations. The request to test ECREL was posted on the
and posted them on its National Register website. Twenty-
NCSHPO's listserv.
three of these PDF image files were downloaded, rescanned
The original requests to ACRA and the NCSHPO were sent
into a searchable format, and loaded onto the ECREL site. In
on December 1, 2003, and the deadline for sending in evalua-
tions was January 30, 2004. Because of the low number of
2
"Metadata" (which is literally "data about data") within ECREL consist of a type of
responses received, the deadline was extended to February 28,
index field. The index values are defined by the user and manually entered into the data- 2004, and a reminder was sent to the NCSHPO and ACRA
base. These values include author, title, areas of significance, etc. To prepare the doc-
ument for full-text searching, an indexer is run to create files of every unique word in
members. The URS project team also presented a poster on the
the document. two prototypes at the TRB 2004 Annual Meeting. During the