Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 31
31
their IT staff to facilitate analysis of the data. The key challenge · Data Warehouse--a collection of individual databases
is how to be able to put this information in the hands of casual across departments and business functions, specifically
users, like marketing staff, without expecting them to become structured for organizationwide access to the data. The
expert users or continually relying on a few expert users to pro- term "enterprise database" is analogous to data warehouse.
duce the desired information. · Data Repository--a common term and, for the purposes
The way to address this critical challenge is to develop of this section, no distinction is made between a data ware-
applications for both the RDBMS and the GIS that provide house and a data repository.
menu driven applications to allow casual users to use the sys- · Federated Database--a federated database is much like a
tems in a more "push button" environment. This approach data warehouse except that the implementation of the fed-
will require working with IT or having technical resources set erated database integrates multiple autonomous databases
up this capability. A key objective should be to allow more peo- into what looks like a single database. A federated database
ple within the organization to access and analyze ITS data. implementation provides a front end interface that stores
Presently, the fact that these systems are expert systems has or retrieves required data from all source data marts using
been a major constraint in getting the ITS and other data out a single query. While this is an option and has advantages
to the people who can use it in the organization. ITS data will for some organizations, it is not discussed here as a main-
never realize its full potential if it cannot be used effectively by stream option for transit agencies due to the level of tech-
more people. At present, much of the market research nology required for construction and sustainability. The
potential of ITS data has been constrained, but could be lever- definition is included because it is discussed in the transit
aged significantly by making the data readily available within a literature.
data management architecture that facilitates additional users.
Using ITS, GIS, and marketing data sources, proprietary
databases and Relational Database Management Technology
Architectures Supporting
as examples, an enterprise database or a data warehouse can
Data Integration
be set up designed from components such as the following:
Once the data have been organized and stored in a database,
an architecture that will automatically provide for the inte- · Proprietary database(s) implemented as a set of files or
gration of data between various ITS and other systems can be defined within RDBMS BLOB for System A. BLOB (binary
defined. A database architecture, like a data warehouse, can large object) is another storage mechanism for files with in
provide for automated data collection, storage, archiving, and the RDBMS.
retrieval. This prevents the need for manual loading of all the · Open database implemented within RDBMS for System B,
information and ensures that current data are available. While which is an ITS system.
this architecture takes some expertise and effort to set up, it · System C, which is a RDBMS ITS system, produces copious
will pay dividends to the users and also ensure a more sus- amounts of time-based data, which is stored in a database.
tainable system, costing less in the long run and saving the · A GIS spatial database implemented in RDBMS with a spa-
time and effort of continually loading data. tial database manager "on top of" RDBMS (e.g., ESRI's SDE
The intent of this section is to provide an overview of data- [spatial database engine] or Oracle Spatial technology).
base architectures that support enterprise data. It will present · External datasets supplied in relational tables or flat files
examples of the most common types of data transactions and with known record definitions. This could be census data,
how they are set up in relation to data architectures, such as addressing data, or customer data.
data warehouses and data marts. Several database manage- · Systems that are considered to maintain and house confi-
ment terms will be used throughout this section. Defining the dential data or secured data in their databases (data marts)
terms up front, as well as in context, may be helpful. Various in proprietary or open database technology.
definitions exist for these terms. However, for the purposes of
this section, the following definitions are used: Since System A has a proprietary database, data from System
A will need to be exported or requested from System A in order
· Data Mart--a database that is most likely a departmental to feed an enterprise database. In this case, there would be du-
database that was designed to support a particular business plicate data--data stored in the proprietary data mart and in the
function. Usually, each department or business unit is con- data warehouse. In cases like this, as well as others, clear data
sidered the "owner" of its data mart, and they provide for ownership needs to be defined amongst the enterprise user base.
maintenance of the data. This enables each department to If more than one system needs to be able to update attributes or
use, manipulate and develop their data, without altering in- define new entities, then a complex data update process needs
formation inside other data marts or the data warehouse. to be defined. Since many of the ITS systems are proprietary and
OCR for page 32
32
support their own data structures, data will need to be dupli- store and retrieve the data owned by GIS. Other databases
cated and organized in a format for enterprise use. (e.g., event data from mobile data recorders) can be linked to the
In the case of ITS System B, its data mart is already in an GIS so that it can be analyzed with spatial functions.
open database platform (e.g., Oracle). If properly defined and External datasets residing in Oracle or other enterprise
of the necessary quality to support the enterprise database, database technologies can be integrated into the enterprise
System B's data may be logically defined as part of the enter- system much like System B's data. Census or addressing data
prise database without duplicating the data. This approach are examples.
works for ITS systems that are proprietary, but offer the abil- Data contained in confidential systems (e.g., customer
ity to report out data to RDBMS. billing data) or in systems needing secure access (e.g., finan-
In the case of ITS System C, massive amounts of data are cial or accounting of fare revenues) will need to be extracted
produced and used by department(s). The data in System C or requested from those systems and stored in accessible tables
may not necessarily be directly defined as part of the enterprise in the enterprise database. This way the whole database of the
database, even if it resides in RDBMS. The enterprise organi- confidential system is not being used or entered in any way.
zation may not require the retrieval of the specific data stored Using these components, the enterprise database would
in System C. To support the enterprise, data from System C's consist of the inclusion of some system's data marts, data ex-
data mart may need to be processed or summarized and then tracted from other system's data marts, RDBMS resident data
exported to the enterprise database for use by other units, such from some systems, and spatial data from GIS. The resulting
as marketing. One example would be AVL data. Typically mar- enterprise database can be defined as the enterprise data ware-
keting would not want all the AVL data because of the database house, the combination of many different databases across an
size and its organization for operations. In this case, marketing entire enterprise. Figure 4-1 depicts the methods for populat-
would get summaries of the information they do want versus ing the data warehouse using these examples.
sorting through reams of data to pick out what they want.
In an enterprise GIS implementation, and depending on the
Access to the Data in the Warehouse
GIS technology, GIS data can be stored in an enterprise database,
such as Oracle. However, to satisfy business requirements of a Data can be written to, read from, or updated in the data
GIS (e.g., versioned data, spatially indexed data) and to perform warehouse using various methodologies. Systems can inte-
GIS analyses on the spatial data, a spatial database manager must grate with the data warehouse using various methods.
Proprietary Enterprise
Data Warehouse (e.g., Oracle) Customer
System A's CS Data Mart
Data Mart Enterprise (Customer
System System) Confidential
A process in System A Data Warehouse (e.g., Oracle)
A Information
extracts data and exports
it for loading into the
warehouse System A's Data
Subset of Customer Data
Files from
Load External
A subset of System B's System's B's System B's Data Process Sources
data is available as part
System Data Mart
B (ITS) Census Data e.g., Census Data
of the warehouse
System C extracts data A subset of the GIS data
from its large data mart
Pre- mart is data available as
System C's Data
and processes it process part of the warehouse
GIS Data
(summarize, reduce, etc.) GIS Data Mart Spatial
for storage in the data System Data
warehouse C (ITS) Manager
GIS
System C's Oracle Data Mart
Figure 4-1. Population of the data warehouse.