Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 28
28
CHAPTER 4
Data Management, Reporting, and Staffing
Considerations
Effective use of ITS data to support and leverage transit mar- analyze the ITS data collected. Marketing was found to often
ket research depends on the development and maintenance of rely on the "trickle down" of information collected by others.
a diverse supporting infrastructure. Key elements of this infra- Thus, the frequently compromised capability to manage ITS
structure include an information system for archiving ITS data; data across the organization represents a crucial roadblock for
enterprise-level applications, the most important of which in market researchers seeking access to data. This section is
the context of this Guidebook is GIS; processes for screening intended to provide an overview and direction with respect to
and validating ITS data to ensure its integrity; software tools developing the enterprise or organizationwide tools and archi-
that support reporting and analysis; and human resources with tecture for managing ITS data.
the skills to maintain the infrastructure and, through analysis
of ITS data, inform strategic decisions in marketing and other
Current Data Practices
agency functions. This chapter addresses these infrastructure
elements, which collectively separate ITS technologies from de- At many of the properties surveyed, the collection of ITS
cisionmaking. When this infrastructure is properly developed data is done on a departmental or project specific basis. This
and supported, the potential of ITS data to support research, information is used for specific applications and is not stored
inform decisions, and benefit customers can be realized. When in a manner that allows others, such as marketing, to easily
the infrastructure includes gaps, inconsistencies, or incompat- retrieve or share the information. While information can the-
ibilities, the potential benefits can be entirely lost. oretically be shared, it usually requires technical expertise and
The presentation in this chapter draws on information re- a level of effort that marketing typically does not have. There
covered in the 2005 survey of properties that had deployed or are a number of factors contributing to this problem, includ-
were planning to deploy the technologies covered in this ing the following:
Guidebook. It also draws on the experiences of the three case
study properties, which collectively have succeeded in bridg- · Lack of budget
ing the technology-decisionmaking gap. Finally, the chapter · Lack of information technology (IT) tools, such as databases
draws on the literature related to each of the infrastructure · Problems integrating proprietary ITS systems
elements. · Lack of expertise and technical support
· Systems that are not capable of organizationwide data
sharing
Enterprise Database Architecture
· Poor or undeveloped relationships and communication
Supporting the Use of ITS Data
between IT departments and other departments
ITS technologies have the potential to provide substantial · Business practices and structures that have not adapted to
information to support and leverage transit market research. the emerging ITS environment and lack direction
Among the transit agencies surveyed in 2005, the potential to
provide information for market research was often reported to To varying degrees, all the transit agencies surveyed in 2005
be blunted by problems with managing ITS data in a manner were dealing with one or more of these constraints. Most of
that allows marketing and others within the organization to the problems carry over from those identified in an earlier sur-
readily use the data. Many of the properties surveyed did not vey of larger transit properties (Boldt 2000), which foresaw
have the capability to validate, organize, retrieve, share, and difficulties in accommodating expanding deployment of ITS
OCR for page 29
29
technologies within the prevailing IT environment. Address- Conceptual definition consists of diagramming the data and
ing these constraints will require planning, time, effort, and relationships among the data to be stored. It is a starting point
resources as each organization evolves its IT function. The fol- for organizing the data. A logical data model finalizes entities,
lowing outline of a data architecture and strategy is intended fields, and relationships as tables, columns, relationships, and
to provide an overall direction for properties that are facing constraints. Given a logical data model, a physical database
these problems. model can be created in a commercial database software
product, such as Oracle or Sequel Server.
Fortunately, in defining a data model, there are commer-
Enterprise Data Management
cially available data modeling methodologies, computerized
This section provides general direction and guidelines for tools (e.g., computer-aided software engineering [CASE]
organizing ITS and other data to be used by practitioners. It tools), and an experienced skill set to facilitate the develop-
defines an enterprise or an organizationwide approach. The ment. Even with commercial off-the-shelf (COTS) software
guidelines follow general data management principles and products, designing a data model can take significant time.
provide a suggested architecture that has been discussed in Depending on which methodology and tool(s) are chosen,
the transit literature. Among the Guidebook's case studies, starting at the very beginning involves first defining a con-
Tri Met serves as a good example of a transit organization that ceptual model, then a logical model, followed by a physical
has developed an enterprise data management architecture. model. However, given the industry examples, rarely does the
A first step in enterprise data management is to develop a process of defining an enterprise data model start from
database design. The database design is a detailed plan for or- scratch with the conceptual model. Much work has already
ganizing and structuring data maintained across the organi- been performed in defining "starting point" models that can
zation. This is also called a database schema, with depictions be used as inputs to defining an enterprise transit GIS-ITS-
of how the data are diagramed and charted. enterprise data model. Examples of these starting points are
There are several reasons for developing a database design. listed as follows:
Database designs facilitate organizing, storing, retrieving, and
sharing the information across the organization. Basically, the · GIS vendors publish industry standard models defined by
database design shows how the whole organization will use, consortiums. For example, ESRI public models include
share, and sustain data over time when incorporated into land, transportation, and address models (ESRI 2007).
database software. In order for a computer system to receive · Most IT organizations will have staff expertise and CASE
information from ITS or other systems, and process data in a tools to assist in developing data models.
consistent manner, the data that it receives from one or more
places must be defined. The data must be defined such that the Since the process of designing an enterprise data model
computer system knows what a data field means, in what for- does not have to re-invent the wheel, a transit agency can
mat the data is stored, and where it should store results. This benefit from what has been defined and proven in previous
is the purpose of a data model; it defines the organization of efforts. A typical approach in using existing models is one
the information such that systems can understand what data where the data maintained and used by an organization's
resides therein, where it resides, how it is stored and how the current and future processes (as foreseen or as defined in
data are related. Data models not only allow data sharing, they a "to-be" model) are compared with the starting point
allow computer systems to work together. Fundamentally, the model(s). Gaps in what is defined in one or more of the start-
data model definition and how well the data model is defined ing point models are addressed as needed to define the target
will determine how well a computer system operates and how data model. This process is called "gap analysis." Other inputs
well it can share data with other systems and business do- to the gap analysis process are the data requirements for
mains. It is essential to define a well functioning database in existing or planned computerized systems.
order to write applications to address business problems. If a The enterprise data model design and system integration
data model is not well defined, efforts to computerize prob- process must have a clear definition of the extent, or scope, of
lem solving or to implement business improvements will the data and systems that will initially be contributing and
likely be difficult and expensive, if not impossible. using data from the enterprise data model. In addition, the
The process of defining a data model passes through three process needs to consider the data requirements of other busi-
stages: ness areas that will be included in the future. If a well-defined
and bounded scope is not established, the data modeling
· Conceptual definition process would stretch out indefinitely.
· Logical model It is best to start an enterprise data and system integration
· Physical data model process/project involving a limited set of business areas in
OCR for page 30
30
order to prove the model and the integration methods prior An object in an enterprise data model may have attributes
to rolling the enterprise data model and associated system in- that are defined in multiple systems. In order to relate all pieces
tegration into other areas of the organization. In other words, of information about a specific object/entity from multiple sys-
"start small and smart." For example, an initial scope for an tems, a common identifier is critical. A common identifier
enterprise data model and system integration project could uniquely identifies one occurrence of an object from every
be defined as one that includes a subset of ITS data systems, other occurrence. The identifier may be made up of one or
GIS, and the use of ITS and GIS data by marketing personnel. more fields in a database. For example, a stop location has an
In addition, the scope would include the impact of the enter- identifier that distinguishes it from all other stop locations.
prise definition on other business units currently using ITS Given the spatial and object-oriented nature of GIS and
systems and GIS, such as Operations. ITS data, integration of the two offers a great opportunity for
analysis. In addition, other external databases can be linked
to an enterprise database for graphic display or analyses
Database Design and GIS Data
(e.g., census data, demographic datasets, or address data).
Typically GIS data are also stored in an enterprise or Marketing may also have other data sources, currently used
relational database management system (RDBMS). GIS solely for their purposes, that have a "locational" characteris-
information, which is spatial in nature and robust in size, tic that can be integrated with ITS and GIS data and provide
requires special consideration as a major aspect of the data- the capability for analyses.
base design. In the GIS arena, physical entities are defined The broad capabilities available with GIS spatial analysis
as objects with attributes, relationships, constraints, and be- tools and the data supplied by ITS technologies are ideal for
haviors. Most GIS objects have spatial characteristics: application in the marketing arena. Some of the geographic
geometries (points, lines, or polygons) and network topol- and/or graphic analysis capabilities that a GIS can provide are
ogy. The network topology defines what object is connected
to what other objects, as in a street network. In a street · Thematic presentation,
network, for example, the GIS database and spatial data · Spatial overlays,
manager knows that Main Street intersects with 1st Avenue · Proximity analysis,
at a specific location. A GIS supports the application of · Point density visualization,
geometric functions (e.g., union, intersection of spatial ob- · Customer location,
jects). For example, a GIS can determine the spatial rela- · Event or incident analysis,
tionships of boundary objects, such as how many stops are · Market segments or characteristics, and
contained within a particular zone. · Direct mail marketing.
Enterprise data that are typically stored within a GIS in-
clude the objects and their corresponding data plus a "land
Using the Database
base" that provides a base map, which serves as a foundation
on which transit-specific information can be overlain. Exam- The actual data model is structured within commercial
ples of data typically found in a land base include RDBMS. The RDBMS serves as a repository for the data, as
an analytical tool for retrieving information, and as a mech-
· Streets and a street network; anism for exchanging data between systems.
· Aerial photography; Several of the ITS systems produce information in the for-
· Elevation contours; mat of a RDBMS. This allows data, after validation or post-
· Buildings; processing, to be transported into the RDBMS for storage and
· Boundary data: zoning, blocks, states, county, tax, zip analysis. GIS systems use the RDBMS as a storage and re-
codes, lots, parcels; and trieval mechanism for data as well. In itself, the RDBMS can
· Census boundary information and census data. serve as a powerful standalone tool for retrieving ITS infor-
mation and analyzing marketing data. When combined with
Combining ITS information and GIS information in the GIS, it is a very powerful tool that can map the data as well.
same database allows the ITS data to be related and analyzed The main drawback with using these powerful tools are they
by location. It should be noted that location values are de- are not "casual user" friendly and typically take considerable
fined in a specific coordinate system. If two or more systems technical training or trained staff to manipulate the data. Most
contributing data to an enterprise database contain location of the transit agencies interviewed in the course of selecting the
coordinate values defined in different coordinate systems or case study properties for this Guidebook were hiring or culti-
projections, the differences can be reconciled with coordinate vating their own staff to orchestrate both RDBMS and GIS in
transformations or on-the-fly projections. an attempt to "get at the data." Other agencies were relying on