Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 31
31 their IT staff to facilitate analysis of the data. The key challenge · Data Warehouse--a collection of individual databases is how to be able to put this information in the hands of casual across departments and business functions, specifically users, like marketing staff, without expecting them to become structured for organizationwide access to the data. The expert users or continually relying on a few expert users to pro- term "enterprise database" is analogous to data warehouse. duce the desired information. · Data Repository--a common term and, for the purposes The way to address this critical challenge is to develop of this section, no distinction is made between a data ware- applications for both the RDBMS and the GIS that provide house and a data repository. menu driven applications to allow casual users to use the sys- · Federated Database--a federated database is much like a tems in a more "push button" environment. This approach data warehouse except that the implementation of the fed- will require working with IT or having technical resources set erated database integrates multiple autonomous databases up this capability. A key objective should be to allow more peo- into what looks like a single database. A federated database ple within the organization to access and analyze ITS data. implementation provides a front end interface that stores Presently, the fact that these systems are expert systems has or retrieves required data from all source data marts using been a major constraint in getting the ITS and other data out a single query. While this is an option and has advantages to the people who can use it in the organization. ITS data will for some organizations, it is not discussed here as a main- never realize its full potential if it cannot be used effectively by stream option for transit agencies due to the level of tech- more people. At present, much of the market research nology required for construction and sustainability. The potential of ITS data has been constrained, but could be lever- definition is included because it is discussed in the transit aged significantly by making the data readily available within a literature. data management architecture that facilitates additional users. Using ITS, GIS, and marketing data sources, proprietary databases and Relational Database Management Technology Architectures Supporting as examples, an enterprise database or a data warehouse can Data Integration be set up designed from components such as the following: Once the data have been organized and stored in a database, an architecture that will automatically provide for the inte- · Proprietary database(s) implemented as a set of files or gration of data between various ITS and other systems can be defined within RDBMS BLOB for System A. BLOB (binary defined. A database architecture, like a data warehouse, can large object) is another storage mechanism for files with in provide for automated data collection, storage, archiving, and the RDBMS. retrieval. This prevents the need for manual loading of all the · Open database implemented within RDBMS for System B, information and ensures that current data are available. While which is an ITS system. this architecture takes some expertise and effort to set up, it · System C, which is a RDBMS ITS system, produces copious will pay dividends to the users and also ensure a more sus- amounts of time-based data, which is stored in a database. tainable system, costing less in the long run and saving the · A GIS spatial database implemented in RDBMS with a spa- time and effort of continually loading data. tial database manager "on top of" RDBMS (e.g., ESRI's SDE The intent of this section is to provide an overview of data- [spatial database engine] or Oracle Spatial technology). base architectures that support enterprise data. It will present · External datasets supplied in relational tables or flat files examples of the most common types of data transactions and with known record definitions. This could be census data, how they are set up in relation to data architectures, such as addressing data, or customer data. data warehouses and data marts. Several database manage- · Systems that are considered to maintain and house confi- ment terms will be used throughout this section. Defining the dential data or secured data in their databases (data marts) terms up front, as well as in context, may be helpful. Various in proprietary or open database technology. definitions exist for these terms. However, for the purposes of this section, the following definitions are used: Since System A has a proprietary database, data from System A will need to be exported or requested from System A in order · Data Mart--a database that is most likely a departmental to feed an enterprise database. In this case, there would be du- database that was designed to support a particular business plicate data--data stored in the proprietary data mart and in the function. Usually, each department or business unit is con- data warehouse. In cases like this, as well as others, clear data sidered the "owner" of its data mart, and they provide for ownership needs to be defined amongst the enterprise user base. maintenance of the data. This enables each department to If more than one system needs to be able to update attributes or use, manipulate and develop their data, without altering in- define new entities, then a complex data update process needs formation inside other data marts or the data warehouse. to be defined. Since many of the ITS systems are proprietary and
OCR for page 32
32 support their own data structures, data will need to be dupli- store and retrieve the data owned by GIS. Other databases cated and organized in a format for enterprise use. (e.g., event data from mobile data recorders) can be linked to the In the case of ITS System B, its data mart is already in an GIS so that it can be analyzed with spatial functions. open database platform (e.g., Oracle). If properly defined and External datasets residing in Oracle or other enterprise of the necessary quality to support the enterprise database, database technologies can be integrated into the enterprise System B's data may be logically defined as part of the enter- system much like System B's data. Census or addressing data prise database without duplicating the data. This approach are examples. works for ITS systems that are proprietary, but offer the abil- Data contained in confidential systems (e.g., customer ity to report out data to RDBMS. billing data) or in systems needing secure access (e.g., finan- In the case of ITS System C, massive amounts of data are cial or accounting of fare revenues) will need to be extracted produced and used by department(s). The data in System C or requested from those systems and stored in accessible tables may not necessarily be directly defined as part of the enterprise in the enterprise database. This way the whole database of the database, even if it resides in RDBMS. The enterprise organi- confidential system is not being used or entered in any way. zation may not require the retrieval of the specific data stored Using these components, the enterprise database would in System C. To support the enterprise, data from System C's consist of the inclusion of some system's data marts, data ex- data mart may need to be processed or summarized and then tracted from other system's data marts, RDBMS resident data exported to the enterprise database for use by other units, such from some systems, and spatial data from GIS. The resulting as marketing. One example would be AVL data. Typically mar- enterprise database can be defined as the enterprise data ware- keting would not want all the AVL data because of the database house, the combination of many different databases across an size and its organization for operations. In this case, marketing entire enterprise. Figure 4-1 depicts the methods for populat- would get summaries of the information they do want versus ing the data warehouse using these examples. sorting through reams of data to pick out what they want. In an enterprise GIS implementation, and depending on the Access to the Data in the Warehouse GIS technology, GIS data can be stored in an enterprise database, such as Oracle. However, to satisfy business requirements of a Data can be written to, read from, or updated in the data GIS (e.g., versioned data, spatially indexed data) and to perform warehouse using various methodologies. Systems can inte- GIS analyses on the spatial data, a spatial database manager must grate with the data warehouse using various methods. Proprietary Enterprise Data Warehouse (e.g., Oracle) Customer System A's CS Data Mart Data Mart Enterprise (Customer System System) Confidential A process in System A Data Warehouse (e.g., Oracle) A Information extracts data and exports it for loading into the warehouse System A's Data Subset of Customer Data Files from Load External A subset of System B's System's B's System B's Data Process Sources data is available as part System Data Mart B (ITS) Census Data e.g., Census Data of the warehouse System C extracts data A subset of the GIS data from its large data mart Pre- mart is data available as System C's Data and processes it process part of the warehouse GIS Data (summarize, reduce, etc.) GIS Data Mart Spatial for storage in the data System Data warehouse C (ITS) Manager GIS System C's Oracle Data Mart Figure 4-1. Population of the data warehouse.