Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
4 INTEGRATOR DESCRIPTION AND FUNCTIONAL SPECIFICATION INTRODUCTION The 1985 Woods Hole Workshop introduced the concept of an integrator as the central function of an integrated data base (IDB). The purpose of this working group is to further extend and describe the integrator. The group reviewed and validated the 1985 workshop concept of the integrator--validated it against the literature in the field and against the work undertaken by others. The group also assessed the status of the prototype IDB as it relates to the function of the integrator, extended the concept of the integrator, and proposed a direction for implementation of the integrator. This chapter summarizes the findings and recommendations of the working group. First, the IDB and integrator concepts are outlined, and the integrator concept is then viewed in the context of architec- ture models resulting from other outside research. Practical tech- nology and user considerations that influenced the group's recommenda- tions for integrator implementation are then described, followed by a discussion of considerations for implementation. Finally, the group's proposal for extending the prototype IDB in 1987 is presented. Members of the working group were: Bob Mahan (Chair), Larry Dyer, Jim Goodland, Bob Heterick, Marie Roberts, Paul Scarponcini, and Dick Zitzmann. IDB CONCEPT The IDB for a building project will exist from the very start of the programming and planning phase of the building process when the idea for the building originates. The IDB will be an active element of the process through the operational and facility management phase (when large cost advantages are expected to result). It may even survive the demolition of the building, since the data would have experiential value to future building projects. The quantity of stored data will vary over the span of the life cycle. As activities and tasks are completed, the amount of stored data will increase with time, at least in the early phases. Since some data will not have to be stored for the full cycle, and others will be added as time goes on, it could be that the total size of the stored data may stabilize at some point in time. 39
40 It is convenient to envisage the IDB as consisting of three broad components: 1. Data distributed among currently active files and data base systems of current participants in the building process, but limited to data elements and instances of data elements that are prescribed in advance to be valid parts of the IDB (i.e., subject to inclusion and exclusion rules), 2. A transitional data base that holds data resulting from the tasks of participants no longer active in the process; over time, this collection of data will grow as the building project proceeds, and, eventually, is expected to comprise essentially all of the data of the IDB, and 3. The functional components of the IDB which, collectively, is called the "integrator." INTEGRATOR CONCEPT The conceptual view of the integrator developed at the 1985 Woods Hole Workshop contains the five components: administrator, director, translator, accessor and communicator. Administrator The administrator is responsible for the area of data security and includes functions to control reading, updating, and deleting of data. It also ensures the integrity, consistency and currency of the data. It includes the management function for concurrent queries or updates, and for recovery and backup in case of system failure. Rules are contained within this component to govern the inclusion or exclusion of data resulting from building process tasks. Director The director contains the global data dictionary, information on where IDB data are stored, file formats, access authorizations, data currency status, and information about the specific physical implementation of the data model. Translator The translator contains information on requirements for different views of data for authorized users of the IDB as well as the formats of data sources and targets. This component contains the functionality to perform necessary format conversions.
41 Accessor The accessor determines optimum strategies for accessing data in the IDB, which may be distributed. It creates files, deletes files and data, adds data to existing files, and retrieves files and data. Communicator The communicator provides for the physical transfer of data among IDB components and between the IDB and user systems. ARCHITECTURE OF THE INTEGRATOR At the 1985 Woods Hole Workshop, the concept of the integrator was introduced as the central function of the IDB conceptual model. The five components or subfunctions of the integrator are, in fact, distributed among several components of a distributed data base. Therefore, it is more meaningful in a description of systems archi- tecture to redistribute these subfunctions to reflect architectural components that are conventionally part of a distributed data base management system (DBMS). Architectural Model1 Three groups of design goals are commonly applied to distributed, heterogeneous data base systems. The first seeks transparent local DBMSs. This implies a common, global user view of all data to be shared in the distributed system, a common data manipulation language, and a system-wide data dictionary. The second group, which seeks to maintain autonomy of all DBMS sites, requires that the local operation of a site be unaffected by global operations and that the addition of new sites or DBMSs be facilitated. Local DBMS optimization and data base organizations should be unaffected by global operations. The third group, which deals with communications among sites and DBMSs, suggests the use of standard communications protocols over conventional communications networks or facilities. However, special data base-oriented protocols may be necessary to preserve semantic integrity among some heterogeneous DBMSs. 1This section is an abstract of the article "Interconnecting Hetero- geneous Database Management Systems," Virgil D. Gligor and Gary Luckenbaugh, University of Maryland, authors, published in the January 1984 issue of COMPUTER, the Institute of Electrical and Electronics Engineers, Inc. The article is a result of an analysis of existing approaches to interconnecting heterogeneous DBMSs.
42 The architectural model for the integrator, which is illustrated in Figure 4-1, consists of three sublayers of the application layer of the ISO reference model plus the data communication protocol layer. A summary description of the three sublayers follows. Global Data Manager The global data manager (GDM) maps local data views into the global data view and vice versa, and performs all associated input/output (I/O) operations. An input to the GDM is a query or transaction based on the global model that the GDM transforms into the query language of the local DBMS or into a set of subqueries in appropriate query languages when more than one site is involved. In fact, the output of the GDM is a plan of subquery execution which the GDM passes to the sites involved via the communications network. Responses to the subqueries are then assembled by the GDM and returned to the user in global schema terms. Five functions are required by the GDM: (1) global data model analysis, (2) query decomposition, (3) query translation, (4) execution plan generation, and (5) results integration. The global schema may be stored in a distributed data dictionary along with the local schemas. Global query translation may include the translation of a query based on a particular user's view of the data into a query based on the global schema; then, another translation based on the local schema may be performed. Distributed Transaction Manager The distributed transaction manager (DTM) receives subqueries from the execution plan generator of the GDM and controls the execution of those queries and transactions so as to ensure the consistency of the data base by providing a hierarchy of concurrency control and recovery control by utilizing a two-phase commit protocol. For reliable recovery control, the coordination process that supports the two-phase protocol must be crash resilient. In most cases, the local DBMS will have to be modified to support the "prepared state" of the two-phase protocol. Structured Data Transfer Protocols Structured data transfer protocols (SDTPs) support the services of the DTM and require the services of the data presentation protocol and of other application layers such as those of the file transfer protocol. They are application-level protocols required for the interconnection of remote heterogeneous DBMSs. The SDTPs themselves use the data communications protocols and can be regarded as extensions of such protocols.
43 Global Query FormulatIon UnIform Query Language Global Schema Global Query TranslatIon and DecomposItIon Global Data Management DIstrlbuted TransactIon Management Global-Local Subquery MappIng (TranslatIon) Global-Local Schema MappIngs (TranslatIon) Resource Management Results IntegratIon ExecutIon Plan GeneratIon I Local TransactIon CoordInatIon InteractIon wIth Local DBMS Local-Resource CoordInatIon (Interface wIth Data Managers) Assembly of IndIvIdual Results L Structured Data Transfer Protocols Command Transfer Protocols Structure Transfer Protocols Type and Format TranslatIon Protocols Data CommunIcatIon Protocols ' FIle Transfer Protocol | Data PresentatIon Protocol |â I Network Interprocess CommunIcatIon Protocol j FIGURE 4-1 Functional requirement areas.
44 PRACTICAL TECHNICAL AND USER CONSIDERATIONS Several practical considerations exist that could affect the integrator implementation plan. Some of these are: 1. Integrators that enable heterogeneous hardware, applications, and data bases to operate together in a coordinated fashion may exist in other areas of industry and could have applicability to the needs of the building industry. 2. Some pieces of the integrator in a distributed, heterogeneous environment are resident in the various hardware and software components comprising that environment. 3. A key purpose of the integrator functionality is to provide the user with an interface to the available information at any stage in the life cycle of a building. Its function in this role is to answer user queries. Transparency to the user is an ultimate goal. Available information is that which has been released by the professional involved in each step of the building process. The integrator is the mechanism for the owner to affect program management. 4. A logical sequence of steps to design an IDB shows that it is not until the last two steps that the integrator can be functionally specified. The steps are as follows: a. Identify data items required by every application and define the data item relationship, b. Identify source and format of data in every application, c. Define a conceptual data base to satisfy data requirements of all applications, d. Define user requirements for data update and selection to and from the data base, e. Based on (c) and (d), define necessary software for data base update and selection, f. Identify the hardware and operating software under which the data base must be operational, and identify the data base manager to be used, g. Design the data base and the integrator, and h. Develop integrator and any support software required. 5. The actual implementation of the integrator depends on the hardware, software, communication, applications and data base components available throughout the building life cycle. The cost to produce an integrator is a function of these components and will likely decrease with the adoption of building industry standards and protocols for communication and translation functions (i.e., common languages, translation to neutral file formats, and communication protocols). The adoption of standards in these areas would directly affect the cost and performance of implementing the communicator, the accessor, and translator functions. The functions of the administrator and the director will remain dependent on the needs of individual owners and operators.
45 6. The integrator should be viewed as a platform on which various functional modules can be added. It should be flexible, programmable, and data-driven (i.e., hard and fast linkages should not be defined or developed). 7. The role of the prototype IDB demonstration has been to help understand the integration needs of the building industry. The development of a fully operational integrator is beyond the scope of this working group. The prototype IDB demonstration was instrumental in showing requirements for data exchange, but the integrator functions were not directly addressed. 8. An extended prototype IDB demonstration for the 1987 Woods Hole Workshop should superimpose the owner's needs through an integrator. The extended prototype should use the existing prototype's hardware, software and applications, and augment these with other products, if appropriate. It should isolate and demonstrate integrator functions. An extended prototype must be able to demonstrate to participants in the building process that there is a direct benefit. To do this, the following are suggested: a. Clearly, concisely, realistically present the strategy for development, implementation, utilization, maintenance, and growth of the IDB. Show immediate possibilities as well as the future potential, and b. Demonstrate by prototype the possible system functions at critical junctures in the building process with an emphasis on benefits. IMPLEMENTATION CONSIDERATIONS Neutral File Concept The introduction of an exchange file concept that converts data into a neutral file format addresses the problem of multiple trans- lators at each node in the IDB network. The integrator would activate the appropriate translator program to convert unique application data files into the neutral file format. Applications requiring these data would then connect from the neutral file format into their own format. This would be done by the application interface portion of the inte- grator, not the application itself. This could also be used by the IDB for global data base activity. Menu Implementation The integrator should provide all interactive user options in menu format. Menu selection will result in application to application interaction, data interchange including translation, output functions to be executed, and utility and support functions to be performed.
46 Application Interface The application interface is the user's view into the IDB; it is a part of the integrator. The application interface responds to all menu prompts, processes command language requests, and enables the appro- priate operating system functions to be accomplished. The application interface is functionally part of the integrator, but is physically evident to the user. Translator The translator is activated by the integrator in response to a request issued by a menu prompt from the requesting application. The translator converts data into or from the neutral file format from or to the specific format of the application issuing the request. PROPOSED PROTOTYPE IDB EXTENSION The 1986 prototype IDB demonstrated the capability of interaction of dissimilar systems in the architect-engineer-constructor continuum. The connections between systems were customized and made no attempt to use generic integrator concepts. Further expansion of the prototype should attempt to incorporate some integrator approaches. Perhaps the simplest technically-meaningful demonstration of integrator concepts occurs relative to a query for data. For purposes of the prototype IDB, incorporation of integrator concepts can be demonstrated with the existing prototype's hardware. It may be clearest to view the inter-system communication as an extended electronic mail system. Even at only this single level, all five components of the integrator are present. Queries are dispatched via mail and responses are returned via mail. The communicator, for demonstration purposes, is then only the linkage between the two systems. The administrator is likely to be quite complex (and beyond the scope of next year's prototype), perhaps evolving into a knowledge-based system. The focus is then on the director, accessor, and translator functions.
47 Conceptually, the IDB might appear as E-MaIl/DIrector (ProbabIy VAX-VMS) One possible scenario would involve: 1. The composition of a mail item on the AT which covered a data base query, 2. Dispatching that item to the mail system, 3. Routing that item to the director, 4. Fowarding the item from the director to the ORACLE data base, 5. Translating the query to well-formed ORACLE commands, 6. Accessing the ORACLE data base, 7. Reformating the response to the interchange specification, 8. Enclosing the responses in a mail item directed to the mail system, and 9. Forwarding the mail item to the AT. In order to implement this scenario it is necessary to operate a mail system (probably on the VAX). The envelope of the mail item and where in the contents is the accessor command need to be determined. The director (one or more pieces of software, probably resident on the VAX) needs to receive the mail item and pass the accessor command (written in SQL as a canonical language). The director will ascertain the appropriate destination (an ORACLE data base on the VAX) and forward the item to a translator (an ORACLE front end) which will again pass the command and reconstruct it as a well-formed ORACLE command(s).
48 The ORACLE data base is queried and the response is passed to a translator (an ORACLE back-end) that reformats the response to the global data interchange specification. That specification would need to precede the response with information such as the number of records, number of fields, and field delimiters of the response in something like row major order. This response is then placed in the appropriate mail envelope and sent back to the requester via the mail system. The requester (the PC/AT) then needs to receive the mail item, strip the mail envelope, and translate (the DOS front-end) the response to an appropriate format for the application executing on the AT. In this simple scenario it can be assumed that the director has knowledge of all the files and fields of the ORACLE data base and can ascertain that the canonical query refers to known files and fields. If it does not, the director needs to be capable of returning a mail item to the sender indicating the offending syntax (such as an undefined variable or an ill-formed canonical statement). Since the sender may be a process operating asynchronously with the "end user," the mail item containing the error diagnostic needs itself to be in some well-defined format. If the ORACLE data base front-end is not able to deal successfully with the canonical query, it needs to be able to return a similar diagnostic. Similarly, if any one of the partners to any exchange is not able to participate, the director needs to know this and incorporate a strategy for handling the situation (such as to hold the communication until the situation is resolved or abort the process with an appropriate diagnostic). Updating functions present a host of concurrency and integrity issues that are beyond the level of the proposed demonstration. It is estimated that each translator (there are three of them in this scenario) will take six man-months to construct. Effective error handling will likely require as much effort as a translator.