Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
method requires a technique for accumulating and oper- ating accurate data, as well as for mining significant rela- tionships of certain traffic phenomena by using the large amounts of accumulated data. A method for efficiently accumulating data and a methodology for uncovering such treasures are required for data mining. Data mining is a method for uncovering the true nature of the structures or phenomena behind data. Whether any treasure is buried in the mountain of data to be mined depends largely on how data themselves are prepared. The following paragraphs will summarize problems to be solved for constructing a behavior database. Transportation data mining requires a large amount of accurate data. In an additive- type database, such as those for traffic data, both the number of records and the number of attributes will continue to increase. Paral- lel processing and incremental processing are indispens- able for handling data that range in size from gigabytes to terabytes. Data obtained under the pervasive computing envi- ronment, which consists of GPS, mobile phones, multi- ple sensors, and the like, cannot be analyzed as they are. To make the data sharable, the interfaces of devices must be normalized. For example, vehicle speed data are usu- ally measured and internally processed as pulses (hertz), and it is difficult to recognize them as speeds if they are published as they are. To share vehicle speed pulses as traffic data, they must be published as speeds, not as pulses, to the outside. The usage of data in the entire sys- tem must be envisioned to some degree and, on the basis of that envisioned usage, the data structure standardized and the data published. Moreover, to extract from the data the knowledge appropriate for the objective of the analysis, the spatial data, such as data on land use, and economic data must be prepared simultaneously and their mutual use enabled through XML or other means. Privacy and security are also major issues to be addressed. If personal travel data (identification number, time, latitude, and longitude) are recorded at intervals of 1 min for 1 year in a city with a population of one mil- lion, a large personal database of 100 TB or more will be constructed. Although the introduction of a large per- sonal information system has been hoped for since Sep- tember 11 from the standpoint of homeland defense, there is a persistent concern over a panoptic on society from the standpoint of privacy, and a technique for ensuring anonymity is necessary. CONCLUSIONS This paper has shown the possibilities of new survey techniques for travel behavior through GPS mobile phones, a web diary, and multiple sensors. The crucial difference between such survey techniques and conven- tional survey methods is the resolution of behavior observation. Although conventional questionnaire sur- veys have allowed only zone- by- zone behavior observa- tions that rely on subjectsâ memory, the use of a GPS enables the measurement of detailed behavior paths and spatial dispersion in destination without relying on sub- jectsâ memory. The use of multiple sensors makes possi- ble observation of variables conventionally difficult to observe, such as detailed activity contents at the loca- tions of stay and the number of steps of the staircase at a transfer. Such data will have a great influence on models. Researchers will be able to obtain a realistic knowledge of the set of choices that a traveler is actually consider- ing, and the use of detailed path selection data, which have conventionally been difficult to obtain, will enable the development of path selection models of higher accu- racy. Furthermore, it will become possible to perform modeling by using actual data on transfer resistances, such as the steps of staircases, and service levels, such as whether a traveler is sitting on a train or standing because it is crowded. Unlike longitudinal surveys, such as panel surveys, which observe year- to- year changes in behavior, these survey techniques make possible collective observation from day- to- day to year- to- year changes in behavior in real time. They will enable real- time data mining and model analysis of an enormous amount of accumulated data on systematic changes in behavior. It is thought that the contributions of travel behavior analysis will expand from transportation plans set up on a several- year basis to transportation management that manages transport demand on a several- minute basis. Data mining allows easy acquisition of various con- tents by following hyperlinks without previously having a format for analysis. Such an approach may be called an Internet- based approach, and the database, in which data continue to be accumulated, may be compared with aerial photographs in precision. In either case, models that are validated on the basis of enormous amounts of accurate data and mechanically derived knowledge may be highly beneficial for travel behavior analysis. Which are more beautiful, paintings or aerial pho- tographs? An aerial photograph precisely expresses an object. However, it does not tell where to look or where to proceed in the long run (although it helps decide where to proceed). The author believes that nothing can be a match for the beauty of theoretical models, which are the paintings of data mining. The results of efforts condensed on a planar canvas brilliantly express the true nature of an object. However, paintings may be replaced by aerial photographs someday if the painter and the viewer fail to improve the technique of painting, to extract true natures, and to feel the beauty. 195 DATA- ORIENTED TRAVEL BEHAVIOR ANALYSIS