D ATA - O R I E N T E D T R AV E L B E H AV I O R A N A LY S I S 195 method requires a technique for accumulating and oper- difference between such survey techniques and conven- ating accurate data, as well as for mining significant rela- tional survey methods is the resolution of behavior tionships of certain traffic phenomena by using the large observation. Although conventional questionnaire sur- amounts of accumulated data. A method for efficiently veys have allowed only zone-by-zone behavior observa- accumulating data and a methodology for uncovering tions that rely on subjects' memory, the use of a GPS such treasures are required for data mining. enables the measurement of detailed behavior paths and Data mining is a method for uncovering the true spatial dispersion in destination without relying on sub- nature of the structures or phenomena behind data. jects' memory. The use of multiple sensors makes possi- Whether any treasure is buried in the mountain of data ble observation of variables conventionally difficult to to be mined depends largely on how data themselves are observe, such as detailed activity contents at the loca- prepared. The following paragraphs will summarize tions of stay and the number of steps of the staircase at a problems to be solved for constructing a behavior transfer. database. Such data will have a great influence on models. Transportation data mining requires a large amount Researchers will be able to obtain a realistic knowledge of accurate data. In an additive-type database, such as of the set of choices that a traveler is actually consider- those for traffic data, both the number of records and ing, and the use of detailed path selection data, which the number of attributes will continue to increase. Paral- have conventionally been difficult to obtain, will enable lel processing and incremental processing are indispens- the development of path selection models of higher accu- able for handling data that range in size from gigabytes racy. Furthermore, it will become possible to perform to terabytes. modeling by using actual data on transfer resistances, Data obtained under the pervasive computing envi- such as the steps of staircases, and service levels, such as ronment, which consists of GPS, mobile phones, multi- whether a traveler is sitting on a train or standing ple sensors, and the like, cannot be analyzed as they are. because it is crowded. To make the data sharable, the interfaces of devices must Unlike longitudinal surveys, such as panel surveys, be normalized. For example, vehicle speed data are usu- which observe year-to-year changes in behavior, these ally measured and internally processed as pulses (hertz), survey techniques make possible collective observation and it is difficult to recognize them as speeds if they are from day-to-day to year-to-year changes in behavior in published as they are. To share vehicle speed pulses as real time. They will enable real-time data mining and traffic data, they must be published as speeds, not as model analysis of an enormous amount of accumulated pulses, to the outside. The usage of data in the entire sys- data on systematic changes in behavior. It is thought that tem must be envisioned to some degree and, on the basis the contributions of travel behavior analysis will expand of that envisioned usage, the data structure standardized from transportation plans set up on a several-year basis and the data published. Moreover, to extract from the to transportation management that manages transport data the knowledge appropriate for the objective of the demand on a several-minute basis. analysis, the spatial data, such as data on land use, and Data mining allows easy acquisition of various con- economic data must be prepared simultaneously and tents by following hyperlinks without previously having their mutual use enabled through XML or other means. a format for analysis. Such an approach may be called an Privacy and security are also major issues to be Internet-based approach, and the database, in which addressed. If personal travel data (identification number, data continue to be accumulated, may be compared with time, latitude, and longitude) are recorded at intervals of aerial photographs in precision. In either case, models 1 min for 1 year in a city with a population of one mil- that are validated on the basis of enormous amounts of lion, a large personal database of 100 TB or more will be accurate data and mechanically derived knowledge may constructed. Although the introduction of a large per- be highly beneficial for travel behavior analysis. sonal information system has been hoped for since Sep- Which are more beautiful, paintings or aerial pho- tember 11 from the standpoint of homeland defense, tographs? An aerial photograph precisely expresses an there is a persistent concern over a panoptic on society object. However, it does not tell where to look or where from the standpoint of privacy, and a technique for to proceed in the long run (although it helps decide ensuring anonymity is necessary. where to proceed). The author believes that nothing can be a match for the beauty of theoretical models, which are the paintings of data mining. The results of efforts CONCLUSIONS condensed on a planar canvas brilliantly express the true nature of an object. However, paintings may be replaced This paper has shown the possibilities of new survey by aerial photographs someday if the painter and the techniques for travel behavior through GPS mobile viewer fail to improve the technique of painting, to phones, a web diary, and multiple sensors. The crucial extract true natures, and to feel the beauty.