Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
C H A P T E R 6 Data ProvisioningThe primary data-sharing concern will be to ensure that the consent form language clearly states the conditions under which the data may be shared after the study is complete. The language should spell out whether the data to be shared will be identifiable and the conditions for each type of data sharing. Data-sharing agreements may be needed for identifiable data but will likely not be needed for de-identified data that are publicly available. However, some form of IRB approval may still be required for all data access. The terms of this data access are still being determined as of this report date. Data Dictionaries To assist all researchers, data dictionaries are being developed that will identify each data item (or data stream) collected dur- ing the study. These data dictionaries include operational def- initions as well as database references. These may be especially helpful for those not involved in the original data collection. Additionally, it is anticipated that custom data dictionaries will be necessary to support data located through research- specific queries. Specifically, if new operational definitions are used to develop new variables, the definition and the approach must be documented both for the original researchers looking at these data and for future researchers. Role-Based Access The ability to provide scalable user access control to the com- bined data sets that will be collected and generated in the course of the study is governed by a role-based security pro- tocol and is dependent on the researcher requesting the data successfully obtaining IRB approval. Role-based security pro- tocols authorize access to data elements based upon explicit security roles that users may be granted. Without any assigned roles, the default authorization granted to users who success- fully âauthenticateâ (successfully log in) is to deny access to any secured data.28Role-based security has successfully demonstrated its abil- ity to cover a range of access requirements. Different roles can be defined to have different security access as appropri- ate. Security can be defined in terms of row-level access (e.g., where a user has access to all the data within a given region), column-level access (e.g., where a user may have access to all the data except GPS location information), or a combination of the two (e.g., cell-level access where a user may have access to GPS location data within a specified region). It should be noted that this security protocol provides protection to both database and file-based (e.g., video) data, as illustrated in Figure 6.1.Required Software It is generally expected that most external access will be pro- vided through a website or web service. However, provisions will be in place to allow qualified users to obtain their access via direct file server or database server access using the role- based access described previously. For example, in these cases, remote users may be able to obtain access through the use of typical Virtual Private Network (VPN) technologies. Using a role-based approach as defined accommodates and supports the use of a website/service to provide general data access. As such, commercially available data analysis tools (e.g., SAS and Matlab) can be used by researchers interested in con- ducting analysis on the data. A shareware-based Community Viewer will be provided also, but no specialized proprietary software will be required. Coordination and Linking with Roadway Information SHRP 2 Safety Project S04A, Roadway Information Database Development and Technical Coordination and Quality Assur- ance of the Mobile Data Collection Project, and Safety Project S04B, Mobile Data Collection, will provide valuable roadway-
29Role-Based Data Access Permissions General Public Limited Access by IRB Approval? Data possession by IRB Approval? Possession of all relevant watermarked data & uncensored video Access to relevant de-IDâd data & video Access to aggregated, de-IDâd data; no video No Yes Yes No Yes O&I Processes Data for Access Data for Access by Users based on Roles & Permissions External Data Users S06: O&I Contractor Making Data Accessible Data User Roles Data Decryption Normalization Aggregation Maintenance Video Processing Data generation algorithms Creating reduced databases Etc. Full Access by IRB Approval? Temporary Access to all relevant data & uncensored video No Yes Figure 6.1. Role-based data access diagram.based data describing in detail sections of roadway traveled by the instrumented vehicles included in the NDS. Roadway data will be extracted from existing inventory databases main- tained by state and local transportation departments and supplemented with other inventory data collected by highly instrumented on-road data collection vans. Successful inte- gration of these two data sets will support many research questions that involve the interaction between the driver, the vehicle, and the roadway and appurtenances. Recogniz- ing that these two efforts will be executed somewhat inde- pendently but simultaneously, actions have been taken in advance to ensure that integration of these data sets will be possible and feasible so as to support the desired research. Roadway inventory data in currently existing files are gen- erally specified using a zero point; subsequent stations aremeasured in length along the traveled roadway path from this zero point. Vehicle data are generally located geographically using latitude, longitude, and sometimes elevation, collected throughout a vehicle trip. Roadway inventory data collected by instrumented vans can be located in either fashionâeither as a distance from a zero point or geographically. Integration of these data will require transformation of one or both types of data to determine at what position in the roadway data the vehicle is during travel on a measured roadway. Communica- tion between S03 contractors and NDS contractors has been conducted to determine what will be required to make these transformations and what adjustments may be required during data collection to support these transformations. A GPS mea- sure would be required at each zero point but, though valu- able, it may not be required at subsequent stations. Vehicle
30GPS data can be used to identify when the vehicle passes a zero point, and then vehicle speed and time data could be integrated to estimate when the vehicle passes subsequent stations. The degree of accuracy of candidate approaches will be evaluated as they are identified. Collection of roadway data will not be feasible on all road- ways on which NDS participant vehicles travel. To priori- tize the measurement of roadway segments, NDS data will be processed regularly to identify roadway segments that are traveled frequently by more than one participant in the NDSeffort. In this way, segments that can provide replications to support research will be identified and provided to the S04 contractors to guide collection. On the basis of investigation of previous naturalistic data collections (Dingus et al. 2006), these high-frequency locations are generally coincident with high-volume corridors within a study area and places of ingress and egress to these corridors and also highways. The accumu- lating actual naturalistic data will be used to narrow these gen- eral rules to determine the specific segments traveled frequently by several participants.