Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 25
25 Table 5. NTD data fields by category. Data Category No. Data Fields Additional Details Mode/Service 2 Transit Mode, Service Date/Time 6 Date, Hour, Minute, AM/PM, Time Zone, Time Period Desc. Location 7 Transit Agency, City, Location Desc., Latitude, Longitude Description of 2 Involved Party Category/Desc. Involved Parties Incident Classification 6 Event Level/Category/Type, Collision Manner, Local Level Desc. Consequences of 3 No. Injuries, No. Fatalities, Property Damage ($) Incident Alignment/Crossing 4 Alignment Type, Grade Crossing Control, Intersection Control Controls General Descriptions 7 Incident/Passenger/Other Veh./Action/Other Action/Event Desc. (i.e., actions) Environmental Factors 8 Weather, Lighting, ROW Conditions/Configuration/Type Contact Info (User) 4 Name, Phone No., Title, E-mail Data Record ID 6 Incident No., Revision No., Begin/End/Submitted Date Exposure to Risk 5 Pass. Trips, Veh. Rev. Miles/Hrs, Weekly Trip Cnt., Volume Cnt. Unknown/Other 6 Total 66 S&S-40 form did not provide the user with the opportunity to Data Record Correction. The first step of the data clean- enter this data on the Internet reporting form. The reason for ing process was to identify and rectify any contradictory or including the data fields in the NTD but not collecting data for omitted information in the collision records. Due to the them is unknown. The inclusion of these data fields on the number of observed errors in the records, it was critical S&S-40 form would provide valuable information to analysts. to avoid including errors, as well as avoiding the exclusion Finally, the NTD database includes a number of records of viable records from the analysis by addressing these intended to track the status of the record itself, such as when deficiencies where possible. In general, most significant it was submitted/edited, who submitted it, how many revi- contradictions/omissions in the data records generally sions it has undergone, etc. There were also six data fields that occurred in the following fields: event category, collision did not correspond to fields on the S&S-40 form whose pur- manner, lighting conditions, dates, injury counts, right-of- pose could not be determined. way type, and grade crossing control type. Event Category. The "event category" field classified NTD Data Quality Issues an incident either as a "collision," "evacuation," "security," "derailment," "fire," or "not otherwise classified (NOC)." Dur- This section outlines the deficiencies identified in the NTD ing the data cleaning process, it was observed that 73 of the database, the remedial measures employed to address them, 2,226 records identified as LRT-related were classified incor- and suggestions to avoid future data quality issues. rectly, either due to an error in data entry, or because the classifications were not mutually exclusive. For example, it was Data Cleaning Process observed that some incidents classified as derailments were actually the result of a collision, as indicated in the description A preliminary examination of the NTD database revealed field for the record. There were also collisions that were cat- several significant issues with the quality of the data. In an egorized as NOC, or simply had a blank category field. The effort to facilitate data analysis, the project team performed a classification of these collisions was updated to reflect the series of "data cleaning" exercises aimed at remedying the information provided in the "event description field." most common data deficiencies. This was accomplished using a systematic approach involving two major steps. First, Collision Manner. The "collision manner" field described data records were examined to see if contradictions and omis- what other vehicles, objects, or individuals were involved in sions in key data fields could be eliminated using the available the collision. In 117 records, an entry of "with object: other information. Second, records that were either not LRT colli- object (describe)" was changed to "with motor vehicle" when sions or duplicates of other collision records were removed the description field clearly indicated that another vehicle was from the database. involved.
OCR for page 26
26 Lighting Conditions. The "lighting conditions" field Many of the records in the NTD database suffered from described the natural and artificial illumination present at the incomplete and/or inaccurate information in key data fields, time of the collision. For 49 records, the fields that described and required significant data cleaning in order to be included in the prevailing lighting/weather conditions and time of day subsequent data analysis. In many cases, the fields containing produced seemingly contradictory accounts. For example, one detailed descriptions of the incidents contained the information record indicated the time of the incident to be 3:00 a.m., while required to make the necessary corrections. However, in many the lighting conditions were listed as "daylight, clear." In such instances it was not possible to make necessary corrections to cases, it was usually not possible to determine whether the time the records because the record lacked sufficient descriptive of day or the lighting condition had been incorrectly entered. information. This problem was partly due to the truncation of Therefore, it was not possible to reconcile some of the apparent the detailed "event description" fields in both the NTD and contradictions relating to lighting condition. SSO datasets. Discussion with FTA staff indicated that this was a database problem and that a fix would be forthcoming Dates. The date of the incident was omitted in 27 of the for the 2008 reporting that will eliminate this truncated data records contained in the database. In some cases, it was pos- problem (e-mail communications with FTA staff, Feb. 2008). sible to retrieve the date from records of the same incident in Unfortunately, the problem was only identified after part of either the SSO or local transit agency database. For example, the description had already been lost. a comparison of the NTD database with the California PUC In current and past editions of the S&S-40 online report- (SSO) database for the Sacramento Regional Transit District ing form, the user is required to enter information in certain allowed the recovery of four missing dates in the NTD data. mandatory data fields (indicated with an asterisk) before In total, eight NTD records had dates recovered from either the user can either save or submit the report. The NTD Safety the SSO or local data for incidents with the same reported and Security Manual (on the NTD web site at http://www. year, time, and incident description. ntdprogram.gov/ntdprogram/safety.htm) has included a state- ment requesting that users input information into all data Injury Counts. The number of injuries provided in the "injury count" field conflicted with the number of injuries pro- fields but not all data fields are designated as mandatory. vided in the "event description" field for 7 NTD records. For To prevent users from omitting information critical to safety example, a record listed as describing one injury included a analysis in the future, the NTD could expand the number of detailed description noting that six individuals were taken to mandatory fields that require information to be entered by the hospital to be treated for injuries. In these cases, the "injury the user before a report can be submitted. count" field was revised to match the "event description" field. It is more difficult to solve the problem of inaccurate data. The NTD Safety and Security Manual already contains detailed Alignment Type. The "alignment type" field classified the explanation for each data field, including descriptions of avail- ROW as exclusive, semi-exclusive, non-exclusive, etc. The clas- able options/answers, and instructions for completion. The sification of the right-of-way in the "alignment type" field was NTD continues to improve the Safety and Security Manual to updated in 62 records where the text description provided a address identified data reporting issues. To further mitigate clear description. Unfortunately, the description fields pro- the problem of incorrect data reporting, the NTD could pro- vided insufficient information to update the alignment type for duce an annual list of common mistakes made in data report- 176 of the data records. ing, including an explanation of frequently misunderstood data fields. The feasibility of this may be an issue: to identify Grade Crossing Control Type. The "grade crossing con- specific problems, the NTD would have to devote time and trol type" field classified the control devices present at road/rail manpower to a data cleaning procedure similar to the one grade crossings. This data field was updated based on informa- employed by the project team. tion contained in the description fields for 104 records. In most of these cases, the field was originally empty, but the text Data Record Removal. Following the data correction description provided a sufficiently detailed description of process outlined above, the database was examined to identify the type of crossing control. and remove records that were attributed to non-existent tran- sit agencies, were the result of duplication of records for the Transit Vehicle. The type of transit vehicle was identified same incident, or that incorrectly indicated the transit vehicle incorrectly in 19 of the data records. These records identified was an LRT vehicle. The following deletions were carried out: the transit vehicle involved in the collision as an LRT vehicle when the description indicated it was a bus. These records Non-existent Transit Agency. Of the 2,226 records in were later removed from the database (see the Non-LRT the database, 2 were attributed to a non-existent "ABC Agency" Vehicle section). and were deleted.
OCR for page 27
27 Duplicate Records. The NTD database allows users sively whether the two records represented one incident with a to access previously submitted incident records to make any revised report, or two separate incidents. necessary additions or changes subsequent to the submis- In cases where it was clear that multiple records referred to sion of the data record. Each revised record is assigned a the same incident, either the records were merged to create "revision number" corresponding to the number of revi- one record with all of the relevant information pertaining to sions made to the record since original submission. Upon the collision, or all of the records except the most recent inspection of the NTD database, it appeared that duplicate update were deleted. In addition, the data records reported by records created for the same incident did not have a unique SEPTA from the year 2005 onward contained many blank revision number. In most cases, the incident report was per- data fields. This further compounded the task of identifying fectly cloned for all but one or two characteristics which duplicate records in the database. Thus, while some SEPTA were likely added during the revision of the record. For records were repaired, the majority had to be left intact due example, two records would contain identical information, to lack of available information. but one data record listed the number of fatalities and no injuries, while the second record listed the number of injuries Non-LRT Vehicle. As discussed in the Transit Vehicle and fatalities. section, 19 of the data records incorrectly classified the transit It was observed that many of the Southeastern Pennsylvania vehicle as an LRT vehicle when the description indicated it was Transportation Authority (SEPTA) records suspected of being actually a bus. These records were removed from the database. revisions contained slightly different information in many of Table 6 shows the number of records removed from the the data fields. This made it very difficult to determine conclu- NTD dataset due to errors or omissions in data entry. Table 6. Summary of records deleted from NTD dataset due to errors/omissions in data entry. % of Total Deleted Remaining Total Records Records Records Transit Agency Records ABC Agency 2 2 100.0% 0 Bi-State Development Agency 9 2 22.2% 7 Central Puget Sound Regional Transit Authority 1 0 0.0% 1 Dallas Area Rapid Transit 92 18 19.6% 74 Denver Regional Transportation District 23 6 26.1% 17 Hillsborough Area Regional Transit Authority 12 4 33.3% 8 King County Department of Transportation Metro Transit Division 32 0 0.0% 32 Los Angeles County Metropolitan Transportation Authority 163 34 20.9% 129 Maryland Transit Administration 36 11 30.6% 25 Massachusetts Bay Transportation Authority 55 15 27.3% 40 Memphis Area Transit Authority 3 0 0.0% 3 Metro Transit 22 3 13.6% 19 Metropolitan Transit Authority of Harris County, Texas 104 14 13.5% 90 New Jersey Transit Corporation 2 1 50.0% 1 New Orleans Regional Transit Authority 13 4 30.8% 9 Niagara Frontier Transportation Authority 5 2 40.0% 3 Port Authority of Allegheny County 14 0 0.0% 14 Sacramento Regional Transit District 66 2 3.0% 64 San Diego Trolley, Inc. 41 7 17.1% 34 San Francisco Municipal Railway 185 17 9.2% 168 Santa Clara Valley Transportation Authority 21 6 28.6% 15 Southeastern Pennsylvania Transportation Authority 1130 108 9.6% 1022 The Greater Cleveland Regional Transit Authority 58 6 10.3% 52 Tri-County Metropolitan Transportation District of Oregon 90 17 18.9% 73 Utah Transit Authority 47 5 10.6% 42 Total 2226 284 12.8% 1942 Total without SEPTA 1096 176 16.1% 920
OCR for page 28
28 Table 6 shows that of the original 2,226 records, 284 records In summary, the data cleaning process resulted in a total of (12.8%) were deleted, resulting in a total of 1,942 records 1,720 crash records remaining from the original 2,226 total remaining. It should be noted that in the SEPTA data from records provided by the FTA. It is this remaining dataset that the year 2005 onward, there was often insufficient information is analyzed in the following sections. to determine whether one or more records were actually dupli- cations of the same event. It is suspected that if more informa- Disparity in Local Transit Agency Reporting to the NTD tion had been available, the number of SEPTA records deleted due to duplication would have increased. Examination of the NTD database revealed a large dispar- Based on the data records examined, it appears that the cre- ity both in the number of collisions reported by transit agen- ation of duplicate records for the same incident is a problem cies, and the total number of collisions reported by year. that needs to be addressed. It is unknown whether this problem Table 7 shows the total number of collisions by year for each is the result of incorrect use of the NTD database by the user, or transit agency. a malfunction in the database causing the creation of multiple Some variation in the number of collisions is expected across records when revisions are made to an incident report. transit agencies. Variation is inevitable because of differences The next step in the data cleaning process was the removal in the size of LRT system, measures of exposure to risk (i.e., of records that were identified as being incidents other than vehicle revenue miles), ROW classification, etc. However, it collisions. The database provided by the NTD initially con- was suspected that differences in data reporting procedures tained all incidents identified as meeting the criteria of a across transit agencies also accounted for a significant portion reportable incident and involving an LRT vehicle. However, of the variation observed. In particular, the number of colli- because these incidents were reported based on the "reportable sions reported by SEPTA was very high from 2002 to 2005, incident" criteria outlined in the Data Collected by NTD sec- and then dropped to levels the project team considered more tion, the records included incidents that did not involve an in line with the approach and level of reporting expected. LRT vehicle colliding with a motor vehicle or pedestrian, which As demonstrated in Table 6 and Table 7, SEPTA accounted were the only incidents relevant to the project. Therefore, the for over half of the total number of collisions in the NTD project team conducted an examination of the data to remove database both before and after the data cleaning process. In these non-collisions from the database. contrast, agencies such as the New Jersey Transit Corporation Prior to 2008, the online S&S-40 form required the user to reported only one collision over the course of six years. These specify a "primary event" and "secondary event(s)" under the two agencies represented the polar extremes of the observed "incident classification" heading. Both primary and second- collision reporting, so the project team contacted them in ary events listed in the S&S-40 form satisfied the NTD crite- an effort to understand the cause of the variation observed ria of a reportable incident. The primary event was defined as across transit agencies. the first harmful occurrence of the incident, while the second- SEPTA staff identified two primary causes that they believed ary event(s) were event(s) resulting from the primary event. contributed to the overrepresentation of SEPTA incidents in Based on this definition, the only incidents relevant to the the NTD database. The first was the fact that incidents involv- analysis were those with the primary event classified as a col- ing LRT were often reported by multiple departments within lision. In total, 222 of the remaining 1,942 records were iden- SEPTA. For example, a single incident could be reported tified as non-collisions, based on the "collision classification" through both the vehicle maintenance system and worker's field, and removed from the database. compensation if it resulted in both damage to the LRV and These records included: injury to the operator. According to SEPTA, this over- reporting of incidents continued until 2005 when the prob- · 5 records that had a blank "collision classification" field, lem was identified and rectified by SEPTA staff. · 91 records identified as derailments with no evidence of a The second explanation offered by SEPTA was the nature collision, of the transit system itself. The collisions classified as related · 15 records identified as evacuations with no evidence of to LRT included eight street trolley lines, five of which oper- collision, ate in mixed traffic conditions with high exposure to auto- · 22 records identified as fires with no evidence of collision, mobile traffic. A sample of the SEPTA records was checked · 77 records identified as "NOC," using mapping software, and it was confirmed that a great pro- · 3 records identified as security problems, and portion of the collisions actually occurred at locations of full · 9 records stating that the transit vehicle had left the roadway, streetcar operation, in mixed traffic for an extended section. which was assumed to indicate either a non-rail vehicle or a These sections operate differently than separate or median derailment, but in either case could not be confirmed to be operating alignments which do not share space with general the result of a collision. traffic. While streetcar alignments in mixed-traffic are still
OCR for page 29
29 Table 7. Total crashes per year by transit agency from NTD database (20022007). Agency 2002 2003 2004 2005 2006 2007 Total Bi-State Development Agency 1 1 1 1 1 5 Dallas Area Rapid Transit 5 17 17 17 13 2 71 Denver Regional Transportation District 1 4 4 5 14 Hillsborough Area Regional Transit Authority 2 3 1 6 King County Department of Transportation Metro Transit Division 7 9 8 8 32 Los Angeles County Metropolitan Transportation Authority 42 16 30 8 10 16 122 Maryland Transit Administration 8 1 5 14 Massachusetts Bay Transportation Authority 1 2 6 6 4 4 23 Memphis Area Transit Authority 1 2 3 Metro Transit 1 5 3 3 12 Metropolitan Transit Authority of Harris County, Texas 28 31 14 17 90 New Jersey Transit Corporation 1 1 New Orleans Regional Transit Authority 2 1 1 4 Niagara Frontier Transportation Authority 2 2 Port Authority of Allegheny County 6 3 2 11 Sacramento Regional Transit District 12 4 22 7 7 4 56 San Diego Trolley, Inc. 7 3 5 5 2 8 30 San Francisco Municipal Railway 41 18 11 10 17 19 116 Santa Clara Valley Transportation Authority 2 2 1 3 1 3 12 Southeastern Pennsylvania Transportation Authority 202 171 147 364 47 16 947 The Greater Cleveland Regional Transit Authority 10 5 14 10 5 3 47 Tri-County Metropolitan Transportation District of Oregon 16 13 9 17 7 5 67 Utah Transit Authority 9 9 1 5 6 5 35 Grand Total (Count) 372 278 304 505 144 117 1720 Grand Total (Percentage of Total Crashes) 21.6% 16.2% 17.7% 29.4% 8.4% 6.8% 100% Total without SEPTA 170 107 157 141 97 101 773 Total without SEPTA (Percentage of Total Crashes) 22.0% 13.8% 20.3% 18.2% 12.5% 13.1% 100% considered as LRT (Type c.1) according to the previously used field was "NTD Reportable," which included a response of classification system of TCRP Report 69, most new LRT sys- either "Yes" or "No" for each collision, based on whether the tems tend to avoid sustained operations in mixed traffic and so collision met the NTD criteria of a reportable incident. The SEPTA's mixed traffic operations are not typical, and are not second field was "NTD non-major/major," which classified the focus of this study. collisions as "major," "non-major," or were left blank (unclas- A third possible explanation was that SEPTA had reported sified). This field was used to classify each incident based on the collisions that do not meet the NTD reporting criteria. It NTD criteria for major incidents. Table 8 shows the number of appears that before 2006, the S&S-40 Internet reporting form collisions reported by SEPTA to the NTD based on their clas- did not filter incidents based on whether or not they satisfied sification in the above two categories. the NTD reporting criteria. This feature has been added to the Since transit agencies are only required to report incidents most recent installment of the S&S-40 form (7). meeting the criteria of a major incident to the NTD, it is The SEPTA collision database included two separate fields expected that an incident SEPTA classified as "major" under used to measure the severity of collisions reported. The first the NTD classification would also have an entry of "yes" in
OCR for page 30
30 Table 8. SEPTA collision reporting by NTD classification (20022005). NTD Classification in Flagged as NTD Reported to Not Reported SEPTA Database Reportable NTD to NTD Total Major Yes 114 99 213 No 13 12 25 Non-major Yes 10 9 19 No 227 213 440 Unclassified Yes 0 0 0 No 392 242 634 Total 756 575 1331 the "NTD reportable" field. Similarly, if an incident were involved in the low number of reported incidents for many of deemed "non-major," it should have an entry of "no" in the the transit agencies. "NTD reportable" field. However, Table 8 shows that 25 of the Discussions with staff from New Jersey Transit during the 238 incidents classified as "major" were also identified as not study team's site visit to the HBLR indicated that they had reportable to the NTD. In addition, 19 of the 459 incidents until recently only reported collisions to the SSO on the classified as "non-major" were identified as reportable to the assumption that those reports would be routed to the NTD, NTD. This shows that although the entries in the "NTD non- although they still reported all other regulatory requirement major" and "NTD reportable" fields for each incident co- data on operation etc. to the NTD; the roles of the NTD and incided in most cases, they were not always consistent. SSO in this case were not clear to the agency. Examination of In addition, Table 8 also shows that the classification of each the partial collision database provided by New Jersey Transit collision based on the above categories had little impact on during the site visit substantiates this explanation. The New whether the collision was reported to the NTD. For exam- Jersey database contained a total of 50 incidents that occurred ple, only 114 of the 213 collisions classified as "major" and between the years 20022008. Of these incidents, 19 resulted reportable to the NTD were actually reported to the NTD. in at least one injury, and 49 resulted in property damage. In addition, of the 440 incidents identified as both "non- Although the data provided contained no further detail as to major" and non-reportable to the NTD, 227 of them were the number of injuries or extent of property damage for each reported to the NTD. Finally, 392 of the 634 "unclassified" collision, it seems likely that at least some of these incidents incidents, all of which were determined to be not reportable would have met the NTD reporting criteria. Thus, it is a pos- to the NTD, were actually reported to the NTD. These data sibility that many of the transit agencies under-represented in suggest that the primary explanation for the high proportion the NTD were not reporting many incidents that satisfied the of SEPTA incidents in the NTD database was the reporting NTD reporting criteria. of incidents not meeting the NTD criteria for a "major" or The variation in the number of collisions reported across "reportable" incident. the years should relate in part to changes in NTD's criteria It is unknown why some of the incidents classified by for a reportable collision (see the Data Collected by NTD SEPTA as "NTD major" were considered non-reportable, and section). In 2003, all collisions at grade crossings were iden- why some of the NTD "non-major" incidents were marked as tified as reportable, but from 2004 to 2007 collisions at reportable in their data. Technically, the NTD "major" inci- grade crossings were only reportable if they resulted in at dents should all be reportable, while NTD "non-major" inci- least one injury away from the scene, or property damage dents should not be reportable. Unfortunately SEPTA was exceeding $7,500. This change should be reflected in a unable to provide additional data or assistance for a more decline in the number of collisions reported from 2003 to detailed review due to the staff time involved on their part. 2004, but the number of reported collisions did not drop In contrast to SEPTA, there were numerous transit agen- until 2006. In this case, it appears that the change in colli- cies who reported very few collisions during the six years sion reporting criteria did not have a discernable impact on examined. For example, The New Jersey Transit Corporation the number of reported collisions. The significant drop in and Niagara Frontier Transportation Authority reported only collision reporting between 2005 and 2006 can be explained one and two incidents, respectively, to the NTD over a period by the steep decline in incidents reported by SEPTA. The of six years. The limited number of collisions can be partially decline in SEPTA reporting corresponds with the identifi- explained by the fact that both of these transit agencies oper- cation and elimination of the problem of multiple SEPTA ate most of their light-rail network along exclusive rights-of- divisions submitting collision reports to the NTD for the way. However, it seemed unlikely that this was the only factor same incident.