Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 25
25
Table 5. NTD data fields by category.
Data Category No. Data Fields Additional Details
Mode/Service 2 Transit Mode, Service
Date/Time 6 Date, Hour, Minute, AM/PM, Time Zone, Time Period Desc.
Location 7 Transit Agency, City, Location Desc., Latitude, Longitude
Description of
2 Involved Party Category/Desc.
Involved Parties
Incident Classification 6 Event Level/Category/Type, Collision Manner, Local Level Desc.
Consequences of
3 No. Injuries, No. Fatalities, Property Damage ($)
Incident
Alignment/Crossing
4 Alignment Type, Grade Crossing Control, Intersection Control
Controls
General Descriptions
7 Incident/Passenger/Other Veh./Action/Other Action/Event Desc.
(i.e., actions)
Environmental Factors 8 Weather, Lighting, ROW Conditions/Configuration/Type
Contact Info (User) 4 Name, Phone No., Title, E-mail
Data Record ID 6 Incident No., Revision No., Begin/End/Submitted Date
Exposure to Risk 5 Pass. Trips, Veh. Rev. Miles/Hrs, Weekly Trip Cnt., Volume Cnt.
Unknown/Other 6
Total 66
S&S-40 form did not provide the user with the opportunity to Data Record Correction. The first step of the data clean-
enter this data on the Internet reporting form. The reason for ing process was to identify and rectify any contradictory or
including the data fields in the NTD but not collecting data for omitted information in the collision records. Due to the
them is unknown. The inclusion of these data fields on the number of observed errors in the records, it was critical
S&S-40 form would provide valuable information to analysts. to avoid including errors, as well as avoiding the exclusion
Finally, the NTD database includes a number of records of viable records from the analysis by addressing these
intended to track the status of the record itself, such as when deficiencies where possible. In general, most significant
it was submitted/edited, who submitted it, how many revi- contradictions/omissions in the data records generally
sions it has undergone, etc. There were also six data fields that occurred in the following fields: event category, collision
did not correspond to fields on the S&S-40 form whose pur- manner, lighting conditions, dates, injury counts, right-of-
pose could not be determined. way type, and grade crossing control type.
Event Category. The "event category" field classified
NTD Data Quality Issues an incident either as a "collision," "evacuation," "security,"
"derailment," "fire," or "not otherwise classified (NOC)." Dur-
This section outlines the deficiencies identified in the NTD
ing the data cleaning process, it was observed that 73 of the
database, the remedial measures employed to address them,
2,226 records identified as LRT-related were classified incor-
and suggestions to avoid future data quality issues.
rectly, either due to an error in data entry, or because the
classifications were not mutually exclusive. For example, it was
Data Cleaning Process observed that some incidents classified as derailments were
actually the result of a collision, as indicated in the description
A preliminary examination of the NTD database revealed field for the record. There were also collisions that were cat-
several significant issues with the quality of the data. In an egorized as NOC, or simply had a blank category field. The
effort to facilitate data analysis, the project team performed a classification of these collisions was updated to reflect the
series of "data cleaning" exercises aimed at remedying the information provided in the "event description field."
most common data deficiencies. This was accomplished
using a systematic approach involving two major steps. First, Collision Manner. The "collision manner" field described
data records were examined to see if contradictions and omis- what other vehicles, objects, or individuals were involved in
sions in key data fields could be eliminated using the available the collision. In 117 records, an entry of "with object: other
information. Second, records that were either not LRT colli- object (describe)" was changed to "with motor vehicle" when
sions or duplicates of other collision records were removed the description field clearly indicated that another vehicle was
from the database. involved.
OCR for page 26
26
Lighting Conditions. The "lighting conditions" field Many of the records in the NTD database suffered from
described the natural and artificial illumination present at the incomplete and/or inaccurate information in key data fields,
time of the collision. For 49 records, the fields that described and required significant data cleaning in order to be included in
the prevailing lighting/weather conditions and time of day subsequent data analysis. In many cases, the fields containing
produced seemingly contradictory accounts. For example, one detailed descriptions of the incidents contained the information
record indicated the time of the incident to be 3:00 a.m., while required to make the necessary corrections. However, in many
the lighting conditions were listed as "daylight, clear." In such instances it was not possible to make necessary corrections to
cases, it was usually not possible to determine whether the time the records because the record lacked sufficient descriptive
of day or the lighting condition had been incorrectly entered. information. This problem was partly due to the truncation of
Therefore, it was not possible to reconcile some of the apparent the detailed "event description" fields in both the NTD and
contradictions relating to lighting condition. SSO datasets. Discussion with FTA staff indicated that this
was a database problem and that a fix would be forthcoming
Dates. The date of the incident was omitted in 27 of the for the 2008 reporting that will eliminate this truncated data
records contained in the database. In some cases, it was pos- problem (e-mail communications with FTA staff, Feb. 2008).
sible to retrieve the date from records of the same incident in Unfortunately, the problem was only identified after part of
either the SSO or local transit agency database. For example, the description had already been lost.
a comparison of the NTD database with the California PUC In current and past editions of the S&S-40 online report-
(SSO) database for the Sacramento Regional Transit District ing form, the user is required to enter information in certain
allowed the recovery of four missing dates in the NTD data. mandatory data fields (indicated with an asterisk) before
In total, eight NTD records had dates recovered from either the user can either save or submit the report. The NTD Safety
the SSO or local data for incidents with the same reported and Security Manual (on the NTD web site at http://www.
year, time, and incident description. ntdprogram.gov/ntdprogram/safety.htm) has included a state-
ment requesting that users input information into all data
Injury Counts. The number of injuries provided in the
"injury count" field conflicted with the number of injuries pro- fields but not all data fields are designated as mandatory.
vided in the "event description" field for 7 NTD records. For To prevent users from omitting information critical to safety
example, a record listed as describing one injury included a analysis in the future, the NTD could expand the number of
detailed description noting that six individuals were taken to mandatory fields that require information to be entered by
the hospital to be treated for injuries. In these cases, the "injury the user before a report can be submitted.
count" field was revised to match the "event description" field. It is more difficult to solve the problem of inaccurate data.
The NTD Safety and Security Manual already contains detailed
Alignment Type. The "alignment type" field classified the explanation for each data field, including descriptions of avail-
ROW as exclusive, semi-exclusive, non-exclusive, etc. The clas- able options/answers, and instructions for completion. The
sification of the right-of-way in the "alignment type" field was NTD continues to improve the Safety and Security Manual to
updated in 62 records where the text description provided a address identified data reporting issues. To further mitigate
clear description. Unfortunately, the description fields pro- the problem of incorrect data reporting, the NTD could pro-
vided insufficient information to update the alignment type for duce an annual list of common mistakes made in data report-
176 of the data records. ing, including an explanation of frequently misunderstood
data fields. The feasibility of this may be an issue: to identify
Grade Crossing Control Type. The "grade crossing con- specific problems, the NTD would have to devote time and
trol type" field classified the control devices present at road/rail manpower to a data cleaning procedure similar to the one
grade crossings. This data field was updated based on informa- employed by the project team.
tion contained in the description fields for 104 records. In most
of these cases, the field was originally empty, but the text Data Record Removal. Following the data correction
description provided a sufficiently detailed description of process outlined above, the database was examined to identify
the type of crossing control. and remove records that were attributed to non-existent tran-
sit agencies, were the result of duplication of records for the
Transit Vehicle. The type of transit vehicle was identified same incident, or that incorrectly indicated the transit vehicle
incorrectly in 19 of the data records. These records identified was an LRT vehicle. The following deletions were carried out:
the transit vehicle involved in the collision as an LRT vehicle
when the description indicated it was a bus. These records Non-existent Transit Agency. Of the 2,226 records in
were later removed from the database (see the Non-LRT the database, 2 were attributed to a non-existent "ABC Agency"
Vehicle section). and were deleted.
OCR for page 27
27
Duplicate Records. The NTD database allows users sively whether the two records represented one incident with a
to access previously submitted incident records to make any revised report, or two separate incidents.
necessary additions or changes subsequent to the submis- In cases where it was clear that multiple records referred to
sion of the data record. Each revised record is assigned a the same incident, either the records were merged to create
"revision number" corresponding to the number of revi- one record with all of the relevant information pertaining to
sions made to the record since original submission. Upon the collision, or all of the records except the most recent
inspection of the NTD database, it appeared that duplicate update were deleted. In addition, the data records reported by
records created for the same incident did not have a unique SEPTA from the year 2005 onward contained many blank
revision number. In most cases, the incident report was per- data fields. This further compounded the task of identifying
fectly cloned for all but one or two characteristics which duplicate records in the database. Thus, while some SEPTA
were likely added during the revision of the record. For records were repaired, the majority had to be left intact due
example, two records would contain identical information, to lack of available information.
but one data record listed the number of fatalities and no
injuries, while the second record listed the number of injuries Non-LRT Vehicle. As discussed in the Transit Vehicle
and fatalities. section, 19 of the data records incorrectly classified the transit
It was observed that many of the Southeastern Pennsylvania vehicle as an LRT vehicle when the description indicated it was
Transportation Authority (SEPTA) records suspected of being actually a bus. These records were removed from the database.
revisions contained slightly different information in many of Table 6 shows the number of records removed from the
the data fields. This made it very difficult to determine conclu- NTD dataset due to errors or omissions in data entry.
Table 6. Summary of records deleted from NTD dataset due to errors/omissions
in data entry.
% of
Total Deleted Remaining
Total
Records Records Records
Transit Agency Records
ABC Agency 2 2 100.0% 0
Bi-State Development Agency 9 2 22.2% 7
Central Puget Sound Regional Transit Authority 1 0 0.0% 1
Dallas Area Rapid Transit 92 18 19.6% 74
Denver Regional Transportation District 23 6 26.1% 17
Hillsborough Area Regional Transit Authority 12 4 33.3% 8
King County Department of Transportation Metro Transit Division 32 0 0.0% 32
Los Angeles County Metropolitan Transportation Authority 163 34 20.9% 129
Maryland Transit Administration 36 11 30.6% 25
Massachusetts Bay Transportation Authority 55 15 27.3% 40
Memphis Area Transit Authority 3 0 0.0% 3
Metro Transit 22 3 13.6% 19
Metropolitan Transit Authority of Harris County, Texas 104 14 13.5% 90
New Jersey Transit Corporation 2 1 50.0% 1
New Orleans Regional Transit Authority 13 4 30.8% 9
Niagara Frontier Transportation Authority 5 2 40.0% 3
Port Authority of Allegheny County 14 0 0.0% 14
Sacramento Regional Transit District 66 2 3.0% 64
San Diego Trolley, Inc. 41 7 17.1% 34
San Francisco Municipal Railway 185 17 9.2% 168
Santa Clara Valley Transportation Authority 21 6 28.6% 15
Southeastern Pennsylvania Transportation Authority 1130 108 9.6% 1022
The Greater Cleveland Regional Transit Authority 58 6 10.3% 52
Tri-County Metropolitan Transportation District of Oregon 90 17 18.9% 73
Utah Transit Authority 47 5 10.6% 42
Total 2226 284 12.8% 1942
Total without SEPTA 1096 176 16.1% 920
OCR for page 28
28
Table 6 shows that of the original 2,226 records, 284 records In summary, the data cleaning process resulted in a total of
(12.8%) were deleted, resulting in a total of 1,942 records 1,720 crash records remaining from the original 2,226 total
remaining. It should be noted that in the SEPTA data from records provided by the FTA. It is this remaining dataset that
the year 2005 onward, there was often insufficient information is analyzed in the following sections.
to determine whether one or more records were actually dupli-
cations of the same event. It is suspected that if more informa-
Disparity in Local Transit Agency Reporting to the NTD
tion had been available, the number of SEPTA records deleted
due to duplication would have increased. Examination of the NTD database revealed a large dispar-
Based on the data records examined, it appears that the cre- ity both in the number of collisions reported by transit agen-
ation of duplicate records for the same incident is a problem cies, and the total number of collisions reported by year.
that needs to be addressed. It is unknown whether this problem Table 7 shows the total number of collisions by year for each
is the result of incorrect use of the NTD database by the user, or transit agency.
a malfunction in the database causing the creation of multiple Some variation in the number of collisions is expected across
records when revisions are made to an incident report. transit agencies. Variation is inevitable because of differences
The next step in the data cleaning process was the removal in the size of LRT system, measures of exposure to risk (i.e.,
of records that were identified as being incidents other than vehicle revenue miles), ROW classification, etc. However, it
collisions. The database provided by the NTD initially con- was suspected that differences in data reporting procedures
tained all incidents identified as meeting the criteria of a across transit agencies also accounted for a significant portion
reportable incident and involving an LRT vehicle. However, of the variation observed. In particular, the number of colli-
because these incidents were reported based on the "reportable sions reported by SEPTA was very high from 2002 to 2005,
incident" criteria outlined in the Data Collected by NTD sec- and then dropped to levels the project team considered more
tion, the records included incidents that did not involve an in line with the approach and level of reporting expected.
LRT vehicle colliding with a motor vehicle or pedestrian, which As demonstrated in Table 6 and Table 7, SEPTA accounted
were the only incidents relevant to the project. Therefore, the for over half of the total number of collisions in the NTD
project team conducted an examination of the data to remove database both before and after the data cleaning process. In
these non-collisions from the database. contrast, agencies such as the New Jersey Transit Corporation
Prior to 2008, the online S&S-40 form required the user to reported only one collision over the course of six years. These
specify a "primary event" and "secondary event(s)" under the two agencies represented the polar extremes of the observed
"incident classification" heading. Both primary and second- collision reporting, so the project team contacted them in
ary events listed in the S&S-40 form satisfied the NTD crite- an effort to understand the cause of the variation observed
ria of a reportable incident. The primary event was defined as across transit agencies.
the first harmful occurrence of the incident, while the second- SEPTA staff identified two primary causes that they believed
ary event(s) were event(s) resulting from the primary event. contributed to the overrepresentation of SEPTA incidents in
Based on this definition, the only incidents relevant to the the NTD database. The first was the fact that incidents involv-
analysis were those with the primary event classified as a col- ing LRT were often reported by multiple departments within
lision. In total, 222 of the remaining 1,942 records were iden- SEPTA. For example, a single incident could be reported
tified as non-collisions, based on the "collision classification" through both the vehicle maintenance system and worker's
field, and removed from the database. compensation if it resulted in both damage to the LRV and
These records included: injury to the operator. According to SEPTA, this over-
reporting of incidents continued until 2005 when the prob-
· 5 records that had a blank "collision classification" field, lem was identified and rectified by SEPTA staff.
· 91 records identified as derailments with no evidence of a The second explanation offered by SEPTA was the nature
collision, of the transit system itself. The collisions classified as related
· 15 records identified as evacuations with no evidence of to LRT included eight street trolley lines, five of which oper-
collision, ate in mixed traffic conditions with high exposure to auto-
· 22 records identified as fires with no evidence of collision, mobile traffic. A sample of the SEPTA records was checked
· 77 records identified as "NOC," using mapping software, and it was confirmed that a great pro-
· 3 records identified as security problems, and portion of the collisions actually occurred at locations of full
· 9 records stating that the transit vehicle had left the roadway, streetcar operation, in mixed traffic for an extended section.
which was assumed to indicate either a non-rail vehicle or a These sections operate differently than separate or median
derailment, but in either case could not be confirmed to be operating alignments which do not share space with general
the result of a collision. traffic. While streetcar alignments in mixed-traffic are still
OCR for page 29
29
Table 7. Total crashes per year by transit agency from NTD database
(20022007).
Agency 2002 2003 2004 2005 2006 2007 Total
Bi-State Development Agency 1 1 1 1 1 5
Dallas Area Rapid Transit 5 17 17 17 13 2 71
Denver Regional Transportation District 1 4 4 5 14
Hillsborough Area Regional Transit
Authority 2 3 1 6
King County Department of Transportation
Metro Transit Division 7 9 8 8 32
Los Angeles County Metropolitan
Transportation Authority 42 16 30 8 10 16 122
Maryland Transit Administration 8 1 5 14
Massachusetts Bay Transportation
Authority 1 2 6 6 4 4 23
Memphis Area Transit Authority 1 2 3
Metro Transit 1 5 3 3 12
Metropolitan Transit Authority of Harris
County, Texas 28 31 14 17 90
New Jersey Transit Corporation 1 1
New Orleans Regional Transit Authority 2 1 1 4
Niagara Frontier Transportation Authority 2 2
Port Authority of Allegheny County 6 3 2 11
Sacramento Regional Transit District 12 4 22 7 7 4 56
San Diego Trolley, Inc. 7 3 5 5 2 8 30
San Francisco Municipal Railway 41 18 11 10 17 19 116
Santa Clara Valley Transportation
Authority 2 2 1 3 1 3 12
Southeastern Pennsylvania Transportation
Authority 202 171 147 364 47 16 947
The Greater Cleveland Regional Transit
Authority 10 5 14 10 5 3 47
Tri-County Metropolitan Transportation
District of Oregon 16 13 9 17 7 5 67
Utah Transit Authority 9 9 1 5 6 5 35
Grand Total (Count) 372 278 304 505 144 117 1720
Grand Total (Percentage of Total
Crashes) 21.6% 16.2% 17.7% 29.4% 8.4% 6.8% 100%
Total without SEPTA 170 107 157 141 97 101 773
Total without SEPTA (Percentage of
Total Crashes) 22.0% 13.8% 20.3% 18.2% 12.5% 13.1% 100%
considered as LRT (Type c.1) according to the previously used field was "NTD Reportable," which included a response of
classification system of TCRP Report 69, most new LRT sys- either "Yes" or "No" for each collision, based on whether the
tems tend to avoid sustained operations in mixed traffic and so collision met the NTD criteria of a reportable incident. The
SEPTA's mixed traffic operations are not typical, and are not second field was "NTD non-major/major," which classified
the focus of this study. collisions as "major," "non-major," or were left blank (unclas-
A third possible explanation was that SEPTA had reported sified). This field was used to classify each incident based on the
collisions that do not meet the NTD reporting criteria. It NTD criteria for major incidents. Table 8 shows the number of
appears that before 2006, the S&S-40 Internet reporting form collisions reported by SEPTA to the NTD based on their clas-
did not filter incidents based on whether or not they satisfied sification in the above two categories.
the NTD reporting criteria. This feature has been added to the Since transit agencies are only required to report incidents
most recent installment of the S&S-40 form (7). meeting the criteria of a major incident to the NTD, it is
The SEPTA collision database included two separate fields expected that an incident SEPTA classified as "major" under
used to measure the severity of collisions reported. The first the NTD classification would also have an entry of "yes" in
OCR for page 30
30
Table 8. SEPTA collision reporting by NTD classification (20022005).
NTD Classification in Flagged as NTD Reported to Not Reported
SEPTA Database Reportable NTD to NTD Total
Major Yes 114 99 213
No 13 12 25
Non-major Yes 10 9 19
No 227 213 440
Unclassified Yes 0 0 0
No 392 242 634
Total 756 575 1331
the "NTD reportable" field. Similarly, if an incident were involved in the low number of reported incidents for many of
deemed "non-major," it should have an entry of "no" in the the transit agencies.
"NTD reportable" field. However, Table 8 shows that 25 of the Discussions with staff from New Jersey Transit during the
238 incidents classified as "major" were also identified as not study team's site visit to the HBLR indicated that they had
reportable to the NTD. In addition, 19 of the 459 incidents until recently only reported collisions to the SSO on the
classified as "non-major" were identified as reportable to the assumption that those reports would be routed to the NTD,
NTD. This shows that although the entries in the "NTD non- although they still reported all other regulatory requirement
major" and "NTD reportable" fields for each incident co- data on operation etc. to the NTD; the roles of the NTD and
incided in most cases, they were not always consistent. SSO in this case were not clear to the agency. Examination of
In addition, Table 8 also shows that the classification of each the partial collision database provided by New Jersey Transit
collision based on the above categories had little impact on during the site visit substantiates this explanation. The New
whether the collision was reported to the NTD. For exam- Jersey database contained a total of 50 incidents that occurred
ple, only 114 of the 213 collisions classified as "major" and between the years 20022008. Of these incidents, 19 resulted
reportable to the NTD were actually reported to the NTD. in at least one injury, and 49 resulted in property damage.
In addition, of the 440 incidents identified as both "non- Although the data provided contained no further detail as to
major" and non-reportable to the NTD, 227 of them were the number of injuries or extent of property damage for each
reported to the NTD. Finally, 392 of the 634 "unclassified" collision, it seems likely that at least some of these incidents
incidents, all of which were determined to be not reportable would have met the NTD reporting criteria. Thus, it is a pos-
to the NTD, were actually reported to the NTD. These data sibility that many of the transit agencies under-represented in
suggest that the primary explanation for the high proportion the NTD were not reporting many incidents that satisfied the
of SEPTA incidents in the NTD database was the reporting NTD reporting criteria.
of incidents not meeting the NTD criteria for a "major" or The variation in the number of collisions reported across
"reportable" incident. the years should relate in part to changes in NTD's criteria
It is unknown why some of the incidents classified by for a reportable collision (see the Data Collected by NTD
SEPTA as "NTD major" were considered non-reportable, and section). In 2003, all collisions at grade crossings were iden-
why some of the NTD "non-major" incidents were marked as tified as reportable, but from 2004 to 2007 collisions at
reportable in their data. Technically, the NTD "major" inci- grade crossings were only reportable if they resulted in at
dents should all be reportable, while NTD "non-major" inci- least one injury away from the scene, or property damage
dents should not be reportable. Unfortunately SEPTA was exceeding $7,500. This change should be reflected in a
unable to provide additional data or assistance for a more decline in the number of collisions reported from 2003 to
detailed review due to the staff time involved on their part. 2004, but the number of reported collisions did not drop
In contrast to SEPTA, there were numerous transit agen- until 2006. In this case, it appears that the change in colli-
cies who reported very few collisions during the six years sion reporting criteria did not have a discernable impact on
examined. For example, The New Jersey Transit Corporation the number of reported collisions. The significant drop in
and Niagara Frontier Transportation Authority reported only collision reporting between 2005 and 2006 can be explained
one and two incidents, respectively, to the NTD over a period by the steep decline in incidents reported by SEPTA. The
of six years. The limited number of collisions can be partially decline in SEPTA reporting corresponds with the identifi-
explained by the fact that both of these transit agencies oper- cation and elimination of the problem of multiple SEPTA
ate most of their light-rail network along exclusive rights-of- divisions submitting collision reports to the NTD for the
way. However, it seemed unlikely that this was the only factor same incident.