Click for next page ( 38

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 37
37 The use of software to check for data inconsistencies can provide a noticeable improvement with respect to the accuracy The most common methods/tools used for quality accep- of the data, identify areas for data collection improvements, tance are the following (in order of decreasing fre- and standardize data formats. These improvements might quency; the percentage of agencies citing each method/ allow for better data analysis and time-history updates. tool is provided within brackets): 1. Calibration of equipment and/or analysis criteria QUALITY ACCEPTANCE before the data collection [80%], 2. Testing of known control segments before data Quality acceptance activities include all procedures used for collection [73%], acceptance testing of both the pavement condition data that 3. Periodic testing of known control segments during are collected by the agency and those that are collected by a production [71%], service provider. These tests validate that the data meet the 4. Software routines that check if the data are within established requirements before they are used to support pave- the expected ranges [71%], ment management decisions. Quality management techniques 5. Software routines that check for missing road seg- commonly used for this purpose include testing of control ments or data elements [61%], and verification sites, sampling and re-rating, complete data- 6. Statistical/software routines that check for incon- base checks, GIS-based quality acceptance checks, and time- sistencies in the data [50%], and history comparisons. 7. Comparison with existing time-series data [50%]. Quality Acceptance Plan Figure 16 summarizes the percentage of agencies that indicated The percentages of the data that are typically reviewed by that they have a formal pavement condition data quality the agencies are shown in Figure 17. It may be noted that the acceptance plan. Two additional agencies indicated that they survey did not differentiate between general checks for com- are working on developing such a plan. pleteness and reasonableness and detailed checks for accu- racy, repeatability, and reproducibility. Quality acceptance processes typically require that a sam- An example of a detailed quality acceptance plan is pre- ple or all of the data are checked to determine if some of the sented elsewhere (93). The New Mexico DOT checks the data may need to be corrected or resurveyed. Although listed quality of the pavement condition data collected by a service by some of the agencies as quality acceptance activities, it is the providers (universities) for consistency, completeness, and opinion of the authors that the first two (calibration and reasonableness. The agency checks that all values fall within testing of control sections before data collection) are more acceptable data ranges and that the distress types and severities correctly classified as quality control activities. The qual- are reasonable. The agency also randomly selects sites and ity acceptance procedure, however, could verify that these conducts data checks on both blind and known locations. procedures were conducted as specified and that the required These checks include comparing results with previous years' tolerances were met. In addition, the procedures typically data to identify locations where large changes occurred. If there include testing of known or blind control sections, automatic had been areas where large changes occurred, the data have check on all the data, detailed checks on a sample of the col- to be checked for reasonableness and consistency (93). lected data comparisons with data from previous data collec- tion campaigns. In the case of data collection contracts, quality Similar procedures are followed by agencies that collect acceptance is often also linked to payments. data in-house. MDSHA uses quality acceptance checks to val- No Response 7.1% Developed by agency 32.1% Not Sure 7.1% Prepared by Independent 7.1% third party No 37.5% Other 8.9% Yes 48.2% 0% 20% 40% 60% 0% 20% 40% 60% FIGURE 16 Percentage of highway agencies having a formal quality acceptance plan.

OCR for page 37
38 Question: If you have a pavement data collection quality trol sites are measured using the reference procedure; for assurance plan, what percentage of the data example, manual visual survey for distress data. collected do you typically review in this plan? Verification sites are used to determine continued repeatability and/or reproducibility. They are measured < 2% None 9.8% periodically by the same equipment/crew or by differ- 14.6% ent devices/crews. The first case is typically called over- sampling and allows determining the repeatability. The 2 to 5% second is referred as cross-testing and determines the 24.4% reproducibility. > 10% 34.1% The percentage of data collected that has to be corrected/ 6 to 10% 17.1% resurveyed as a result of deficiencies identified by the quality acceptance process is very similar for in-house and contracted data collection. Approximately two-thirds of the respondents FIGURE 17 Percentage of data reviewed (64% for in-house and 67% for contracted) indicated that for quality assurance. their staff or the service providers need to correct less than 2% of the data collected. Most of the others (30% for in-house idate that the quality control process was conducted properly. and 33% for contracted) reported having to correct between 2% and 5% of the data. The quality acceptance is done by a quality assurance auditor, who is not the operator. This quality auditor checks the data management spreadsheet to verify that the data are complete, Establishing Acceptance Criteria verifies that all data have been saved and backed-up, and re-checks a random sample of 10% of the data collected (66). A key aspect of the quality acceptance procedure is the definition of what constitutes "acceptable" data. Agencies need to define the criteria to determine how much variation is Control and Verification Site Testing allowable between the reference value (or ground truth) and actual data measurement. The criteria are usually different for Approximately three-fourths of the agencies use known or the various pavement condition indicators (e.g., smoothness blind control sites as part of the quality acceptance procedure. and distress data), and include limits for accuracy and repeata- Blind sites are often used for distress rating comparisons. bility. It is important that the criteria reflect the actual capa- Although the terms control and verification sites are often bilities of the available technology and that the service provider used interchangeably, there is a practical difference: and agency agree on the acceptance criteria. For example, the tolerance for IRI might be 5%. Table 3 presents, as an Control sites are those in which the reference measure- example, the criteria originally defined for the Pennsylvania ments have been determined and can be used to deter- DOT for quality acceptance. Additional examples of tolerances mine both accuracy and repeatability of data. The con- used by DOTs are compared in Table 4. TABLE 3 INITIAL PAVEMENT CONDITION ACCEPTANCE CRITERIA FOR THE PENNSYLVANIA DOT Percent Within Recommended Action if Criteria Not Reported Value Initial Criteria Limits (PWL) Met IRI 25% 95% Reject deliverable Individual Distress 30% 90% Feedback on potential bias or drift in Severity Combination ratings. Retrain on definitions. Total Fatigue Cracking 20% 90% Reject deliverable Total Non-Fatigue 20% 90% Reject deliverable Cracking Total Joint Spalling 20% 90% Reject deliverable Transverse Cracking, 20% 90% Reject deliverable Jointed Plain Concrete Location Reference-- Correct All Return deliverable for correction Segment/Offset Segment Location Reference-- 10 ft 95% Return deliverable for correction and Segment Begin systems check Panoramic Images Legible signs 80% Report problem. Reject subsequent deliverables. Source: Ganesan et al. (94 ).

OCR for page 37
39 TABLE 4 EXAMPLE OF TOLERANCE FOR VARIOUS PAVEMENT INDICATORS Virginia British Columbia Condition Indicator Range Criteria Accuracy Repeatability Smoothness (IRI) 10% of Class I 0.1 m/km Rut Depth 3 mm 3 mm Pavement Condition Index (surface 10 95% 1 PDI 1 SD of PDI for distress) [scale 1 to 10] five runs PDI = Pavement Distress Index; SD = standard deviation. Sampling for Quality Acceptance Testing Sample Size Determination for Assessing Network-Level Accuracy Another important aspect is the determination of how large of a sample is needed for determining and verifying the accuracy This section discusses available tools to determine how large and repeatability of the data collection procedure. For exam- a sample is needed to verify that the accuracy of the measure- ple, for automated or semi-automated distress data collection ments is within a specified range (e.g., 10%) with a certain it is common to extract a sample of the collected pictures and degree of confidence. Statistical testing of the mean differences review the ratings for accuracy. A 5% sample is common for is typically conducted using two- and one-sided t-tests. The size this purpose; however, statistical approaches can be used to of the sample required for conducting meaningful comparisons determine the required sample size. Larger samples may be is a function of the following statistical parameters: required for research-quality data; for example, the LTPP distress data collection protocol requires that 10% of each 1. Test significance (alpha) used as threshold for statistical lot is checked for distress mismatches, questionable severity significance (e.g., an alpha of 0.05 is typically selected levels, or errors in the test section or survey date. to achieve a 95% level of confidence); 2. Desired precision or maximum acceptable difference; Selezneva et al. (88) presents a series of promising sampling 3. Variability in the population, determined as the stan- approaches that have been evaluated for quality assurance of dard deviations of the computed differences (unknown the data collected for the National Park Service PMS. The data at the time of analysis); and are surveyed by a service provider that collects pavement and 4. Test power, or probability of correctly rejecting the null right-of-way images, rutting, smoothness, road horizontal and hypothesis (no difference between surveys) when it is vertical alignment, and GPS coordinates. Surface-cracking false, which is typically computed by using a parame- data are determined from the images using an automated crack ter beta defined as a probability of not detecting a dif- detection system that detects the type, severity, and amount ference when the difference exists (e.g., to achieve test of cracking within a 0.01-mile section. The data are then aggre- power of 90%, beta would be limited to 10%). gated in indexes for individual surface distresses, a compos- ite distress index [the surface condition rating (SCR) and an The sample size is selected by balancing accuracy and cost. overall pavement condition rating (PCR)]. Although a larger sample allows for identifying a finer mean difference as statistically significant, the cost and effort of The quality control and acceptance checks required for obtaining the sample and processing and analyzing the data field data collection are documented in the Road Inventory may offset the benefit of the added precision of the results. A Program--National Park Service Quality Assurance Manual pilot application of the methodology for one park showed (95). The checks include equipment checks, diagnostics of that a tolerance level of five SCR points would require 103 to data collection and processing hardware and software, and a 112 sample sections, which was considered relatively high verification survey of a sample of selected parks by a review because of the high cost for the field survey. If the tolerance panel. Collected distress data are subject to a two-step data level for the difference in mean SCR values between manual quality control and acceptance process in which the service and automated surveys is relaxed to 10 SCR points, it would provider first applies a series of internal quality control checks require 26 to 28 sample units, which was considered more and the FHWA then conducts quality acceptance checks. The reasonable (88). This type of precision of 10% points of a quality acceptance checks verify that the collected data are 100 point index is also used by other agencies. rated in accordance with the approved methodology for dis- tress identification and distress severity ratings by manually rating selected pavement distress images provided by the ser- Sample Size Determination for Conformity Testing vice provider's automated crack detection system. The size of the sample requiring these later checks is determined To compare a sample of individual measurements against the using the following procedures. reference and determine what percentage of the data collected

OCR for page 37
40 fail to meet the quality acceptance criteria (e.g., 20%), those distresses that have the most effect on PMS decisions. Selezneva et al. (88) recommend the use of the lot acceptance For example, inconsistencies among distress severities are sampling plan methodology. The decisions about acceptance weighted less than disagreements among distress identification. of production data are based on counting the number of un- This type of analysis has been shown to be more effective in acceptable quality observations in a random sample of obser- identifying data inconsistencies than traditional methods, but vations from a set. assignment of weights and benchmarks needs to be carefully done (86). To determine the required sample size for a known targeted maximum number of unacceptable observations in the sample, the analyst defines the following statistical parameters: Complete Database Checks Typical quality acceptance procedures also include automatic 1. Acceptable quality level or the percentage of automat- checks on all the collected data to determine (1) if the data are ically collected data points that are expected to differ within the expected ranges, (2) if there are any missing road from the reference value assessment as a result of lim- segments or data elements, and/or (3) conduct simple statis- itations of production technology; tical analysis and/or check to find possible inconsistencies in 2. Lot tolerance percent defective (LTPD) or percentage of the data. This can be done as the data are being submitted automatically collected data points differing from the (e.g., weekly) or after the entire product has been submitted. reference value assessment of the same data points that It is recommended that at least some of these checks be con- would make a dataset unacceptable by the specifications; ducted frequently to identify possible issues as soon as pos- 3. Type I Error (service provider's risk) or probability of sible and avoid collecting large quantities of bad data. rejecting a dataset that has a defect level equal to the acceptable quality level; and These checks are similar to the ones discussed for quality 4. Type II Error (client's risk) or probability of accepting control but are conducted as a second check by the owner a dataset with a defect level equal to the lot tolerance agency in the case of contracted data collection or by an inde- percent defective. pendent quality acceptance auditor (internal or external) in the case of in-house data collection. Examples include checks to For large datasets, where the sample size is less than 10% verify essential "general" information included in the con- of the total number of observations, n0, that must pass the dition database, sensor checks to flag out-of-range values for acceptance criteria without a single unacceptable observation different indicators, and distress checks to verify that the dis- can be computed using Eq. (1), and the number n that must tresses identified match the surface type (e.g., fatigue cracking pass the criteria with no more than one unacceptable obser- on asphalt pavements). vation using Eq. (2): An example of a well-developed set of software checks ln (1 - C ) has been presented by Wolters et al. (52). This publication n0 = (1) ln ( R ) describes an application developed for Oklahoma DOT for checking the pavement data quality. The program conducts four types of checks: preliminary, sensor, distress, and spe- ( R )n + n (1 - R ) ( R )n -1 = 1 - C (2) cial. All of the data checks can be organized into reports to identify inconsistencies and areas where data collection pro- where tocol may need to be modified (52). R = reliability of the data production procedures expressed The Iowa DOT quality acceptance process includes checks as a fraction; the percentage of the observations not to verify that the measurements are between the expected passing the passfail criteria because of limitations of minimum and maximum values, and to identify segments with the production methodology is equal to 100(1 - R); missing roughness, rutting, or distress data, and sections in and which all distress values are zero on continuous segments. The C = confidence level expressed as a fraction (for a 95% quality acceptance procedures also compare pavement condi- level of confidence, C = 0.95). tion index values from year to year. To improve communi- cations with the service providers and reduce the amount of rejected data, some of these steps have also been incorporated Other Statistical Analyses into the service provider's quality control process (73). The Cohen's weighted Kappa Statistic has also been proposed as a measure of agreement between raters, which evaluates Use of Geographic Information Systems in the probability of agreement beyond chance. This method Pavement Condition Data Quality Management allows for the use of weights between disagreements, so that less important disagreements have less of an effect than more One particular technique that is gaining acceptance is the use important disagreements and more weight can be given to of GIS-based checks for quality acceptance. The visualization