Read "Guidelines to Improve the Quality of Element-Level Bridge Inspection Data" at NAP.edu

« Previous: 2. Research Approach

Page 4

Suggested Citation:"3. Findings and Applications." National Academies of Sciences, Engineering, and Medicine. 2019. Guidelines to Improve the Quality of Element-Level Bridge Inspection Data. Washington, DC: The National Academies Press. doi: 10.17226/25397.

Page 5

Page 6

Page 7

Page 8

Page 9

Page 10

Page 11

Page 12

Page 13

Page 14

Page 15

Page 16

Page 17

Page 18

Page 19

Page 20

Page 21

Page 22

Page 23

Page 24

Page 25

Page 26

Page 27

Page 28

Page 29

Page 30

Page 31

Page 32

Page 33

Page 34

Page 35

Page 36

Page 37

Page 38

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

4 3 FINDINGS AND APPLICATIONS This portion of the report describes the research and findings from the project. This includes a survey of bridge engineers conducted to determine the current state of the practice for element-level inspection. The development of a guideline to improve the quality of element-level inspections is described. This includes a methodology for establishing accuracy requirements and evaluating the impact of an accuracy requirement on bridge management system deterioration forecasting. The results of field exercises conducted to evaluate the guidelines and the quality of element-level data is reported. 3.1 Survey of Bridge Engineers A survey of state bridge engineers and other stake holders was completed during the early stages of the research to identify key data relevant to the research objectives. The survey was completed during November and December of 2015. A complete summary of the survey results and the detailed responses to each survey question are included in Appendix A. A summary of key findings from the survey are included in this section of the report. The survey was formatted electronically and was distributed to individuals identified by members of the Standing Committee on Bridges and Structures (SCOBS) from each participating state. The survey consisted of 52 questions. Thirty-six agencies responded to the survey. The responding agencies included 34 state Departments of Transportation (DOTs), Washington, D.C., and the Corps of Engineers. A complete listing of the agencies responding to the survey is included in Appendix A. About 80% of the agencies responding to the survey began collecting element-level data prior to 2010, indicating that the population of respondents had experience in element-level inspection that pre-dated federal requirements for reporting element-level data. A key question addressed in the survey was to determine if agencies using element-level inspection had element-level inspection manuals to support the inspections. It was significant that more than 2/3 respondents indicated that they had an element-level inspection manual that included at least the FHWA-required elements and the defect elements identified in the MBEI. Among these respondents, more than Â½ had a manual that included defect elements. About 1/5 of the respondents indicated they did not have a manual at this time though half of these respondents indicated a manual was being planned. These data are significant in demonstrating that agencies are responding to the need to develop manuals to support element-level inspections and are adjusting existing element-level manuals to meet the needs of the MBEI. It was found that the use of photographs to illustrate different CSs was limited among those agencies with element-level manuals; only about 1/3 of respondents indicated that their manual includes photographs of different CSs. It is also significant to note that 2/3 of these respondents indicated that their manual was in a format that is suitable for use in the field by inspectors. These data indicated that there was a need for a manual or guideline to provide standard photographs of element CSs, and that manual should be usable in the field. Several questions in the survey were designed to develop an understanding of the current state of the practice regarding the use of quantitative data to describe CSs within agencies implementing element-level inspection. For example, question 20 of the survey asked if the responding agency was using the quantitative descriptors included in the MBEI commentary to determine the CS for concrete bridge decks with cracks. A majority of respondents, 70%, indicated that they were using the quantities identified in the commentary in the MBEI (with 2015 revisions), and another 20 % indicated they were using the description in the previous version of the MBEI (without 2015 revisions) or had developed their own quantitative descriptions. In total, 33/36 respondents indicated that they were using some type of quantitative descriptors to define condition states. This was significant to the current study because quantifying CS descriptions is more objective than having subjective descriptions, and therefore, more likely to have a

5 measurable quality. Only three respondents indicated that they were using subjective criteria, of these, two respondents indicated that they had procedures for âcalibratingâ inspectors; these generally consisted of quality assurance (QA) processes conducted on a yearly basis. Several questions were aimed at determining if agencies were using accuracy requirements as part of their business practices. These questions were designed to address the question of accuracy requirements for element-level inspection. For example, Question 23 asked if the respondent had a specified accuracy requirement for CS quantities. In general, respondents indicated that they did not have a specified accuracy tolerance for CS quantities, with only 5/36 respondents indicating that they did have such an accuracy requirement. Among the five indicating that they had tolerance, two respondents appeared to be in error; one respondent indicated that they had graduated tolerance that decreased as the CS went from good to severe (CS 1 15%, CS 2 10%, CS 3 5% and CS 4 1%). A second respondent indicated that they used tolerances that varied according to element type, ranging from 0% for elements with units of ea up to 15% for wearing surfaces and protective coatings. Question 27 asked if agencies had accuracy requirements for defects. Twenty-two agencies reported using defect elements, but only two of these indicated an accuracy requirement for defects. Of these two, one indicated a subjective requirement was used by comparing inspector ratings to a QA review rating, and the second indicated that accuracy requirements were included in their inspection manual. A review of the subject manual revealed the guidance on the measurement of steel and concrete cracking to the nearest 1/32" (1 mm). Question 32 asked how the results of quality assurance processes are conveyed to the inspection team leaders. Research has shown that feedback is an important factor in the quality of VI results (Megaw 1979, Drury and Watson 2002, Melchore 2011, See 2012). Consequently, the feedback provided through both formal QA processes and less formal group meetings is an important component in the quality of inspection data. The results shown in Figure 1 illustrate that most respondents use a combination of face-to-face meetings and a report to convey the results of the QA process. Specifically the data showed that 14% of respondents used a report, 23% used and face-to-face meeting, and 63% used both. Figure 1. Pie chart showing distribution of responses for question 32. Another form of providing feedback to inspectors is to hold periodic meetings of inspectors to discuss inspection results, ratings procedures, and requirements. Survey questions 34 and 35 asked if the agency held periodic inspection meetings, and if so, the frequency of the meeting. Question 35 also differentiated between meetings of all inspection teams or meetings at a district or other sub-division level. Responses indicated that almost 90% of respondents hold periodic meetings of inspectors. Responses from those holding meetings were analyzed to include responses noted as âotherâ if those responses generally fit within a provided frequency option. For example, some respondents indicated a monthly meeting, but the choices provided were annual, biannual or quarterly. Based on this analysis, it was found that 81% of the respondents held a meeting of all inspectors at least annually; 53% indicated an annual meeting and 28%

6 could be characterized as monthly, quarterly, or biannually. These results are significant in showing that most agencies are providing feedback and information to their inspectors through face-to-face meetings. A series of question in the survey were focused on inspection procedures and practices. Because the bridge inspection process can vary significantly for different situations encountered in the field, an example bridge was provided for consideration in responding to the survey. The example bridge provided a context for certain questions regarding how such a bridge would typically be inspected. One of the quality dimensions commonly assessed is the consistency of area estimates for different condition states. The ability of the inspector to estimate any given area is not well understood. As such, several questions were designed to gain some insight regarding how quantity estimates were made in the field, and whether these procedures were documented. For example, question 42 inquired about how damage quantities would be estimated for the example bridge deck, assuming that the decks had a total area of 11,880 sq ft with 11,628 sq ft in CS1 and 252 sq ft in CS3. This question was focused on determining if the quantity is obtained subjectively, i.e., by a visual estimate of an area, or quantitatively, i.e., by using an objective measure such as a tape or measuring wheel and tallying the total area. Figure 2 shows that responses were fairly evenly divided between those that would use an area estimate and those that would use a measurement device. A small number indicated that the damage would first be drawn on a sketch, or that the respondent didnât know what method would be used. A follow-up question asked if the method of estimating area was documented in an inspection manual, provided through training, or was at the discretion of the inspection team. More than 60% of the respondents indicated that the method for making the measurement was at the discretion of the inspection team. Less than 10% of respondents indicated that the method used was documented in an inspection manual or other resource; about 30% indicated the method is described in training or periodic meetings. These data indicate that there is variation in the methodology for making a quantity estimate, and consequently one would expect some variation in the consistency of results. Figure 2. Bar graph showing the survey response to question 42 regarding estimating defect areas.

7 Previous research has indicated that the time required or allowed to complete an inspection has an impact on the quality of results. It is interesting to note that, similar to previous studies, the anticipated time required to complete a routine element-level inspection of the example bridge varied significantly. Two questions focused on the time element for inspecting the example bridge. The first asked how long the respondent would expect an inspection team to spend inspecting the example bridge and preparing the required report for a first-time inspection. As shown in Figure 3, estimates ranged from less than 2 hours to greater than 8 hours. Another question asked how long the respondent would expect an inspection team to spend for subsequent element-level inspection. These results are also shown in Figure 3. The results also indicated responses ranging from less than 2 hours to greater than 8 hours. As expected, first-time inspections were generally expected to take longer than subsequent inspections. One respondent skipped these two questions. Given that different agencies have different business practices and are collecting different levels of detail during element-level inspections, the fact that time estimates differ is not surprising. However, the extent to which they differ, from less than 2 hours to greater than 8 hours, is indicative of a large variation in approaches to element-level inspection among different bridge owners. Another question asked if the agency required inspectors to record the actual time taken to complete an inspection. Seventy percent of respondents indicated that recording the time was not required. Figure 3. Bar graph showing estimated time to complete a first-time and routine element-level inspection. Access to the elements to be inspected is another factor in the quality of inspection results that was studied through the survey. Respondents were divided in their responses for the example bridge provided in the survey. Asked if access equipment would be used, approximately 47% of respondents said yes and 53% said no; two respondents skipped this question. Asked if traffic control would be used, approximately 46% said yes and 54% said no; one respondent skipped this question. Again, these data provide insight into variations between the business practices of different agencies. Finally, the survey asked what photographs were required as a part of the inspection. Photographs can provide important documentation of conditions and defects present in a bridge. Photographs can provide context regarding the advancement of deterioration over time for future inspections and allow for quality review of inspection results. About 70% of respondents indicated that the photographs described in the MBEI were required, other responses indicated that photographs of defects that resulted in CS 2, 3, or 4, or

8 a NBI component condition rating below a 6, were required. About 16% of responses indicated that inspection teams determine the appropriate photographs. Overall, the survey provided important information regarding the state of the practice for element-level bridge inspection. Given the evolving nature of element-level inspection programs, the survey provides a snapshot in time of the inspection practices. Generally, the survey illustrated that responding agencies were taking a proactive approach to implementing the MBEI and new element-level inspection requirements, developing element-level manuals, and utilizing some quantitative descriptors to identify condition states. QA programs that included feedback through both face-to-face meetings and reports illustrate a commitment to quality and an important understanding of the need for feedback to improve the quality of visual inspections. The survey also revealed that there were broad variations in approaches to inspecting a common highway bridge, with significantly different estimated times to complete the inspection, different methods of access, traffic control, and tools that are used during routine inspections. The survey also indicated that most of the responding agencies did not have accuracy requirements for element-level inspection data. A methodology for developing such an accuracy requirement is described in this report. It was also shown in the survey that different methods of estimating area are used by different agencies, with about half of the respondents indicating that percentage estimates are used and half indicating the areas are measured and tallied. The effect of these different approaches was evaluated during the field exercises completed as part of the research. It was also found that, at the time of the survey, most agencies did not have manuals that included photographs of elements in different CSs. A visual guide including visual standards was developed as a part of the research as described below. 3.2 Guidelines to Improve the Quality of Element-Level Data The document âGuidelines to Improve the Quality of Element-Level Bridge Inspection Dataâ was completed during the course of the research. The guidelines are documented in the appendices of this report in two appendices. The guidelines include recommendation for determining accuracy requirements for element-level bridge inspection data, and field test methods for improving quality. This portion of the guideline is presented in Appendix B. The primary feature of the guidelines is a visual guide that is intended for use in the field to improve the quality of element-level data. This visual guide was incorporated into the MBEI during the course of the research, and is presented in Appendix C. The development of the guidelines is described in this section. 3.2.1 Visual Guide The visual guide is designed as a tool for determining the appropriate CS for a defect element. The guide consists of photographs (images) of defect elements in different CSs (i.e., CS 2, CS 3, etc.). The images provide visual standards to which a condition observed in the field can be compared to determine the appropriate CS assignment. The guide is generally organized according to element material, such as reinforced concrete, prestressed concrete, or steel. Defect elements for steel protective coating, movable bearings, and joints are also included. The defect elements shown in the visual guide are applicable to any element formed from that material, because the MBEI defect descriptions are uniform for a given material. For example, a concrete spall (defect 1080) has the same description whether the defect is in a concrete deck, girder, column, pier, or abutment. Therefore, only one visual standard describing each CS for the spalling defect is required. In this way the assignment of the appropriate CS is more likely to be consistent as compared with having different visual standards for the spalling defect in different bridge elements (i.e., deck, girder, column, pier, or abutment). The visual guide was developed in part using images from existing bridge inspection manuals currently in use by DOTs. However, a review of the available images in state inspection manuals indicated that the images had some limitations. In many cases, the images did not clearly show the defect element, or the

9 subject of the image was difficult to determine. In some cases, the images were not clear, or the resolution of the image was too low for use in the new visual guide. Many manuals did not have a complete set of images showing each CS for each defect element. Additionally, many of the images did not include a quantitative scale in the image. The appearance of a scale in the image was desirable to promote more uniform interpretation the images. To address these limitations, the research team captured new images in the field that were suitable to produce visual standards for most of the defect elements. Special scales were investigated that would provide a normalized, easily-interpretable scale in the image. Scales that are designed for use in forensic crime scene investigations were obtained because these scales are intentionally developed for photographic documentation of scenes. Consequently, the scales include dimension delineations in segments that are easy to interpret in an image. Figure 4 shows the scales that were used in developing the visual guide. Several different crack compactors were also used for documenting crack sizes. Figure 4. Photograph of the forensic science visual scales used for developing the visual guide. It was also found that in some cases the subject of an image may not be readily apparent to the user. For example, a photograph may include portions of an element in more than one CS. Red boxes were superimposed on some of the images to enclose the subject of the image to address this issue. The box encloses the defect CS that is to be assessed using the image (e.g., CS 2, CS 3, etc.) The selection of images for use in a visual guide is a very subjective process, because many of the CS descriptions are qualitative. Different images have different qualities that may be preferred, and those preferences are not necessarily consistent among different individuals. For those defects that had quantitative descriptions, such as cracking in concrete, images were sought to represent the quantitative values. The quantitative values, most of which have been moved to the commentary in the current MBEI, were used to define the boundaries between different CSs. In some cases, CS descriptions lend themselves to having âboundary imagesâ that define the boundary between two different CSs. This is done to provide the inspector with a âgreater than or less thanâ comparison image to use in evaluating a particular situation. Boundary images that define the boundary between different CSs were included where appropriate. Other elements cannot have boundary images because specific conditions define the different CSs. For example, steel coating peeling/bubbling/cracking (El. 3420) has CSs defined by the layer of paint that is affected. For CS 2, only the topcoat is affected, for CS 3, both the top coat and the primer are affected, and for CS 4, bare metal is exposed. Therefore, there are not boundary images that can be used to divide CS 2, CS 3, and CS 4. For these reasons, different defect element visual standards vary according to the needs of the subject element and its associated CS descriptions. The visual guide generally includes visual standards for CS 2 and CS 3, and boundary images where appropriate. An effort was made to minimize the number of images in the guide to allow the images to

10 appear larger on the page and to simplify the use of the guide. A visual guide with an excessive number of pages is more difficult to use in the field then a guide with a smaller number of pages. Additionally, images that are redundant can lead to confusion and reduce consistency, since different inspectors may use different images to assess the same defect. Photographs for CS 1 were not included in the visual guide because CS 1 is defined by the lack of the defect. Therefore, a photograph of CS1 would be unnecessary for evaluating damage. Finally, for most defect elements, CS 4 is described in the MBEI by the general condition âwarrants structural review.â Determining if a condition warrants structural review is a subjective decision that relies on context. There is no way to produce a visual standard that defines a subjective decision. For example, a steel beam with corrosion damage resulting in a hole through the web is likely to be assigned CS 4. However, significant thinning that has not yet produced a hole may also be a situation that warrants structural review (CS 4), depending on amount and location of section loss. Another rationale for not including CS 4 (for cases warranting structural review) is that these conditions are somewhat uncommon as compared with CS 2 and CS 3. Including additional images for CS 4 would require reducing the size of the images for other CSs, which may reduce the effectiveness of the visual guide for the majority of inspections. Table 1 provides a list of the defects elements for which visual standards were developed during the research. This list includes many of the most common defects in the materials of steel, concrete, and prestressed concrete. Certain defect elements for joints, bearings, and protective coatings were also included in the guide. Figure 5 shows an example from the visual guide for defect 1080, Delamination/Spall/Patched areas with three relevant condition state CSs; CS 1, CS 2 and CS 3. Characteristics of the typical page are shown in this figure. The number and name of the element is shown in the upper left corner of the page. The top row of the table shows the different CSs, with CS 1 shown in green, CS 2 in yellow, CS 3 is a darker tone of yellow. If appropriate, CS 4 is shown in red. The description of the CS from the MBEI is presented. If there are quantitative data from the commentary that are relevant, these appear below the primary description in smaller italicized text. The boundary images between CSs are shown in the lower portion of the page, where appropriate. For the boundary images shown in Figure 5, the definition of CS 2 is a spall 1 in or less deep or 6 in or less in diameter; the description for CS 3 is a spall greater than 1 in deep or 6 in in diameter. The boundary image between CS 2 and CS 3 shows a spall that is 6 in in diameter and ~1 in deep. Therefore, any spalling defect that is larger (or deeper) than the spall shown in the image should be assigned to CS 3, while any spalling defect that is smaller (or less deep) should be assigned to CS 2. This allows for a âgreater than / less thanâ comparison in the field when assigning the appropriate CS.

11 Table 1. Listing of defect elements included in the visual guide. El. No. Element Name Material / Element Type 1000 Corrosion Steel 1010 Cracking Steel 1020 Connection Steel 1080 Delamination/Spall/Patched Area Concrete 1110 Cracking (PSC) PS Concrete 1090 Exposed Rebar Concrete 1120 Efflorescence/Rust Staining Concrete 1130 Cracking (RC and Other) Concrete 1190 Abrasion / Wear Concrete 2310 Leakage Joints 2330 Seal Damage Joints 2350 Debris Impaction Joints 2220 Alignment Bearing 2240 Loss of Bearing Area Bearing 3410 Chalking (Steel Protective Coatings) Protective Coating 3420 Peeling/Bubbling/Cracking (Steel Protective Coatings) Protective Coating 3430 Oxide Film Degradation Color/Texture Adherence(Stl Protect Coat) Protective Coating 3440 Effectiveness Protective Coating Figure 5. Visual standard for Defect Element 1080-Delamination / spall / patched area.

12 3.2.1.1 Spatial Estimation Diagrams The guideline also includes diagrams intended to assist in making accurate quantity estimates in the field. Figure 6 shows an example of a spatial estimating diagram for use on bridge elements recorded in units of sq ft. The diagram is used by comparing the areas shown in the diagram with the appearance of damage in the bridge element being assessed. The diagram shows 5% of the area as damaged with different distributions of damage. For example, the top diagram shows 5% of the area, with damage distributed throughout the element. The bottom diagram shows 5% of the area with the damage in a single area. The diagrams for elements with units of sq ft are standardized to a scaled area of 4000 sq ft, measured as 40 ft x 100 ft. This size was chosen to provide a typical aspect ratio for a two lane bridge with shoulders. Linear estimating diagrams for comparison with elements described by units of ft are also included. The diagrams for elements described by units of ft are standardized to a scaled length of 100 ft. A guide for judging the area (sq ft) affected by pattern cracking is also included in the guide, as well as diagrams illustrating crack spacing in concrete. Figure 6. Example comparison image for estimating area (5% of area). 3.2.2 Development of Accuracy Requirements This chapter of the report describes a methodology for developing accuracy requirements for element- level bridge inspection data to support bridge management and deterioration modeling. It was found from the survey that very few agencies had accuracy requirements or tolerances for element-level data collection. There was no established method for determining the accuracy requirements for quantities recorded as part of an element-level bridge inspection. As a result, an approach for determining accuracy requirements needed to be developed during the course of the research. The approach was developed based on the general concepts typically used for nondestructive evaluation (NDE) technology requirements. A process for measuring the impact of different accuracy requirements on bridge management systems was also developed. This process was used to determine the effect of different accuracy requirements on the outcome of a common deterioration model. The methodology for developing accuracy requirements was generally based on the approach used for developing technology requirements for NDE in the aerospace and petroleum industries. Typically, this

13 approach uses a statistical model that describes the anticipated variation in the inspection result for a given NDE technology. This variation is then compared with a decision threshold such as a critical crack length. The critical crack length is commonly determined based on fracture mechanics. The threshold length value for rejecting a crack based on an NDE result is determined such that a crack of critical size would be detected with a certain probability, usually 90% or more. Because there is some probability that an inspector will undersize or even miss a crack, the threshold crack size (i.e., the rejected crack size per the specification) may be smaller than the actual critical crack size. This is necessary to account for the statistical variation associated with the NDE results and to ensure a crack of critical size is likely to be detected at some prescribed confidence level. The same concept has been adopted for developing an accuracy requirement for element-level inspection. This chapter explains an approach for element-level inspections in which the inspection result is a quantity of an element in a particular CS as opposed to a crack size. The statistical variation of the inspection result is assumed to be normally distributed in the model developed. The dispersion in the inspection result is considered in determining the appropriate thresholds for decision-making and examining the effect of that variation on decision-making. Decision-making based on element-level inspection results is typically based on the percentage of an element in a particular CS. For example, the Michigan Bridge Deck Preservation Matrix utilizes element quantities to discern different actions such as overlay, sealing, or replacement of a bridge deck. The component-level condition ratings are also considered. Generally, the Michigan rules indicate that decks with less than 10% damage (i.e., less than 10% in CS 3) will receive preservation-type actions, decks with 10 to 25% damage will receive maintenance actions, and decks with greater than 25% damage will receive repairs (Juntunen 2007, Weykamp, Kimball et al. 2010). Therefore, there are decision âthresholdsâ or âboundariesâ at 10% and 25% of a deck in CS 3 that are used to decide on preservation, maintenance, or repair. These decision boundaries are shown in Figure 7A, which illustrates the decision boundaries and two normal distributions. Now, assume that there are two different inspection procedures and one is known to be more accurate than the other. For example, assume procedure A characterizes damage in a deck by visual observation from alongside the roadway, and procedure B measures spalling and delamination using a chain drag and measuring tools (e.g., tape measures). Further, assume that the inspection results from these two procedures are normally distributed such that some inspections will underestimate the quantity in CS 3, while some will overestimate the quantity in CS 3. Such a normal distribution is shown in Figure 7. The standard deviation (Ï) characterizes the variation in the inspection results statistically. With a normal distribution assumed, 68% of inspection results will fall within +/- 1Ï and 95% of all results will fall within +/- 2Ï. For this illustration, we assume that Ï from visual inspection (procedure A) is 5% and Ï for chain drag (procedure B) is 2.5% as shown in Figure 7A. Given these assumptions, ~95% of inspection results would lie with +/- 10 % of the mean for procedure A and within +/- 5% of the mean for procedure B (i.e. 2Ï) as shown in Figure 7A. Now, assume that the actual condition of the deck is equal to 17.5%, the mean value of the decision thresholds (10% and 25%), and that the mean value of the inspection results is also 17.5%. For procedure B (Ï = 2.5%) almost all the inspection results would lie correctly between the decision boundaries of 10 and 25%, meaning that the correct decision would be made according to the agency rules. For procedure A (Ï = 5%), some portion of the inspection results would produce an incorrect decision; those to the left of 10% would indicate preservation activity, while those located to the right of 25% would require repair, when the correct decision according to the agency rules is maintenance. This illustrates how the statistical variation of inspection results has an impact on decision-making; inspection procedures with lower variation (i.e., lower Ï values) are more likely to result in a correct decision for a given set of decision thresholds or boundaries.

14 Figure 7. Examples of probability density functions for normally distributed inspection results. This model can be used to determine the appropriate decision-making thresholds considering the variation in the inspection results and the importance of making a certain decision accurately. For example, assume that it is very important that almost no decks with 25% or more damage in CS 3 were assigned incorrectly for maintenance (<25%) rather than repair (>25%), and procedure B was used for inspection. In this case, the decision threshold value (assuming inspection procedure B, Ï=2.5 %), should not be set at 25%, but rather at 20% (25% -2Ï = 20%) as shown in Figure 7B. In this way, the probability that a deck that actually has 25% damage would be characterized correctly as needing repair would be ~98 %. This can be visualized by considering the normal distribution with a mean of 25% shown in Figure 7B. Assuming that the mean value is the actual amount of damage, then most of the inspections that underestimate the amount of damage would still exceed the 20% threshold, as shown in the figure. This accounts for the probability that an inspector might underestimate the amount of damage in the deck. Of course, this also means decks with only 20% actual damage have a 50% chance of repair, because half the inspection results would exceed the threshold (overestimate the amount of damage) when the actual damage in the deck was 20%, as shown in Figure 7B (dashed line). If the decision was not critical such that it was acceptable to have a lower likelihood of correct decision at a particular threshold value, then the threshold value could be set higher. This simple example illustrates the considerations for determining an accuracy requirement. The accuracy requirement must consider the threshold values for decision-making as compared with the variability of the inspection results and the importance of the correct decision. The impact of an accuracy requirement on decision-making is to affect the probability of a correct decision, and as a result, decision thresholds may need to be changed if the demand for a correct decision is high. In the case of deterioration modeling and system-level planning, there may be other factors in decision-making which are not related to inspection results such as level of traffic, importance of the asset, costs, or forecasting future deterioration. Regardless, the trigger for preservation, maintenance, or repair will be affected by the rules associated with the accuracy of the inspection data, even if these decisions are prioritized differently due to other factors. The variation in the inspection results from element-level inspections is not known, although some initial data has been developed from the field exercises completed during the course of this research. To establish initial accuracy requirements without data describing the actual variation in inspection results, the following approach was used. First, it was assumed that for the mean value of a given decision boundary, it is acceptable for âtypicalâ decisions that 68% of decisions are correct (i.e., +/- 1Ï). In this case, the accuracy requirement is equal to the sum of decision boundaries divided by 2. For an âimportant decision,â where a higher degree of accuracy is required, it is assumed that it is desirable to have 95% of decisions to be correct. In this case the accuracy requirement is equal to the interval divided by 4 (+/- 2Ï). Figure 8 illustrates these accuracy requirements graphically for decision boundaries of 10 and 20%.

15 For example, assume that a bridge deck with less than 10% damage in CS 3 will have a preservation activity, a deck with 10-20% damage will have a maintenance activity, and a deck with greater than 20% damage will have a repair activity. In this case, the decision interval is 10%, and the mean of the interval is 15%. For a typical situation the accuracy requirement is +/- 5%, such that for a deck with an actual condition of 15% damage, the probability that a given inspection result falls within the decision interval is 68%. If the actual condition were greater than or less than 15%, then the probability of a correct decision would be reduced to a minimum of 50-50 at the boundaries of the interval. For preservation and maintenance activities, this may be an acceptable probability because the consequence of an incorrect decision is relatively low. If the decision is âimportantâ such that 95% probability for a correct decision is desirable, then the accuracy requirement of +/- 2.5% could be used. Figure 8. Model for accuracy requirements showing normal distributions. This approach has some limitations. First, it assumes that the actual condition of the deck corresponds with the average value determined from many inspections. Given the subjective nature of the CS ratings and the historical reliance of the decision boundaries on visual inspection results and experience this may be an acceptable assumption. Second, data on the actual variation in inspection results is necessary to determine if the accuracy requirement is achievable. In other words, the accuracy requirement can be set at whatever value you chose, but the accuracy of the inspection process is a characteristic of the inspection process itself, and is unaffected by the requirement. Without measuring the variation in the inspection result, there is no way to determine if the accuracy requirement is achieved (or achievable). Furthermore, decision boundaries may need to be different to achieve the desired result when the variation in the inspection result is considered, as described above. Third, this simple approach does not consider the impact of the accuracy requirement on deterioration modeling. Therefore, this simple initial model was further developed by analyzing the impact of different accuracy requirements on deterioration modeling to develop recommended accuracy requirements for the guidelines, as described in the following section. 3.2.3 Effect of Accuracy Requirements on Deterioration Modeling For system-level decision-making and deterioration modeling, the accuracy of the inspection results will affect the time period for a given element to develop damage that is sufficient to exceed a threshold established for decision-making. The most commonly used deterioration model for bridge management is the Markov chain model, a state-based stochastic process model that estimates future deterioration based

16 on the probability of transition from one state to another. The Markov model was used historically in the Pontis bridge management system, and provides some operational capabilities of the new AASHTOWare Bridge Management (AASHTO BrM). During the execution of the research, technical data on the AASHTO BrM modeling process was still emerging. To provide a means for analyzing the effect of accuracy requirements on deterioration modeling, the research team developed a Markov chain model that matched the performance characteristics of the Pontis model, based on previously published information on the Pontis software and available literature. The output from this model was subsequently verified using the AASHTO BrM software to confirm that the RT model produced the same output as the AASHTO BrM software. The impact of inspection accuracy requirements was evaluated by analyzing the effect of different accuracy requirements, i.e., different parameters of variation (Ï) for input data, on the output of the idealized deterioration model. The output was assessed by comparing the quantity predicted in CS 3 with an assumed threshold value. The assumed threshold value represents a decision-making boundary. For example, assume that a given inspection process has an accuracy requirement of +/- 10%. Figure 9 shows Markov results for an element with two different initial condition vectors; Figure 9A shows the deterioration projected for an element with an input vector of [80, 20, 0, 0], that is, the element is 80 % in CS 1, 20% in CS 2, and 0% in CS 3 and CS 4. Assuming that there is a decision-making threshold of 15% of the element in CS 3, the time interval for the element to deteriorate such that 15% of the element is in CS 3 is approximately 21 years. In contrast, assume that the same element has a condition vector of [80, 10, 10, 0], which meets an accuracy requirement of +/- 10% and represents the 10% error in the inspection result. In this case, the time interval until the decision threshold is exceeded is only 15 yrs. In this way, the impact of the accuracy requirement is determined in terms of the number of years until a given element deteriorates such that a decision threshold is exceeded. Figure 9. Markov results for an element with two different condition state vectors. The effect of different accuracy requirements (i.e., tolerances) can be measured in this way using the Markov chain model. Analysis was completed on the effect of accuracy requirements by modifying the condition vectors to simulate different accuracy level requirements and determining the impact in terms of the number of years until the decision threshold is exceeded. This is done for accuracy requirements of +/- 2.5, 5, 10, 15, 20, and 25%. Different decision-making threshold values were also analyzed, including 10%, 20%, and 30% thresholds. The analysis was completed for an element with a ârapidâ rate of deterioration, in which 50% of the element is expected to transition to the next lower CS over the course of 10 yrs. This results in a transition probability of 6.7% to the lower CS or 93.3% probability of staying in a CS in a given year. The analysis was also completed for an element with a âslowâ deterioration rate of 30 years which results in a transition probability of 2.3% to the lower CS or 97.7% probability of staying in a CS in a given year. These analyses

17 were completed to provide some range of rational values for an inspection accuracy requirement, and to illustrate the trends in these data. The results of the analysis were evaluated to consider the effect of different tolerances on longer-term modeling. The following figures illustrate the effect of the different accuracy values on the time period before a given decision threshold is exceeded. Figure 10 shows results based on an element with a âslowâ rate of deterioration and Figure 11 shows results based on a âfastâ rate of deterioration. Figure 10 shows the effect of using a 2.5% tolerance for a 30% decision threshold, where the time interval until the decision threshold is crossed is shown. As shown in this figure, assuming the actual condition of the element at the time of the inspection is 15% in CS 3, and the accuracy of the inspection is +/- 2.5 %, then the range of time for the decision threshold to be exceeded is 24 to 27 yrs. Were the inspection result accurate, than the time of the decision would be 26 yrs. This may be an acceptably low error in the predicted time for decision, given the long time interval. Figure 10. Markov model for 30% decision threshold A) -2.5% accuracy level, B) +2.5% accuracy level for a slow rate of deterioration. The results from this analysis are illustrated in Figure 11 for an accuracy level of +/- 2.5% and a threshold value of 30% for the âfastâ deterioration case. In this case, the time range for the decision threshold to be exceeded is between eight and ten years. Figure 11. Markov model for 30% decision threshold in fast rate deterioration A) -2.5 % accuracy level, B) +2.5% accuracy level for a fast rate of deterioration. Table 2 shows the results of the analysis for different threshold values and accuracy levels considered for an element with slow rate and fast rate of deterioration. For example, for a threshold value of 10%, and an accuracy tolerance of +/- 2.5%, the threshold value would be exceeded after 10 years if the inspection error were +2.5%, and the threshold value would be exceeded after 17 years if the inspection error were -

18 2.5%. If the inspection were accurate, such that the inspection indicates the true value of 5% in CS 3, then the threshold value of 10% would be exceeded after 14 years. Table 2. Result of Markov model for different decision thresholds and accuracy levels for slow and fast rate deterioration. Slow Rate Deterioration Decision Threshold Interval Accurate Result (yrs) Accuracy Tolerance Min (yr) Max (yr) Range (yr) 10% 14 +/-2.5% 10 17 7 +/-5% 0 19 19 20% 20 +/-2.5% 18 22 4 +/-5% 15 24 9 +/-10% 0 28 28 30% 26 +/-2.5% 24 27 3 +/-5% 22 29 7 +/-10% 16 32 16 +/-15% 0 35 35 Fast Rate Deterioration 10% 5 +/-2.5% 4 6 2 +/-5% 0 7 7 20% 7 +/-2.5% 6 8 2 +/-5% 5 8 3 +/-10% 0 10 10 30% 9 +/-2.5% 8 10 2 +/-5% 7 10 3 +/-10% 6 11 5 +/-15% 0 12 12 To normalize these data and provide a rational assessment of the inspection accuracy requirements for element-level inspection, a âQuality Ratioâ (QR) was calculated for each accuracy level for each threshold value. The QR is a measure of the quality of the data result from a given accuracy requirement for the inspection, based on the outcome of the deterioration prediction. In simple terms, the QR indicates if the magnitude of the range of predicted values resulting from a given accuracy requirement is larger or smaller than the prediction stemming from a perfectly accurate inspection result. As such, if the QR value is greater than one (i.e., 100%), the range of values resulting from the error is greater than the prediction from an accurate inspection. Table 3 indicates the QR values resulting from the sensitivity study for the fast and slow deterioration rates considered. As shown in the table, there is a clear trend that the quality of the prediction from the Markov model deteriorates as the accuracy level tolerance increases. When the QR is greater than 100%, the quality of the predication is low because the range of resulting values is actually greater than the predicted decision times from an accurate inspection. Therefore, QR values greater than 100% were subjectively identified as undesirable.

19 Table 3. Quality ratio for different threshold values and accuracy levels. Decision Threshold Value Accuracy Tolerance Quality Ratio Slow Fast 10% +/-2.5% 50% 40% +/-5% 136% 140% 20% +/-2.5% 20% 29% +/-5% 45% 43% +/-10% 140% 143% 30% +/-2.5% 12% 22% +/-5% 27% 33% +/-10% 62% 56% +/-15% 135% 133% Based on this analysis of the QR, suitable inspection accuracy requirements to meet the needs for bridge deterioration modeling and future planning needs can be assigned as shown in Table 4. These data illustrate that increasing the accuracy tolerance results in diminished quality for the modeling results. As shown in the table, accuracy requirements depend on the decision threshold under consideration. Table 4. Recommended accuracy requirements for element-level inspection. Decision Threshold Value Accuracy Tolerance Quality Ratio Slow Fast 10% +/-2.5% 50% 40% 20% +/-2.5% 20% 29% +/-5% 45% 43% 30% +/-2.5% 12% 22% +/-5% 27% 33% +/-10% 62% 56% These results should be considered in the determination of the accuracy requirements for element-level inspection. These results were implemented in the guidelines within the section discussing the determination of accuracy requirements for element-level data collection. The outcome of the simple Markov chain model implemented for this analysis presents an idealized approach for estimating future deterioration patterns for the purpose of identifying suitable proposed accuracy requirements. Implementation of such a model for specific bridges within a bridge management program may provide different results that can be applied for evaluating a given accuracy requirement in practice. 3.2.4 Status of the Guidelines for Improving Element-Level Bridge Inspection Data During the course of the research, the visual guide was implemented in a revised version of the MBEI. As part of that implementation, the MBEI was reorganized based on the element material such as reinforced concrete, steel, etc. The defect elements were listed once for each material, and the visual standards

20 (photographs) developed through the research were inserted with the appropriate materials and defect elements. A section containing the spatial estimating diagrams was included in the revised MBEI. The element listing in the MBEI was also reorganized to list elements according to the material from which the element is formed, and a section containing the element commentary was added. The revisions to the MBEI reduced the number of pages in the manual by more than 50%. The revised version of the MBEI was submitted for consideration by AASHTO and approved in 2018. The second part of the guideline included the suggested accuracy requirements described herein, as well as descriptions of methods for improving the quality of element-level data, primarily through field exercises and inspector calibration. This portion of the guideline is included in Appendix B. 3.3 Field Inspection Exercises This portion of the report presents a summary of the results from the field exercises. The complete field exercise plan is included in Appendix D. The detailed results and analysis are included as Appendix E. The first objective of the field exercises was to compare the quality of element-level data when using the visual guide developed through the research with traditional approaches for element-level inspection. This included analyzing the accuracy of inspection in terms of spatial estimates and assignment of CSs. This also included comparison of these data between different groups of inspectors, within groups of inspectors, and with a control inspection. A second objective of the field trials was to capture data on the impact of making changes to the MBEI. The changes included changing the unit of measure for certain elements to improve the quality of data. This included using ft instead of ea for columns and using ft instead of sq ft for steel bridge coatings. In some cases these changes were evaluated through tasks conducted at the S-BRITE Center and in other cases the changes were assessed through the field trials. The third objective was to evaluate the quality of element-level inspection data. The field exercises provided data on the quality of element-level data overall. These data are needed for rational assessment of tolerances (i.e. accuracy requirements) for inspection results used for preservation, maintenance, and repair decision-making. These data illustrate the quality of element-level inspection and the distribution of quantity estimates. Inspectors were divided into two groups. Test Group A (TGA) used the visual guide throughout the field exercises, and Test Group B did not use the visual guide, conducting the inspection according to their normal, routine inspection process. To evaluate the variability of data between inspectors and inspector groups, two measures were used. These measures focused on the quantity of damage assigned to different CSs. The mean value was calculated as the average value from all of the responses in the particular group that assigned a given CS. The sample standard deviation, Ï, was used to characterize the variation between individual inspectors in the group. The Ï value provides a statistical estimate of the variation in the results, and +/- 1Ï from the mean represents 68% of the responses, based on the assumed normal distribution. Lower Ï values denote less scatter in the results. The Coefficient of Variation (COV) was used to characterize the variation in results, on a percentage basis, normalized to the mean value calculated for a group of inspection results. The COV is calculated by dividing Ï by the mean value. For example, if the mean value is 3%, and the Ï is 2%, than the COV would be 66%. If the mean were 50% and the Ï was 2%, than the COV is only 4%. These data are useful for determining the magnitude of variation in the inspection results relative to the mean. In some cases, the average of the COV values is presented to illustrate the general magnitude of results. For certain tasks where the actual value of the quantity being estimated was known, the error was calculated by subtracting the actual value from the mean value. The error was analyzed by determining the average normalized error according to the equation:

21 1 Where is the actual value and is the average of the estimate provided by inspectors. In this way the difference between the estimated value (in %) and the actual value is presented as a fraction (%) of the actual value, which adjusts the data considering its magnitude. For example, if the actual value was 12%, and the average (mean) value from both test groups combined was 8%, the error was calculated as - 4% and normalized error was calculated as 4/12 = 33%. Analysis was typically conducted separately for TGA and TGB, and for the combined group (TGA + TGB). Generally, analysis was completed by comparing mean values and dispersion among the inspectors that participated in the field exercises. When analyzing the combined group, statistics were calculated for all of the inspectors participating as a single population. In this way, the number of samples was increased as compared with treating TGA and TGB separately. In Indiana, 14 inspectors participated, and TGA and TGB each had a population of seven while the combined group had a population of 14. In Michigan, ten inspectors participated such that TGA and TGB had populations of five each, and the combined group had a population of ten. It should be noted that the number of samples, as compared with the variability of the data overall, was limited for a statistical analysis. 3.3.1 Indiana Field Exercises This portion of the summary provides results from the field exercises completed in Indiana. The field exercises in Indiana included several tasks completed at the S-BRITE Center at Purdue University and a routine inspection of twin steel girder bridges located on I-65 in the West Lafayette area. Complete details of the test bridges and the test protocols completed during the exercises are included in Appendix D. The following section provides a summary of results for the testing at the S-BRITE Center and at the two steel girder bridges. A summary of the bridge elements examined during the exercise is shown in Table 5. Complete, detailed results and analysis can be found in Appendix E. The raw inspection data provided by inspectors can be found in Appendix F. In Appendix F, a commonly used format of bar graphs depicting the proportion of the element assigned to different CSs is shown. Table 5. Bridge inspection exercise summary for Indiana Number of Inspectors for Indiana Field Exercise TGA (used the newly developed visual guide) TGB (routine inspection practice) 7 7 Indiana Bridge Characteristics No. of Bridges Bridge Type 2 Steel continuous girder Elements included in the inspection exercise Element No. Element Name Element No. Element Name 107 Open Steel Beam/Girder 210 RC Pier Wall 515 Steel Protective Coating 215 RC Abutment 12 Reinforced Concrete Deck 313 Fixed Bearing 510 Wearing Surface 311 Movable Bearing 302 Compression Joint Seal

22 3.3.1.1 S-BRITE Testing This portion of the summary provides data from the inspection tasks completed at the S-BRITE Center. The objectives of the field exercises at the S-BRITE Center were to evaluate the fundamental capabilities of inspectors for estimating quantities of damage and to evaluate if the use of the visual guide improved the quality of inspection results. Five of six tasks completed at the S-BRITE Center were focused on measuring the capabilities of inspectors to estimate the quantity of simulated damage. Task 1 included a page test, in which inspectors were asked to estimate the area of damage depicted by irregular shapes on a standard 8.5 x 11 sheet of paper. Area estimates were also made for simulated damage on the web of two steel plate girders. The quantity estimates were assessed in units of area (sq ft) in Tasks 2 and 3, and using units of length (ft) in Tasks 4 and 5. The methods used by the inspector to make the quantity estimate were also examined. This included tasks to make the quantity estimate by estimating the percent of the element with damage and multiplying by the total quantity for the element (Tasks 2 and 4), and by tallying individual areas (or lengths) of damage (Tasks 3 and 5). Photographs from the tasks conducted at the S-BRITE Center are shown in Figure 12. Task 6 at the S-BRITE Center consisted of the routine inspection of a single-span decommissioned truss section that was removed from service and constructed at the S-BRITE Center. Elements assessed during this task included steel truss elements, the protective coating, and the gusset plates connecting truss members. This section of the report summarizes key findings from these S-BRITE Tasks 1-6. The results of the page test indicated that there was slightly less error in the mean results from TGA as compared with TGB. The normalized error was 18% for TGA and 20% for TGB. The data indicated that the variation in the results was similar between the two groups, with the average of the COV values for TGA and TGB being 37 and 33%, respectively. S-BRITE Tasks 2 and 3 evaluated the capabilities of an inspector to estimate the areas of damage using simulated damage on the webs of plate girders. In Task 2, the inspectors were asked to estimate the area of damage based on the percentage of the area. In Task 3, inspectors were asked to tally and sum the individual areas. Figure 13 shows example data from Task 2 and Task 3, showing the area estimates provided by inspectors. As shown in the figure, there was scatter between the estimates provided by the participants. In Tasks 2 it was shown the accuracy, as measured by the normalized error, was 25% for TGA and only 11% for TGB. The variability of the data as determined from the average COV values was similar between the two test groups. The COV values were 57% for TGA and 50% for TGB. It was also observed that the Ï values increased as the area of damage increased, meaning the magnitude of the scatter increases Figure 12. Inspectors conducting an assessment of simulated damage on a plate girder (A) and inspection of a truss element (B) at the S-BRITE Center.

23 with increasing areas of damage. These data did not indicate that there was an improvement in accuracy, or a reduction in variability, associated with the use of the visual guide. In Task 3, areas estimates were made by making diagrams of the damage and tallying the individual areas. Surprisingly, it was found that the normalized error was greater for both TGA and TGB when using tallying, as compared with simply making a percentage estimate. When the data was analyzed for all of the participants in the same group, the average normalized error was found to be 16% for Task 2 and 41% for Task 3. These data indicated that there was not an increase in the accuracy when using tallying as compared with simply estimating the area using a percentage. Some portion of the reduction in accuracy may be attributed to âsquaring offâ of irregularly shaped areas when tallying is used. It was also shown that for both TGA and TGB, there was a reduction in the variation in results when tallying was used as compared with making the quantity estimate based on percentage, based on the averaged COV values. These data can be interpreted in the following way: In S-BRITE Tasks 2 and 3, estimating the area by percentage was more accurate than estimating the area by tally, when examining the mean values. However, there was less variation between individual inspectors when tallying areas, as compared with estimating percentage. In other words, there was more consistency between inspectors when tallying areas as compared with estimating as a percentage. It was also found that tallying areas required approximately 4 times the minutes consumed by estimating area by a percentage. Estimating areas by percentage required approximately 5 minutes, while tallying individual areas required almost 20 minutes. Figure 13. Results from S-BRITE Task 2 (A) and Task 3 (B) showing area estimates provided by TGA and TGB. .

24 In S-BRITE Task 4 and 5, damage estimates were made using unit of length (ft). In Task 4, inspectors estimating the length of damage as a percentage of the total length. In Task 5, inspectors tallied individual lengths. The results of Task 4 indicated that the normalized error was greater for TGA as compared with TGB. The normalized error for TGA was 29%, while the normalized error for TGB was only 7.7%, indicating that TGB had improved accuracy as compared with TGA. The results of Task 5, during which tallying was used, showed there was a measurable increase in accuracy for both TGA and TGB, and the average COV values were reduced. These data indicated that tallying length provided an increase in accuracy, and reduction in variability, as compared with estimating the length as a percentage. Examining the combined group of inspectors, the normalized error was 17% when using percentage estimates. This value was very similar to the results for the combined group when estimating area by percentage in Task 2, 16%. The best accuracy was found when tallying lengths, as measured by the normalized error for the combined group of 8%. It was found that there was a negligible difference in the time required to tally length as compared with making a percentage estimate. S-BRITE Task 6 consisted of a routine element-level inspection of a single-span truss bridge constructed at the S-BRITE Center, shown in Figure 12. Inspectors assessed the quantity of corrosion damage in the truss, and Figure 14 shows the individual inspector results from this task. These data illustrate that the quantities of damage assigned to CS 2 and CS 3 vary significantly in both groups TGA and TGB, but there is agreement in the total amount of damage in the truss (CS 2 + CS 3). The variability in assigning CS 2 and 3, as measured by the COV value for the combined group of inspectors, was 61 and 82%, respectively. When considering the total amount of damage, i.e., CS 2 + CS 3, the variability was reduced to only 15%. These data indicated that the amount of damage indicated by the inspectors was consistent, but the assignment of CS 2 or CS 3 was not. The protective coating on the truss section was assessed using units of area (sq ft) by TGA and by units of length (ft) by TGB. It was found that the quantity assigned to damage in CS 2, CS 3, and CS 4 was increased when units of ft were used as compared with units of sq ft. It was also found that when units of ft were used, only the CSs 3 and 4 were assigned by most inspectors, but when sq ft were used, CSs 2, 3, and 4 were assigned, indicating that there was increased precision when area (sq ft) was used. Only 2/7 Figure 14. Inspector results for Defect 1000 Corrosion damage in a steel truss element.

25 inspectors assigned any damage to CS 2 when units of ft were used. In contrast, 6/6 inspectors assigned CS 2 when areas of sq ft were used. The overall time used to complete Task 6 was measured to determine if using different units of measure for steel coating assessment would result in a significantly different amount of time required to complete the inspection. It was found that the mean time for TGA was 32 minutes. The mean time for TGB was 26.3 minutes. These data indicate that it did not take significantly more time to rate the coating system in sq ft. In fact, the group that rated the coating in sq ft (TGB) actually took less time on average than TGA, which rated the coating in ft. This may be explained by the fact that TGA was unfamiliar with the use of the visual guide. The gusset plates in the truss section were also assessed. Inspectors were asked to rate the 72 gusset plates included in the truss section. The gusset plates were rated individually to provide additional data as compared with rating each joint. The control inspection rated 14 gusset plates in CS 4. Generally, inspectors in TGA and TGB rated between 2 and 5 gusset plates in CS 4; one member of TGA indicated that 22 gusset plates were in CS 4. In total, 14 inspectors provided a rating for this element, and five (about 1/3) of these inspectors did not indicate any gusset plate in CS 4. This result is important in illustrating that there was variation in the reporting of the need for structural review of damaged gusset plates. The assignment of CS 4 is a subjective assessment such that one inspector might consider the level of damage required structural review while another might assign CS 3. For this reason, data was analyzed by combining the quantities in CS 3 and CS 4 (CS 3 + CS 4). These data indicated that for the combined group, the mean value was 39% with a COV of 56%. In practical terms, these data indicate that the mean number of gusset plates assigned either CS 3 or CS 4 was 28 plates (0.39 x 72), and typical values would range +/- 16 plates (i.e., +/- 1Ï). Similar values were found when TGA and TGB were treated as separate groups. The range of values for the number of gusset plates in CS 3 + CS 4 was from 4 to 48 plates. These data are significant in illustrating that there was variation in assessing the severity of damage and assignment of CSs for gusset plates. 3.3.1.2 Indiana Task I1 and I2, Routine Inspection of Steel Girder Bridges Indiana Tasks I1 and I2 consisted of the routine element-level inspection of twin, three-span continuous steel girder bridges carrying I-65. Figure 15 shows photographs of the inspectors conducting the inspection. The inspection included the assessment of the primary superstructure members, the deck of the bridge, and the substructure elements. Figure 15. Inspectors conducting an assessment of test bridges I1 and I2.

26 Significant results from these tests included that there was a high degree of variation in the assessment of corrosion damage for element 107-Open Steel Girder/Beam. Figure 16 shows the raw inspection data from TGA and TGB. In this figure, the length estimate (ft) of damage is shown for CS 2 and CS 3, and for CS 2 + CS 3. From these data it is apparent that there is a high degree of variation between different inspectors assessing the steel girder element. Results for bridge I1 indicated a mean value of 32% with a COV of 94% for total damage reported (CS 2 + CS 3) for the combined group of inspectors. The combined results had a mean value of 53% and a COV of 74% for bridge I2. These values are indicative of the high variation found in the inspection results for this element. Variation in the inspection results were also found for other elements of a bridges I1 and I2. For Element 12-RC Deck, COV values as high as 217% were found for the combined group assignment of CS 2. Two outliers in the data were identified, in which very large damage estimates for both bridges were provided by the same inspector. With the outliers removed, the mean value for damage (i.e. CS 2 + CS 3) of the combined group dropped from 15 to 7.7% for bridge I1, and from 13 to 10% for bridge I2. The COV values dropped from 167 to 55% for bridge I1 and from 108 to 76% for I2. Generally, outliers of this magnitude were not found in the data from other elements. The results from the assessment of the deck element with the outliers removed is shown in Table 6. These data represent typical results from elements of bridges I1 and I2. COV values were typically found to be greater than 50%, indicating a variation in the inspection results for CS 2, CS 3, and the total damage represented by CS 2 + CS 3. However, in practical terms, Ï values in some cases were less than 5% of the total area. For example, for the combined group result, CS 3 of bridge I1, the mean value is 4.0% and the Ï was 3.8%. Based on these data, most inspection results would report between ~0% and ~8% in CS 3. It was also found that there was variation in the assignment of defects, with different inspectors assigning different defects to the same elements. For example, there were four different defects assigned to Element 302-Compression joint seal for bridge I1. Eleven inspectors identified defects in the joint; 1/11 identified Defect Element 2310 (leakage), 5/11 identified Defect Element 2330 (damage), 5/11 identified Defect Element 2350 (debris impaction), and 4/11 identified Defect Element 2360 (adjacent deck or header). For the deck of bridge I1, 8/11 identified Defect Element 1080 (spalling), 5/11 identified Defect Element 1120 (efflorescence), and 8/11 identified Element 1130 (cracking). These data illustrate is that there was variation in the assignment of the defect elements. Three inspectors did not identify the defect, providing only the CS. Figure 16. Individual inspection results for element 107- Open Steel Girder (ft) for bridges I1 and I2.

27 Table 6. Results for Element 12-RC Deck for bridges I1 and I2 with two outliers removed from TGB. Condition State (CS) Control Quantity (%) TGA Results (%) TGB Results (%) Combined (%) Mean Ï COV Mean Ï COV Mean Ï COV Bridge I1 CS2 7.0 5.3 4.5 85 3.7 1.92 52 4.7 3.7 79 CS3 0.13 4.9 4.5 92 3.1 3.0 96 4.0 3.8 94 CS2+CS3 7.1 9.5 4.8 51 5.5 2.3 42 7.7 4.3 55 Bridge I2 CS2 1.5 11 10 93 8.2 4.8 59 9.6 7.8 81 CS3 4.2 1.0 0.94 94 0.20 0.18 0.93 0.70 0.83 118 CS2+CS3 5.7 12 9.8 84 8.2 4.8 58 10 7.6 76 3.3.2 Michigan Field Exercises The field exercises in Michigan consisted of a routine inspection of the superstructure and substructure elements of twin prestressed concrete girder bridges (M1 and M2). These bridges were in generally good condition, with only small amounts of damage to the superstructure and substructure elements. During the inspection of these bridges, inspectors provided assessments of the concrete columns in both units of ea and in units of ft to provide a comparison between these two approaches. The Michigan field exercises also included the inspection of two concrete bridge decks of bridges with relatively low Average Daily Traffic (M3 and M4). Ten inspectors participated in the field exercises in Michigan, with five inspectors assigned to TGA and five assigned to TGB. A summary of the elements examined in the Michigan exercise is shown in Table 7. Table 7. Bridge inspection exercise summary for Michigan. Number of Inspectors for Michigan Field Exercise TGA (used the newly developed visual guide) TGB (routine inspection practice) 5 5 Indiana Bridges Characteristics No. of Bridges Bridge Type 2 Prestressed girder 2 Steel girder (bridge deck inspection only) Elements included in the inspection exercise Element No. Element Name Element No. Element Name 109 PSC Open Girder/Beam 310 Elastomeric Bearing 205 RC column 12 RC Deck 215 RC Abutment 300 Strip Seal 234 RC Pier Cap 301 Pourable Joint Seal 313 Fixed Bearing

28 3.3.2.1 Routine Inspection of Twin Prestressed Girder Bridges Figure 17 shows photographs of inspectors conducting an assessment of bridges M1 and M2. The prestressed girders were in generally good condition, and there was agreement between TGA and TGB regarding the amount of damage when mean values were considered. For example, for bridge M1, the mean value of damage (CS 2 + CS 3) was found to be 2% for TGA, and 3% for TGB. The COV values indicated that there was variation in the inspection results between inspectors. Typical COV values were found to be greater than 70% among assigned quantities for CS 2, CS 3, and the combined damage in CS 2 + CS 3. Figure 18 illustrates the variation in the inspection results found for the prestressed members of bridge M1 and M2. This figure shows the lengths reported by each inspector for CS 2, CS 3, and the total damage length (CS 2 + CS3). Two inspectors from TGA and one inspector from TGB did not report any CS 3 in either bridge. The range of values assigned to CS 3 for bridge M1 was from 0 ft to 38 ft (5.0%). The range of values assigned to CS 3 for bridge M2 was from 0 to 33 ft (4.4%). These data illustrate that although there was variation between different inspectors, all inspectors reported 5% or less for the damage in CS 3. It was notable that three inspectors did not report any quantity in CS 3. Bridges M1 and M2 each had eight reinforced concrete columns that were assessed by the inspectors. These columns were assessed by two different units of measure in separate tasks. During the routine inspection, the inspectors reported the condition of the columns in traditional units of ea. After the inspection was completed, the inspectors were asked to go back and reassess the columns, but using the units of ft. The purpose of this task was to evaluate how changing the unit of measure for columns would affect the inspection results. The data in Table 8 shows that there was, when looking at the averaged values presented in the table, consistency in both TGA and TGB estimating about 50% of the columns were damaged (CS 2 + CS 3) in bridge M1; there was less consistency for bridge M2 where the mean value was 53% for TGA and 30% for TGB. There was some agreement between TGA and TGB in assigning CS 3 for bridge M1; TGA assigned 42% and TGB assigned 34% of the columns to CS3. The COV values for CS 3 were 17% and 18% for TGA and TGB, respectively. When examining the combined group, the COV value was only 19% for CS 3. These data indicated that there was limited variation in the inspection results for CS 3. Examining the combined CS 2 and CS 3, the COV value for the combined group was 35% for M1 and 74% for M2, illustrating that there was some dispersion in the data from the different inspectors assigning damage (i.e., CS 2 + CS 3) in the columns. Figure 17. Inspectors conducting an assessment of twin prestressed girder bridges M1 and M2.

29 Table 8. Inspection result for element 205 - Reinforced Concrete Column for bridge M1 and M2 inspected using units of (ea) and reported as percentage of the total quantity Condition State (CS) CI Qua. (%) TGA Results (%) TGB Results (%) Combined Result (%) Mean Ï COV Mean Ï COV Mean Ï COV Bridge M1 CS2 75 881 0 0 29 19 65 44 33 76 CS3 0 42 7.2 17 34 6.3 18 38 7.2 19 CS2+CS3 75 53 24 45 45 11 25 49 17 35 Bridge M2 CS2 38 63 45 72 30 14 48 42 31 74 CS3 0 251 0 0 0 0 0 25 0 0 CS2+CS3 38 53 41 78 30 14 48 40 30 74 1Only a single inspector reported damage in this CS. Figure 18. Inspection results for PS girder damage for bridges M1 (top) and M2 (bottom).

30 In a separate task, inspectors were asked to assess the columns using units of length (ft) instead of the traditional units of ea. Table 9 shows the analysis results from assessing the columns using units of ft. As shown in the table, there was agreement regarding the quantity of damage for each bridge. For example, for bridge M1, TGA estimated 11% of the total column length was damaged, TGB estimated 9.3%. The combined group estimated 10% with a COV of 33%. In other words, based on the normal distribution assumption, 68% of results could be expected to lie roughly between 7% and 13% of damage reported by inspectors. For bridge M2, TGA estimated 6.4 % while TGB estimated 5.0% of total damage (CS 2 + CS 3). In this case, the combine group results showed a mean of 5.7 % with a COV of 51 %. In should be noted that there was a limited number of inspectors that reported CS 3 in bridge M2, with only 2 inspectors in TGA and a single inspector in TGB. For bridge M1, only 2/5 inspectors in TGA reported CS 3, while 4/5 members in TGB reported quantities in CS 3. Table 9. Inspection result for element 205 - RC Column for bridge M1 and M2 using units of ft, reported as a percentage of the total quantity. Condition State (CS) CI Qua. (%) TGA Results (%) TGB Results (%) Combined Result (%) Mean Ï COV Mean Ï COV Mean Ï COV Bridge M1 CS2 11 8 3.7 46 6.8 2 30 7.5 3 40 CS3 0 81 3.2 40 6.5 4.9 74 7 4.1 58 CS2+CS3 11 11 3.4 31 9.3 3.4 36 10 3.3 33 Bridge M2 CS2 9 6.3 3.9 62 5.4 0.6 10 5.8 2.6 45 CS3 0 3.41 0 0 3.42 0 0 3.4 0 0 CS2+CS3 9 6.4 4.1 65 5.0 1 20 5.7 2.9 51 1 Result from only 2 inspectors. 2 Result from a single inspector. It was found that the quantity of damage reported was reduced significantly when units of ft were used as compared with unit of ea, when considered on a percentage basis. For example, for bridge M1, the quantity of damage (CS 2 + CS 3) was reduced from 49% to 10%, and for bridge M2, the quantity of damage was reduced from 40% to 5.7%. It was also found that the inspector assessment changed when units of ft were used. For example, two inspectors that assigned only CS 2 to columns when assessed by ea subsequently assigned CS 3 when using units of length (ft). Another inspector that had assigned the defect of cracking when using units of ea did not assign any quantity to cracking when using units of ft. These data illustrated that there was variation in the assignment of defect elements and condition states. The time required to complete the assessment of the columns using the units of ft was reported by the inspectors as part of this task. The average time that inspectors reported to complete the assessment of the columns using the unit of ft was 7 minutes per bridge. The routine inspection task, in which the inspectors complete the overall inspection of the superstructure and substructure, was completed in an average time of 37 minutes per bridge. Other elements of bridges M1 and M2 that were assessed during the field exercise included abutments, pier caps, and bearings. The abutments and pier caps were in generally good condition, and it was found that there was consistency among the inspection groups that there were only small amount of damage in CS 2 and CS 3. For the bearing elements, the fixed bearings were reported by the control inspection to be 100% in CS 2. The combined results from TGA and TGB indicated a mean value of 74% for bridge M1 and 78% for bridge M2 assigned to CS 2.

31 Element 310-Elastomeric Bearing, had some reported damage in CS 3, and the inspection result is shown in Table 10. For bridge M1, the data indicated that there was good agreement regarding the total quantities of bearings with damage in either CS 2 or CS 3. TGA members estimated the damage as 79% of the bearings, and TGB 84%. When the result from both groups were combined to form a single group including ten inspectors, the total quantity of damage was estimated to be 81% of the bearings with a COV of 28%. This was a relatively low COV value as compared with many other elements evaluated in the study. The inspection result for elastomeric bearings for bridge M2 showed less consistency between the groups, with TGA estimating the total quantity of damage (CS 2 + CS 3) at 58% while TGB estimated the total amount at 73%. The combined group estimate was 65% with a COV of almost 50%, indicating the variation in the results from inspectors. Table 10. Inspection result for Element 310-Elastomeric Bearing for bridge M1 and M2 reported as percentage of the total quantity. Condition State (CS) CI Qua. (%) TGA Results (%) TGB Results (%) Combined Result (%) Mean Ï COV Mean Ï COV Mean Ï COV 310 - Elastomeric Bearing Bridge M1 CS2 50 66 23 35 69 37 53 67 26 39 CS3 50 30 2.5 8.3 64 10 16 47 20 43 CS2+CS3 100 79 30 39 84 12 14 81 23 28 Bridge M2 CS2 0 50 40 81 76 30 39 60 37 62 CS3 100 20 7.6 37 45 29 64 35 25 72 CS2+CS3 100 58 38 66 73 27 37 65 32 49 There were two concrete bridge decks assessed (M3 and M4) as shown in Figure 19. The inspection results from these two decks are shown in Table 11. For deck M3, there was very little variation in the mean value of total quantities of damage recorded. Looking at the combined results for the quantity of damage assigned in CS 2, the results were 3.7, 3.0, and 3.4% for the control, TGA, and TGB, respectively. The test groups identified some areas in CS 3, whereas the control did not. The total amount of damage (CS 2 + CS 3) was 3.7, 3.8 and 3.3% for the control, TGA, and TGB, respectively. These mean results were very consistent; the mean combined team result was 3.6%. The Ï for the combined group was 1.9%, which indicates that there was good agreement among all of the inspectors that there was a limited amount of damage on the bridge deck in terms of the percentage of deck damaged.

32 For bridge M4 assessed by inspectors, there was again consistency shown in the average values (mean) between the groups; TGA assessed 8.8% and TGB assessed 10% examining the results for damage (CS 2 + CS 3). However, there was significant variation among the individual inspectors, as illustrated by the high COV values. For example, the COV value when all inspectors were placed into the same population (combined results) was 60% (Ï = 5.6 %), meaning that the majority of assessment could range from ~ 4 to 15%. For CS 3, the COV values were greater than 100%, illustrating the large variation in the inspection results between inspectors. Table 11. Inspection results for the decks of bridges M3 and M4. Condition State (CS) CI Qua. (%) TGA Results (%) TGB Results (%) Combined Result (%) Mean Ï COV Mean Ï COV Mean Ï COV Bridge M3 CS2 3.7 3.0 0.4 14 3.4 2.1 61 3.2 1.3 42 CS3 0 1.4 1.6 117 0.5 0.4 80 0.9 1 120 CS2+CS3 3.7 3.8 1.4 37 3.3 2.4 74 3.6 1.9 54 Bridge M4 CS2 11 6 2.7 45 11 6.5 60 8.1 5.1 63 CS3 0 3.6 4 113 1.6 1.1 64 2.6 2.9 112 CS2+CS3 11 8.8 4.3 48 10 7.2 72 9.3 5.6 60 Figure 19. Inspectors conducting an assessment of the deck elements of bridges M3 and M4.

33 The inspection results were also analyzed considering the raw data provided by the ten inspectors that participated in the study, as shown in Figure 20. These data illustrate the significant variation in the amount of deck assigned to CS 3. For example, two inspectors did not assign any of the deck to CS3. Among the inspectors that did assign CS 3, the range of values assigned was from 64 sq ft (<1%) to 1554 sq ft (almost 10%). If we discard the high (1554 sq ft) and low (one of the 0 sq ft), assuming these are outliers, the range is between 0 sq ft (0%) and 460 sq ft (~3 %) in CS 3. Examining the total damage assigned (CS 2 + CS 3), the range of reported values was from 460 sq ft (~3%) to 3451 sq ft (21%). Again, if we assumed that the high and low values are outliers, the range would be between 790 sq ft (~5%) and 2307 sq ft (14%) damage in the deck. These data indicate that there was significant variation between inspectors in reporting damage in the bridge deck, although the average values when considered as a group were more consistent. However, there was agreement among both groups that the area of damage in the deck was about 10%. 3.3.3 Discussion This study provided very unique and important data on the variability that is found in element-level inspection data. One of the primary objectives of the field trials was to provide data on the quality of element-level data overall by assessing the distribution of spatial estimates and the consistent assignment of CSs. It was found in the field exercises that the distribution of quantity estimates among different inspectors was sometimes very high. The variation in quantity estimates was illustrated using the COV values to provide a normalized value for the magnitude of the standard deviation relative to the total mean damage quantity. This provided a measure of the variation in inspection results that could be compared between TGA and TGB. Table 12 shows the average of the COV values for the primary elements in the study. The table includes the averaged COV values for CS 2, CS 3, and CS 2 + CS 3. The groupings of TGA, TGB, and the combined TGA + TGB are shown. These data include the primary superstructure, substructure, and deck elements that had CS assignments in CS 2 and CS 3 of sufficient quantity to provide Figure 20. Inspection results for RC deck M4 showing areas of damage in the deck.

34 a meaningful measure. These data do not include ancillary elements such as joint seals, bearings, or bridge railings. (Note: The linear average of the magnitude of COV values is shown for illustration, which is different statistically from the average COV value). It was found that overall, the average of the COV values was greater than 50%. From both the Indiana and Michigan field exercises, the average of the COV values were very similar between TGA and TGB. In Indiana, the average of the COV values for TGB were slightly lower than for TGA, indicating there was more consistency in the results from TGB. The significant findings shown this table are twofold; first, the average of the COV values were somewhat high for both TGA and TGB. Second, the use of the visual guides did not have an effect that was reflected through a reduction in the variation of the data, as measured by the COV values. It can also be observed that the average of the COV values were generally lower for Michigan as compared with Indiana. The fact that Michigan has been conducting element-level inspection for longer than Indiana may contribute to this difference. However, the two groups inspected different bridges which had different levels of damage, so it is not possible to determine how the increased experience of the MI inspectors may have affected the results. Qualitatively, the inspectors in Michigan were more likely to record defects during the field exercises as compared with inspectors participating in the field exercises in Indiana. This may be associated with the increased experience of inspectors from Michigan as compared with inspectors from Indiana. Table 12. COV values determined for TGA and TGB for the field exercises. Group Average of the COV Values CS2 CS3 CS 2+3 IN - TGA 80 89 69 IN - TGB 70 79 59 MI - TGA 57 88 62 MI - TGB 74 62 67 IN Combined 82 96 69 MI Combined 74 83 70 It is also notable that the variation in the total damage, i.e., CS 2 + CS 3 was also relatively high, with a minimum of ~60% as shown in Table 12. These data are significant because they indicate the variation between reporting of total damage quantities, not just differentiating between CS 2 and CS 3. However, the COV values require some context to place the results in practical terms. The COV values reflect the ration of Ï to the mean value. If the mean damage quantity is small then the variation in the inspection result is small when expressed as a percentage of the total quantity. For example, assume that the damage in a deck was an area equal to 10% of the bridge deck. If the COV were 50%, then the variation as expressed by the Ï value is only 5% of the total quantity of deck. This may be an acceptable variation. However, if the damage in the deck were 50% of the total deck area, then the variation would be +/- 25% of the total deck area, which may not be acceptable. In other words, assuming the mean of the inspection result was equal to the quantity of 50% of the deck, the range of inspection results would be from 25% to 75% of the deck area.

35 To further illustrate the test results, Figure 21 includes two graphs showing the measured Ï values from the field exercises for CS 3 (Figure 21A) and for CS 2 + CS 3 (Figure 21B). The figures include a trend line for the combined data set from Indiana and Michigan. As shown in these figures, the Ï values increase as the mean amount of damage increases. In this way, when the quantity of damage is small, the variation in inspection results is small, but when the quantity is large, the variation is also large. This is significant because it indicates that the more damaged a bridge element, the lower quality the inspection data that will be obtained. Looking at these data from an overall perspective in regards to decision-making and bridge management, the data suggest that as bridge deterioration increases, such that decisions regarding maintenance and repair actions are required, the quality of inspection data is decreasing. This is the opposite of what would be most desirable from a bridge management perspective. The results from MI and IN were also analyzed to determine how the assignment of CS 3 related to the quantity of an element in CS 3. Figure 22 shows the relationship between the numbers of inspectors reporting CS 3 and the mean value of the quantity assigned to CS 3. A trend line is drawn on the figure that shows the trend for the results with mean values less than 10%. These data illustrate that when the mean value in CS 3 is less than 10%, there is variation in the number of inspectors that report any quantity in CS 3. However, the trend of these data shows that as the amount of damage increases, the likelihood of an inspector reporting damage in the CS 3 also increases. This illustrates that the assignment of CS 3 is not simply random, but in fact follows the trend that you might expect. When the amount of damage in CS 3 is small, some inspectors assign CS 3 and some do not. But as the amount of damage increases, an increasing number of inspectors assign CS 3. The data with mean values above 10% are probably too sparse to assess effectively; more data is needed for situations where the quantity of damage in CS 3 is greater than 10% is needed. Figure 21. Standard deviation (Ï) as a function of damage for CS 3 (A) and CS 2 + CS 3(B).

36 One of the objectives of the field trial was to compare the use of the visual guide with the traditional inspection approach. The data was studied to determine if there was a measurable improvement in the accuracy of spatial estimates as a result of using the visual guide. In the studies conducted at the S-BRITE Center, it was found that there was not a consistent pattern showing that use of the visual guide resulted in improved accuracy. There were three tasks that considered the ability to estimate a quantity with the use of the visual guide, including the page test, estimating the area of simulated damage on the web of a plate girder, and estimating the length of the simulated damage on the web of a plate girder. TGA, which used the visual guide, had smaller normalized error in only one of the three tests, the page test. TGA also had a lower normalized error when completing the page test in Michigan. As such, from four tests that examined the ability to estimate a quantity (sq ft), without needing to also consider the appropriate CS, TGA had a lower normalized error in two of the four tests, and these were both page tests. Given that the page test was very similar in appearance to the images in the visual guide, this result is not very significant. For the S- BRITE tests examining the area estimates of simulated damage placed on a plate girder, the averaged COV values were 57% and 50% for TGA and TGB, respectively. These data indicate that for any given quantity being estimated, the variation of 1Ï is approximately +/-50% of the mean value of the estimate. Given this significant amount of variation in the results, it is unlikely that the influence of using the guide could be effectively detected. An additional explanation may be that there is more training and experience required using the spatial estimating guides than could be provided within the constraints of the field exercises. The results from the page test suggest that the guide may be helpful, if those results can be transferred to full- scale elements. In terms of the assignment of CS, it was found that the variability of the results combined with the relatively small data set did not allow for an effective evaluation regarding effect of using the visual guide as compared with not using the visual guide. Qualitatively, several participants noted in the post-test questionnaire that the visual guide assisted them in making assessments. It may be that additional training and experience in utilizing the visual guide is necessary to realize its full benefits. Several inspectors noted that the guide was difficult to use due to its size, the need to look up defects, and the fact that it was not computer-based. These limitations are easily resolved; the guide can be printed smaller, and once inspectors are familiar with the content, finding defects would not be a limitation. Given the variation found in the Figure 22. Rate of detection for CS 3 as a function of the mean.

37 assignment of defects and CSs in the study, a visual guide appears to be needed to improve the quality of element-level data. Part of the study evaluated the differences between making estimates based on a percentage as compared with tallying individual areas, and making estimates in area (sq ft) as compared to length (ft). Exercises at the S-BRITE Center used simulated damage on the web of a plate girder were designed to examine these differences. For the estimation of area, it was found that the normalized error was actually increased when tallying individual areas as compared with providing a percentage estimate of the areas. However, the variation in results from individual inspectors as expressed by the COV was found to be reduced when tallying areas as compared with estimating areas as a percentage. For the evaluation of length (i.e. units of ft), different results were found. The results indicated that tallying individual lengths provided both the lowest normalized error and the lowest variation between inspectors. For example, when examining all of the participating inspectors in a single group, the normalized error was about the same for area (sq ft) and length (ft) when percentage estimates were used, about 16%. When tallying individual areas (sq ft), the normalized error for the group was 41%, but when tallying individual length (ft), the normalized error was only 8%. It was also found that the quantity of damage recorded was increased when units of length were used as compared with units of area. Exercises completed at the S-BRITE Center also included analyzing the effect of changing the units of steel protective coatings (Element 515) from sq to ft. Data was analyzed considering the total amount of damage identified by inspectors, i.e., CS 2, 3, or 4. It was found that the total quantity of damage was increased when using units of ft (98%) as compared with sq ft (72%). The variation in results, as expressed by the COV value, was lower when using units of ft (4.2%) as compared with sq ft (33%). These data illustrated the reduced precision of using units of ft, where any portion of the truss defines the rating for that linear foot of truss panel, as compared to using units of sq ft. However, the consistency between different inspectors was increased when using units of ft (i.e., lower COV). This result may have been effected by the quantity of damage in the truss, since almost 100% of the truss was damaged when considered in this way. Therefore, there was a limitation on the overestimate that could be made in this exercise, because values greater than 100% were not possible. A different field exercise was used to examine the effect of changing unit of RC column from ea. to ft. In this task, inspectors rated eight columns in each of two bridges (M1 and M2). Again, it was shown that there was more precision when using units of ft, which resulted in significantly less damage being reported when using unit of ft as compared with units of ea. This would be expected, since each column is being divided into 1 ft sections as compared to using a single CS to describe the entire column when using units of ea. Overall, the results showed that using units of length (ft) as compared to area (sq ft) resulted in a higher quantity of damage reported, but reduced variation between inspectors. Using unit of length (ft) as compared to ea resulted in decreased quantities of damage and decreased variation between inspectors. It is noted that statistical analysis of the data was conducted after the field exercises. The F-test was used to evaluate the dispersion of TGA as compared to TGB. This analysis showed that there was not enough data to show a statistically significant result that indicated the use of the guide improved the quality of results from TGA as compared with TGB. Analysis to compare the mean from the combined group with the control inspection using the t-test showed that there was not enough data to show statistically significant similarity between these results. Given the variation found in the results, a larger sample would be needed to prove these hypotheses. 3.3.4 Conclusions from the Field Exercises This section of the report includes the conclusions from the field exercises with regard to the primary objectives of the field trials: 1. Compare the use of the visual guide with the traditional inspection approach.

38 The field exercises did not show that there was a decrease in inspection variability when using the visual guide as compared with not using the visual guide. The variation in the data from the field exercises did not allow for recognizable trends regarding the assignment of the appropriate CSs. It was also found inspector groups using the visual guide tended to require more time than those not using the visual guide. Based on feedback from the post-test questionnaire, more training and experience with the visual guide is needed to make the guide more effective for improving the quality of element-level data. Qualitatively, inspectors indicated that the guide was helpful and assisted in identifying the correct assignment of CSs. Inspectors also indicated that the guide was relatively easy to use, but could be improved if it was reformatted to be more suitable for field use. Analysis: The results of the field exercises indicate that more training and experience with the use of a visual guide is needed to realize positive results. Given the short period of time that the inspectors had to work with the guide, it may have been unrealistic to expect a recognizable improvements. Analysis of the assignment of defect elements and CSs indicates that more training is needed in these areas. Given the variability found in this area, the need for the visual guide appears obvious; more training and experience is needed to realize the benefits. 2. Assess potential changes to the MBEI The analysis of potential changes to the units of measure for elements such as protective coating and columns indicated the following: Overall, the results showed that using units of length (ft) as compared to area (sq ft) resulted in a higher quantity of damage reported, but reduced variation between inspectors. Using unit of length (ft) as compared to ea resulted in decreased quantities of damage and decreased variation between inspectors. 3. Evaluate the quality of element-level inspection data The results of the field exercises showed that there was variability in the damage quantities determined from element-level inspections. It was found that the variation was on the order of greater than 50% of the quantity being measured, based on statistical analysis of the data. It was found that the variation in the inspection data increased as the quantity of damage increased. It was also found that the likelihood of detecting CS 3, Poor, had variation when quantities were less than 10% of the total element quantity, but trended toward increased detection as the quantity in CS 3 increased. There was insufficient data to assess the assignment of CS 3 when quantities were greater than 10%. It was also found that there was variability in the assignment of CS 4 for gusset plate elements in the Indiana field exercises. If was found that 1/3 inspectors did not report any gusset plates in CS 4, while 2/3 inspectors did report gusset plate elements in CS 4. The results from the field exercises also showed that there was inconsistency in the assignment of defect elements. Different inspectors tended to report different defect elements for the same bridge element. For example, some inspectors reported only cracking in concrete, some reported only delamination/spalling, and some inspectors reported both, all for the same element. It was found from the survey that there was variability in the methods used for estimating quantities for element-level inspections, with some inspectors basing estimates on percentage and some using tallying. Results from tests on simulated damage indicated that for units of area, tallying did not improve the accuracy of results, but did reduce variability between estimates. When units of length were used, tallying resulted in improved accuracy and reduced variability. Analysis: The data showed an increase in variability as the quantity of damage increased based on a statistical analysis of the data. The quantity of data is limited for such a statistical analysis, but this trend seems apparent and is consistent with what might be expected. The variation in the results may not be problematic when damage quantities are small. As damage quantities increase, the variation may be more problematic for decision-making and bridge management.

Next: 4. Conclusions and Suggested Future Work »

Guidelines to Improve the Quality of Element-Level Bridge Inspection Data (2019)

Chapter: 3. Findings and Applications

Welcome to OpenBook!

Get Email Updates