3
Diagnostics

TESTING OF DOMESTICATED ANIMALS

Tests for Johne’s disease (JD) can be divided into two categories: those that detect the organism and those that assess the host response to infection. The first category includes fecal smear and acid-fast stain, culture, and polymerase chain reaction (PCR) tests. There are no tests of metabolic products or unique antigens of Map. The second category, detection of host response, includes clinical signs in combination with gross and microscopic pathology and immunologic markers of infection, which include antibody response to Map (serology), delayed-type hypersensitivity (DTH) reaction, lymphocyte proliferation, and increased cytokine (IFN-γ) production. Most of the development and evaluation of diagnostic tests has occurred in domesticated cattle and sheep. Despite considerable research effort, all methods are fraught with difficulties that have impeded the control and eradication of JD.

Diagnostic test interpretation and evaluation are important subjects that are discussed in greater detail in Appendix A. The best measure of diagnostic-test performance is predictive value, which is the ability of a test to accurately predict whether disease is present in a particular population. Rates of sensitivity (the ability of a test to correctly identify a diseased individual in a population) and specificity (the ability of a test to correctly identify a healthy individual in a



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 45
3 Diagnostics TESTING OF DOMESTICATED ANIMALS Tests for Johne’s disease (JD) can be divided into two categories: those that detect the organism and those that assess the host response to infection. The first category includes fecal smear and acid-fast stain, culture, and polymerase chain reaction (PCR) tests. There are no tests of metabolic products or unique antigens of Map. The second category, detection of host response, includes clinical signs in combination with gross and microscopic pathology and immunologic markers of infection, which include antibody response to Map (serology), delayed-type hypersensitivity (DTH) reaction, lymphocyte proliferation, and increased cytokine (IFN-γ) production. Most of the development and evaluation of diagnostic tests has occurred in domesticated cattle and sheep. Despite considerable research effort, all methods are fraught with difficulties that have impeded the control and eradication of JD. Diagnostic test interpretation and evaluation are important subjects that are discussed in greater detail in Appendix A. The best measure of diagnostic-test performance is predictive value, which is the ability of a test to accurately predict whether disease is present in a particular population. Rates of sensitivity (the ability of a test to correctly identify a diseased individual in a population) and specificity (the ability of a test to correctly identify a healthy individual in a

OCR for page 45
population) are established by comparing how well a test correlates with an established “gold standard” for the condition. The gold standard for JD is the identification of the etiologic agent, Map, in tissues that show characteristic histopathologic lesions. A test’s sensitivity and specificity must be determined, along with the prevalence of the condition, in order to calculate the positive predictive value of a positive test and the negative predictive value of a negative test. Diagnostic test performance depends on the stage of disease. JD in dairy cattle is clinically categorized into four stages (see Table 2–2). In Stage 1, animals are infected, asymptomatic, and no organisms are detected in feces. In Stage 2, animals are asymptomatic, but organisms can be detected in feces. Stage 3 animals are symptomatic with weight loss and diarrhea. Stage 4 is advanced clinical disease, animals are symptomatic with signs of lethargy, emaciation and profuse diarrhea. Diagnostic tests will generally tend to perform better in individual animals in the later stages of the disease. (This may not be true for immunological tests, where anergy of either cell-mediated or antibody responses to Map has been noted in animals with a heavy bacterial burden.) At the herd level, tests will tend to perform better as the proportion of individuals in more advanced stages of disease increases. For control programs, it is important to make this distinction between test performance at the individual animal level and test performance at the herd level. One valuable screening test for control programs is the enzyme-linked immunosorbent assay (ELISA) for antibodies against Map. This test has relatively low sensitivity at the individual animal level, but fairly good sensitivity at the herd level. It also has significant advantages over fecal culture for screening, which is important in large-scale control programs. These advantages include relatively low cost, simplicity, and rapid results (Tables 3–1, 3–2, 3–3). Table 3–1. Detectability of Johne’s Disease at Varying Clinical Stages Stage I Stage II Stages III, IV Signs of disease No No Yes Fecal culture No Maybe Yes PCR No Maybe Yes Acid-fast bacilli No Maybe Yes IFN-γ Maybe Yes Maybe Serology No Maybe Yes Notes: PCR: polymerase chain reaction IFN-γ: gamma interferon

OCR for page 45
Table 3–2. Comparison of Diagnostic Tests for Johne’s Disease Test Cost Turn-Around Time Sensitivity Specificity Speciesb Clinical Stage Fecal culture Moderate Months Moderate High All II, III, IV PCR High Hours Low High All II, III, IV Acid-fast bacilli Low Hours Low Moderate All III, IV IFN-γ High Days Moderate Moderate Bovine, ovine, caprine II, III Serology ELISA Low Hours Low-high Moderate Bovine, ovine, caprine, alpaca, deer II, III, IV AGID Low Days Low-moderate High Bovine III, IV Notes: PCR: polymerase chain reaction IFN-γ: gamma interferon ELISA: enzyme-linked imunosorbent assay AGID: agar gel immunodiffusion a Sensitivity is highly variable with stage of disease. b Payeur, 1998. Table 3–3. Utility of Diagnostic Tests in Clinical Stages of Johne’s Disease Test Stage I Stage II Stages III, IV Pathologic lesions Generally useful Generally useful Useful Signs of disease n/a n/a Very useful Fecal culture n/a Very useful Extremely useful PCR Limited utility Useful, depending on clinical progression Useful Acid-fast bacilli fecal smear n/a n/a Useful, depending on clinical progression IFN-γ Limited utility Generally useful, depending on clinical progression Useful Serology Generally n/a Generally useful, depending on clinical progression Useful Notes: N/A: not applicable PCR: polymerase chain reaction IFN-γ: gamma interferon

OCR for page 45
An ideal diagnostic test for a control program would identify most animals in Stage I. No antemortem test meets this standard. Surveillance sampling for histopathologic changes is both impractical and likely to be insensitive because visible mucosal changes are later manifestations of disease. A more practical goal for a control program would be to identify animals as they enter Stage II and to continue to improve the sensitivity of detection so that an increasing proportion of Stage I animals can be recategorized as Stage II. Fecal culture is the accepted standard for identification of Stage II animals. Several other laboratory methods are available as alternatives to fecal culture, but none has proven superior in identifying individual infected animals. It has been erroneously claimed that because Map is an obligate pathogen there are no false-positive fecal cultures. In other words, whenever the organism is isolated from clinical specimens, the JD pathologic process will be present in the infected animal. This assertion is based on the fastidiousness of the organism and its inability to proliferate without externally provided cofactors, which would make it an obligate (as opposed to facultative) intracellular pathogen. However, that now seems unlikely, in view of the large number of species Map can infect, its widespread geographic distribution, and its capacity to persist under harsh environmental conditions. Little is known about whether there are microenvironments that permit proliferation outside the animal host. However, it is reasonable to expect that this could be the case. In addition, in cattle from contaminated environments, pass-through Map can be detected in uninfected animals, yielding a false-positive result (Sweeney et al., 1992a). Bacteriologic Culture The Johne’s bacillus was first described in the inflamed intestine of a cow in 1895. The bacillus was eventually grown in 1912 on enriched culture media to which extracts of M. tuberculosis and M. phlei had been added (Twort and Ingram, 1912). Culture techniques were refined after the mycobactins, which are iron-chelating proteins, were subsequently identified as the essential ingredient from those extracts that enabled cultivation of the Johne’s bacillus. Detection of Map in conjunction with histopathologic lesions of JD is regarded as the gold standard for determining whether an animal is infected. The most common cultivation method involved processing of a sample matrix—most commonly feces but also intestinal tissues, milk, or fetal tissues—and the detection of subsequent visible growth of the agent on a variety of artificial culture media. Occasionally, fecal culture by itself has been erroneously regarded as a gold-standard procedure for determining an animal’s infection status. Although the specificity of fecal culture is high, it is not absolute because of the potential for pass-through of orally ingested organisms by uninfected cattle (Sweeney et al., 1992a) and because of laboratory errors, such as sample misidentification or cross-contamination. Because of the relatively low sensitivity of conventional fecal culture in subclinically infected animals, and because Map grows very slowly on artificial media, much has been done to improve performance of the procedure and, more recently, to develop hybrid test procedures (e.g., Collins et al., 1990b; Secott et al., 1999).

OCR for page 45
All culture methods require long incubation (8–16 weeks or more, in some cases), decontamination of the specimen to selectively kill faster growing non-mycobacterial organisms that would otherwise overgrow the culture medium, and some way to concentrate the organisms from within the specimen before inoculation of the medium. A variety of methods and parameters have been evaluated over the years to improve the sensitivity of Map detection by culture. The following discussion briefly highlights the most common methods and techniques for isolating Map from clinical specimens. Decontamination of Specimens Various chemicals and antibiotics that are selectively toxic to organisms other than Map have been included in decontamination methods. Hexadecylpyridium chloride (HPC) is probably the most commonly used decontaminant; it is less toxic to Map than are other commonly used decontaminants (Merkal et al., 1982). HPC is commonly used in North America, but sodium hydroxide is more common in Europe (Whitlock et al., 1992). Other decontaminants are benzalkonium chloride and oxalic acid (Merkal et al., 1982); sodium hydroxide and oxalic acid followed with neomycin and amphotercin B (Jorgensen, 1982); cycloheximide with nystatin and tetracycline (Merkal and Richards, 1972); and PANTA—amphotercin B mixed with polymyxin B, nalidixic acid, trimethoprim, and azlocillin (Merkal and Richards, 1972). PANTA is a premixed antibiotic supplement (Becton Dickinson Laboratories, Sparks, Maryland, USA) that has been recommended by the manufacturer for use with the BACTEC radiometric culture system. The double-incubation method (also called the Cornell method) is commonly used for decontamination (Shin, 1989; Whitlock et al., 1992). It includes a preincubation step with brain-heart infusion medium that initiates germination of bacterial and fungal spores, followed by centrifugation, and then a second step with the addition of antibiotics (amphotercin B, vancomycin, and nalidixic acid) to kill the spores that subsequently germinate (Shin, 1989; Whitlock et al., 1992). Singh and co-workers (1992) reported that the double-incubation culture method demonstrated a higher sensitivity of detection and that it reduced contamination more effectively than did the conventional sedimentation method. Concentration of Map from Specimens After the culture specimen is decontaminated, the Map organisms must be concentrated to increase the sensitivity of the technique (i.e., to enable detection of lower numbers of organisms). Several methods, including centrifugation, sedimentation, filtration, and immunomagnetic separation have been used to accomplish this task. Centrifugation of the supernatant increases recovery of Map (Whipple et al., 1991; Whitlock and Rosenberger, 1990). Various combinations of speed and duration of centrifugation have been suggested, including 900g (gravity units) for 30 min (Whitlock and Rosenberger, 1990), 1000g for 30 min

OCR for page 45
(Whitlock et al., 1989), and 1700g for 20 min (Stabel, 1997). High-speed centrifugation resulting in compaction of the pellet can result in increased contamination rates, and it can cause difficulty in resuspending the pellet (Whipple et al., 1991). Kalis and co-workers (1999b) reported that high-speed centrifugation (3000g for 15 minutes) did not increase the efficiency of identification of animals that shed Map. Allowing the supernatant to settle overnight or longer improves recovery of Map organisms by sedimentation. Stabel (1997) compared three methods of Map concentration and reported that centrifugation yielded more organisms than did sedimentation, but fewer than a modified version of double incubation: centrifugation had higher rates of contamination than did sedimentation or double incubation. Filtration of the supernatant through polycarbonate filters with a 3 micron pore size (to take advantage the organisms’ tendency to form large clumps) has been used with the BACTEC system to concentrate Map while letting contaminants pass through (Collins et al., 1990b). Paramagnetic beads coated with specific antibodies to Map have been used to quickly extract Map organisms from feces and milk by immunomagnetic separation. The beads and the attached Map organisms are easily separated from the specimen with the use of a magnetic field and thorough washing. This method has been highly effective for milk specimens in which the quantity of Map per unit volume is low (Grant et al., 1998), but it is much less effective in isolating Map from feces. Conventional Culture for Map Conventional Map culture consists of decontaminating the specimen, concentrating the organisms, and inoculating a growth medium. Solid media are most commonly used; Herrold’s egg yolk medium (HEYM) is popular in many regions, but a modified Lowenstein-Jensen medium is preferred in some areas of Europe. Each medium is supplemented with mycobactin J (Jorgensen, 1982; Whitlock et al., 1992). A thorough discussion of medium constituents and techniques for these and other less commonly used solid media was published by Whitlock and colleagues (1992). Multiple tubes of medium with and without mycobactin J supplementation are used for each specimen to assess the mycobactin dependence commonly associated with Map (Whitlock et al., 1992). The tubes are monitored for at least 16 weeks, after which the presence and number of colonies and the demonstration of mycobactin dependence consistent with Map are recorded (Whitlock et al., 1992). A molecular based confirmatory test, such as PCR to detect the Map marker sequence IS900, can be used to confirm positive specimens. Automated Culture Radiometric Systems The radiometric system in greatest current use is the BACTEC system (Becton Dickinson Laboratories, Sparks, Maryland, USA). It was originally designed for human clinical laboratories for the diagnosis of tuberculosis and other human mycobacterial infections. Collins and co-workers (1990b)

OCR for page 45
demonstrated that the system could be modified to culture Map. BACTEC is highly automated, faster, and apparently has a slightly higher sensitivity than conventional culture systems do, but it is quite expensive and requires the use of radioisotopes (14C-labeled palmitic acid). Specimens are decontaminated and then mixed with PANTA and specific liquid medium that contains 14C-labeled palmitic acid in sealed tubes. The instrumentation detects the 14C-labeled CO2 that is produced by metabolism of the labeled palmitic acid. Although many microorganisms can metabolize labeled palmitic acid, the decontamination step and the antibiotic brew (PANTA) select for mycobacterial species (Collins et al., 1990b). A confirmatory test, such as IS900 PCR, is required on positive specimens. When the BACTEC system is combined with IS900 PCR, rapid and specific identification of Map is possible—often within only a few weeks (Whittington et al., 1998b). This system is used only in a few laboratories throughout the world, primarily because the equipment is expensive, but also because it requires the use and disposal of radioactive materials. Nonradiometric Systems Several new non-radiometric automated culture systems have become available. They are highly automated, incubating and evaluating the culture vials simultaneously and downloading results directly to a computer. The systems require special defined media and incorporate a detector system that reacts to alterations in oxygen, CO2, or pressure within a sealed tube (Nielsen et al., 2001). Studies that evaluate their utility for the detection of Map in clinical specimens have not been published, but several studies of their ability to detect mycobacteria in human clinical specimens have been published: MB/Bact system by Organon-Teknika (Brunello et al., 1999), BACTEC MGIT 960 and BACTEC 9000 MB systems by Becton Dickinson (Tortoli et al., 1998), MB Redox by Bioquest (Piersimoni et al., 1999), and ESP system by Trek Diagnostics (Tortoli et al., 1998). Pooled Fecal Culture The use of pooled fecal specimens from several animals within a herd has been evaluated as a means of determining a herd’s infection status. Pooling samples reduces the number of fecal cultures necessary to determine infection, thereby reducing the cost of a large-scale JD control or eradication program. One study (Kalis et al., 2001) compared the results of strategically pooled culture specimens (five animals of the same age per pool) and individual fecal specimens. Using individual fecal culture as the gold standard, the authors reported a sensitivity of 86 percent and specificity of 96 percent—higher than expected. The authors concluded that pooling of the samples and using fecal culture influenced diagnostic sensitivity and that strategic pooling of samples would considerably reduce testing costs when a herd is not suspected of being infected. If a herd were infected, however, individual cultures or other organism-based tests would be necessary to identify infected individuals (Kalis et al., 2001). The same research group also evaluated strategically pooled fecal

OCR for page 45
cultures as a method for certifying Map-free dairy herds. They concluded that the absence of clinical signs of JD in a herd was not a good indicator of the infection status of the herd, that culture of pooled fecal specimens could be used to detect Map infections in herd that was believed to be uninfected, that repeated cultures of strategically pooled fecal specimens in combination with closed herd management were necessary to effectively determine the Map infection status, and that the optimal number of herd cultures necessary to provide a sufficient level of confidence that a herd is indeed free is still an open question (Kalis et al., 1999a). At the Seventh International Colloquium on Paratuberculosis, Garner and co-workers (2002) presented a preliminary report of a continuing study on the diagnostic sensitivity of pooled fecal culture for Map in dairy herds. The results indicated a sensitivity of 30–100 percent on pools of 5 and 10 animals and that sensitivity depends on pool size and the number of heavy versus light shedders (Garner et al., 2002). Sensitivity of Fecal Culture Merkal et al (1970) reported that fecal culture has a diagnostic sensitivity of roughly 50 percent and that it detects animals shedding more than 100 colony-forming units per gram (cfu/g) of feces. Several other studies have evaluated the sensitivity of various fecal-culture methods and report sensitivities ranging from 38 to 55 percent and specificities of near 100 percent (Eamens et al., 2000; Sockett et al., 1992a; Stabel, 1997; Whitlock et al., 2000b). Sockett and colleagues (1992a) reported on one study in which radiometric culture (BACTEC) produced a slightly higher sensitivity than did conventional fecal culture (54.4 percent vs. 45.1 percent). Another study (Eamens et al., 2000) evaluated five conventional and radiometric culture methods. The authors reported that the Whitlock decontamination to BACTEC medium was most sensitive for detecting shedder cattle, followed in order by Whitlock decontamination to HEYM, modified Whitlock decontamination to BACTEC medium, conventional decontamination with sedimentation to HEYM, and conventional decontamination and filtration to BACTEC medium (Eamens et al., 2000). Immunologic Tests Detecting the Cell-Mediated Response Various tests can detect cell-mediated immune response to Map, but the most common are assays for IFN-γ production (Rothel et al., 1990; Wood et al., 1989, 1992). In the past, intradermal skin tests (Johnson et al., 1977; Wentink et al., 1992) and lymphocyte blastogenesis, also called lymphocyte transformation (Buergelt et al., 1977, 1978a; Johnson et al., 1977), were used more commonly. Skin Testing The skin test takes advantage of the development of a delayed-type hypersensitivity (DTH) reaction to the intradermal injection of a mycobacterial

OCR for page 45
extract, purified protein derivative (PPD). Intradermal skin testing has been and continues to be commonly used for the diagnosis of both bovine and human tuberculosis. The PPD used in JD skin testing in the United States has been named Johnin, and the PPD extract was isolated from M. avium serovar 2 (formerly, M. paratuberculosis strain 18), which, before the development of more sensitive genetic methods of identification (Chiodini, 1993), was presumed to be a laboratory-adapted strain of Map. A positive test results in an increase in skin thickness (greater than 3 mm) at the site of injection within 72 hours of intradermal injection of Johnin PPD. Significant cross-reactions occur with exposure to other environmental mycobacteria, such as M. avium species and M. bovis, or with vaccination for JD, resulting in a lack of specificity and a poor correlation with the infection status of the animal (Cocito et al., 1994; Collins, 1996; Wentink et al., 1992, 1994). A nonspecific response can be clarified in cattle by the use of the comparative cervical skin test, because a stronger reaction will be given to the M. avium (PPD-A) injection site than to M. bovis (PPD-B) (Manning and Collins, 2001). Lymphocyte Blastogenesis and Transformation Lymphocyte blastogenesis transformation is a relatively complex in vitro bioassay that uses the antigen Johnin PPD to stimulate lymphocytes present in fresh bovine whole blood co-incubated with 125I-5-iodo-2'-deoxyuridine (Buergelt et al., 1977, 1978a) or, in some cases, 3H thymidine (Kreeger et al., 1991). The incorporation of labeled deoxyuridine is measured using a gamma counter to determine the degree of lymphocyte blastogenesis in response to Johnin stimulation (Buergelt et al., 1977, 1978a). Lymphocyte blastogenesis also has been used as a diagnostic test in North American wild ruminants and domesticated sheep (Williams et al., 1985). The method requires the use and disposal of radioisotopes, relatively expensive instrumentation, and a fairly large volume of fresh whole blood. It must be done immediately after specimen collection, and it suffers from specificity problems similar to those for skin testing related to exposure to other mycobacteria. Gamma Interferon The basis of the IFN-γ test is the production and release of IFN-γ by sensitized bovine lymphocytes in response to in vitro stimulation with a series of mycobacterial antigens, including Johnin PPD and M. bovis PPD (Wood et al., 1992). The method was developed for the diagnosis of bovine tuberculosis (M. bovis infection), as an in vitro correlate of skin testing (Rothel et al., 1990). Two methods have been developed to detect bovine IFN-γ: a bioassay (Wood et al., 1989), and a sandwich enzyme immunoassay (EIA) (Wood et al., 1992). An EIA test kit is commercially available (BOVIGAM, CSL Ltd., Parkville, Australia). The sandwich EIA is more sensitive and specific than the bioassay is for detecting Map-infected cattle at all stages of the disease: it offers sensitivity of 71.8 percent for subclinical cases with no fecal shedding, 93.3 percent for subclinical cases with fecal shedding, and 100 percent for clinical cases. The

OCR for page 45
bioassay, by comparison, had sensitivity of 16.7 percent for subclinical cases with no fecal shedding, 33.3 percent for subclinical cases with fecal shedding, and 80 percent for clinical cases (Billman-Jacobe et al., 1992; Nielsen et al., 2001; Wood et al., 1992). Collins and Zhao (1995) showed that it is possible to differentiate experimentally infected animals from noninfected animals at 17 months post-challenge. Jungersen and co-workers (2002) reported that cross-reactivity with Map could be documented in cattle with specificities of 95–99 percent after Johnin stimulation, irrespective of interpretation relative to M. bovis PPD or no antigen stimulation. However, false-positive test results were found in animals under the age of 15 months, and false negatives were reported for infected animals when the test was performed on day-old specimens (Jungersen et al., 2002). EIA evaluated on young cattle has produced mixed results that were related to specificity and uncertain test interpretation. The cause could have been related to exposure to other mycobacterial antigens, as described with other cell-mediated tests (Collins and Zhao, 1995; McDonald et al., 1999). A modification of this test is reported to be available soon for use in Cervidae (Manning and Collins, 2001). Serologic Tests to Detect the Humoral Response Serologic tests for Map are most useful in establishing the herd prevalence of infection, for presumptive identification of infected animals, and for confirming the diagnosis of JD in animals that demonstrate compatible clinical signs (Nielsen et al., 2001). A variety of tests can detect humoral antibodies to Map in bovine serum, but because only the agar gel immunodiffusion (AGID) test, the complement fixation test (CFT), and absorbed indirect ELISA have been used widely to diagnose JD, the discussion below is limited to those tests. In a recent review of Map in veterinary medicine, Harris and Barletta (2001) briefly describe the results of studies conducted over the past decade on assays using serum or blood to detect humoral and cell-mediated responses to Map. In general, because CMI develops early in Map infection and humoral immunity develops 10–17 months after infection, serologic tests are not recommended for animals younger than 15 months (Lepper et al., 1989; Nielsen et al., 2001). Because the humoral response tends to occur relatively late in infection, those tests are better used for detection of clinical than subclinical disease (Nielsen et al., 2001). All of them depend heavily on the use of Map-specific antigens to prevent cross-reaction of antibodies that develop after exposure to other mycobacteria, such as M. bovis or other M. avium species (Nielsen et al., 2001). This has been a particular problem because the original antigen used for some of the tests (particularly ELISA) was derived from a lysate of M. avium serovar 2, formerly known as M. paratuberculosis strain 18 (Chiodini, 1993).

OCR for page 45
Complement Fixation Test The complement fixation test (CFT) was one of the earliest serologic tests for JD, and it is still required by some countries for export or import of livestock (Colgrove et al., 1989; Larsen et al., 1963). CFT’s major advantage is its ability to detect heavily infected animals (de Lisle et al., 1980). Most animals with a CFT titer of 1:32 or above are likely to be fecal-culture positive, but animals with lower titers also can be positive on fecal culture, indicating a lack of sensitivity (Whitlock, 1994). CFT’s usefulness is limited by its tendency to produce false-positive results and its lack of sensitivity (Colgrove et al., 1989; Larsen, 1973; Merkal, 1984). CFT can detect serum antibodies 1–5 months later than does ELISA (Ridge et al., 1991), and its sensitivity and specificity have been reported as 38.4 and 99 percent respectively (Sockett et al., 1992b). Agar Gel Immunodiffusion AGAR gel immunodifussion (AGID) was developed after CFT as a quick test for animals that were showing clinical signs of JD (Sherman et al., 1984, 1989). Positive test results correlate well with clinical signs of JD, but failure to detect subclinically infected animals is a major drawback. Whitlock (1994) estimated that when AGID results are positive, there is a 95 percent chance of actual Map infection. Sherman and colleagues (1990) reported that AGID has slightly better sensitivity (18.9 percent) and specificity (99.4 percent) for detecting subclinically infected animals than does CFT (sensitivity, 10.8 percent; specificity, 94.7 percent). However, other results have not been conclusive. Sockett and colleagues (1992b) reported sensitivities of 26.6 percent (AGID) and 38.4 percent (CFT) and specificities of 100 percent (AGID) and 99 percent (CFT) (a titer of ≥1:8 was considered positive). Colgrove and co-workers (1989) reported that, when the results of AGID testing on 192 cattle were compared with fecal culture results, there was poor agreement, although there was fair to good agreement among fecal-culture, CFT, and ELISA results. Nielsen and colleagues (2001) reported that AGID was generally less sensitive than ELISA or CFT, particularly for subclinical infection. Because the antigen used for AGID differs from one country to the next, results for the United States and European countries, for example, are not always directly comparable (Goudswaard et al., 1976). Enzyme-Linked Immunosorbent Assay Most ELISA tests in current use are modifications of the method developed by Yokomizo and colleagues (1983). For many years, this test used a protoplasmic antigen derived from M. avium serovar 2, formerly known as M. paratuberculosis strain 18 (Chiodini, 1993; Whitlock, 1994; Yokimizo et al., 1983), although by the mid-1990s this antigen was finally replaced with a Map-derived antigen. Current versions employ a step to reduce the cross-reaction to nonspecific mycobacterial antigens by absorbing the bovine test serum with M. phlei before performing the ELISA (Bech-Nielsen et al., 1992). Other antigens have been used in some ELISA studies, including various extracts of Map,

OCR for page 45
purified lipoarabinomannan, and lipid-free arabinomamman (Harris and Barletta, 2001; Sugden et al., 1987, 1989). Care must be taken in comparing the results of the various ELISA studies to be sure similar antigens were used. ELISA test kits or services are commercially available from a number of sources (IDEXX, Portland, Maine, USA; Allied Monitor, Fayette, Missouri, USA; Synbiotics, San Diego, California, USA; Biocor Animal Health, Inc., Omaha, Nebraska, USA; CSL, Parkville, Victoria, Australia). Depending on the specific ELISA, sensitivities range from 43.4 to 58.8 percent and specificities range from 95.4 to 99.8 percent (Collins et al., 1991; Nielsen et al., 2001; Ridge et al., 1991; Sockett et al., 1992b). Apparently the best currently available serologic test for JD is the absorbed ELISA (Collins, 1996; Socket et al., 1992b). ELISA for bulk-tank milk also has been developed for estimating the prevalence of JD in dairy cattle (Nielsen et al, 2001). Those authors describe a sensitivity of 97 percent and specificity of 83 percent. However, the number of infected animals that must be present in the herd to result in a positive bulk-tank sample is apparently unknown, and it is likely that a herd with only a few infected animals would be missed (Nielsen et al., 2001). An earlier study (Hardin and Thorne, 1996) compared the results of individual animal milk ELISA versus serum ELISA in 821 dairy cattle from 12 Missouri herds and reported a low correlation (k=0.08 and R2=0.02) between the two tests. More study will be necessary to determine the usefulness of ELISA on individual milk and bulk-tank milk samples for use in JD status or control programs. Interpretation of ELISA results generally has been made on a single, arbitrary cutoff value with a resultant positive-negative test result. Although this makes results easier for veterinary practitioners and herd owners to understand, valuable information is lost. Ranking of the values obtained can be used to determine the likelihood ratio that cattle are infected with Map or are fecal shedders (Collins, 1996; Nielsen et al., 2001; Spangler et al., 1992). Test Result Measures Fixed-Decision Thresholds or Cutoffs For serologic tests with a continuous result, such as conventional ELISA, a positive test is one that yields a signal of sufficient intensity (optical density) that it is above or below an arbitrary decision threshold or cutoff value (Greiner et al., 2000). Such a cutoff is selected to optimize the tradeoff between false-negative and false-positive results, the occurrence of which incur different costs. Because of inherent procedure variability and a surprising degree of individual animal response variability across repeated samples, recent findings suggest that such cutoffs are particularly problematic for Map-antibody ELISA (Barrington et al., 2003, Sockett, 2000). For example, using a commercial kit licensed by the U.S. Department of Agriculture, Sockett (2000) found that 6 of 22 ELISA-positive cows were ELISA-negative 30 days later. On testing a subset of samples in 72 replicates, Sockett found that the coefficient of variation for the procedure was 19 percent; a procedure is considered robust if its

OCR for page 45
coefficient of variation is below 10 percent. As a consequence of this inherent lack of precision, samples from cows with true values near the cutoff are likely to yield false results on any given test and therefore are likely to produce inconsistent results on repeated testing. Because of the high prevalence of subclinical infection, a large proportion of Map-infected cattle in a herd could be close to the cutoff. Furthermore, for unknown reasons, serum antibody concentrations are more variable in infected animals than our understanding of antibody kinetics would predict (Barrington et al., 2003). These findings support the anecdotal reports of difficulty with replicating Map-antibody ELISA in the field. Variation in results, particularly near a threshold value for test interpretation, seriously undermines the confidence of producers and veterinary practitioners in such tests. One approach to reducing the effect of procedure variability is to run replicates of each specimen or, to minimize total testing costs, to repeat those tests that yield results within a given range of the cutoff value. Information Cost and Degree of Result Categorization When a result from a laboratory procedure that is inherently continuous, such as a colony count or an analyte concentration, is classified into one of a few categories (positive, suspect, negative), potentially useful information about the disease or infection state in the individual is often lost (Shapiro, 1999). For example, reporting the result of fecal bacterial culture only as positive indicates that organisms were shed in feces, but provides no information about the level of shedding. Many laboratories increase the utility of the reported information by further dividing positive culture results into several categories (light, moderate, heavy, too numerous to count [TNTC]). Because shedding is likely related both to an animal’s disease state and to the risk it poses to other animals in its environment (Whitlock et al., 2000b), information on shedding is useful for management decisions, such as whether to cull an animal immediately or to wait for a later stage in the production cycle. Increasing the information yield by further categorization, such as providing a more precise estimate of fecal shedding, increases the cost of testing. The necessity of improving the yield of information from procedures depends on the balance between the increased utility of the information to its users—producers and veterinary practitioners—and increased costs. Producers generally are willing to pay more for information that is more accurate or more useful. As the analytic sensitivity of a test procedure improves and additional result categories become available, additional research to validate the relationship between these categories and biologically or economically important outcomes also is needed. Some work has been done to determine the relationship between minimally categorized fecal shedding and such factors as the likelihood of in utero transmission (Sweeney et al., 1992c), the presence of the agent in colostrum or milk (Sweeney et al., 1992b), the degree of milk and other production losses (Abbas et al., 1983; Benedictus et al., 1987, Buergelt and Duncan, 1978), and the time to development of clinical disease. The

OCR for page 45
analytic sensitivity of fecal-culture procedures has improved considerably since much of this work was done, and more research is needed to improve test credibility and to provide a better basis for management decisions. The relationships between the test result categories and outcomes beyond the presence or absence of infection should be studied. Little work has been published that relates serologic test results to infection states and to the associated production and health; however, because of their rapid turnaround, the use of such tests is increasing. Likelihood Ratios and ROC Curves Likelihood ratios that correspond to specific test results are a useful approach for reporting test procedures that yield a continuous quantitative result, such as optical density from conventional-format ELISA (Radack et al., 1986). The power of the likelihood ratio is that it allows clinicians to revise estimates of infection or disease probability (Giocoli, 2000). Using a conversion algorithm, the test value—optical density, for example—for each animal is converted to a likelihood ratio that is then reported as a result. A major advantage of the algorithmic approach is that the effects of covariates on test performance (age, breed, vaccination status, gestational status) can be incorporated to increase accuracy—and thus the usefulness—of the test information (Greiner and Gardner, 2000). The failure to formally incorporate this readily available covariate information into the calculation of test likelihood ratios has been named as a major oversight in this field (Brenner et al., 2002; Feinstein, 2002). Using a likelihood ratio for each tested animal and a known or assumed prior probability of infection in the herd, the producer or veterinarian can obtain positive and negative predictive values for each animal tested. These indices can be calculated from formulas (Fletcher et al., 1996; Sacket et al., 1991) or obtained directly from the graph nomogram of the relationship between likelihood ratios and pre- and post-test disease prevalence or probability (Pagan, 1975; Giocoli, 2000; Moller-Petersen, 1985). Utilizing farm-specific estimates of prevalence with false-positive and false-negative costs, a farm-specific cutoff can be calculated to satisfy an optimality criterion (Greiner et al., 2000). The use of likelihood ratios avoids the problem of information loss associated with the too-broad categorization of results, as well as the confusion that arises from repeated testing and the use of a fixed cutoff in the presence of procedure and animal variability. Producers and their veterinarians need educational materials explaining the interpretation of likelihood ratios as test results and the use of calculations or other indices, because those measures are not yet in wide use. Recent evidence from the practice of human medicine shows that very few physicians are familiar with or use these measures correctly, despite their clinical usefulness (Hoffrage et al., 2000; Reid et al., 1998; Steurer et al., 2002). Veterinary practitioners are likely to have a similar lack of familiarity and understanding. Establishing the algorithm for converting a particular test value (optical density, percentage inhibition) to the corresponding likelihood ratio, whether by equation or empiric table, requires determining the test performance over a full

OCR for page 45
range of typical results in groups of infected and uninfected animals representative of those in which the test will be used. The results of such evaluations are expressed as receiver operating characteristic (ROC) curves, which are essentially expressions of all true-positive (sensitivity) versus false-positive (1−specificity) pairs over the range of the test (Shapiro, 1999; Zweig and Campbell, 1993). Because of these common scales, the use of ROC curves also promotes direct accuracy comparisons among test formats. The application of ROC curves to medical laboratory test evaluation is relatively recent, and ROC data analysis methodology is an active area of research in biostatistics. ROC curves are not yet widely accepted, particularly for tests that provide continuous results (Pepe, 2000b). Recent examples of advances are the ROC analysis approaches that use specialized regression techniques, as published by Lloyd (2002) and Pepe (2000a). The application of ROC curves to veterinary diagnostic test evaluation has been reviewed by Greiner and colleagues (2000). As with the evaluation of any laboratory test, proper design and execution of a study used to establish a ROC curve is critical, and the fundamental principles are no different. Despite well-established methodologic test evaluation, those studies often have serious flaws (Reid et al., 1995), with the design flaws most often resulting in an overestimation of test performance (Lijmer et al., 1999). Greiner and Gardner (2000) provided a design checklist and discussed sample selection methods and strategies for avoiding bias in veterinary laboratory test evaluation studies. Because test sensitivity and specificity generally vary across herds (Brenner and Gefeller, 1997; Greiner and Gardner, 2000), the results from studies involving multiple herds could best be analyzed as clustered data. At least one paper has been published on the analysis of clustered ordinal, rather than continuous, data for ROC curves (Beam, 1998). Although a sufficient number of test subjects must be carefully selected for sampling, clear guidelines for estimating the sample size required for developing ROC curves for a test with a continuous result are not yet available. Without regard to cluster effects, Sintchenko and Gilbert (2001) suggest that a minimum of 500 animals is needed on each side (infected and noninfected) of a study. Although not directly applicable to establishing ROC curves for laboratory tests that yield continuous results, Obuchowski (2000) provides tables containing some smaller sample sizes as preliminary estimates. MOLECULAR METHODS The first molecular studies of Map were connected with efforts to identify a small number of mycobacterial isolates obtained from long-term cultures of human intestine from patients with Crohn’s disease (Chiodini et al., 1984; McFadden et al., 1987a). The organisms were first evaluated by DNA-DNA hybridization with known strains of mycobacteria and were found to be indistinguishable from organisms of the M. avium complex (McFadden, et al., 1987b). Genetic fingerprinting techniques using restriction fragment length polymorphism (RFLP) analysis of DNA subsequently identified these isolates as

OCR for page 45
identical to known isolates of Map. Application of these and other techniques to standard laboratory strains of Map led to the proposed taxonomic reclassification of Map as a subspecies of M. avium (Rogall, et al., 1990; Thorel et al., 1990a). Those analyses showed that, with more than 98 percent genetic identity, Map was more closely related to M. avium subsp. avium than it is to M. intracellulare or M. scrofulaceum. IS900 In 1989, a novel DNA insertion sequence in Map was reported independently from two laboratories (Collins, et al., 1989; Green, 1989). This insertion sequence, IS900, was the first described in mycobacteria and has attributes that, when combined with modern molecular methods, make it a powerful tool for studying Map infection. Studies of mycobacterial strain collections showed the presence of multiple copies of IS900 in all isolates of Map and in no other mycobacterial species analyzed. However, recent studies have shown that insertion sequences similar to IS900 exist and can yield false-positive test results if stringent procedures are not followed (Cousins et al., 1999). The apparent specificity of IS900 for Map was quickly exploited by using PCR to develop a rapid assay for identification of Map from clinical specimens (Vary, 1990). A commercially available assay kit (IDEXX Laboratories, Inc. Westbrook, Maine, USA) has been evaluated in several host species. In cows, IS900 PCR testing of feces is highly specific, as well as sensitive, when fecal shedding is frequent (Sockett et al., 1992a; Thoresen and Saxegaard, 1991). Although PCR testing is not as sensitive as fecal culture, results can be obtained within a few days rather than after many weeks—an important attribute for JD control programs. Other genetic sequences (hspX and F57), purportedly specific for Map, also have been reported, but additional work is needed to clarify their utility (Bannantine and Stabel, 2000; Coetsier et al., 2000; Ellingson et al., 1998, 2000; Poupart, 1993). At least two studies have raised doubts about reliance on IS900 PCR typing alone for identification of Map from clinical specimens. Critical review reveals some of the uncertainties in interpretation that can arise when new findings are presented. Cousins and colleagues (1999) reported that mycobacteria isolated from feces of four healthy ruminants were positive by IS900 PCR typing. Those findings are relevant to JD control programs because the isolates were obtained during monitoring of herds that had been established as JD-free. A careful analysis of the amplified product by sequencing the DNA showed that the IS900 primers used had amplified DNA elements that were only moderately similar to IS900 from Map (71–83 percent DNA-DNA homology). Closer sequence relatedness existed between three of these isolates and strains of M. paraffinicum, and with M. scrofulaceum for the fourth. In addition, DNA typing by RFLP analysis showed patterns distinct from IS900. Thus, isolates that were presumptive Map by PCR were found, on closer examination, to be distinct mycobacterial species.

OCR for page 45
Another study (Naser et al., 1999) evaluated IS900 PCR on M. avium isolates derived from human clinical sources and reported positive amplification reactions in 15 of 28 isolates. The results were confirmed by hybridizing the PCR reaction products with a labeled plasmid probe derived from Map. However, hybridization of this probe with M. avium subsp. avium has been accomplished by Hampson and colleagues (1989) using low stringency conditions, permitting interaction with chromosomal flanking sequences for IS900 insertion or with non-IS900 sequences of lower specificity. Naser et al. (1999) did not identify the stringency conditions, so it is difficult to interpret the results of their method of validation. Sequencing of the amplified products would have provided a more convincing demonstration of the presence of IS900 in these M. avium clinical isolates. When sequencing of PCR products was done by Cousins and colleagues (1999), it became clear that their mycobacteria were distinct from Map and that IS900 was not present. These results highlight several concerns about the use of molecular methods as a surrogate for isolation of the whole organism. It is essential that the methods’ specificity be tested in well-designed collaborative studies and measured against clear and accepted standards. Any departures from established methods should be validated by a similar comparison with the primary standard, or with a widely accepted surrogate, although consensus standards should be used whenever possible. Care in interpretation must be exercised whenever an established test is applied to a new clinical entity or to a new clinical context (such as when prevalence differs markedly from the conditions for which the assay was developed). Molecular diagnostic methods, such as PCR for IS900, are often erroneously claimed to have 100 percent specificity, despite occurrence of clearly demonstrated false-positive results, even in the absence of contamination. Limitations of PCR in routine diagnostic applications include the high cost; the demand for technical sophistication; and the need for well-defined, stringent quality control (Collins, 1996). Molecular Strain Typing The capacity to differentiate individual strains of Map is essential for evaluating routes of transmission and characteristics of pathogenesis. It is important for producers to be able to identify the source of a new infection because that information often will dictate corrective action. Different control strategies are warranted depending on whether a new infection is the result of introducing livestock from another herd or is attributable to animal contact with something in the farm environment, such as contaminated pasture. Isolates of Map from different clinical sources have few distinguishing phenotypic characteristics. The only features that differentiate strains of Map in culture are the rate at which they grow and, sometimes, variations in pigment (Stevenson et al., 2002). However, several methods have been developed to discriminate closely related strains. Nonmolecular methods are usually based on serology, differences in biochemical properties, antimicrobial susceptibility, and phage typing. Multilocus enzyme electophoresis (MEE), which compares strain differences in the size of common metabolic enzymes, could be useful, but few

OCR for page 45
of these techniques are practically useful for differentiating strains of Map. MEE is also technically ponderous and its use is limited to a few research laboratories (Feizabadi, 1997; Thorel et al., 1990b; Wasem et al., 1991). Molecular-strain typing has had a great influence on studies of Map. Among the techniques used have been RFLP analysis of DNA, pulsed-field gel electrophoresis of DNA, and multiplex PCR typing. RFLP has been used most extensively. In the most common application of RFLP analysis, DNA is cut into small, nonrandom lengths that are separated by size. A probe that can recognize regions that occur in multiple copies in the genome reveals fragment sizes that contain those regions. In the correct circumstances, the resulting pattern will be characteristic of closely related strains but will change as relatedness between strains becomes more distant. There are many variations on RFLP typing, and the results can vary depending on the type of probe used, the manner in which the DNA is cut, and the choice of reaction conditions. Sequences from IS900 are the most widely used probe in RFLP analysis of Map. IS900 has been particularly useful because 15–20 copies typically are present in the genome, and its insertion remains stable over many generations of growth of the organism (Cousins et al., 2000; Green et al., 1989; Pavlik et al., 1995; Whipple, 1990). IS900 RFLP typing has been widely applied to isolates of Map from animal and human sources (Thoresen and Olsaker, 1994). It is now well documented that there are at least two main strain types of Map, designated C (cattle) and S (sheep), which can be distinguished by RFLP patterns. These strain types are discussed in greater detail in Chapter 2. EPIDEMIOLOGIC TOOLS Test Protocol Standardization Because of the relatively poor epidemiologic sensitivity and specificity of current laboratory tests for subclinical infections, considerable research is focused on improving test performance. Some have proposed methods for optimizing conventional Map culture of bovine-origin specimens (Whipple et al., 1991; Whitlock and Rosenberger, 1990), but specific standard protocols have not been established for any JD tests. Even the International Office of Epizootics (OIE) Manual of Standards, the principal source of standard methodologies prescribed for international trade (OIE, 2000), implies a large degree of flexibility in diagnostic methods. OIE does not prescribe a JD test for international trade or detail an international primary reference standard. For example, it acknowledges that conventional culture is technically difficult and time-consuming, and it cites seven variations in the method (OIE, 2000). The USDA National Veterinary Services Laboratory (NVSL) recommends three protocols (Cornell, National Animal Disease Center, University of Pennsylvania) for culture of bovine fecal specimens (Payeur and Kinker, 1999). However, the version of the Pennsylvania protocol in the reference cited by Payeur and Kinker (Stabel, 1997) as a source of these protocols is different from

OCR for page 45
that protocol as published by its originator (Whitlock and Rosenberger, 1990). Protocols for conventional culture are therefore not readily available in the refereed primary scientific literature and are in a continuous state of improvement (Shin, 2000). In light of relatively poor test standardization and the considerable research to improve it, the absence of specific standard protocols is probably justified. Beyond reagent and reference standard quality control, the incorporation of rigid procedural standards into control program requirements is likely to retard the development and incorporation of improved tests into infection control plans. Although the use of ELISA serologic testing for bovine JD is increasing, the large variation in test procedures compounds the problem of relatively poor diagnostic performance, even with standardized protocols in the form of commercial test kits. After repeated testing of six serum samples across multiple plates and days with the IDEXX ELISA, Sockett (2000) reported an average coefficient of variation of 19 percent, with variation greatest at the lower sample-to-positive control (S/P) ratios. Because the arbitrary positive-negative threshold is 0.25, positive samples with a mean S/P of 0.27 would test negative 35 percent of the time, and negative samples with a mean S/P of 0.21 would test positive 25 percent of the time. Sockett (2000) noted that discrepant results from repeated testing—a consequence of this relatively large variation— seriously undermine confidence in the reliability of testing as part of control and eradication programs. One response to the problem of variability is to move away from using a fixed S/P cutoff for positive or negative classification to using S/P ranges as an indication of likelihood that an animal is infected (Coffins, 2002). Although OIE provides international standard sera for many diseases, which is crucial for international standardization of testing procedures, it is not yet providing such standards for JD in cattle or other species. The international availability of standard sera for all ruminant livestock species is important because they would make it possible to establish a baseline for diagnostic test performance in any laboratory, worldwide. This also would provide a comparison standard within and between laboratories for developing and improving tests and for establishing international trade criteria (Wright, 1998). Laboratory Proficiency Evaluation The variability in ELISA test performance makes laboratory proficiency testing and quality control all the more important for JD control programs, but commercial test design can run counter to this goal. In a study involving eight laboratories using an earlier version of this ELISA kit, the within-plate coefficient of variation averaged 7 percent, but it ranged from 5 percent to 29 percent across laboratories. The across-day coefficient of variation averaged 14 percent, but it ranged from 6 to 29 percent (Collins et al., 1993). The authors indicated that, because of economic pressure to keep the cost of the test low, the kit was being marketed as a single-well assay and concluded that this was justified based on the results. Running ELISA as a single-well test is contrary to what is considered best laboratory practice. The lack of data on

OCR for page 45
within-sample variability also markedly reduces a laboratory’s ability to monitor coefficients of variation as a component of laboratory quality control. Because the effects of well position within a 96-well plate on ELISA variability are not random—owing to edge effects—a duplicate placement scheme was developed several decades ago that takes advantage of this systematic pattern (Stemshorn et al., 1983; Wright, 1987). Placing duplicates in a systematic pattern (rather than to next to one another) reduces the coefficient of variation by 30 percent (Stemshorn et al., 1983), yet, for convenience, ELISA kit instructions from two companies suggest placing duplicate positive and negative controls in sequential wells at the beginning of each plate (CSL Laboratories, 2000; IDEXX, 2000). The number and types of controls in these kits are inadequate: the OIE guidelines specify that four replicates of an antibody-negative control, a weak-positive control, and a strong-positive control be run on each plate (Wright et al., 1993). Within-sample variability data would enable continuous monitoring of quality control using computerized procedures such as Shewart cumulative sum (Shewart—CUSUM) program (Blacksell et al., 1994; 1996), which allows continuous evaluation of many factors that affect performance within the laboratory. Such procedures also enable comparison of performance between laboratories on a near real-time basis. Sharing quality control data via the Internet would allow rapid identification of differences between lots of reagents or kits (Rebeski et al., 2001). A strong, unbiased, external laboratory proficiency evaluation or quality assessment program is critical to establishing and maintaining confidence in the testing aspects of disease control programs. Indeed, proficiency evaluations are required under ISO 9000 guidelines implemented by OIE for international trade harmonization (OIE, 2000). Proficiency programs should be designed to encourage laboratories to adopt better-performing tests and to refine existing protocols rather than restrict them to performing a set of standard test protocols (Salkin et al., 1997). Given the range of protocols— whether for culture or serology—comparing performance across laboratories in a continuous proficiency evaluation program provides the basis for determining what factors or procedures are critical to improved performance (Somoskovi et al., 2001). Publication of proficiency testing results in media read by producers and veterinary practitioners will encourage underperforming laboratories to adopt better methods. It also will encourage improvement in quality control and will provide consumers of laboratory services (producers and veterinarians) important information for selecting laboratories. The result will be an increase in confidence in JD test results. At the request of the Laboratory Certification Subcommittee of the Johne’s Disease Committee of the United States Animal Health Association, in 1996 NVSL began an annual check testing program for veterinary diagnostic laboratories to become approved for JD fecal culture or serology of bovineorigin specimens (Whitlock et al., 2000a). The procedure is based on a set of serum and fecal-sample unknowns that NVSL provides from known, naturally infected, culture-positive cattle and from known culture-negative cattle herds. In 1999, testing involved 30 samples; in 2000, 25 samples were provided.

OCR for page 45
Laboratories must correctly identify all of the culture-negative fecal samples, all of the culture-positive TNT (too numerous to count) samples, and 70 percent of the remaining samples identified as positive by 70 percent of the participating laboratories. Laboratories must correctly identify 90 percent of the serology samples. In the first year, the performance of only five of 35 laboratories met fecal-culture-approval standards, and none of 16 met those for serology. This increased to 35 of 41 for fecal culture, to 2 of 6 for fecal PCR, and to 61 of 63 for serology in the 2000 round of testing. Veterinary diagnostic laboratories have an incentive to participate in this program: the current U.S. voluntary cattle herd status program recommends that all tests be performed in an approved laboratory (USAHA, 1998). No such check testing program exists for veterinary diagnostic laboratories that test JD specimens originating from other ruminant livestock species. In contrast to the proficiency assessment procedures currently in use, and those recommended by OIE (OIE, 2000), considerable research has shown that assessment programs should use a process that is as blind as possible: the participating laboratory should not be able to distinguish check samples from routine samples. Open schemes, such as those currently in use, overestimate day-to-day laboratory proficiency, presumably because more care is taken with the check samples than with regular diagnostic samples. Rather than estimating routine laboratory proficiency, open schemes indicate the optimal proficiency of participating laboratories (Black and Dorse, 1976; Reilly et al., 1999). Libeer (2001) reported that assurance plans that use identifiable samples overestimated clinical chemistry laboratory performance by 25 percent, compared with blind samples. In a Centers for Disease Control and Prevention study (Hansen et al., 1985) of samples submitted for drug testing, the average error rate across six drug classes was 49 percent higher for blind positive samples than for mailed positive samples. In a study of mycology samples, the frequency of errors on covert samples was as much as 25 percent higher than those for overt samples (Reilly et al., 1999). This bias occurs even when both the laboratory director and the individuals who perform the testing are required by law to sign statements that the proficiency program samples are handled in the same manner as regular submissions. In a study of 42 laboratories in proficiency testing programs requiring such a statement, 18 percent of results from blind proficiency testing samples were unacceptable, but only 4 percent of samples from open tests were unacceptable. In this study, 60 percent of the laboratories exhibited significantly better performance on the open samples than on the blind samples (Parsons et al., 2001). Use of the appropriate sample matrix, in addition to blinding, is important: Only 47 percent of laboratories properly identified E. coli in a blind urine specimen; 94 percent did so in a lyophilized specimen (Black and Dorse, 1976). The conclusion is that the current USDA laboratory certification system considerably overestimates the actual performance of veterinary diagnostic laboratories on routine submissions, particularly for difficult, labor-intensive procedures, such as fecal culture for Map, or for procedures with many sources of variability, such as bovine JD ELISA.