Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 107
APPENDIX F
RELIABILITY OF FIELD WORK
Despite the care with which the field team was selected and trained and
the thorough editing of data, an independent assessment of the reli-
ability of the field team's work was performed. A consultant who had
assisted in training the field team independently re-abstracted a
subsample of records and compared her results to those initially com-
piled by the field team. This Appendix presents the methods and find-
ings from that activity.
Methods
The original seventy-one hospitals were first divided into four groups,
depending on which field team member had visited each facility. The
hospitals in each group were then categorized according to whether they
were visited during the first or last half of the abstractor's field
work. From each of the resulting eight groups, one hospital was chosen
at random for inclusion in the study. Thus, all seventy-one hospitals
had a posssibility of selection. All eight hospitals agreed to a
second site visit by the consultant.
In each hospital one-half the records from the original sample
(approximately thirty-six records) were selected for review. To
accomplish this, the records were equally divided between those in which
no discrepancy had been found between the IOM abstract and the Medicare
record and those in which one or more discrepancies had been found.
Within each group, the assessment records were then chosen at random.
Two hundred eighty-one records were available for analysis. Each new
abstract was assigned a weight to reflect the probability of selection
of both abstract and hospital in the assessment analysis. The results
can be generalized to the universe of all IOM abstracts.
The forms and instructions used in the assessment are the same as those
used by the field team (see Appendix D). However, the consultant
was not asked to consult any Medicare claim forms because of time
constraints. The consultant did not know which member of the field team
had done the initial abstracting or whether any discrepancies had
initially been detected. After completing the independent abstracting,
the consultant reviewed the Medicare record (also used by the field team)
107
OCR for page 108
APPENDIX F
108
and reconciled discrepancies according to the process used by the field
team. The. goals of this process were to check whether the TOM
abstracter made a reasonable judgment about the accuracy of the
Medicare data and whether the field team's assessment of the reasons
for discrepancy were plausible.
The independent re-abstracting does not answer definitively the
question of the reliability of the field team's work. The re-abstracting
sample was very small. The perspectives of the consultant and field
team may have been somewhat dissimilar. An alternative assessment
method would have been to have the field team members check on one another,
but this option was precluded by time and budget constraints. Neverthe-
le.ss in a situation where the concept of data accuracy is tenuous at
best (for some abstracted items, there is no clear "right answer")
the. independent assessment was intended to help in determining the
soundness of the. basic study data.
Analysis
The analysis involved a comparison of three sources of data: that
generated by HCFA, the IOM abstracters, and the consultant. Special
attention was given to determining whether the field team and con-
sultant initially abstracted the medical record in a similar manner
and, where there were differences, whether they agree on the correct
source of data and the. reasons for discrepancies.
Table 1 shows that data on dates of admission and discharge and
patient's sex were. highly reliable, thus confirming the findings of
the initial analysis. The levels of agreement on the presence of
additional diagnoses and principal procedure were quite high;
however, there was considerably less agreement on principal diagnosis.
The "no discrepancy" figures slightly under-estimate the data
reliability, because they do not include those' cases where there
was a discrepancy between the field team member and the consultant,
but the consultant agreed with the re-abstractor's determination
of correct data source. ~
Table. 1. Comparison of Data Abstracted by the Consultant and the Field
Team (weighted percent) ~ -~- ~
, . .
Agreement on correct data
-source where a discrepancy
exists
- No discrepancy -~Agree Disagree Total
Admit date
. . . ~
Discharge date
Sex
Principal diagnosis*
Presence of additional
diagnoses
Principal procedure
99 . 6 0 . 4 - 1 00 . 0%
99.2 0.8 - 100.0
100.0 - - 100.0
75.8 0.3 23.9 100.0
93.0 0.6 6.4
88.4 0.2 11.4
100.0
100.0
Note: Unweighted N = 281 abstracts
*compared to four digits
OCR for page 109
109
APPENDIX F
It should be noted that the levels of agreement between the two
data sources in this assessment are not fully comparable with similar
data in the body of the report because of different weighting
factors and differences in the populations to which the statistics
are generalizable. More specifically, for the total data set
examined by the field team, the data were weighted to reflect the
universe of all diagnoses included in the ICDA-8 classification system,
as adopted by HCFA (see page 8~. The weighted percent of the sample
devoted to the 15 specific diagnoses was 40.8, while 59.2 percent
of the sample fell into the 'tall else.' category and included those
diagnoses necessary to represent the rest of the universe and permit
the calculation of net and gross difference rates (see Table 7 on page 33).
On the other hand, in the assessment of the field work, the sample was
drawn to be representative of the unweighted data set produced by the
field team. (Because of the small sample size, applying the basic weights
from the total data set to the assessment abstracts would have produced
serious distortions.) With this approach, 75.4 percent of the assessment
abstracts represented the 15 specific diagnoses, while only 24.6 percent fell
into the "all else" category (see Table 4, below). The over-representation
of the specific diagnoses in the assessment means that the two data bases
are not comparable. Accordingly, it is misleading to use the percent of
abstracts with no discrepancy on principal diagnosis between the field
team and consultant as an indication of the overall quality of the field
team's work. The fact of variation is apparent, however. For that
reason further analyses were done to try to determine the extent of
differences between the consultant and field team and the underlying
reasons.
Where there were discrepancies between the field team and consultant,
quite often both also disagreed with the Medicare record (see Table 2~.
This occurred for about fifty percent of the principal diagnoses and
about forty-one percent of the principal procedures. In other words,
each of the three data sources contained different pieces of information
all based on the same patient medical record. For the remaining cases
of discrepancies between the field team and consultant, agreement -
between the field team and the Medicare record was more likely than
agreement between the consultant and the Medicare record.
Table 2. Data Source in Agreement with Medicare Record when Discrepancies
were Found Between the TOM Abstract and Assessment Abstract
_ . . u
..
Data item Assessment ION abstract Neither Total
Principal
Diagnosis
Principal
Procedure
. . .
17.2 33.7 49.1 100.0%
40.8 100.0
OCR for page 110
APPENDIX F
110
To determine whether the sampling categories were related to varying
levels of data accuracy, the diagnoses were grouped according to their
reason for inclusion in the sample--entire Diagnosis Related Groupings
and specific and residual diagnoses within the DRGs (see Table 3~.
The sampling categories were not influential (confirming the findings
of Chapter 3~; the level of coding refinement was. Diagnoses compared
to four digits were least accurate; AUTOGRP comparisons were most
accurate.
Table 3. Comparison of Data Abstracted by the Consultant and the Field
Team by Sampling Categories and Level of Coding Re finement
(weighted percent)
Sampling
Category
Percent with no discrepancy
AUTOGRP Three-digit Four-digit
Entire DRGs*
(N = 215)
Specific diagnoses
(N = 187)
Residual diagnoses**
85.8 82.3 79.0
85.4 81.4 77.7
*Excludes abstracts with a diagnosis listed in the "all easer category
in Appendix C.
**Results are not presented because of the small number of cases.
Diagnostic-specific discrepancy rates are not presented because the
numbers of abstracts per diagnosis are so small. However, Table 4
shows the distribution of discrepancies between the field work and
assessment by diagnosis. The assessment confirms the finding that
reliability is lowest for chronic ischemic heart disease.
Where both the field team and assessment data differed from that on
the Medicare record but agreed with each other, the extent of agree-
ment on reasons selected to explain discrepancies with the Medicare
record was also examined. The possibly subjective nature of this
assessment and the need to apply judgment in selecting from the
several options were noted in Chapter 2. Because of the sizable
number of options and the small number of abstracts reviewed in
the assessment, only the general categories of reasons for discrepancy
are considered here. Table 5 shows the extent to which the field
team and consultant agreed that the reasons for discrepancy between
OCR for page 111
111
APPENDIX F
Table 4. Distribution of Discrepancies between the Field Work and
Assessment by Diagnosis
. ,,
Chronic ischemic
heart disease
Cereb rovascul ar
d isease
Fracture, upper neck
o f femur
Cataract
Acute myocardial
infarc Lion
Weighted per
cent of the
total number
of discrep-
anc ies for
each d iagno s i s
-
14.1
12.2
-
5.7
Unwe ight ed
number 0 f ab-
stracts with
. . .
a~screpanc yes
. .
7
7
o
o
4
.
Weighted percent
of the total number
of abstracts
in the as se s sment
for each d iag-
nosis
6.9
6.3
6.5
3.3
9.4
Inguinal hernia without
mention of obstruction - O 4.5
Diabe tes 8 . 1 4 5 . 2
Hyperplas ia of the
prostate 7.2 4 5. 7
Bronchopneumon ia-
organism not
specified and
pneumonla-organlsm
and type not
spec if fed
o
9.8
Cholel ithiasis/
cholecystitis 2.8 12 4.4
Inte s t in al ob s true t ion
without mention of
hernia
4.5
6
4.0
Conge s t ive hc art
failure 4.2 3 2.0
Diverticulosis of
intestine 4.7 2 4.0
Bronchitis - O l.1
Mal ignant neoplasm of
bronchus and lung - O 5.6
All Else 36.5 19 24.6
Total 100. 0% 68 100. 0
OCR for page 112
APPENDIX F
112
the original abstract and the Medicare record stemmed from difficulties
in deciding which diagnosis or procedure was principal (an ordering
discrepancy) or from errors in assigning the proper diagnostic
or procedural code number (a coding discrepancy). "General agreement'.
means that both chosen the same general category, but may have
selected different specific reasons. For example, one may have decided
that the reason was "coding completeness" while the other selected
"coding judgment." Where there was complete agreement, they selected
the same general and specific reasons. Where the consultant disagreed
with the Medicare record, the reason selected was similar to the
one chosen by the field team for about 75 percent of the diagnostic
discrepancies (compared to four digits). There was either complete
or general agreement between the two on all reasons to explain
discrepancies on principal procedure.
Table 5. Agreement between the Consultant and Field Team on Reasons
for Discrepancies when both Disagreed with the Medicare
Record Data (weighted percent)
Diagnosis (4 digit) Procedure
Complete agreement 36.2 40.7
General agreement 39.1 59.3
Complete disagreement 24.7
Total
(Unweighted N)
Summary
100.0% 100.0:
(67) (49)
The reliability of the Institute of Medicine field work was assessed by
comparing data provided by HCFA, the IOM abstracters who performed the
field work, and the consultant who performed the assessment. The
results of the assessment confirm both the findings and the caveats
reported in Chapter 3.
Data were most reliable for information on hospital admission date,
discharge date, and sex. The indication of whether additional
diagnoses are present was reported with a high level of reliability.
Some difficulty was encountered in conclusively determining which
diagnosis or procedure should be regarded as "principal." The reliability
of diagnostic data varied, depending on the level of coding refinement
and the specific diagnosis. Overall, agreement between the field team
and consultant on principal diagnosis ranged from 75.8 percent with
four-digit comparisons to 85.8 percent with AUTOGRP. These figures should
OCR for page 113
113
APPENDIX F
not be compared with findings in the body of the report because of
different weighting factors and sampling approaches. In some cases
all three data sources contained distinctly different notations
for principal diagnosis. Reliability was lowest for chronic
ischemic heart disease. For principal procedure the level of
agreement reached eighty-eight percent. Where both the field team
member and assessment consultant disagreed with the Medicare record,
they general ly agreed on the reasons for about seventy-five percent
of the discrepancies on principal diagnosis and for all the discrepancies
with principal procedure
.
These findings should be tempered by the limitations of the assessment
The sample size was very small. The time available for the assessment
was limited.
OCR for page 114
Representative terms from entire chapter:
medicare record