| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 76
Appendix 11
Types of Errors Common in
Machine Translation
Two studies have recently been made of the types of errors made
in mechanical translation. The first study was very kindly made
available to the Committee by the IBM Thomas J. Watson Research
Center, Yorktown Heights, New York. By counting and classifying
the corrections made by posteditors, this study determined the
types and frequency of errors found in the output of four machine
translations (Russian to English).
GENERAL CLASSIFICATION AND PERCENTAGE
OF ERRORS OF ARTICLE I
Total number of words:
Transliterated words
Multiple meddlings and ambiguities
Word order rearranged
Miscellaneous insertions and corrections
Total
No.
96
23
45
164
GENERAL CLASSIFICATION AND PERCENTAGE
OF ERRORS OF ARTICLE II
Total number of words:
Transliterated words
Multiple meanings and ambiguities
Word order rearranged
Miscellaneous insertions and corrections
Total
76
Approximately 1,200
%
8.0
2.0
3.6
13.6
Approximately 1,200
No.
6
132
17
77
232
%
0.5
11.0
1.4
6.4
19.3
OCR for page 77
GENERAL C LASSIFICATION AND P ERCENTAGE
OF ERRORS OF ARTICLE III
Total number of words:
Transliterated words
Multiple meanings and ambiguities
Word order rearranged
Miscellaneous insertions and corrections
Total
No.
17
143
36
122
318
GENERAL CLASSIFICATION AND PERCENTAGE
OF ERRORS OF ARTICLE IV
Total number of words (including individual
digits and symbols in all formulas):
Transliterated words
Multiple meanings and ambiguities
No.
1
87
Word order rearranged 14
Miscellaneous insertions and corrections 436
Total 538
Approximately 1,700
l
9
2
7
19
Approximately 1,600
%
5.8
0.9
29.0
35.7
The second study was made by Arthur D. Little, Inc., and was
done in a manner similar to the IBM study. That is, machine trarts-
lation output was postedited and the errors classified and counted.
From the study, the A. D. Little group was able to tell the percent-
age of total corrections made in each category. The original con-
sisted of approximately 200 pages of scientific Russian. One set of
approximately 100 pages was edited by two different editors. The
second set contained "approximately 100 pages from seven MT
articles edited by at least four different editors."*
*An Evaluation of Machine-Aided Translation Activities at F.T.D., Contract
AP 33~657~-13616, Case 66556, May 1, 1965, p. ~10.
77
OCR for page 78
PERCENTAGE OF TOTAL CORRECTIONS COUNTED*
Error
Word omission
A. Articles
B. Others
Wrong words
A. Prepositions
B. Verb tense, voice, suffix
C. Others
Russian left in
Choice
A. Choice of two
B. Choice of two, both wrong
Unnecessary word
Symbol
Phrase not interpreted
Word order
Total Number of Corrections: 7,573
78
%
18.76
15.98
34.74
3.78
5.56
16.24
25.58
4.48
8.17
3.57
11.74
3.09
4.5
3.14
12.73
Representative terms from entire chapter:
order rearranged