| Copyright © 2012. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 1
Machine Translation:
From a Translation to a Communications
and Information Challenge
-
Developmental work on machine translation has been under way for more
than 30 years. Some say that we have come a long way, while others question
whether the goal is in sight. The reality is that perceptions of what machine
translation is-its purpose and scope are shifting. The paradigm that dominated
thinking during most of this 30-year period of research and development (R&D)
was the expectation Mat computer technologies could be developed to build
machines capable of doing the work of human translators. Today, machine
translation~efined as translation generated by a computer, with or without
human intervention~mbraces a broad spectrum of technologies. Included are
machine translation systems that run on large mainframes and those that run on
stand-alone personal computers, enhanced with automatic aids for the human
translator.! Rather than eliminating human translators, machine translation and
related technologies are now seen as ways of facilitating their work.2
The focus of the symposium organized by the Office of Japan Affairs and the
Computer Science and Technology Board was machine translation from Japanese
to English. Growing interest in machine translation between this language pair
I This report deals with computer-based translation, including machine translation systems and
machine aids for translators.
2 The concept includes machine translation systems that are used primarily for information scanning,
where little or no post-editing is done. Such a system can in some circumstances serve as a
mechanical translator for a monolingual researcher. Some of the commercial systems available today
are beginning to see this kind of usage in limited circumstances.
1
OCR for page 2
2
reflects recognition of the importance of science and technology information
produced-in Japan. In order to understand the special challenges of Japanese to
English machine translation, a comparatively new field of R&D, it is important to
appreciate efforts in machine translation involving other languages.
CHANGING CONTEXT FOR MACHINE TRANSLATION
Machine translation technology development has taken on broader
significance in an age of rapid international communication and intense market
competition. Competition for global markets has intensified the need for
companies to get their messages across to overseas customers who speak foreign
languages. Companies doing business around the world must be able to speak
the languages of their customers. Some large companies have targeted translation
technologies as a component of their competitive strategies: IBM sees translation
as a "eating" items for its marketing objectives in the 1990s; Xerox emphasizes
the importance of machine translation in launching products simultaneously in
multiple markets. As the importance of global markets grows and cycle times
shorten for the introduction of new products, translation increasingly becomes an
expensive bottleneck for international companies.
Another, related explanation for changes in perspectives on machine
translation is the information explosion. More specifically, U.S. businessmen,
researchers and product developers, and policymakers need a better
understanding of what is going on in Japan, because that country has now joined
the ranks of the world's technological leaders. The need for translations of
Japanese technical documents is now apparent, but only a minute fraction of our
science and technology community can speak and read Japanese. Growing
recognition of the importance of technical information produced in Japan has
stimulated interest in the role that machine translation might play in making it
possible for Americans to access reports of new inventions, products, and
financial developments in Japan.
Changes in thinking about machine translation also reflect the evolution of
new concepts of how machine translation systems might be developed and used.
Progress in natural language processing technology, the development of more
powerful computers, the increasing availability of large, information-laden
dictionary data sets, and advances in some aspects of linguistic theory suggest
opportunities for R&D. Translations can be delivered through electronic mail
and quickly incorporated in successive editions of technical manuals. Interest in
"machine-aided translation" has been spurred by the development of workstations
3 Translation of documents is a prerequisite for entering overseas maricets.
OCR for page 3
3
with dictionaries and other tools that can be used by professional translators. The
expectation that translation machines will replace people has now been
transformed into the view that these technologies are tools to enhance the efforts
of professional translators, researchers, and secretaries.
Today the challenges of machine translation development illustrate the
broader challenges of information technology research, development, and use.
Machine translation technologies pose a range of theoretical, software, hardware,
and even sociological problems that require integration of technologies and
improved interactions among developers and users. For all these reasons,
machine translation today is more than a linguistics problem. It is a
communications and information challenge that demands a diverse range of
expertise and resources.
SETTING AND MEETING GOALS FOR MACHINE
TRANSLATION DEVELOPMENT
The dream that stimulated early R&D efforts was a machine that produces
high-quality translations from a wide variety of source texts at low cost. Even the
most ardent supporters of machine translation agree that three decades of effort
have not produced the breakthroughs necessary to achieve this dream.
Why the dream remains unfulfilled is the subject of some disagreement.
Viewed from one angle, the failure to achieve the goal is a result of giving up too
early. Negative evaluations of machine translation in the 1960s were based on
the argument that the understanding of text by computer was too difficult,
rendering machine translation infeasible.4 The ALPAC report by the National
Research Council concluded that the basic technology for machine translation
had not been developed, and recommended a focus on long-term research in
computational linguistics and improvement of translation methods. While the
report made no recommendations with regard to funding for research and
development on machine translation, the overall negative evaluation of the state-
of-the-art is now seen by many as a major cause of the subsequent decline in
funding for such research in the United States. Between 1960 and 1970, funding
for R&D on machine translation declined Tom about $10 million to $1 million.
4 This argument was made by Bar Hillel. lbe ALPAC report written by the Automatic Language
Processing Advisory Committee of the National Research Council, entitled Language and Machines:
Computers in Translation and Linguistics, is widely seen as the most influential of the studies of
machine translation in the 1960s. Published in 1966, the report concluded that the quality of machine
translation was poor and cost savings had not been achieved. lbe report analyzed the products of
machine translation at U.S. government agencies after development work had been under way for
more than ten years.
OCR for page 4
4
Supporters of machine translation say that we would be closer to the dream if we
had not given up so soon.5
Viewed from another perspective, however, the fact that the initial dream has
not been fulfilled is no reason to dismiss the promise of machine translation. The
machine translation "heaven" of high-quality, low-cost, general-purpose systems
is still distant.6 The high-quality systems that exist today are in most cases
special-purpose systems working in restricted domains, but none of these are for
the translation of Japanese to English. (See Figure 1.) In addition, there are also a
number of cost-effective systems that operate in broader domains. Among them,
Systran is the only company to have developed a general-purpose commercial
system that translates between Japanese and English, as well as 14 other language
pairs. A dozen or more prototype machine translation systems have not been able
to attain cost effectiveness after a decade of development
The process of machine translating from Japanese to English and vice versa is
comparatively difficult because of important differences in the structures of the
two languages. Typical Japanese text consists of Chinese characters and two
different styles of Japanese phonetic symbols. These characters and symbols are
written without any spaces between individual words, and phrases are rarely
separated by punctuation. Grammatically. Japanese differs from English in that it
has no distinction of singular and plural nouns, there are no articles, and the
subjects are often omitted. In general, the Japanese sentence puts the verb at the
end, and the text preceding the verb is in no particular order and contains many
compound clauses. The grammatical structure of Japanese may omit pronouns,
subjects, and objects, so that the context must be understood in order to choose
among alternative possible interpretations. Because of these and many other
characteristics of the language, pre-editing is especially helpful to make the text
more Actable for machine translation processing.
Most of the Japanese to English systems now in use in Japan succeed because
they are limited to particular domains. In the eyes of many, these systems
represent a significant step forward, even if they do not fulfill We initial dream.
5 In July 1989 the Japan Electronic Industry Development Association published a report entitled: A
Japanese View of Machine Translation in Light of the Considerations and Recommendations
Reported by ALPAC, U.S.A. This report argues that two major conclusions of the ALPAC report are
no longer valid: the claim that there is no translation shortage is refuted by estimates of today's
translation market in Japan, and numerous examples of successful machine translation are also cited
in response to ALPAC's conclusion that it will have no practical use in the near future. The Japan
Electronic Industry Development Association's Machine Translation System Research Committee,
which prepared the report, was chaired by Dr. Makoto Nagao of Kyoto University and included
representatives from Japanese corporations involved in machine translation development.
6 Some argue that the major explanation for the failure lo reach the target is that there is a much
bigger difference between general language and domain-specific language than has heretofore been
suspected.
OCR for page 5
s
High-Qual~y
Systems
Cost-Effective
Systems
Prototype
Systems
Demonstration
Systems
,:3
,:3
~ NEr )\ E-J J
_ _
_
~(
_
/=iN
,
{~3
~-
· Special
Purpose
Systems
· Interactive
Systems
. Constrained
Systems
· Domain-
Speafic
Systems
· General-
Purpose
Systems
FIGURE 1 Machine translation quality/funciion matnx. SOURCE: Bemard Scott, Logos
Corporation. See legend on page 6.
The argument may be rephrased as follows: Japanese to English machine
anslaiion systems can be used effectively for carefully targeted purposes and it
will only be possible to improve these systems if we are willing to put resources
into development and experimentation. When we also consider the growth of
machine aids for human translators, such as dictionaries and other composition
tools, it is clear that machine translation technologies have practical uses today,
even if the dream of a fully automated, high-quality, low-cost, general-purpose
system remains over the horizon.
Given this picture of promise and problems associated with machine
translation, we need to examine the challenges that lie ahead for U.S. industry
and the U.S. government. One set of challenges is commercial and relates to the
fact that while a number of Japanese companies are working on Japanese to
English machine translation systems, little similar work is going on in the United
OCR for page 6
6
LEGEND TO FIGURE 1
ALPAC Russian-English machine translation systems being developed under U. S.
Department of Defense funding prior to the publication of the ALPAC Report in
1966, which led to their discontinuation.
ALPNET Interactive multilingual translation aid, developed in the United States. Previously
known as ALPS for Automated Language Processing System.
EEC European Economic Community.
EUROTRA I Large-scale multilingual machine translation prototype effort sponsored by the
European Community. Eurotra I is intended to lead to a full-scale industrialized
version, Eurotra II, encompassing 72 language pairs.
FTD Russian-English and German-English SYSTRAN-based machine translation
systems of the Foreign Technology Division of the U.S. Air Force. FTD's
installation in the late 1960's represented the first operational use of machine
translation.
GETA Machine translation system for various language pairs entailing French, developed
at the University of Grenoble, France. (GETA - Groupe d'Etudes pour la
lraduct~on Automatique.)
LOGOS
METAL
METEO
Multilingual machine translation system developed and marketed by Logos
Corporation, a U.S. company. Language pairs are: German-English, -French,
-Italian; English-French, -Spanish, -German, and -Italian.
German-English, English-German machine translation system originally developed
at the Linguistic Research Center of the University of Texas and later under joint
development with Siemens AG in Munich.
English-French machine translation system developed for the nationwide weather
communications network of the Canadian Meteorological Center. METRO is a
derivative of the TAUbI system developed at the University of Montreal.
PAHO English-Spanish/Spanish-English machine translation systems of the Pan American
Health Organization designated as SPANAM and ENGSPAN, respectively.
SYSTRAN Machine translation system for many language pairs, developed in the U.S. and
controlled by French and Japanese interests. SYSTRAN is a much enhanced
derivative of the early Georgetown Automatic Translation (OAT) system.
TAUM Aviation An English-French adaptation of TAUM technology to a specific subject matter
domain or language subset, developed at the University of MontrEal. (TAUM -
Traduction Automatique de l'Universite de Montreal.)
J-E/E-J
XEROX
Japanese-English/English-Japanese machine translation systems. There are about a
dozen such systems in Japan, either on the market or being prepared for the market.
English-multitarget machine translation system developed for XEROX Corporation
by the developers of SYSTRAN, and used by XEROX to translate technical
manuals written in controlled English.
OCR for page 7
7
States. A second set of challenges includes technical problems issues relating
to the choice of R&D focus and problems in evaluating system performance. The
Bird set of challenges may prove to be the most pressing: defining R&D policy
(either at the company or the U.S. government level). Without a clear consensus
on what constitutes the '~state-of-the-art," or comprehensive data on market
prospects and user needs, forging an appropriate policy response is not an easy
task.
The sections that follow address each of these three sets of challenges in turn,
highlighting areas of uncertainty and issues that deserve further study and debate.
Representative terms from entire chapter:
translation technologies