| ||||||||||||
| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 8
The Commercial Challenges
CURRENT STATUS OF MACHINE TRANSLATION DEVELOPMENT
How, where, and by whom are machine translation technologies and machine
aids for translation being developed? Who is using them and for what purposes?
Before examining Japanese to English machine translation as it is being
developed and used in the United States and Japan, it is helpful to review the
broader context of machine translation development throughout the world.
Machine translation development has advanced rapidly in the past few years,
and while numerous challenges remain, machine-assisted translation is no longer
a dream, but is actively and increasingly used around the world. Companies and
governments are developing and/or using machine translation technologies in the
Americas, Europe, Japan, and in the newly industrializing countries of Asia.
Machine translation and machine-aided translation are now being used in
organizations as diverse as translation bureaus, multinational corporations, and
government defense departments. These technologies help to scan information,
collect intelligence, translate product manuals for export, and improve the
efficiency of professional translators. The overview below provides a basis for
understanding the organizational, political, financial, and societal reasons for
differences in their development and use in the United States and Japan.
The major systems in use today are not, however, aiding translation from
Japanese to English; most are performing what is considered by many the easier
task of translating from one Western language to another. Many examples can be
found in the public or government sector. The Pan American Health
8
OCR for page 9
9
Organization, for example, uses its internally developed system to translate health
and agricultural information between English and Spanish. The U.S.
government, including the Air Force's Foreign Technology Division, has long
used Systran to translate Russian, German, and French into English for
intelligence purposes. The government is currently supporting Systran's
development of a Japanese language system. Similarly, the Canadian
government uses the Logos system for translating English to French at the
Departments of National Defense and State, among others, and uses METEO to
translate weather forecasts. The Smart Translator is used to announce job
openings. The European Community is supporting a large initiative aimed at
developing a system to translate among all the official languages of the
Community.
In the private sector, manufacturing companies are using machine translation
technology to produce product manuals and translation companies are using it to
improve the volume and efficiency of translation efforts. Xerox, for example,
uses the Systran machine translation system to translate its photocopier product
manuals for distribution throughout the world. A European computer maker,
Nixdorf, uses the Logos system to achieve the same end. Translation companies
such as Lexitech in Canada also use Logos, and ALPNET has developed a
worldwide network of translation services based on its own internally developed
interactive and machine-aided software. METAL has a number of clients in
Europe and ATAMIRI is used by Wang for the translation of product manuals
into several languages.
There are a few examples of machine translation technology used for Asian
languages. Logos, one of the oldest U.S. machine translation companies, got its
start in 1971 when it developed an English to Vietnamese system for the U.S.
Defense Department. Currently, the Defense Department is funding machine
translation from and into Korean.
It can be seen from the examples above that the world of machine translation
spans the globe and includes efforts in many language pairs for a variety of uses.
Some of the oldest, most successful, and well-established machine translation
companies are American or are based on technology developed in the United
States. Systran is perhaps the grandfather of U.S. machine translation companies
and its efforts in language pairs other than those involving Japanese continue to
be U.S.-based. The next oldest commercial machine translation system is that of
Logos, also an American company. One of the best-known Japanese machine
translation companies, Bravice International, which claims to have sold 4,500
software units, bought its technology in the United States through the acquisition
of a U.S. firm. It is useful to consider this context as one examines the more
specific case of Japanese to English machine translation.
Who, then, are the developers of machine or machine-aided translation
between Japanese and English? The major commercial efforts in the United
States are being conducted by a handful of companies. As mentioned above,
OCR for page 10
10
Systran, with U.S. government support, is supporting the Japanese-English
combination; it should be noted, however, that Systran's commercial Japanese
efforts have been Japanese-ownel Some large U.S. corporations, including IBM
and DEC, are pursuing internal research in this field, but their Japanese to
English efforts are located in Japan.7 IBM's English to Japanese machine
translation system, developed for internal use in the corporation's Tokyo
facilities, is currently in operational use for the translation of computer manuals.
There are some small-scale efforts, experiments in assembling and using
machine translation technologies for very limited domains. Examples are Smart
Communications and a small ongoing effort at the Veterans Administration
Medical Center in Baltimore that uses public domain software to translate a
narrow range of medical documents between Japanese and English. Efforts in
machine-aided translation development are currently underway at ALPNET and
LinguaTech, both of which are based in Utah and originated at Brigham Young
University. Logos, which continues to invest heavily in machine translation
development, has conducted research in Japanese but has never undertaken
development of a Japanese system.
There is also considerable theoretical research under way among
computational linguists at U.S. universities, such as work at Carnegie Mellon
University's Center for Machine Translation, and the Linguistics Research Center
at the University of Texas, and the sub-language approach pursued cooperatively
by Hunter, Monmouth, and N.Y.U. If basic research on natural language
processing and computational linguistics is taken into account, the United States
still maintains a signficant research effort.8 What is lacking in the United States
is a strong development effort on Japanese to English machine translation.
The world of Japanese to English machine translation development in Japan
offers a sharp contrast to that in the United States. Every major Japanese
computer or electronics firm has invested considerable effort in machine
translation research and development and many claim to have introduced
workable systems. Without evaluating quality, it is nonetheless significant that
there are at least twenty operaiional9 systems in Japan that translate from English
to Japanese. Operational systems in Japan that translate Tom Japanese to English
7 At its Tokyo Research Lab, IBM is working on a Japanese to English machine translation system for
use in translating newspaper editorials and economic materials, which is now in a research prototype
stage.
8 It should be noted, however, that research in computational linguistics that is not related to machine
translation will not necessarily contribute to machine translatic n. On the other hand, the ease can be
made that work on machine translation serves as a test bed and stimulus for Adler kinds of natural
language processing investigation.
9 There is considerable ambiguity about what constitutes an "operational" or "usable" system. The
sheer number of Japanese developers who even claim to have an operational system is nonetheless an
indication of the comparatively strong interest and resources devoted to machine translation in Japan.
OCR for page 11
11
are available from NEC, Fujitsu, Oki Electric, Bravice International, Sharp,
Toshiba, Hitachi, and Sanyo Electric to mention a few of the companies.
Japanese to English systems are also under development at NTT, Mitsubishi
Electric, KDD, and Toshiba.
In addition, the Japanese government has supported an important effort in
machine translation development. This effort, which involves the Ministry of
International Trade and Industry's (MITI) Electrotechnical Lab, the Science and
Technology Agency's (STA) Japan Information Center of Science and
Technology JICST), and Kyoto University, was started in 1982. The Electronic
Dictionary Research Project conducted by MITI in connection with the Fifth
Generation Computer Project aims at the development of a detailed dictionary
with more than 200,000 words and multiple usages.~° Supported by the Ministry
of Post and Telecommunications, research is underway at ATR on speech
translation telephony. Technology transfer to industry has been made possible
throughout these projects via industrial participation. The projects feed into the
effort at JICST for translating science and technology abstracts. These projects
are developmental vehicles that spin off nationwide results; their continuity and
commercial emphasis build capability in the companies.
Who, then, is using Japanese to English machine translation? If the typical
pattern is for translation to be done in the country of the target language, then
Japanese to English machine translation should be done most efficiently in the
United States, rather than Japan. At the present time, however, this is not the
case. Most Japanese to English machine translation is conducted in Japan. The
fact that many Japanese computer makers have developed their own machine
translation capability reflects their orientation toward product exports and their
need to control the quality of translated manuals. It also reflects an
understanding of the importance of machine translation technology and its
possible spin-offs to the information industry as a whole.
Japanese-developed Japanese to English systems are not widely used in the
United States. Lack of compatibility between hardware and software is one of
the impediments. Microelectronics and Computer Technology Corporation
(MCC), a private U.S. microelectronics and computer science consortium, is a
relatively new user of Japanese to English machine translation (MCC is using a
Japanese-developed system) and has begun to use the technology in a particularly
forward looking manner that will be discussed in more detail below. The
University of Wisconsin's Biotechnology Center planned to use a Bravice system
10 The 10-year EDR project that began three years ago includes an English dictionary, a Japanese
dictionary, and a neutral dictionary that connects the Japanese and English dictionaries.
11 See Hitoshi Iida, "Advanced Dialogue Translation Techniques," ATR Interpreting Telephony
Research Laboratories, ATR Symposium on Basic Research for Telephone Interpretation, December
11-12, 1989.
OCR for page 12
12
on an experimental basis to scan and translate Japanese language databases on
biotechnology, but has been unable to do so because of the high cost of the
necessary hardware. Currently, there is no U.S.-developed Japanese to English
machine translation system on the market. The U.S. defense and intelligence
community will, in all likelihood, be the first major user of Systran's Japanese to
English system when it is fully operational.
MARKET PROSPECTS
Today, the volume of machine-translated documents remains comparatively
small. In contrast, some Japanese believe that the annual market for all translated
materials in 1988 was about 800 million yen, and that the quantity of translation
will double over the 1990-1992 period. One developer even estimates that by
the year 2000 there will be 500,000 to 2 million machine translation systems in
use throughout the world,~3 assuming that substantial improvements are made in
intermediate systems that can run on small personal computers. According to this
vision, international businessmen will need small computers with built-in
machine translation systems.
At the same time, Japanese experts note that large Japanese companies
working on machine translation do not believe that this business will yield great
profits, at least in the short term. They do, however, see machine translation as a
mechanism for learning more about natural language processing technology in
general, which they judge to be a key technology in the next century.
Despite the interest in machine translation technology, profits in the United
States and Europe are very slim, if there are any at all. While there are
companies developing systems for internal use, the independent developers are
(as noted above) few in number. This contrast with the situation in Japan may be
explained in a number of ways. Critics of machine translation argue that the
products are based on dated technology. They argue that translators harbor
serious doubts about machine translation on quality grounds, even if translators
have a hard time quantifying the concept of quality. Translation, particularly in
Europe, is poorly integrated with office and publishing computer environments
where the potential benefits of machine translation could be substantial.
Prom a Japanese viewpoint, however, it is possible to create demand. Viewed
from this angle, the more machine translation systems are made available and put
into practical use, the more the demand will grow. Japanese industry and
government are willing to plow large investments into Japanese to English
]2 See Japan Electronic Industry Association, A Japanese View of Machine Transition. . ., op. cit.,
p. 5 and Appendix 5.
13 Many experts see this as a very high estimate.
OCR for page 13
13
machine translation. It is estimated that a four- to five-year effort involving 50 to
70 people is needed to develop a general-purpose mainframe system for delivery,
at a cost of $13 million for the entire period.~4 This is, however, only the
beginning of the investment that is necessary. After delivery of the system,
considerable resources must be invested in maintaining and improving it in
response to user complaints and needs. In short, the required investments are so
large that most companies find it impossible to recover costs by selling only
hundreds of systems. This is a primary reason for Japanese government support
not only of researchers, but also of commercial developers, although there are
other important reasons such as reducing the language barrier between Japan and
other countries and disseminating Japanese technical information worldwide.
Given that the major commercial machine translation systems have been
mainframe-based, the traditional markets for machine translation have been
limited to translation bureaus, multinational corporations, and intelligence and
information gatherers, particularly in government. Developers have identified
two major targets for development. One is the large-scale, general-purpose,
mainframe-based system for use by big companies and governments. The goal of
this type of machine translation system using large-scale hardware is high volume
translation of documents, sometimes for mass distribution. Even those involved
in development work on general-purpose, publication quality machine translation
systems say that such high-quality, general-purpose systems are 10 to 20 years
over the horizon.
The second is a small-scale machine translation system for use by small to
medium size companies and even individual researchers. The purpose of this
type of smaller scale system is to translate for specialized applications. Many
Japanese commercial developers are focusing on small-scale system
development. There is a third kind of development that is intermediate between
We two systems mentioned above.~5 From a commercial perspective, it is this
kind of intermediate system development that is seen by the Japanese as the only
feasible target for at least the next decade. Such systems are in use today for
information scanning in resmcted domains.
In developing and enhancing machine ~anslabon systems, developers stand to
benefit from close interaction with users. Japanese developers rarely consult with
users on the details of systems design, but they do seek out users' views on what
features are needed in pre- and post-editing and on general issues of man-
machine interface. Japanese developers prefer to interact intensively win a
limited number of users so that they can respond effectively to their needs. A
14 Makoto Nagao, written response to "stimulation questions" prepared for the Symposium on
Japanese to English Machine Translation.
15 Systems like those at Xerox run on microcomputers and are specialized, but cannot be considered
"small-scale" since they translate more than 10 million words annually.
OCR for page 14
14
conscious strategy of selling a limited number of systems is often pursued in
order to facilitate this process. New users benefit later from this accumulated
feedback embodied in system configurations developed and perfected for other
users. Although Japanese observers complain that mechanisms for exchange of
information between developers and users are inadequate, in comparison to the
situation in the United States there has been closer interaction among the
research, development, and user communities in Japan.
One of the limiting factors on the machine translation market is the fact that
most Japanese systems operate on only one type of hardware. In fact, some large
mainframe computer makers pursue machine translation development as a
strategy for marketing their hardware to large companies. Bravice produces the
only Japanese to English machine translation system that can be used on a variety
of hardware. It runs on most small microcomputers sold in Japan. A practical
barrier to widespread usage of small-scale Japanese-developed machine
translation systems is the limited availability of hardware that supports Japanese
characters (kanji, hiragana, katakana) outside of Japan.
International competition in machine translation is not mature. Given the
barriers to hardware interoperability, the most prominent examples of
competition occur among large-scale communication systems. In 1988, a number
of computer companies competed for contracts associated with the development
of a large communication system for the Korean and U.S. armies and some
machine translation system developers received contracts.
Reflecting the practical limitations on machine translation technology today,
there are a number of efforts that focus on combining machine translation
technology and human translation. ALPNET has developed a strategy based on
the assessment that machine translation is not the solution to all user needs, and
that it is not an all-or-nothing alternative. Technology tools now in existence
(such as multilingual word processors, dictionary look-up systems, character
recognition and word processing systems) can be effectively used by human
translators at much less cost than fully automated machine translation systems.
New ''linguistic engine" options can be added to a user's existing applications,
thereby increasing ease of use without forcing the customer to learn new user
interfaces to complex systems. The experiences of developers like ALPNET
illustrate the fact that successful application of machine translation and related
technologies depends on an understanding of user needs, establishment of
expectation levels, matching the tools to the job, and the training of skilled
professional people to use the tools.
Another possible approach is to redesign systems using new technology and
locate smaller, modular systems in places such as schools where translators are
16 Exceptions to this general statement about interactions in the United States include relationships
between developers and users such as Systran and the U.S. Air Force or between Logos and AT&T.
OCR for page 15
15
being trained. Also, a broader appreciation of machine translation and related
technologies to cover a wide range of communications problems could enhance
the use of currently existing tools in solving practical problems. For example,
many potential users in Europe are not professional translators but secretaries
working for large multinational corporations who must write correspondence in
foreign languages. Market prospects could be broadened by integrating linguistic
tools into the office environment, the publishing business, and even He engineering
development environment.
If we consider particular applications of machine-aided translation technologies
to specific consumer products, demand may indeed grow significantly over the next
few years. A composition aid for technical writers, for example, could improve
understanding of cultural nuances, not to mention grammar and syntax. Another
product application that illustrates the point is a hand-held electronic phrase book
for use by foreign travelers. Such applications, it should be noted, offer promise,
but they remain distant from the machine translation heaven of general-purpose,
high-quality systems.
USER NEEDS
People who want to read translated material care little whether the translation
is done by people or machines or a combination. They simply want reliable
translation that is cheap and fast. For some of these users, such as those who
must scan very large volumes of information in order to follow trends in research
and development or identify documents for full translation, the speed of
translation is important. Turnaround time is, in the experience of the users of the
Air Force's system, more important than quality, defined in terms of naturalness
of expression and precision in conveying meaning.~7
Other large volume users, such as companies that require translation of their
operating manuals, require precise and understandable translation- but in fairly
limited domains. The classic example of such a user in the United States is
Xerox. Xerox has been using machine translation for more than a decade to
translate technical documents. These are large volume projects which involve
high reproduction and updating rates. For large companies that sell a wide range
of products in global markets, translation costs represent a significant component
of new product expenditures.
The Xerox experience deserves further mention. Xerox has developed an
approach that integrates desktop editing programs and a Systran machine
translation system to produce service manuals, Gaining programs, and operator
17 Some argue that quality standards must include a mixture of turnaround time, accuracy, and
readablity, the relative importance of each varying with the needs of the particular user.
OCR for page 16
16
manuals for use in Europe, Latin America, and North America. This approach,
which has evolved over time, involves the training of technical writers and pre-
editors in a set of simple writing tools (Multinational Customized English)
developed by Xerox. Dictionary building is constantly under way as technical
writers and translators from different parts of the world, joined through the Xerox
worldwide network, submit new words and phrases.
At Xerox, machine translation has shown concrete results. Significant
improvements have been seen in productivity per finished page of translated text.
Producing more than 40,000 pages annually of documentation translated through
this process, Xerox now finds that translation is no longer a barrier to product
launch. Machine translation enhances the company's ability to introduce new
products almost simultaneously all over the world.
Xerox found that management initially had high expectations for what
machine translation could do, while linguists were skeptical. Over time and with
growing experience, the perceptions of these two groups have begun to converge.
(See Figure 2.) By gradually developing a system that translators are able to see
as a tool in their work, and one that produces high-volume product in restricted
technical domains, machine translation has been fully integrated with business
operations.
In contrast to large users, there are countless potential users of small machine
translation systems who have different kinds of needs. In many instances, these
are researchers (scientists and technical personnel) in the United States who need
to know about developments in Japanese science and technology. For many of
them, readability and smoothness of translation is less important than timeliness
and access to a narrowly targeted range of technical documents of interest.
High
Expectation
Level
Low
-
/
Managers
-
-
-
-
-
Linguists
1
1
To
l
me · Reality
FIGURE 2 Expectations versus reality of computer translation. SOURCE: Mana Russo.
OCR for page 17
17
MCC is now experimenting with machine translation to monitor overseas
technology developments and provide translated materials to member firms. For
an investment of $50,000, MCC has assembled a machine translation system that
includes an optical character reader, a Japanese-made personal computer that is
also used as a JICST terminal, a Japanese to English machine translation software
package, and a U.S.-developed workstation. MCC encountered problems in
connecting the computer systems developed in Japan and the United States and in
getting adequate vendor support for the machine translation system developed in
Japan. Fifty percent of the sentences in the trial output have been judged to be
accurate. For a modest investment ($50,000 to $60,000), the machine translation
experiment at this U.S. industry consortium is considered to be worth the effort.
MCC's International Liaison Office expects to augment its already successfully
established functions of monitoring developments in Japan and providing
technical support to its researchers by using machine translation first to translate
titles and, eventually, abstracts and short papers. MCC has, in addition, recently
initiated a research effort on knowledge-based natural language processing
targeted toward machine translation technology development.
Beyond corporations, individual professional translators and researchers also
qualify as potential users of machine translation systems and machine aids for
translators. Companies such as LinguaTech are developing workstations for
translators who can work at home using a microcomputer with a high-capacity
disk drive, a printer, and a port for telecommunications. A translator can receive
source texts and glossaries with specific subject matter along telephone lines. A
variety of options are available to the translator, including composition tools, a
bilingual corpus that permits text retrievals when necessary, multilevel
terminology files, and optional use of machine translated texts. To use a baking
analogy, machine translation is one ingredient in a range of elements that the
translator (baker) can combine in myriad ways to accomplish his work. The
baker may wish to use the heavy-duty bread mixer (machine translated text) for
some tasks, but he also keeps his drawer of spoons (manual tools).~9
Potential users, particularly individuals or small companies, may find it hard
to learn about machine translation. In Japan, potential users learn about machine
translation through newspapers and television as well as through demonstrations
at computer company service centers. Only limited information is obtained
Trough these channels. The potential Japanese user is, however, in a much better
18 In judging the value of machine translation, it must be remembered that, unlike the case with
European languages with cognates and familiar script, Japanese is totally unintelligible to the
potential English-speaking user. Some, therefore, believe that capturing even 50% of the Japanese text
can be a useful step forward in communication.
19 This analogy is developed by Alan K. Melby in his article "Recipe for a Translator Work Station,"
Mull ilingua, 3-4 (1984), pp. 225-228.
OCR for page 18
18
position than his U.S. counterpart. For the U.S. translation community,
demonstrations and meetings of professional organizations provide channels for
transmitting information that augments the articles appearing in specialist
journals. For other users who have no direct contact with the translation
community (such as researchers and people in small business), acquiring
information about machine translation and making judgments about purchase and
use are even more difficult. We shall return to this point again in the final section
on R&D policy.
JAPANESE AND U.S. USERS: CONTRASTING NEEDS
As we think about what type of Japanese to English machine translation
system would help users in the United States, it is important to keep in mind that
their needs are different from those of most users in Japan. The profile of the
typical user of Japanese to English machine translation in Japan is a large
corporation engaged in global exports. This user can "control" the source
document because it is in his native language and pre-editing is easier in this
case. This user can customize the system to suit his or her needs and tolerate
marginal machine translation because there is a direct cost justification for this
effort.
The profile of the typical user of Japanese to English machine translation in
the United States is someone who needs expanded access to Japanese-language
technical information but who has no fluency in Japanese. This user has no
control over the source text and little ability to pre-edit. For monolingual users,
the "raw" output must be more reliable and accurate than for the typical Japanese
user. For the user in the United States, there are many problems associated with
gaining knowledge of and access to Japanese databases, inputting text, and post-
editing requirements. In view of the broad-based needs of the U.S. user, marginal
machine translation is in many cases not useful. (See Figure 3.) However, there
are many possible uses of machine translated text now available, particularly if
some post-editing is done, for scanning information.
Japanese to English machine translation systems developed in Japan do not, in
the opinion of some leading U.S. developers, fit the needs of many potential
high-volume U.S. users. The requirements are for high-quality, broad-domain
systems which will be based on new technology.
What are our choices? Should we wait for Japan to develop high-quality
Japanese to English machine translation systems in the year 2000 and later? After
all, Japan has a head start and Japanese is their language not ours. Or should we
develop Japanese to English machine translation systems here in the United
States? Are there other alternatives? Answering these questions requires an
understanding of the technical challenges facing those engaged in R&D on
machine translation.
OCR for page 19
19
High~C)uality
Systems
Cost-Effective
Systems
Prototype
Systems
Demonstration
Systems
JAPAN: Exporter's USA: Researcher's
production tool information utility
~,
· doesn't control source
· controls source document _ document
· can customize system to _ · needs read-only
narrow domain broad-domain system
tolerates high-cost, · needs low-cost, usable raw
low-quality, raw output output
~_ ~_
(I) · Special
Purpose
Systems
· Interactive
Systems
Domain- · General
Specific Purpose
Constrained Systems Systems
Systems
Japanese to English machine translation systems useful for
users in Japan only
Japanese to English machine translation systems useful for
users in both Japan and the United States
\ Japanese to English machine translation systems useful for
J users in Japan and possibly the United States
FIGURE 3 User profiles for Japanese-English machine translation. SOURCE: Bernard Scott, Logos
Co~oranon.
Representative terms from entire chapter:
english machine