Machine Translation and Translation Memory

Depending on language, content volume, type of source text and publishing timeframes, it is possible to improve the efficiency and reduce the cost of publishing multilingual information by using Translation Memory Software (CAT tools) or even Machine Translation.

The primary reasons for implementing either technology are speed, cost savings, and consistency:

  • Speed—Machine translation significantly reduces the time required to translate large volumes of text. CAT tools can also speed up the process for documents with large amounts of repeats.
  • Cost savings—By reducing the need for human involvement, both technologies can reduce overall translation costs.
  • Consistency—Because the systems draw on pretranslated dictionaries and databases, respectively, both technologies allow for significant gains in translation consistency.

It should be noted however that the human translators’ involvement is always required.

Translation Memory (TM) — a translator’s tool

Translation Memory software, also called Computer Assisted Translation, is designed to improve the quality and efficiency of the human translation process, not to replace it. Translation memory software or CAT tools work by saving words and phrases that the human translator has encountered and translated, and storing them into a database for future use. When the same or similar text segment are encountered again, the translation memory software searches the translations previously stored in the memory. It may be an exact match, a “fuzzy” match (nearly the same but not quite…) or no match. The translator then goes through and reviews the translation as a final edit. As the translation progresses, the translation memory grows —essentially a sort of reference built and used by the translator him/herself. This can decrease the time used on future translations while increasing consistency.

Documents and projects which would be good candidates for use of TM tools include those that are:

• Highly repetitive
• Technical
• Large volume translations
• Updates of previously translated materials

Examples of CAT tools are DejaVu, memoQ, Trados, Wordfast but there are many others. Language Professionals uses DejaVu.

Machine Translation (MT)

MT software aims to replace the human translator. It uses algorithms to analyze the grammar and syntax of source segments according to previously defined rules. It then queries a dictionary to produce a translated segment without human intervention. MT output is generally not good enough to be published without extensive human post-editing. In addition, machine translation can only be used for a limited number of supported languages.

This is how Google explains the process for the Google translator: Inside Google Translate

Google Translate’s basically sifts through large piles of data — in this case, text, billions of words of text, consisting of examples of human translations, and finds patterns. Google refers to this process “Statistical Machine Translation”, or SMT. SMT is built on the premise of probabilities. For each segment of source text, there are a number of possible target segments with a varying degree of probability of being the correct translation. The software will select the segment with the highest statistical probability.

Another model of Machine Translation is the so-called “Rule-Based Machine Translation”, or RBMT. Rule-based Machine Translation is built on the premise that a language is based on sets of grammatical and syntactic rules. In order to obtain the translation of a segment, the software will need bilingual dictionaries/glossaries for the specified language pair, a set of linguistic rules for the sentence structure of each language and a set of rules to link the two sentence structures together. These requirements can be time-consuming and expensive to create and must be done each time a new language pair is required. RBMT however will produce better quality.

Machine Translation quality is not perfect and may never be so, but it is continuously improving. In experienced and informed hands, it is a useful linguistic tool to translate large volumes of the right content, in conjunction with Translation Memory, glossaries, style guides and human translators or post-editors. It’s also useful for individuals to get the gist of some documents.