Post-editing machine translation using Systran and Trados

In this post I am going to describe the process that we use at GTS for post-editing machine translation (MT). To run this process you will need to have licenses for Systran Desktop 6 and SDL Trados Workbench. This process yields human translation quality for various kinds of texts and is well-suited for high volume projects that need to be delivered quickly. We have been successfully using this process for translation of technical translation projects such as tenders and technical manuals.

BACKGROUND

MT has long been considered a tool which is good for ‘gisting’ but that is not good enough for translations that require high quality and accuracy. Several things have happened in the last two years which are changing the way that the translation industry and clients perceive MT:

1. Major improvements in the quality of MT with the deployment of SMT (Statistical MT) systems that can be trained with lots of bilingual texts.

2. Pressure to reduce translation prices due to world economic conditions.

3. CONTENT EXPLOSION! The Internet, coupled with the slow death of printed material, has brought about a dramatic increase in the amounts of texts being published on websites, blogs and social networks. There are not enough professional translators to keep up with all of this and software/MT is the only way to deal with it.

4. Improvements in post-editing/crowdsourcing tools and the gradual education of the professional translation community.

THE PROCESS

1. Open the STPM (Systran Translation Project Manager) application. Create a new project and set the source/target languages. Select the User Dictionaries and Translation Memories (TMs) that will be used in the MT process.

2. Import the files you are translating and translate the files. Systran will populate the translation with either 100% TM matches or with its MT (when a 100% match is not found).

3. When the translation has completed, open the Sentence Review window in the STPM. Flag all of the sentences and then select Mark all as TM entry.

Picture 41

4. Click the Send to SDM icon on the toolbar.

Picture 425. When prompted, click Yes.

Picture 44

6. In the Systran Dictionary Manager (SDM), select File-Export and export the translation memory in TMX file format. The TMX file contains the TM of the files you translated.

Picture 45

7. Open Trados Workbench and import the TM you created in the previous step into your established TM. Make sure to select the Keep Oldest option so that you do not overwrite any of the TUs in your established TM. This is important, as the TM created with Systran will contain TUs that are machine-translated and are therefore of questionable quality.

Picture 46

8. In Trados, translate the file with the new TM (which has both good TUs and MT TUs). The threshold level can be set to 95%. All TUs will be translated, some with good TUs and some with MT.

9. Send the unclean files to your translators. At this stage your translators will work in their own Trados Workbench. Make sure that they deliver the unclean files when they are done.

10. Clean up the files using Trados to update the TM.

Notes:

a) Make sure that your translators do not get sloppy and just accept the 100% matches. Insist that they thoroughly review each TU and rewrite it if necessary. Your project managers will need to review the end result and make sure that nobody is taking shortcuts. This is critical!

b) You can use the SDM to build custom dictionaries and language rules that can further improve the translation quality.

c) Systran can be run in a standalone, Desktop mode; or it can be licensed in the Enterprise version. The Enterprise system is suitable for mid-large size organizations that have wide-ranging language requirements.  The process described above will work with either type of license.

d) Systran recently introduced its new Enterprise Server 7 which supports training of the MT using aligned, bilingual corpora. Systran is touting this system as a hybrid RbMT/SMT system.

THE BENEFITS OF USING THIS PROCESS

Using this or another efficient post-editing MT process has several benefits:

1. You will be able to deliver projects faster than ever. The rule of thumb that we use at GTS is that a professional translator can translate about 2,500 words a day. A very fast and industrious translator can do more. At any rate, if a translation agency gets a project of 100,000 words then it would take about 35 translator days to deliver the project. So if you assign the project to 7 translators, the project would take 5 days to complete. But what if you need to deliver the 100k words in 3 days? Assign more translators? It gets messy. Using our post-edit process, we have seen good translators deliver 8-10,000 words a day of high quality text.

2. You will save money–it’s cheaper than plain human translation.

3. The quality of the MT improves steadily when using this process. Your investment in MT will pay off.

4. Your resources remain your own. As opposed to the ‘cloud’ approach of the Google translation toolkit, your information is private and will not be shared with anyone else.

EDUCATING THE PROFESSIONAL TRANSLATION COMMUNITY

Many professional translators do not accept projects which entail post-editing of MT. The typical excuses are: “the quality is too poor” or “it will take me less time to do it from scratch.”

I think that the professional community is gradually being convinced that MT is the way of the future and as the old saying goes, “if you can’t beat them-join them!” Using the process I described here, translator throughput is increased dramatically which means that the translator makes more money. Additionally, since many professional translators already own and use Trados Workbench, no training or using new tools is required.

SUMMARY

This is an efficient, low-cost solution to integrating MT into your operation. If you are using other methods for post-editing of MT, or have any comments or questions please let me know.

ABOUT GTS

GTS provides high quality human translation of technical, legal and medical documents; translation of websites; localization of software products. For price quotes, please send an email to
sales@gts-translation.com.

17 thoughts on “Post-editing machine translation using Systran and Trados

  1. It looks like your solution might still need some fine-tuning, at least judging from the poor quality of the Italian translations in this site…

  2. Dear Paolo, thanks. The Italian version of our blog is unedited machine translation. It did not go through the process described here. We are developing a system for post-editing of machine-translated blogs. Please check this blog periodically for updated on our new system. Dave

  3. It very interesting and thanks for sharing!

    We have optimized the method for other customers, so that they can achieve higher productivity, but it’s great to share your deep knowledge.

    Diego

  4. Hello Dave,

    I fully agree with you that the translation community has to educate itself about new technologies available to us. Alas, people usually hate change.

    Have you found MT to be useful only for specific languages or across the board?

  5. Hi Leah, thanks for your comment. Our experience in post-editing of MT in a production environment has, for the most part, been in German to English, French to English and English to French. The process I described will work equally well for all the language pairs supported by Systran (about 30 or so). Having said that, MT handles some language pairs better than others. But since the MT in this process is post-edited the quality should not suffer at all.

  6. Pingback: high quality translations

  7. Pingback: Computer Tips

  8. Hi Dave,

    At the moment we are testing Systran enterprise solution for the Greek language. So far it looks that the best results come after we feed the system with good terminology. The difficult part is to implement tho system along with a CAT tool to an existing translation project management system. Do you have a process for feeding terms into the system?

  9. Hi Michael, thanks. Please keep us posted on the progress of your testing. It sounds interesting.

    Regarding the feeding of terms I have a couple of comments:

    1. We use a product called SDL Multiterm Extract for term extraction. Having mentioned that, I think I will write a blog post about our use of this product. Look out for it in the next few days.
    2. Systran Enterprise Server 7 has some built-in terminology extraction software. I did not get it yet so I can not say anything else about it.

    Best wishes, Dave

  10. Looks like a very helpful tool in translating languages. This will indeed save more money than having more translators.
    This is a great tool for someone like me who is studying international language as well and doing some writings in their language.

  11. Pingback: Download Software » Blog Archive » Download Software | Business Translator 6.0 Euro Full Version

  12. Pingback: Public Sector Tenders

  13. Pingback: Post editing machine translation with Google Translator Toolkit and Trados | GTS Blog

  14. Pingback: SYSTRAN : First Quarter 2010 Revenue Release | TV drama