Integrating grammar checking tools into the MT post-editing process

Post-editing of machine translated (MT) text is becoming mainstream in the translation industry. Several factors have contributed to this:

  1. Companies are demanding more for their money.
  2. Content explosion! The amounts of content keep on growing and there are not enough human translators to do all of the work
  3. Improvements in MT have made post-editing more economically viable and less of a compromise to translation quality.

Many translators and translation companies still refuse to deal with post-editing of MT. But the big companies are doing it and they are setting the trend for the future. SDL says that they post-edit over 200 Million words per year (click here to read about it on their website). It is only a matter of time and post-editing of MT will become a de-facto standard.

A few days ago, I had an interesting thought: what if we were to take machine translation (MT) output and stick it into a grammar checking application. Would it help improve MT quality? And if so, should you be integrating grammar tools into your post-editing process?

We ran a series of tests which anyone (with an Internet connection and MS Word) can do on their own. We took some benchmark text, ran it through the MT and then ran the grammar/spell checker in MS Word. We did this in several languages (MS Office provides grammar checking tools in multiple languages).

The results were reasonably enouraging: both the spelling and the grammar checker finds mistakes in the MT output. And even if the the grammar checker only improves the MT marginally, it provides human editors with some additional examination points in the review process. So it is probably a good idea to incorporate grammar/spell checking into your post-editing process.

But I found something in our tests which was puzzling: when we translated the benchmark text using Microsoft Translator and ran the output through the MS Word spell/grammar checker it came up with errors both in spelling and grammar. Now wouldn’t it stand to reason that Microsoft would integrate its own tools into its MT process in order to improve the MT output quality?

I put this question to Will Lewis of Microsoft Research, the team that develops the Bing Translator, and this is what he had to say:

This is an interesting idea, something we’ve talked about, but is not something we’ve experimented with.  We didn’t pursue it mostly because the kinds of errors that MT makes are not the same that speakers of a language will make.  Since the grammar checkers are tuned to language speakers, not MT, the corrections will be of less use in an MT scenario.  That being said, there is some value in seeing what improvements the grammar checkers would make, especially as the quality of MT improves.

Mr. Lewis’ approach to the use of grammar checking tools in post-editing of MT output is corroborated in a reasearch project done by the Department of Computer and Information Science at Linköping University in Sweden (click here to read the research paper). The conclusions of the research team in Sweden are:

(a) use of commercial grammar tools will not improve the quality of MT significantly but can still provide value in the editing process; and

(b) there is a need to develop special grammar checking tools which are designed specifically for MT output. These tools can be integrated into the MT software itself or can be run as a standalone application on the MT output.

So who will be the first company to introduce special grammar tools for MT? SDL? Microsoft? Google? You? Let’s wait and see.

5 thoughts on “Integrating grammar checking tools into the MT post-editing process

  1. We (Asia Online) use it all the time as it also helps to detect errors faster in a more visual way.

    We are even looking at automating this but I agree that it is a good idea and worth some experimentation. Like many new techniques it requires some effort to establish a clear process and value.

  2. Grammar-checking on the output of MT will only expose flaws in the MT software. It makes much more sense to check (and edit) the input to MT so the MT results are more accurate. We do this using Simplified Technical English, and it dramatically improves the quality and accuracy of both the source and any translations.