Get a Free Quote
Recently I have done several assessments of machine translation output to evaluate the overall quality of the MT engines before they are used. These evaluations can also help a translation agency calculate the likely effort required for post editing… Read More
Recently I have done several assessments of machine translation output to evaluate the overall quality of the MT engines before they are used. These evaluations can also help a translation agency calculate the likely effort required for post editing the machine translated text by allocating a score to each segment in a sample text with a scale ranging from “zero editing required” up to a “complete retranslation”.
This work led me to an interesting research paper on the phenomenon of “post editese”, which compares the output of PEMT (post editing machine translation) with human translation and describes some of the linguistic differences found in these texts. The author, Antonio Toral, works in the Computational Linguistics group at the Center for Language and Cognition at the University of Groningen. Below I have linked to a blog post, which gives an accessible summary of Toral’s research in this area.
Three datasets were used in the research, all involving news and subtitles. Toral reflects that different findings may result if the focus is on more technical texts, particularly where terminology and consistency are key. What I found interesting was the nature of the differences found between post-edited and human translated text (PE versus HT). Firstly, in terms of lexical variety (i.e. size of vocabulary) HT comes out on top, followed by PE, with pure machine translation (MT) bringing up the rear. Lexical density reflects the amount of information conveyed in a text and, once again, HT scores more highly, this time with PE and MT being roughly equivalent.
The researchers also investigated the overall length of the translated text compared to the source text, capturing the result in a score they refer to as the length ratio. The higher the score, the greater the translated text diverges from the source in length. Once again, HT has the highest score, with MT having the lowest score and PE somewhere in between. This is interpreted as reflecting the greater freedom a human translator may have in producing the target text.
The final measure used to evaluate the translations captures the impact on part-of-speech sequences and reflects potential interference from the source text on the translated output. Once again, a significant difference was found between PE and HT, suggesting residual interference from the source language on PE output.
The overall conclusions are clear: post editing may resolve some of the issues involved in machine translation, but the resulting text still exhibits features that distinguish it from human translation with no machine input. This will inevitably impact the subjective quality of the translated output; the question is whether this is an acceptable trade-off given the greater speed of PEMT and its relatively good score when it comes to errors. The answer will naturally depend on the context and application, but it is important that clients understand the nature of the translation options available to them.
Taylor Wessing LLP
We are very pleased with the services provided by Rosetta Translations. They always send very prompt responses, transparent prices and deliver their work product at the highest standards.
Jackie Brook, Sr Product Manager
Thank you very much for your prompt and efficient service.
Maximus Crushing and Screening
I have translated multiple projects with Rosetta now and I cannot emphasise how great the service they provide is; quality, turnaround time and pricing is the best I have found yet. The qualities of translations we receive are of the highest standard and communication from the start of a project to the end is consistent.
For a company looking into translations, I would highly recommend Rosetta as first pick, as the support and service they provide is first class.
Get a Free Quote
©2021All Rights Reserved