Microsoft trained an AI to translate Chinese as well as humans

An artificial intelligence that can translate Chinese to English as well as a human expert could have a huge impact on breaking down language barriers, experts at Microsoft say. The team, split between Microsoft Research's Asia and US facilities, has been working on machine translation using the same sort of techniques people use to learn languages themselves, only applied to AIs.

"Much of our research is really inspired by how we humans do things," Tie-Yan Liu, a principal research manager with Microsoft Research Asia in Beijing, said of the approach. Rather than train the AI in a single way, a number of different methods were combined to get more effective, faster results. For example, "dual learning" saw the AI not only convert Chinese to English, but then back to Chinese upon which the meaning was checked to see if it had altered significantly through the process.

In another method, "deliberation networks," the AI was taught to repeat the translation multiple times. That, the Microsoft researchers say, allowed the machine intelligence to progressively refine its results over time. However, they also developed two systems of their own.

In one, dubbed "joint training," new sentence pairs were created by the English to Chinese translation. Those pairs could then be used to add to the training dataset for the Chinese to English translation. The same process was flipped and used in the reverse, too, with the researchers saying that as the two converge the overall performance improves.

Finally, there's "agreement regularization." In that, the system reads from left to right and then from right to left: if the translation in both cases matches, then its result – and the processes behind those translations – are deemed more trustworthy. That, it's suggested, encourages a consensus translation.

Although machine translation isn't new, getting an effective conversion of language that takes into account nuances that traditionally only humans could identify and take into account is far rarer. To test Microsoft's system, it was challenged with around 2,000 sentences sourced from online newspapers, each of which had a professional translation counterpart. Microsoft then brought in bilingual language consultants to judge how well the AI was doing, compared to human translators.

The result was surprisingly good, with the judges deeming the AI coming up with similar accuracy and quality as a human. That parity is a particularly complex matter, as it requires not only translating specific words but preserving the context in which they were used. Although there is no single "correct" answer, the system was nonetheless able to produce text that read as cleanly as the human translator was able to produce.

"Hitting human parity in a machine translation task is a dream that all of us have had," Xuedong Huang, technical fellow in charge of Microsoft's speech, natural language and machine translation efforts, said of the findings. "We just didn't realize we'd be able to hit it so soon."

The goal is to use the technology in Microsoft's translation tools, with a mind to expanding it to deal with multiple languages. In addition, the team will be working on speeding up the training process, increasing support for more complicated or niche vocabulary. Eventually, the researchers hope to be able to use such a system for real-time translation.