Google are apparently developing a live, speech-based translation system that would allow multi-lingual phone conversations to be interpreted in real-time. The technology – which Franz Och, Google’s head of translation services, reckons will “work reasonably well in a few years’ time” – combines voice recognition, speech-to-text and voice synthesis, all of which are separately available from the search giant but not yet in a combined fashion.
“Clearly, for it to work smoothly, you need a combination of high-accuracy machine translation and high-accuracy voice recognition, and that’s what we’re working on. If you look at the progress in machine translation and corresponding advances in voice recognition, there has been huge progress recently. Everyone has a different voice, accent and pitch, but recognition should be effective with mobile phones because by nature they are personal to you. The phone should get a feel for your voice from past voice search queries, for example” Franz Och, head of translation services, Google
The system would also improve its recognition properties, based both on repeated implementation by a single user and through Google amalgamating their data across all users. The company already believes it can use digitized documents and websites to better educate its speech-to-speech engine on previously tricky topics, such as grammar.