Google translate might have once been delivering funny results, but it has been constantly improving and is now approaching human levels of capabilities in its translations.
The Artificial Intelligence delivering translations between languages has created a model of its own, for translating between languages it has previously not been trained to handle. Google calls this Zero Shot translation.
Artificial intelligences are trained for tasks, which usually involves feeding in tons of data, and the more data that is fed into the system, the more accurate it gets. For example, to train the AI to translate between English and Spanish, the English versions and the acceptable Spanish versions are fed into the machine.
Say the artificial intelligence is trained to translate between English and Spanish, then trained to translate between Italian and Spanish, the machine can fairly accurately translate between English and Italian on its own. This may seem like a trivial task for humans, but for machines, it is particularly challenging. This is the first time machines have demonstrated capabilities of handling translation tasks they were previously untrained for.
In September, Google switched from Phrase-Based Machine Translation (PBMT) to Google Neural Machine Translation (GNMT) for handling translations between Chinese and English. The Chinese and English language pair has historically been difficult for machines to translate, and Google managed to get its system close to human levels of translation by using bilingual people to train the system.
The developments were announced by Google in a blog post, and was accompanied by a research paper published on Arxiv.
In November, Google Translate got its biggest ever update, with GNMT rolled out for eight language pairs. French, German, Japanese, Korean, Portuguese, Spanish and Turkish were added to English and Chinese, making GNMT available for 35 percent of the translation requests on Google. Google planned to add GNMT for all 103 languages in Google Translate. That would mean feeding in data for 103^2 language pairs, and the artificial intelligence would have to handle 10,609 models.
Google tackled this problem by allowing a single system to translate between multiple languages. The knowledge required for translation is shared across language pairs, and this forces the system to better use its modelling powers. When the translation knowledge was shared, curious Google Engineers checked if the AI could translate between language pairs it was not explicitly trained on before. This was the first time machine based translation has successfully translated sentences using knowledge gained from training to tranlsate other languages.
Google engineers were curious about how the AI was managing to do this. Using a plotter integrated into TensorFlow, Google engineers peeked into the mind of the artificial intelligence, to see what was going on.
The coloured dots in the image are phrases with the similar meanings bunched together. In the zoomed in view, phrases from three different languages, are grouped together. This is evidence that the machine is saving information about the meaning of the phrases, instead of just phrase translations. The models created by the machine shows what Google calls a “universal interlingua representation.”
Google celebrated ten years of machine based translations earlier this year. Over one billion words are translated every day.