A new AI model from Meta can translate 200 languages, allowing more people to use technology.
Our link to the outside world is language. However, because there are hundreds of languages for which there are no high-quality translation technologies, billions of people today are unable to access digital material or take full part in online discussions and communities in their preferred or native languages. For the hundreds of millions of people who speak the many languages of Asia and Africa, this is especially problematic and Meta has a solution for the same.
The web is dominated by a few languages, including English, Mandarin, Spanish, and Arabic. Native speakers of these widely spoken languages may take for granted the significance of reading something in their mother tongue. NLLB from Meta will enable more people to read things in their native language, rather than always requiring an intermediary language that frequently misinterprets the sentiment or content.
Meta AI researchers created No Language Left Behind (NLLB), an effort to develop high-quality machine translation capabilities for most of the world’s languages. Meta has built a single AI model called NLLB-200, which translates 200 different languages with results far more accurate than what previous technology could accomplish.
The translation quality of NLLB-200 was on average 44% better than that of earlier AI studies. The translations produced by NLLB-200 were more precise than human translations for some languages with African and Indian roots.
Meta created FLORES-200, a dataset that enables researchers to gauge the performance of this AI model in 40,000 different language directions, in order to effectively assess and enhance NLLB-200.
Furthermore, Meta is awarding grants of up to $200,000 to academics and nonprofit organisations working on projects addressing issues such as sustainability, food security, gender-based violence, education, and others that support the UN Sustainable Development Goals. Researchers in linguistics, machine translation, and language technology, as well as nonprofit organisations interested in using the NLLB-200 to translate two or more African languages, are encouraged to apply.
This work can also help advance other technologies, such as developing assistants that work well in languages like Javanese and Uzbek, or developing systems that take Bollywood movies and add accurate Swahili or Oromo subtitles.
As the metaverse takes shape, the ability to create technologies that work well in a wider range of languages will help to democratise access to immersive virtual world experiences.