Get featured on INDIAai

Contribute your expertise or opinions and become part of the ecosystem!

Over 7000 to 8000 languages are spoken across the world today. Among them, most are benefitted from modern language technologies such as voice-to-text transcription, automatic captioning, instantaneous translation and voice recognition. The researchers of Carnegie Mellon University intend to explore the possibilities of automatic speech recognition tools to expand the number of languages used worldwide from 200 to 2000.  

Xinjian Li, a PhD student at the School of Computer Science’s Language Technologies Institute, says, “Developing technology and a good language model for all people is one of the research goals.” 

Almost every speech recognition model needs two types of data sets; text and audio data sets. Text data exists for various languages, while audio data does not. The research team intends to eradicate the need for audio data by concentrating on the common linguistic elements of all languages. 

Every speech recognition technology focuses on the phoneme of a language. They are the distinct sounds that distinguish one word from the other and are unique in each language. At the same time, languages also consist of phones, which explains how a word sounds physically. Hence, separate languages may have different phonemes but not phones. Their fundamental phones could be the same. 

Instead of focusing on phonemes, the LTI team is developing a speech recognition model and dives deep into information about how the phones are shared between languages and thus making it easy to use without building a separate model for each language. It pairs the model with a phylogenetic tree, which is a diagram that details the relationship between two languages to help with the rules of pronunciation. The research team is able to approximate the speech model for thousands of languages, overthrowing the audio data requirement using this model and the tree structure. 

For Li, it’s about cultural preservation more than the global availability of language technologies. Li also reminds us that language is a prominent factor in culture. Each of them has its own story. Developing tools like a speech recognition system is a significant step toward preserving those languages without losing their stories.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in