There used to be a time when it was customary to pick up a Brittanica encyclopaedia to learn more about the world around us. Today, we have Google and Wikipedia – the ubiquitous way to procure any information we may need. Be it the state of financial markets, the latest launch of a smartphone, real estate trends or even what your local politicians are up to. Technology has gone a step further in providing granular assistance in modes beyond just text. We truly live in the information age – we have our choice to pick from videos, podcasts, blogs, websites, social media and so much more.

But what about a farmer in a village? Or what about a truck driver who spends most of his time travelling dreary roads and highways? India has around half a billion smartphone users. 304 million users alone are from rural India. Despite the surge in smartphone users and internet penetration, most of the users in this bracket have not been able to encash on the opportunity. How can they access information in a way that makes most sense to them? This was the very question that came to the mind of Ananth Nagaraj. A career techie, Nagaraj once met a farmer who had no information about market prices of crops. It confounded Nagaraj: How was this man looking to sell his crops with limited access to market information that was already available on the Internet? Thus began the journey of Gnani.ai.

Gnani.ai, cofounded by Ananth Nagaraj and Ganesh Gopalan, was born with the vision to bridge the technology divide in India. “This incident with the farmer moved me deeply. The cognizance of Indians not being able to leverage the power of the internet dawned on me. As we started discussing how to bridge this divide, it struck us how conversations are a natural form of communication. We envisaged the benefits of people being able to interact with machines.”

But India is a land of languages. With 22 official languages and many other regional languages and dialects, it is not easy to build a generic speech engine. Gopalan and Nagaraj decided to take on the Herculean task of empowering Indians to converse with machines in their preferred language. Today, Gnani.ai has speech engines for 12+ major Indian languages, with conversational bots that act as the touchpoints in the democratization of the Internet. 

Being a deep tech company specialising in conversational AI, Gnani.ai has proprietary ASR and NLP engines. Among their more recent developments is the On-Device voice models. “The exciting thing about this tech stack is that it allows for voice assistants to be deployed locally, negating the need for an Internet connection. The typical use case for this includes IoT devices and the automobile industry. This product is a low footprint all neural speech recognition module with an extensive vocabulary, making our speech models less than 75 MB in size without compromising on accuracy,” explains Nagaraj.

Gnani.ai’s conversational AI platform offers customer automation and omnichannel analytics solutions to customers across BFSI, FMCG and healthcare among other sectors in 20+ languages globally. 

Building India’s Foremost NLP Engines & Its Challenges:

NLP is the ability of a machine to understand the conversations just as the way we do. It is a known fact that the differences in written and spoken language are vast and more so in vernacular languages, so it is imperative to have a robust NLP module alongside a speech engine. “We have come across several instances where the transcription of an utterance did not make much sense but meant something in spoken language. In such scenarios, the meaning needs to be extracted contextually. Due to these intricacies, building a robust NLP model required a lot of trial and error,” says Nagaraj. The startup has a large team of speech scientists, NLP engineers, and linguists working together to make machines understand the nuances of spoken language.

Challenges exist not only in the collection of data but labelled data. For example, if a user desires to know his/her account balance, no two users have the same way of asking for it. When you extrapolate to the entire population that speaks the same language, you get a thousand combinations whose end intent is the same, but is spoken in various ways.

The core of every voice application is a speech engine. However, there is no one-size-fits-all speech model. Every sector will have terminologies that are exclusive to that industry and Gnani.ai offers domain-specific models for every industry. For instance, in the e-commerce industry, there are several terminologies generally not associated with say, the telecom industry. For sector-specific terms and jargon to be accurately recognized by speech modules, minor changes to the voice applications are made for improved results.

Helping Samsung Reach 126Mn Indian Users Through Voice:

Samsung Ventures has made a strategic investment with Gnani.ai, with their vernacular language capabilities powering Samsung’s virtual assistant Bixby for the Indian smartphone market, which Samsung has a 26% share of. As a speech start-up with Speech-to-text and Text-to-speech engines for 12+ major Indian languages, Gnani.ai is offering Samsung Bixby in Indian languages to nearly 156 million users through voice.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in