Mr. Anand Raman takes time off from his busy schedule to share some of the cutting-edge work that Microsoft is doing right out of India. He speaks on ways in which speech and machine translation is empowering people and integrating the nation despite the diversity from a language standpoint. Every idea is punctuated with the need for scalability and how these powerful use-cases seen in India can be mirrored across the world. He also takes us through a very interesting personal journey – his foray into AI almost a decade back. 

Q. Please share your thoughts on Microsoft's approach to AI, as well as where we are heading. Microsoft has put great efforts and driven initiatives and R&D on NLP and accessibility which is commendable. And so we would like you to incorporate that part also - especially from an Indian market standpoint, NLP is huge. Please also touch upon leadership & diversity.

I have just relocated back to India to lead the products in what we call Research Technology Center, India. There are two aspects. I lead the relevance and ranking as well as the trustworthy aspect in computing, and we do so in great depth. And on the cloud side, Azure, we look at speech, NLP, which includes the language model as well as the translations. We have also started to invest in computer vision. We partner with our teams at the US Headquarters where we work with them in delivering on these areas, addressing the gaps, and of course, always remaining focused on the larger India market.

Q. What got you into AI, in the first place? You had a remarkable 20-year career in Redmond which only a few can aspire for. What was your inspiration?

In the initial stage of my career, I worked on developer tools, visual studio, and subsequently on Windows 7 and so on. Thereafter, in 2005, I started a database unit. And one of the biggest issues we faced was that our products took a disproportionate amount of time to be shipped. We were kind of graduating from on-premise to the cloud environment which was a paradigm shift. I got the opportunity to lead the engineering systems at the database unit, and essentially to look at how to componentize products into tiny fragments so that they could be shipped as microservices.

I spent a good eight years in the database unit and I saw this trend emerging. Around that time, Microsoft brought in a new leader from Amazon. We wondered how someone with a retail background would fit into Microsoft. Internally he started a conference on Machine Learning & AI in which I participated as a volunteer. Incidentally, it’s still running and is in its tenth year. I got hooked on AI and by the second year, I started to work with him. 

I reported to Joseph as a technical advisor and the Chief of Staff. And I also got the opportunity to work on the broader strategy while driving the learning behind machine learning. The experience was ground-up and I saw these disciplines (AI, ML) evolve rapidly and the diverse ways in which they could improve and expand. And they were used in collaborating with Microsoft Research in many ways. We built a set of foundational models for each of these areas. A Stanford paper featured these foundational models and I would be happy to share it with you. So essentially the models on computer vision, speech-language have become the core foundations. And then I think it's almost becoming a commodity. Every vendor out there is getting a human parity in this foundation. So to answer your question, the kind of growth of data and the use of the data to build these foundational models got my interest going, and then I started working in this area. It's only been five or six years. I feel, I still feel, I do not know a lot. There's so much to learn. And then the industry is moving so fast. I mean, it's so very interesting when I look back.

Q. Microsoft started focusing - early on - more on the user and the cloud which enabled part to build these capabilities like you pointed out. In your current role in India, please tell us what are key initiatives you are working on?

Oh absolutely. I think one of the largest initiatives for us is about driving empowerment. How do you do this? Especially in India, what attracted me a lot is the language and the diversity we have. Europe also poses a similar challenge with 24 languages. However, unlike Europe, the interesting challenge we have is the code mix of the bilingual trilingual elements. And in the same sentence, you can have these elements.

One of the trends I saw was how the speech market in India has taken off. If you look at highly sourced languages – the ones with many speakers – there’s much more data available to train the models. And then there are more investments as well to train these models in English, Chinese, Spanish, Arabic, right? These are the highly sourced languages. Indian languages are not as highly sourced languages, even though a lot of people speak these languages. And when, when I looked at the tendency of the data, to my surprise, speech recognition is the number two market after English.

For our products, the number two market is in India. Voice-enabled technology, speech recognition, are a growing phenomenon. We have some of the largest contact centers and the kind of widespread adoption never ceases to amaze me. We are essentially focused on bilingual code technologies for enterprise customers. We focused specifically on Indian languages and currently, we are very good in the top 13 languages in India, and we continue to invest. We also focus on machine translation.

Microsoft has a product for over a hundred languages. We cover machine translation. And we are also looking at the Indian aspects of it. How do we deal with these challenges in a real trend constellation? For instance, how would it be to teach the machines when there’s a language switch? So that's the trend that machine translation and language models have to adapt to. And a huge market is playing out. Keeping this in mind, we have to invest accordingly while and deploying AI in other areas. For instance, there is a big market as well for document intelligence. So the three trends I'm seeing here are, speech recognition, speech-wise machine translation, and document intelligence.

All three have NLP at the core. Speaking about empowerment, the language translation system is very important to uplift the bottom of the pyramid. There are people who can use these smart devices but they don't always have that kind of literacy background to use them in English. For India, it is one of the biggest challenges - how do we enable Indian languages as easily as English for people to communicate with intelligent devices? So how do you overcome this challenge with the data assets?

Universalization of models is what we are working towards. Our Head of Speech Technology is very experienced and his team is equally qualified. We have invested in techniques of universalization and created a corpus of data for low resource languages. It's a challenge. but I think that's essentially where the innovative techniques come into great use.

Q. Apart from the language technology, what do you think will be crucial going forward and from an empowerment standpoint? Which are the other areas where AI and machine learning can make an impact?

 These technologies also help in preserving heritage through language translation. You should see some of the work we are doing in MSR, Bangalore. Removing the language barrier through technology is a massive step forward in the empowerment and integration of different people. People can now be brought a lot closer. 

We talked about the foundational models. You can build on top, and we are investing in different vertical segments such as finance, healthcare, agriculture, etc.

Another focus area is AI for Good. We are using AI to help people who are physically impaired (sight, hearing, etc) or young people who have dyslexia (learning enabled through immersive tools) and help them lead lives without friction to the extent possible. These technologies keep getting better and in process, the inputs are also of higher quality.

Tech has many many applications – retail, agriculture (soil study), healthcare, manufacturing (computer vision for safety), voice-enabled technology et al. Different scenarios different applications – they are all positively impacting human lives through empowerment and enabling greater accessibility. And, all of these are being done at scale.

Q. I have one last question. You worked 20 years, in the US, and now you head the India Development Center. So what is the difference in approach?

What fascinated me after I come here and saw that India is super rich in digital infrastructure, The India stack in particular, and how it enables mobile payment is just one example. In the US, people still carry cash and credit cards.

Though the scenarios are different. In India, it’s about empowering a billion people. And it’s being done in so many areas – how do you use this as a reference picture and take it to other countries? That is a big challenge in front of us. So that's kind of the area I'm fascinated about and how do our products fit into these things? The goal is always about scaling and impacting billions of lives.

Jibu Elias, NASSCOM: Thank you very much for speaking with us. It was so very insightful. I wish you all the best in this journey of empowering the next billion.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in