Vikas Joshi is a machine learning researcher with more than a decade of experience in research and industry with reputable businesses such as Microsoft, Amazon, and IBM India Research Labs (IRL). He has substantial expertise in developing production-ready large-scale automated speech recognition (ASR) systems using traditional and end-to-end neural models. Additionally, Vikas developed the language model for Alexa's India launch.

INDIAai spoke with Vikas, a Principal Applied Researcher at Microsoft, to learn about his views on artificial intelligence.

It's great to hear; that you earned your undergraduate degree in electronics and communication engineering. What motivated a student of electronics and communication engineering to pursue a career in machine learning? 

My journey in machine learning started at IIT Madras, somewhat accidentally and not by choice. As an undergraduate electronics and communication engineering student, I was more interested in signal processing than machine learning. It was back in 2010 when machine learning was not that popular. My advisor Prof. Umesh Srinivasan introduced me to machine learning and advised me to take a course on pattern recognition by Prof. C. Chandra Sekhar at IIT Madras. Professors taught the study so well that I decided to conduct my research in this field. As my research progressed, I saw some success with papers published in reputed conferences like Interspeech and ICASSP. By the end of my stint at IIT Madras, I was clear and determined to pursue a career in machine learning. 

How were your earliest research phases? Could you describe any of the difficulties you encountered during that time?

My research journey started on a high note, with four papers accepted in top tier conferences in the first two years of my research. Let me share some advice given by my uncle that helped me in the initial phases of my research. My uncle had a doctorate from Penn State University, and he advised me to spend a lot of time with senior PhD scholars in the initial months of my research. I helped my seniors with experimentation and proactively involved myself in their technical discussions. This experience helped me ramp up faster and identify the niche problems to be solved. I also worked towards establishing a good rapport with my advisor, senior PhD scholars and my peers. Moreover, my advisor was very supportive and set a conducive research environment in our lab. With all these aspects coming together, I had the best possible start to my research journey. 

Have you ever felt low in your 9+ years of research and industry experience as a Machine Learning researcher? How did you get through it? 

My research journey has been a roller coaster ride. There have been times during my PhD when nothing seemed to work. My ideas and experiments did not yield the expected results, and all my papers submitted during that time to conferences and journals were rejected. While I was disappointed with the rejection, I enjoyed the journey. I liked trying new ideas and experimenting with different modelling approaches and enjoyed long discussions/debates with my peers. I introspected the feedback from reviews and acted upon it. During the low times, I think it is important to keep enjoying the journey and, at the same time, introspect the failure and improve upon it. It is easy to get demotivated during the tough times, which can lead to a downward spiral as it is hard to succeed when you are demotivated.   

Is it necessary to know programming to conduct AI research? If that's the case, what programming language is needed? 

It is necessary to know high-level programming languages such as Python and deep learning frameworks such as PyTorch and TensorFlow. Most organizations also look for a decent understanding of data structures and algorithms.  

Who is your role model in AI? 

I am inspired by various people at different stages of my career. At the start of my research career, I looked up to my advisor Prof. Umesh, an expert in statistical speech recognition and humble at the same time. Then I was particularly impressed with the work done by Alex Graves. He proposed recurrent neural network transducer models for speech recognition which started a new speech modelling paradigm now popularly known as end-to-end or on-device speech recognition models.  

What do you actually do in your organization as a Principal Applied Researcher? 

I work at the Microsoft Speech team, and we focus on improving speech recognition accuracy for Indian languages. In addition, I lead acoustic modelling and on-device speech modelling efforts. As a Principal Applied Researcher, I am involved in defining the projects, creativity, and execution. We also contribute to intellectual property by filing patents and publishing our research work at top-tier speech conferences.  

Can you briefly say about your research area? What is your future focus on it? 

My research focuses on advancing the state-of-the-art for bilingual and code-mixed speech recognition. Recently, I am also exploring unsupervised learning for speech recognition.  

What advice would you provide to people interested in pursuing a career in artificial intelligence? 

A career in artificial intelligence is rewarding on many fronts, and roles like data scientist or applied scientist are excellent career choices. However, like any other domain, it needs a specific set of skill sets to be acquired. To begin with, be proficient with a high-level programming language like Python and one of the deep learning frameworks like PyTorch or Tensorflow. Then, depending on the job profile, you need to learn general machine learning concepts and some advanced topics like natural language processing, speech recognition, or image processing. It is also essential to know the fundamental concepts of linear algebra and probability theory, which are often ignored. NPTEL offers linear algebra and probability theory courses and is a great source to learn from.  

While theoretical understanding is necessary, gaining hands-on experience is of paramount importance too. To do so:

  1. Pick up a real-world scenario and explore different machine learning models on that dataset.
  2. Get a feel of how these models behave in practice and correlate the model behaviour on this real-world dataset with your theoretical understanding of the model.
  3. If the model does not act as expected, dig deeper until you reason out such behaviour.

In machine learning and artificial intelligence, having this urge to go deep and be able to explain your model behaviour is essential to be successful in the long run.  

Can you recommend some AI-related papers and books to us? 

There are numerous papers and books published on AI. I referred to the Pattern Recognition and Machine Learning book by Christopher Bishop to learn fundamental machine learning concepts. Numerous exciting articles are published every year, and it is hard to select a few as it depends on the area of interest. There are now forums where the users rank the papers, and it is an excellent place to look for trending and state-of-the-art research papers.  

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in