Deep Learning is changing our interaction with the world. Deep learning is revolutionising the areas of UAV(unmanned arieal vehicles), self driving card, speech recognition, etc. There are fundamentally three type of networks within Deep learning, Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) and Artifical Neural Networks (ANN).

CNN is mainly used for images and text digitization, ANN is used for facial recognition and computer vision, and RNN primarily for NLP; however, there are few more use cases for RNN. 

Recurrent neural networks (RNN) are the most promising algorithm for sequential data. RNNs are used by Apple's Siri and Google's voice search. They were the most initial algorithms to have the capability to remember its input data with the help of an internal memory. Owing to this property, they are highly precise in their predictions.

This makes them good for machine learning problems involving sequential data like time series, speech, text, financial data, audio, video, weather and much more. 

Let's get started and understand how RNN works and what are the application areas in NLP and few other domains. For that, we must touch upon the basics of ANN (Artificial Neural Networks). ANN usually involves a large number of processors operating in parallel and arranged in tiers, with the first tier the raw input receiving tier. The next tier's input is the output from the preceding one and not a fresh input. This way, the last tier produces the output of the system.

To get a better idea, we shall understand some real-life use cases where sequence models or RNN are useful. 

  • We all use google, and also now are getting used to the autocomplete feature. Google has RNN embedded into it for this auto-complete feature.
  • Another use case is translation. Google translate can translate sentences from one language to another using RNN at the backend. 
  • The next use case is Name Entity Recognition. (Named-entity recognition is a subtask of information extraction that works to locate and classify named entities that are mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.)
  • Sentiment analysis is another beautiful use case where a given piece of text can be used to extract the emotion, tone, or sentiment conveyed. Here the star ratings are decided based on the sentiment of the text. These are called sequence modeling problems since it's essential to consider the sequence of words in human communications.

In an RNN, the information cycles through a loop. When it makes a decision, it considers the current input and also what it has learned from the inputs it received previously. RNN has the capability to model a sequence of data so that each sample can be assumed to be dependent on previous ones

Often, a question comes up which says why not use ANN in the first place for sequence problems.

The answer to this lies in the below mentioned few issues that come up during ANN usage for sequence problems.

  • Variable size of input/output
  • Too much computation
  • No parameter sharing

However, there are some issues with RNN, including the fact that it's challenging to train RNN. Also, there are problems of gradient vanishing and exploding.

The good news is both these issues can very well be solved with LSTM (Long Short term memory), which we shall talk about soon. 

Sources of Article

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE