NLP is one of the most discussed topics in the field of data science. The meteoric rise of the application of NLP over the past few years has been beyond imagination. Corporates and enterprises are increasingly looking at using this tool to help analyze data and gather meaningful insights to gain a competitive advantage over others. Data enthusiasts are trying to understand the various applications of NLP and help businesses succeed. The most interesting applications of NLP in businesses are chatbots and virtual assistants, auto-correct, machine translation, text summarization, speech recognition and sentiment analysis. However, there are multiple other applications of NLP which businesses can leverage to improve their product or service.

In recent years there has been an increase in the filing of AI-related patents across India and NLP heads the list. So much so that, machine translation is being used for accurate translations of the multiple global patents that are being published annually. In fact, NLP is speeding up the process of translation with higher and accurate results. To have a better understanding of the NLP patents, let us consider a few in details.

Techniques for facilitating optical character recognition (OCR) in documents

Patent Filed By: Vijay Yellapragada, Peijun Chiang, Sreeneel K. Maddika

Date: 3rd July 2018

OCR technology has been developed to capture data from physical documents such as letters, invoices or even images that are part of many daily business activities. OCR tools facilitate in the creation of digital copies of such physical documents and extract data into structured formats. Such a structured format helps in data processing through quick sorting, searching and editing. Off late, OCR tools have become a top priority for many businesses as it automates manual processing, reduces costs and makes information readily available. On these lines, Intuit Technologies have obtained a patent on OCR in structured documents. In this, the inventors are credited to design a method that can identify one or more regions in an electronic document to perform the OCR. For example, a method for identifying information in an electronic document includes obtaining a set of training documents for each template of a plurality of templates for the electronic document, extracting spatial attributes for at least a first label region and at least a first corresponding value region from the set, and training a classifier model based on the extracted spatial attributes, wherein the classifier model is used to identify the information in the electronic document.

Extracting searchable information from a digitized document

Patent filed by: Prakash Ghatage, Nirav Sampat, Kumar Viswanathan, Suvendu Kumar Mahapatra, Srikanth Narayanan, Rekha Mani, Aravind Krishnan, Rahul Kotnala, Kameshkumar Lakshminarayanan, Ashish Jain

Date: 11th June 2019

A similar type of patent for data extraction technique from documents has been filed by Accenture Global Solutions where data extraction and automatic validation from digitized documents in non-editable formats is disclosed. The inventors have identified processes through which paper documents are digitized or converted into formats suitable for storage on computers or other digital devices. The digitized documents are then classified into one of a plurality of document types and based on the document type, document processing rules are selected for analyzing the digitized documents. Such rules help in data extraction and automatic validation. As a next step, the positions and values of the data fields in the digitized documents are then obtained using machine learning techniques. The data field values are automatically validated and assigned confidence scores and fields with low scores are flagged for manual review.

Extracting action of interest from natural language sentences

Patent filed by: Arindam Chatterjee, Kartik Subodh Ballal

Date: 13th October 2020

It is important to identify user intent in any information. User intent typically implies what the user wants. Intelligent systems using NLP and natural language understanding breaks down a user query into a more granular form combining an action (or a verb) and a centre of interest pertaining to which the action needs to be performed. Correct identification of the centre of interest is important as an error might lead to misinterpretation of the intent. The inventors from Wipro Technologies have devised a method for extracting Action of Interest (AOI) from natural language sentences. The method includes creating, by an AOI processing device, an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user, wherein the plurality of parameters for each target word comprise a Part of Speech (POS) vector associated with the target word and at least two words preceding the target word, a word embedding for the target word, a word embedding for a headword of the target word in the dependency parse tree of the sentence, and a dependency label for the target word.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in