Computers compute - naturally, but can they see things? Computer vision, a field of artificial intelligence makes it possible. This is an area that is developing at a rapid pace (even by AI standards) and the mesmerizing stuff on display is a far cry from the days when it took a million images to train the machine so that it could distinguish between 4-legged mammals.  

It’s been around since the 1950s or at least the rudimentary idea but the renaissance in computer vision has been due to a convergence of powerful tech advancements such as in-built cameras in mobile devices, brute computing power, appropriate hardware designed for computer vision & analysis, and convolutional neural networks. The impact has been astounding. Even a decade back, the success rate of identification was 50%, and today a 94 or 95% in many use cases has become normal. Even a 99 isn’t a rarity.  

Programmers feed a million images of a cat, and the algorithms learn to distinguish between different features. The next time it’s shown a cat and asked to identify, it starts to fit together a jigsaw puzzle, going back to each component it learned. And, voila there’s a cat! The deeper one goes into neural networks; one is reminded of that marvellous organ that we supposedly use 10% of - the brain. I wonder why this cat example comes up every time you google computer vision. It never says tortoises – have you not wondered why?  

A pixel is the smallest portion of an image that a computer can display or print. It’s a microscopic dot. Each pixel’s brightness is represented by a single 8-bit number, whose range is from 0 (black) to 255 (white). Each one is numbered in the computer’s memory. A black ‘n white image is stored as numbers corresponding to specific blocks. Computers usually read colour as a series of 3 values — red, green, and blue (RGB) — on that same 0–255 scale. Now, each pixel has three values for the computer to store in addition to its relative position. That’s an awful lot of memory consumed just to store one image. Again, it’s possible at scale due to massive computing power and the plummeting cost of storage.  

At a simplistic level, it’s essentially a 4-step procedure:

  1. Detect/identify faces in an image (using a face detection model)
  2. Predict face poses/landmarks (for the faces identified in step 1)
  3. Using data from step 2 and the actual image, calculate face encodings (numbers that describe the face). 
  4. Compare the face encodings of known faces with those from test images to tell who is in the picture. 

The Telangana government has implemented RTDAI (Realtime Digital Authentication of Identity) to authenticate pensioners without any need for special hardware at the user’s end, nor does it require fingerprints or iris images. GT Venkateshwar Rao, Managing Director, Telangana State Technology Services (TSTS), says, “all you need to have is a smartphone to remotely authenticate yourself. Take a photo and upload it in the exclusive app launched for the purpose. The software will verify the photo and demographic details in the pension database, avoiding the need to authenticate people physically,” The app is made available in T App Folio, the umbrella app for various e-governance offerings from the Telangana Government. 

Two crucial questions are required to be answered and it can be done in one minute – Is the pensioner alive and whether he/she is the legitimate one?

The Pensioners Life Certificate Authentication using a Selfie (PLCS) method deploys three levels of authentication – demographics (name, father’s name and address), photo, and whether the individual is alive. The system uses an AI-based liveness check solution (developed by a Bengaluru start-up) and an ML-based demographic comparison solution. After the one-time registration (consent) and authentication, the user can authenticate from anywhere and anytime. Out of the 13760+ applications received, 12760+ have been successfully authenticated with almost 98% accuracy. As with all ML-based tools, the system keeps getting better with more inputs. 

Buoyed by the success of this technology, the Andhra Pradesh government is also working on an AI-based Realtime Beneficiary Identification System (RBIS) for the authentication of pensioners. This will replace the present PDO authentication (human intervention). The new system is expected to remove the existing discrepancies that have crept in. 

Computer vision as technology has immense potential, as we have seen. But it also comes with deep ethical considerations. It can be used to stalk people, know more about them, and monetize that data without the individual’s active consent. Or for that matter the human bias with racial underpinnings. The consequences may be unintended (at times) or intended in other cases, and the effect of unfairness in algorithms can impact human lives in a big way. The camera can be spun around a room to capture images and determine who is sad or who has diabetes? But is that something we would want to reveal about ourselves? 

The Ethical Framework of AI is evolving and will continue to expand its boundaries as newer use cases come into the mainstream, and their impacts are realized.   

Sources of Article

Image by Maria Francisca Mayorga from Pixabay 

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in