Researchers are training AI to recognise photos of similar materials. For example, a machine-learning model can recognise all the pixels in an image corresponding to a specific material. 

"I visualise a time when we will be to robots what dogs are to humans, and I'm rooting for the machines." - Claude Shannon.

The method, developed at MIT, could one day be employed in computer vision systems that assist robots in interacting with real-world things. The image is an artist's interpretation of the new system.

For example, a robot managing products in a kitchen will benefit from knowing which items are made of the same materials. With this information, the robot would learn to use the same amount of force, whether picking up a small pat of butter from a shady corner of the kitchen or an entire stick from the brightly lit fridge. However, identifying items in a scene made of the same material, known as material selection, is a challenging task for machines since the look of a material can vary dramatically depending on the shape of the object or lighting circumstances.

DINO - self-supervised vision transformer 

Scientists at MIT and Adobe Research have made progress in resolving this problem. They devised a method for identifying all pixels in an image representing a particular material displayed in a pixel chosen by the user. The method is accurate even when objects have diverse forms and sizes, and the machine-learning model they created is not fooled by shadows or lighting conditions, which can cause the same substance to appear different. 

Although scientists trained their model using only "synthetic" data generated by a computer that changes 3D sceneries to generate many different images, the system works efficiently in natural indoor and outdoor settings it has never seen before. The method may also be applied to films; once a pixel is identified in the first frame, the model can detect items of the same substance throughout the video.

Material selection

Existing material selection techniques have difficulty accurately identifying all pixels that belong to the same material. For example, a chair with wooden arms and a leather seat is an example of an object that can be made of several materials, although some approaches focus on full objects. Other methods might use a preset set of materials, but they frequently have generic names like "wood," although there are thousands of different kinds of wood.

The researchers had to get over a few obstacles to create an AI approach that could learn how to choose related materials. First, their machine-learning model could not be trained on any existing dataset since the materials needed to be finely labelled. More than 16,000 materials were randomly added to each object in the 50,000 photos that comprised the researchers' synthetic dataset of interior scenes.

Similarity problem

The researchers' approach converts generic, pre-trained visual cues into material-specific features in a way that is resistant to object shapes or changing lighting conditions. After that, the model may compute a material similarity score for each pixel in the image. Then, when a user clicks a pixel, the model determines how similar every other pixel seems to the query. Next, it generates a map in which each pixel is scored for similarity on a scale of 0 to 1. Finally, because the model generates a similarity score for each pixel, the user can fine-tune the findings by specifying a similarity threshold, such as 90%, and receiving a map of the image with highlighted locations. 

The method also works for cross-image selection, meaning the user can select a pixel in one image and find the identical substance in another. During testing, the researchers discovered that their model was more accurate than other techniques at predicting parts of an image that contained the same substance. For example, when scientists compared the forecast to the ground truth, the real parts of the image made of the same substance, their model matched with roughly 92 per cent accuracy. 

Conclusion

Separating an image into its underlying components is an essential initial step in modifying and comprehending images. The researchers describe a method for identifying photograph parts with the same material as an artist-selected area. Their proposed method is resistant to shading, specular highlights, and cast shadows, allowing for selection in real-world photos.

Sources of Article

Image source: Unsplash

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE