Get featured on IndiaAI

Contribute your expertise or opinions and become part of the ecosystem!

Coinciding with the kick-off of the 2020 TensorFlow Developer Summit, Google published a pipeline — Objectron — that spots objects in 2D images and videos and estimates their poses and sizes through an Artificial Intelligence (AI) model. The innovation will impact several industries such as robotics, self-driving vehicles, augmented reality and image reality. For example, it could help self-driving vehicles to avoid obstacles in real-time. 

Tracking 3D objects is extremely difficult, especially because it is power and memory intensive. Such programs weren’t able to run on smartphone system-on-chip previously. The complexity increases when the pipeline has to rely on only 2D videos due to lack of diversity in appearances, shapes of objects and of data. The Objectron team at Google developed a toolset to work around this problem - the toolset allows annotators to label 3D rectangular boxes, aka bounding boxes, for objects using a split-screen view to display 2D video frames. These 3D bounding boxes were overlaid atop it alongside point clouds, camera positions, and detected planes. Annotators drew 3D bounding boxes in the 3D view and verified their locations by reviewing the projections in 2D video frames, and for static objects, they only had to annotate the target object in a single frame. The tool propagated the object’s location to all frames using ground truth camera pose information from AR session data.

To boost the accuracy of AI model’s prediction, the Objectron team has also developed an engine to place virtual objects into scenes containing Augmented Reality data by using camera poses, detected planar surface and estimated lighting to imitate conditions and lighting to match the scene. The result was high-quality synthetic objects that fit seamlessly into real backgrounds.

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE