Results for ""
On June 16, 2022, the Meta AI unveils new scientific research that will aid AI in understanding the physical world more flexibly and efficiently. Meta AI researchers state that AI systems must learn to navigate the complexity of the physical world. In the same way, people create and interact with immersive new experiences in the metaverse — where people can navigate virtual realms and our physical world through augmented reality.
For example, AR glasses showing us where we left our keys require essential new technologies that let AI understand the layout and dimensions of the unknown, ever-changing settings without using high-computing resources such as pre-loaded maps. We don't need to know the exact location or length of our coffee table to move around it without banging into its corners as people, for example (most of the time).
Research outcome
All of this work helps progress in the visual navigation of embodied AI, a field of research focusing on training AI systems through interactions in 3D simulations instead of traditional 2D datasets.
Navigating without GPS
Researchers from Ukrainian Catholic University, 2Georgia Institute of Technology, and Meta AI have developed new ways to improve visual odometry, which is how AI can determine where it is based only on what it sees. Their new data-enhancement method trains simple but effective neural models without adding human annotations to the data. As a result, robust visual odometry is all you need to move state-of-the-art from 71.7 per cent success on the Realistic PointNav task without GPS or compass data to 94 per cent success, even when the action dynamics are noisy.
Even though their method doesn't solve this dataset entirely, this research shows that explicit mapping may not be for navigation, even in real-world settings.
Zero-shot navigation learning
Most embodied AI developments perform well on discrete, well-defined tasks depending on objective type (e.g., "identify an object," "navigate to a room") or modality when it comes to training AI to find objects (e.g., text, audio). On the other hand, agents must be able to modify their skills on the go in the actual world without using resource-intensive maps or lengthy retraining processes.
Image source: Meta AI
Researchers from The University of Texas at Austin and Meta AI have created a model. Their model captures the essential skills for semantic visual navigation and then applies them to different target activities in a 3D environment without additional retraining in a first-of-its-kind zero-shot experience learning (ZSEL) framework.
Conclusion
Researchers looked into the question, "Can a self-guided agent move around in a new environment without making a map?" in (simulated) real-life situations. To answer this question, researchers first showed that when given ground-truth localization (GPS+Compass), map-less agents can overcome actuation and sensor noise and learn to navigate with near-perfect performance. This approach revealed that localization is the limiting factor.
Soon, researchers plan to apply these navigational breakthroughs to mobile manipulation to create agents that perform specific tasks, such as "find my wallet and bring it back to me." In addition, they claimed that they would undertake a variety of new and intriguing challenges: How does this simulation work translate to practical robots? How can an embodied agent self-supervise its learning without human intervention in reward engineering, demonstrations, or 3D annotations? How can simulation be scaled to the next level of simulation and learning speed?
Image source: Unsplash