Here are the most significant scientific articles. It is a handpicked compilation of the most recent advancements in artificial intelligence and data science, arranged chronologically with a link to a more detailed article.

GANs in creating 3D models

This GANverse3D model requires an image to generate a customizable and animated 3D figure.

Differentiable rendering has made it possible to train neural networks to do "inverse graphics'' tasks like figuring out what 3D geometry looks like from a single-eye view of a scene. Unfortunately, most of the current methods for training high-performing models depend on multi-view images, which are not always easy to get. Recent Generative Adversarial Networks (GANs) that create images, on the other hand, seem to learn 3D knowledge. For example, you can change how an object looks by changing its latent codes. But these hidden codes don't always have a physical meaning, so GANs can't easily do explicit 3D reasoning. 

The researchers use differentiable renderers to pull out and separate 3D knowledge learned by generative models. The key to their method is using GANs as a multi-view data generator to train an inverse graphics network with an off-the-shelf differentiable renderer and then using the trained inverse graphics network as a teacher to turn the GAN's hidden code into understandable 3D properties. The researchers quantitatively and through user studies show that their method does better than current inverse graphics networks trained on existing datasets. The researchers also show how the detangled GAN as a 3D "neural renderer" can work with traditional graphics renderers.

Paper: IMAGE GANS MEET DIFFERENTIABLE RENDERING FOR INVERSE GRAPHICS AND INTERPRETABLE 3D NEURAL RENDERING

Deep nets in computer vision

This research is an opinion paper about what Deep Nets for vision does well and what it does not do well. They are at the centre of the vast progress made in artificial intelligence in recent years and are becoming more critical in cognitive science and neuroscience. They have had a lot of success, but they also have some problems and not much about how they work. At the moment, Deep Nets do very well with benchmark datasets for specific visual tasks, but they are much less flexible and adaptable than the human visual system.

The researchers argue that Deep Nets in their current form are unlikely to solve the fundamental problem of computer vision. This problem is how to deal with the combinatorial explosion caused by the vast complexity of natural images and get the same deep understanding of human visual scenes. The researchers say that this "combinatorial explosion" means that "big data is not enough", and they need to rethink their methods for benchmarking performance and evaluating vision algorithms.

Also, the researchers say that performance evaluation is not just an academic exercise because vision algorithms are being used more and more in real-world applications. This process means that performance evaluation is not just an academic exercise but has essential effects in the real world. It wouldn't make sense to look at all of the Deep Net literature, so the researchers focus on a few topics and references to be "entry points" into the literature.

Paper: Deep nets: What have they ever done for vision?

Perpetual view generation

The next phase in view synthesis is Perpetual View Generation, where the objective is to fly inside an image and explore the surrounding terrain.

The researchers discuss the problem of perpetual view generation, which is the long-term creation of new views that match a camera's path for an arbitrarily long time given a single image. This research is a complex problem beyond the abilities of current view synthesis methods, which quickly break down when a camera moves a lot. In addition, methods for making videos can't make long sequences very well, and they often don't care about the shape of the scene. So instead, the researchers use a hybrid approach that combines geometry and image synthesis in an iterative "render, refine, and repeat" framework. This approach allows long-range generation that covers large distances after hundreds of frames.

Also, their method can be a set of one-eyed video sequences. The researchers propose a set of aerial footage of coastal scenes. They compare their method with recent view synthesis and conditional video generation baselines to show that it can make plausible scenes for much longer time horizons and more significant camera trajectories than existing methods.

Paper: Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

Click here for the code

Colab demo

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in