The most important research articles are listed below. It's a hand-curated, chronological collection of the most recent AI and data science breakthroughs, complete with a link to a more detailed article.

Image Synthesis and Editing

Guided image synthesis makes it easy for anyone to create and edit images that look like photos. The biggest problem is finding a balance between being true to the user's inputs (like hand-drawn coloured strokes) and making images look natural. Existing GAN-based methods try to find this balance by using either conditional GANs or GAN inversions, which are complex and often need more training data or loss functions for each application.

To solve these problems, the researchers devised a new way to make and edit images called Stochastic Differential Editing (SDEdit). This method is based on a diffusion model and makes realistic images by iteratively removing noise through a stochastic differential equation (SDE). When SDEdit gets an image from the user with instructions on how to change the RGB pixels, it first adds noise to the image and then uses the SDE to remove the noise to make the image look more natural. As a result, SDEdit doesn't need task-specific training or inversions, and it can naturally find the right balance between being realistic and staying true to the original. A human perception study found that SDEdit beats the best GAN-based methods by up to 98.09 per cent in terms of realism and 91.72 per cent in terms of overall satisfaction on multiple tasks, such as stroke-based image synthesis and editing as well as image compositing.

Paper: Image Synthesis and Editing with Stochastic Differential Equations

Click here for the code

Colab demo

GAN sketching

Can a user create a deep generative model by drawing a single example? In the past, making a GAN model required many examples from a large dataset and deep learning expertise. On the other hand, sketching might be the easiest way for everyone to communicate visually. In this paper, the researchers describe GAN Sketching, a way to rewrite GANs with one or more sketches to make training GANs easier for people who have never used them. In particular, the researchers change the weights of an original GAN model based on user sketches. Through a cross-domain adversarial loss, the researchers also try to get the model's results to match the user sketches.

Furthermore, the researchers try out different regularization methods to keep the diversity and image quality of the original model. Experiments have shown that their approach can shape GANs to match the shapes and poses in sketches while keeping them realistic and unique. Finally, the researchers demonstrate the GAN for latent space interpolation and image editing.

Paper: Sketch Your Own GAN

Click here for the code

Styleclip: Text-driven manipulation

Inspired by the fact that StyleGAN can make realistic images in many different domains, a lot of recent work has been done to figure out how to use StyleGAN's latent spaces to change both generated and authentic images. But finding semantically meaningful latent manipulations usually requires a careful human look at the many degrees of freedom or a set of annotated images for each desired manipulation.

In this work, the researchers look into how to use the power of the recently introduced Contrastive Language-Image Pre-training (CLIP) models to make a text-based interface for StyleGAN image manipulation that doesn't require as much manual work. The researchers start by discussing an optimization method that uses a CLIP-based loss to change a latent input vector based on a text prompt from the user. Next, the researchers discuss a latent mapper that generates a text-guided latent manipulation step for a given input image. This approach makes text-based manipulation faster and more stable.

Lastly, the researchers show how to map text prompts to input-independent directions in StyleGAN's style space. This approach makes it possible to change an image by typing commands. Their methods work, as shown by many results and comparisons.

Paper: Styleclip: Text-driven manipulation of StyleGAN imagery.

Click here for the code

Colab demo

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in