The most noteworthy scientific publications are listed here. It's a curated collection of the most recent advances in artificial intelligence and data science, organized chronologically with a link to a more in-depth article.

Swin transformer

Will Transformers be able to take the role of CNNs in computer vision? 

This article introduces a novel vision Transformer, Swin Transformer, as a general-purpose backbone for computer vision. The difficulties in converting transformer from language to vision stem from the domain differences between the two, such as the vast scale variety of visual items and the high resolution of pixels in photos versus words in the text. 

The researchers suggest a hierarchical Transformer with shifted window representation to resolve these discrepancies. The shifted windowing technique improves efficiency by limiting self-attention computation to non-overlapping local windows while allowing cross-window communication. Furthermore, this hierarchical architecture is capable of modelling at multiple scales and has a linear computing complexity proportional to the image size. 

Moreover, Swin Transformer's characteristics make it suitable for a wide variety of vision applications, including image classification and dense prediction tasks like object identification and semantic segmentation.

Paper: Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Click here for the code.

Making SGD parameter-free

Stochastic Gradient Descent (SGD) is a straightforward but efficient optimization approach for determining functions' parameter/coefficient values that minimize a cost function. In other words, it is to discriminatively learn linear classifiers with convex loss functions such as SVM and Logistic regression.

The researchers here design a parameter-free stochastic convex optimization (SCO) method with a rate of convergence that is just a double-logarithmic factor more significant than the ideal rate for the analogous known-parameter situation. The best previously known rates for parameter-free SCO on online parameter-free regret bounds include unavoidable excess logarithmic terms relative to their known-parameter equivalents. 

Their technique is theoretically straightforward, guarantees high probability, and partially adapts to unknown gradient norms, smoothness, and strong convexity. Their conclusions are on a unique parameter-free certificate for SGD step size selection and a time-uniform concentration result that makes no a priori assumptions about SGD.

Paper: Making SGD Parameter-Free

Generative adversarial transformers

The researchers introduce and investigate the GANformer, a novel and efficient type of transformer, for visual generative modelling. The network is bipartite in structure, facilitating long-range interactions throughout the image while preserving linear efficiency in computing, allowing for easy scaling to high-resolution synthesis. Furthermore, it propagates information repeatedly from a set of latent variables to emerging visual features and vice versa, allowing each to be refined in light of the other and encouraging the evolution of compositional representations for objects and scenes. 

In contrast to the classical transformer architecture, it uses multiplicative integration, enabling variable region-based modulation and multi-latent generalization of the successful StyleGAN network. The researchers demonstrate the model's strength and robustness by evaluating it across various datasets. Additional qualitative and quantitative trials provide insight into the approach's benefits and efficacy.

Paper: Generative Adversarial Transformers

Click here for the code.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in