These are the most intriguing AI research papers published this year. It combines artificial intelligence (AI) with data science developments. It is chronologically organised and includes a link to a longer article.

Combining Vision and Language Representations for Patch-based Identification of Lexico-Semantic Relations

There have been many different applications for multimodal natural language processing, but only some studies have focused on multimodal relational lexical semantics. The researchers' initial attempt to use visual cues to identify lexico-semantic relationships, which represent language phenomena like synonymy, co-hyponymy, and hypernymy, is proposed in this study. The researchers hypothesise that visual information can augment the textual information, depending on the apperception subcomponent of the semiotic textology linguistic theory. At the same time, conventional approaches make use of the paradigmatic approach or/and the distributional hypothesis. 

To do this, the researchers automatically add visual information to two gold-standard datasets, and they create several fusion algorithms to mix textual and visual modalities following the patch-based approach. According to experimental findings using multimodal datasets, visual information can reliably boost performance by filling in the semantic gaps in textual encodings.

DrawMon: A Distributed System for Detection of Atypical Sketch Content in Concurrent Pictionary Games

The famous sketch-based guessing game Pictionary allows us to look at cooperative gameplay with a shared goal when communication is limited. But sometimes, players will draw things that aren't normal. Even though this content is occasionally essential to the game, it sometimes breaks the rules and makes it less fun. To deal with these kinds of problems in a timely and scalable way, the researchers came up with DrawMon, a new distributed system for automatically finding sketches with unique content in Pictionary games simultaneously. 

The researchers made specialised online tools to collect game session data and annotate atypical sketch content. It led to AtyPict, the first ever atypical sketch content dataset. The researchers train CanvasNet, a deep neural network that can find unusual material, with AtyPict. CanvasNet is one of the essential parts of DrawMon. Their analysis of post-deployment game session data shows that DrawMon works well for monitoring on a large scale and finding sketch content that isn't normal. Aside from Pictionary, their work can also be used as a guide for making unique, unusual content response systems that use shared, interactive whiteboards.

Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors

In this study, the researchers look at the interesting question of what can be learned from an 8*8 pixel video sequence. It turns out to be a lot more than I thought. The experts show that we can get a full-length, 256*256 video by processing these 8*8 videos with the right audio and image priors. Our new audio-visual upsampling network helps the researchers call this 32* scaling from an input with a very low resolution. The audio prior helps us figure out the basic features of the face and the exact shape of the lips, and a single high-resolution image before the target name gives us a lot of information about how the person looks. 

Their plan is a multi-stage system that works from beginning to end. The first stage makes a rough intermediate output video, which can then be used to animate a single target identity picture and make outputs that are real, accurate, and high-quality. Their method is simple and works much better than other super-resolution ways (an 8x improvement in FID score). The experts also apply our model to talking-face video compression and show that we get a 3.5-fold improvement over the previous state-of-the-art in terms of bits/pixel. In the paper and the extra materials, the results from our network are carefully examined by doing a lot of ablation tests.

Sources of Article

Image source: Unsplash

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in