These are the most intriguing AI research articles published this year. It combines artificial intelligence (AI) with data science innovations. It is organized chronologically and contains a link to a longer article.

A Meta-Learning Approach Based on Language Clustering for Zero-Shot Cross-Lingual Transfer and Generation

In recent years, research on multilingual and cross-lingual transfer, in which supervision is moved from high-resource languages (HRLs) to low-resource languages (LRLs), has moved quickly forward . But the cross-language transfer is sometimes different from one language to another, especially in the zero-shot setting. One potential research topic is learning structures that can be employed in several tasks utilizing a modest quantity of labelled data. 

In this paper, the researchers propose a meta-learning framework, Meta-XNLG, that uses meta-learning and language clustering to learn shareable structures from typologically different languages. It is a step toward uniform cross-language translation for languages people have never seen. First, they put the languages into groups based on how they are written, and then they find the centre language of each group. A meta-learning system is trained with all centroid languages and tested on the other languages in zero-shot. The researchers show how well this modelling works on 

  • two NLG tasks (Abstractive Text Summarization and Question Generation), 
  • five popular datasets, and 
  • 30 languages 

They are very different in how they are written. Nevertheless, consistent improvements from strong starting points show that the proposed framework works. Moreover, because The researchers carefully made the model, this end-to-end NLG setup is less likely to have the problem of accidental translation, which is a big worry in zero-shot cross-lingual NLG tasks.

Overlap-based Vocabulary Generation Improves Cross-lingual Transfer Among Related Languages

Multilingual language models that have already been trained, like mBERT and XLM-R, have shown much promise for zero-shot cross-lingual transfer to low web-resource languages (LRL). But because model capacity is limited, the big difference between high web-resource languages (HRL) and low web-resource languages (LRL) needs to give more room for co-embedding the LRL with the HRL, which hurts the performance of LRLs later.

In this paper, the researchers argue that the similarity between languages in the same language family in lexical overlap could be used to get around some problems with LRLs' corpora. They suggest Overlap BPE (OBPE), a simple but effective change to the BPE algorithm for making new words that increases the overlap between related languages. Through a lot of tests on different NLP tasks and datasets, they found that OBPE creates a vocabulary that makes it easier for LRLs to be represented by using tokens that HRLs also use. It makes the zero-shot transfer from related HRLs to LRLs better without lowering the accuracy or representation of HRLs.

In contrast to earlier studies that didn't think token overlap was significant, the researchers show that it is essential in low-resource language settings. If the overlap is cut down to zero, the zero-shot transfer accuracy can drop by as much as four times.

Towards Fair Evaluation of Dialogue State Tracking by Flexible Incorporation of Turn-level Performances

Discussion State Tracking (DST) is primarily evaluated based on Joint Goal Accuracy (JGA), the proportion of turns in which the actual dialogue state precisely matches the prediction. Typically, in DST, the dialogue or belief state for a given turn incorporates the user's previous intentions. After a misprediction, the cumulative belief state makes it easier to forecast correctly. Consequently, although it is a helpful statistic, it can be severe and underestimate the full capability of a DST model. Furthermore, improving JGA may decrease turn-level or non-cumulative belief state prediction due to uneven annotations. Therefore, employing JGA as the sole criterion for model selection may only be optimal in some circumstances.

In this article, the researchers explore numerous evaluation measures utilized for DST and their drawbacks. To solve the difficulties mentioned above, they suggest a new evaluation metric dubbed Flexible Goal Accuracy (FGA) (FGA). The FGA is a generalization of the JGA. Unlike JGA, however, it attempts to penalize locally right mispredictions, meaning that a previous turn is the source of the error. As a result, FGA assesses cumulative and turn-level prediction performance flexibly and provides better insight than earlier metrics. Additionally, the researchers demonstrate that FGA is a superior predictor of DST model success.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in