Get featured on INDIAai

Contribute your expertise or opinions and become part of the ecosystem!

Researchers from the University of Pittsburgh and Microsoft Researcher’s Future Social Experiences (FUSE) lab, developed an AI-based model for better summarization of text. The flexible AI system they developed pays close attention to the beginning of documents for improved experimental performance.

The team says that in textual data such as news or discussions, automatic text summarization is challenging. Thus, summarization techniques in those form of texts should be modelled concerning the structure and nature of the data. In a paper co-authored by Ahmed Magooda, University of Pittsburgh and Cezary Marcjan, Microsoft Research FUSE lab, the researchers suggest stressing to the beginning of a document, to improve the performance of extractive summarization models.  

The team states that evaluations with the help of bidirectional attention mechanism, attending to the beginning of a document (initial comment/post) in a discussion thread, can introduce a consistent boost in ROUGE scores (set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing). This will also introduce a new State Of The Art (SOTA) ROUGE scores on the forum discussions dataset. Then the team extended the hypothesis to other generic forms of textual data by applying to the first few sentences. The team used two extractive summarization

datasets. The first set was from trip advisor forum discussions that consisted of 700 threads. The second data set contained 532 Microsoft Word documents across subjects. The results show that attending introductory sentences using bidirectional attention, improves the performance of extractive summarization models when even applied to a more generic form of textual data.  

Microsoft Research has earlier published a study detailing a “flexible” AI system capable of reasoning about relationships in “weakly structured” text. This is a follows that study. 

The team is planning to include more generic data sets in the next training and testing phases to verify their approach.

Want to publish your content?

Publish an article and share your insights to the world.



The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at