Lie detection involves the process of determining the integrity of a given communication. When producing deceptive narratives, liars employ verbal strategies to create false beliefs in the interacting partners and are thus involved in a specific and temporary psychological and emotional state. For this reason, the Undeutsch hypothesis suggests that deceptive narratives differ in form and content from truthful narratives.

This topic has always been under constant investigation and development in the field of cognitive psychology, given its significant and promising applications in the forensic and legal setting. Its potential pivotal role in determining the honesty of witnesses and possible suspects during investigations and legal proceedings impacts the investigative information-gathering process and the final decision-making level.

Decades of research have focused on identifying verbal cues for deception and developing effective methods to differentiate between truthful and deceptive narratives. Such verbal cues are, at best, subtle and typically result in naive and expert individuals performing just above chance levels. A potential explanation from social psychology for this unsatisfactory human performance is the intrinsic human inclination to the truth bias, i.e., the cognitive heuristic of presumption of honesty, which makes people assume that an interaction partner is truthful unless they have reasons to believe otherwise.

More recently, the issue of verbal lie detection has also been tackled by employing computational techniques, such as stylometry. Stylometry refers to a set of methodologies and tools from computational linguistics and artificial intelligence that allow quantitative analysis of linguistic features within written texts to uncover distinctive patterns that can infer and characterise authorship or other stylistic attributes.

LLMs and Verbal lie detection

LLMs are Transformer-based models trained on large corpora of text that have proven to generate coherent text in human natural language and are highly flexible in a wide range of NLP tasks. In addition, these models can be further fine-tuned on specific tasks using smaller task-specific datasets, achieving state-of-the-art results.

In a study, "Verbal lie detection using Large Language Models", the researchers tested the ability of a fine-tuned LLM (FLAN-T5) on lie-detection tasks. First, given the extreme flexibility of LLM, they tested whether fine-tuning an LLM is a valid procedure to detect deception from raw texts above the chance level and outperform the classical machine and deep learning approaches. Second, they wanted to investigate whether fine-tuning an LLM on deceptive narratives enables the model also to detect new types of deceptive narratives. Third, they investigated whether it is possible to successfully fine-tune an LLM on a multiple-context dataset. 

Furthermore, they hypothesised that model performance may depend on model size, given that the larger the model, the better it forms its inner representation of language. 

Their experiments introduced the DeCLaRatiVE stylometry technique, a new theory-based stylometric approach to investigate deception in texts from four psychological frameworks (Distancing, Cognitive Load, Reality Monitoring, and Verifiability approach).

Given the results, the researchers highlight the significance of a diversified dataset in achieving a generalised good performance through their report. They also considered the balance between the diversity of the dataset and the size of the LLM, suggesting that the more diverse the dataset is, the bigger the model required to achieve higher-level accuracy. According to the team, their approach's main advantage is its applicability to raw text without the need for extensive training or handcrafted features.

Understanding the limitation

Though the study was relevant and successful, it did have certain limitations. The first notable limitation concentrated on lie detection within three contexts: personal opinions, autobiographical memories, and future intentions. 

This restricted scope limits the possibility of accurately classifying deceptive texts within different domains. A second limitation is that the research considered datasets developed in experimental set-ups designed to collect genuine and completely fabricated narratives. 

Sources of Article

https://www.nature.com/articles/s41598-023-50214-0

Image: Unsplash

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in