Results for ""
MIT researchers have created a user-friendly system called SymGen that enables people to verify an LLM’s responses. According to the MIT study, Large Language models sometimes “hallucinate” by generating incorrect or unsupported information in response to a query.
Due to this hallucination problem, human fact-checkers often verify an LLM’s responses, especially if a model is deployed in a high-stakes setting like health care or finance. However, validation processes typically require people to read through long documents cited by the model, a task so onerous and error-prone it may prevent some users from deploying generative AI models in the first place.
MIT researchers created this user-friendly system to help human validators. With SymGen, an LLM generates responses with citations that point directly to a place in a source document, such as a given cell in a database.
Users hover over highlighted portions of its text response to see data the model used to generate that specific word or phrase. At the same time, the unhighlighted portions show users which phrases need additional attention to check and verify.
“We give people the ability to selectively focus on parts of the text they need to be more worried about. In the end, SymGen can give people higher confidence in a model’s responses because they can easily take a closer look to ensure that the information is verified,” says Shannon Shen, an electrical engineering and computer science graduate student and co-lead author of a paper on SymGen.
Through a user study, Shen and his collaborators found that SymGen sped up verification time by about 20 percent compared to manual procedures. By making it faster and easier for humans to validate model outputs, SymGen could help people identify errors in LLMs deployed in a variety of real-world situations, from generating clinical notes to summarizing financial market reports.
A SymGen user first provides the LLM with data it can reference in its response, such as a table that contains statistics from a basketball game. Then, rather than immediately asking the model to complete a task, like generating a game summary from those data, the researchers perform an intermediate step. They prompt the model to generate its response in a symbolic form.
With this prompt, every time the model wants to cite words in its response, it must write the specific cell from the data table that contains the information it is referencing. For instance, if the model wants to cite the phrase “Portland Trailblazers” in its response, it would replace that text with the cell name in the data table that contains those words.
SymGen then resolves each reference using a rule-based tool that copies the corresponding text from the data table into the model’s response.
During a user study, most participants said SymGen made it easier to verify LLM-generated text. They could validate the model’s responses about 20 percent faster than if they used standard methods.
However, SymGen is limited by the quality of the source data. The LLM could cite an incorrect variable, and a human verifier may be none-the-wiser.