Researchers from Carnegie Mellon University and MIT Lincoln Laboratory have found that training an AI model with mathematically "diverse" teammates makes it better at working with AI it has never worked with before.

As AI gets better at doing things that humans used to be the only ones who could do, like driving cars, many people see teaming intelligence as the next step. In the future, humans and AI will work together to do high-stakes jobs like complex surgery or protecting against missiles. But before teaming intelligence can take off, researchers need to solve a problem that makes it hard for people to work together: people often don't like or trust their AI partners.

Is diversity the key to collaboration?

Now, new research shows that diversity is one of the essential things researchers can do to make AI a better team player. Recently, researchers at the MIT Lincoln Laboratory have found that training an AI model with mathematically "diverse" teammates improves working with other AI it has never worked with before in the card game Hanabi. Also, both Facebook and Google's DeepMind published work at the same time that added diversity to training to improve results in games where humans and AI work together.

Adapting to different behaviours

Many researchers are utilizing Hanabi as a test bed for developing cooperative AI. Hanabi requires players to work together to stack cards in the correct order, but they can only see their teammates' cards and can only offer each other a few hints as to which cards they possess.

Lincoln Laboratory researchers used humans in a previous experiment to evaluate one of the world's best-performing Hanabi AI models. They were startled to discover that humans despised playing with this AI model, describing it as a perplexing and unreliable teammate. "We're missing something about human preference," Allen adds, "and we're not yet competent at developing models that could function in the current world."

Why reinforcement learning?

The researchers wondered if cooperative AI required special training. Reinforcement learning is a sort of AI that traditionally learns how to succeed at complex tasks by determining which behaviours offer the most significant reward. It is frequently trained and tested on models that are comparable to itself. This method has produced unrivalled AI players in competitive games like Go and StarCraft.

However, for AI to be a good collaborator, it may need to be concerned not only with maximizing reward while working with other AI agents but also with something more fundamental: understanding and adapting to the abilities and preferences of others. Moreover, it must learn from and adapt to diversity.

Conclusion

Indeed, this research did not involve human testing of Any-Play. However, simultaneously with the lab's work, DeepMind published a study using a similar diversity-training strategy to produce an AI entity capable of playing the collaboration game Overcooked with humans. As Allen explains, this outcome leads us to believe that our technique, which we feel is much more generalizable, would also work well with humans. In addition, Facebook also used a variety of training to promote collaboration among Hanabi AI agents. Still, its method was more complex and required adjustments to the Hanabi game rules to be executable.

Unproven is the notion that inter-algorithm cross-play scores are reliable indications of human preference. To reintroduce the human perspective into the process, the researchers intend to attempt a correlation between a person's feelings toward an AI, such as mistrust or perplexity, and the same goals employed to train the AI. The discovery of these links could speed up progress in the field.

Want to publish your content?

Publish an article and share your insights to the world.

ALSO EXPLORE

DISCLAIMER

The information provided on this page has been procured through secondary sources. In case you would like to suggest any update, please write to us at support.ai@mail.nasscom.in