DeepMind has launched AlphaProof, a new reinforcement-learning-based system for formal math reasoning, and AlphaGeometry 2, an improved version of our geometry-solving system. According to the company, these systems solved four out of six problems from this year’s International Mathematical Olympiad (IMO), achieving the same level as a silver medalist in the competition for the first time.

The IMO is the oldest, and most prestigious competition for young mathematicians, and it has been held annually since 1959. Each year, elite pre-college mathematicians train, sometimes for thousands of hours, to solve six exceptionally difficult problems in algebra, combinatorics, geometry and number theory. More recently, the annual IMO competition has also become widely recognised as a grand challenge in machine learning and an aspirational benchmark for measuring an AI system’s advanced mathematical reasoning capabilities.

This year, DeepMind applied its combined AI system to the competition problems provided by the IMO organizers. The model was scored according to the IMO’s point-awarding rules by prominent mathematicians Prof Sir Timothy Gowers, an IMO gold medalist and Fields Medal winner, and Dr Joseph Myers, a two-time IMO gold medalist and Chair of the IMO 2024 Problem Selection Committee. First, the problems were manually translated into formal mathematical language for the DeepMind systems to understand. In the official competition, students submit answers in two sessions of 4.5 hours each. AlphaProof solved two algebra problems and one number theory problem by determining the answer and proving it was correct. 

The model earned 28 out of 42 points, achieving the same level as a silver medalist.

Meet AlphaProof

AlphaProof is a system that trains itself to prove mathematical statements in the formal language Lean. It couples a pre-trained language model with the AlphaZero reinforcement learning algorithm, which previously taught itself how to master the games of chess, shogi and Go.

When presented with a problem, AlphaProof generates solution candidates and then proves or disproves them by searching for possible proof steps in Lean. Each proof that is found and verified reinforces AlphaProof’s language model, enhancing its ability to solve subsequent, more challenging problems.

The company trained AlphaProof for the IMO by proving or disproving millions of problems, covering various difficulties and mathematical topic areas over weeks leading up to the competition. The training loop was also applied during the contest, reinforcing proofs of self-generated variations of the contest problems until a full solution could be found.

AlphaGeometry 2 is a significantly improved version of AlphaGeometry. It’s a neuro-symbolic hybrid system in which the language model was based on Gemini and trained from scratch on an order of magnitude more synthetic data than its predecessor. This helped the model tackle much more challenging geometry problems, including problems about object movements and equations of angles, ratios, or distances.

Before this year’s competition, AlphaGeometry 2 could solve 83% of all historical IMO geometry problems from the past 25 years, compared to the 53% rate achieved by its predecessor. For IMO 2024, AlphaGeometry 2 solved Problem 4within 19 seconds after its formalization.

New frontiers

According to DeepMind, this system doesn’t require problems to be translated into a formal language and can be combined with other AI systems. They also tested this approach on this year’s IMO problems, and the results showed great promise. The company believes that in the future, mathematicians could use AI tools to explore hypotheses, try bold new approaches to solving long-standing problems, and quickly complete time-consuming elements of proofs.


Disclaimer: INDIAai has not tested the platform. This story is entirely based on the information shared by the company. 

Sources of Article

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE