In collaboration with scientists from the University of Oxford and the University of British Columbia, Sakana AI has developed an artificial intelligence (AI) system that can conduct end-to-end scientific research autonomously. In the report introducing the model, the team proposed and ran a fully AI-driven system for automated scientific discovery applied to machine learning research. The AI Scientist automates the entire research lifecycle, from generating novel research ideas, writing any necessary code, and executing experiments to summarizing experimental results, visualizing them, and presenting its findings in a full scientific manuscript.

The team also introduced an automated peer review process to evaluate generated papers, write feedback, and improve results. It is capable of evaluating generated papers with near-human accuracy. The automated scientific discovery process is repeated to iteratively develop ideas in an open-ended fashion and add them to a growing archive of knowledge, thus imitating the human scientific community.

About AI scientist

The AI Scientist is designed to be compute efficient. Each idea is implemented and developed into a full paper at approximately $15 per paper. While there are still occasional flaws in the documents produced by this first version, this cost and the promise the system shows so far illustrate the potential of AI scientists to democratize research and significantly accelerate scientific progress. 

According to Sakana AI, this work signifies the beginning of a new era in scientific discovery: bringing the transformative benefits of AI agents to the entire research process, including that of AI itself. The team believes that the AI Scientist takes us closer to a world where endless affordable creativity and innovation can be unleashed on the world’s most challenging problems.

The AI Scientist is a fully automated pipeline for end-to-end paper generation, enabled by recent advances in foundation models. Given a broad research direction starting from a simple initial codebase, such as an available open-source code base of prior research on GitHub, The AI Scientist can perform idea generation, literature search, experiment planning, experiment iterations, figure generation, manuscript writing, and reviewing to produce insightful papers. Furthermore, The AI Scientist can run in an open-ended loop, using its previous ideas and feedback to improve the next generation of ideas, thus emulating the human scientific community.

The AI scientist first “brainstorms” a diverse set of novel research directions. Given an idea and a template, the second phase of the AI Scientist executes the proposed experiments and then obtains and produces plots to visualize its results. Finally, the AI Scientist produces a concise and informative write-up of its progress in the style of a standard machine learning conference proceeding in LaTeX. A key aspect of this work is the development of an automated LLM-powered reviewer capable of evaluating generated papers with near-human accuracy.

Overcoming limitations

In its current form, The AI Scientist has several shortcomings. Its uses continue to radically improve in capability and affordability. The AI Scientist currently lacks vision capabilities, so it cannot fix visual issues with the paper or read plots. 

For example, the generated plots are sometimes unreadable, tables sometimes exceed the width of the page, and the page layout is often suboptimal. Adding multi-modal foundation models can fix this. It can incorrectly implement its ideas or unfairly compare baselines, leading to misleading results. And the model occasionally makes critical errors when writing and evaluating results. The researchers expect that all these will likely improve dramatically in future versions with the inclusion of multi-modal models and as the underlying foundation models.

Sources of Article

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE