A team of AI researchers, biologists and evolutionary specialists has designed and built an AI model capable of generating the code to synthesize novel proteins. In their paper published in the journal Science, the group describes the factors that went into developing their new AI model, which they call ESM3, and how they used it to synthesize a previously unknown bright, fluorescent protein.

Prior research has shown that synthesizing proteins can provide unique insights into the structure and function of natural proteins. To date, most such proteins are copies of those found in nature. For this new study, the researchers used an AI model to mimic the evolutionary process of a protein that never existed naturally.

Generating artificial proteins offers the possibility of new avenues of research, both in better understanding the nature of proteins and their uses and developing novel applications. The research team used data about existing proteins as a basis for generating new proteins.

ESM3 is a multimodal generative language model, which means that, like its chatbot cousins, it learns about the nature of things when trained on massive amounts of data. In this case, the multimodal generative language model was trained on 771 billion tokens generated from 3.15 billion protein sequences, 236 million protein structures and 539 million protein annotations.

According to the researchers, this was like giving the model 500 million years of evolutionary knowledge, which allowed it to start with basic code that evolved over virtual time into a modern virtual protein. The virtual protein was then converted to a real-world artificial protein using standard protein synthesis techniques. The result was a protein with a genetic sequence that was different from other known proteins.

The research team specifically asked their model to generate a new green fluorescent protein—other such proteins, which fluoresce under ultraviolet light, are often used as markers. The team named the new protein esmGFP. They suggest that their model and others like it could be used to create new proteins for medicine, environmental research, and a wide variety of other applications.

Sources of Article

Want to publish your content?

Publish an article and share your insights to the world.

Get Published Icon
ALSO EXPLORE