Results for ""
MIT researchers develop a new computational model that may be able to identify antibody drugs that can target a variety of infectious diseases. This work could enable researchers to sift through millions of possible antibodies to identify those that could be used to treat SARS-CoV-2 and other infectious diseases.
“Our method allows us to scale, whereas others do not, to the point where we can actually find a few needles in the haystack,” says Bonnie Berger, the Simons Professor of Mathematics, the head of the Computation and Biology group at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and one of the senior authors of the new study. “If we could help to stop drug companies from going into clinical trials with the wrong thing, it would really save a lot of money.”
The technique, which focuses on modelling the hypervariable regions of antibodies, also holds the potential for analyzing entire antibody repertoires from individual people. This could be useful for studying the immune response of people who are super responders to diseases such as HIV to help figure out why their antibodies fend off the virus so effectively.
Hypervariable regions vary in length but usually contain fewer than 40 amino acids. It has been estimated that the human immune system can produce up to 1 quintillion different antibodies by changing the sequence of these amino acids, helping to ensure that the body can respond to a huge variety of potential antigens. Those sequences aren’t evolutionarily constrained the same way that other protein sequences are, so it’s difficult for large language models to learn to predict their structures accurately.
To model those hypervariable regions, the researchers created two modules that build on existing protein language models. One of these modules was trained on hypervariable sequences from about 3,000 antibody structures found in the Protein Data Bank (PDB), allowing it to learn which sequences tend to generate similar structures. The other module was trained on data that correlates about 3,700 antibody sequences to how strongly they bind three different antigens.
The resulting computational model, known as AbMap, can predict antibody structures and binding strength based on their amino acid sequences. To demonstrate the usefulness of this model, the researchers used it to predict antibody structures that would strongly neutralize the spike protein of the SARS-CoV-2 virus.
The researchers started with a set of antibodies that had been predicted to bind to this target, then generated millions of variants by changing the hypervariable regions. Their model was able to identify antibody structures that would be the most successful, much more accurately than traditional protein-structure models based on large language models.
Then, the researchers took the additional step of clustering the antibodies into groups that had similar structures. They chose antibodies from each of these clusters to test experimentally, working with researchers at Sanofi. Those experiments found that 82 per cent of these antibodies had better binding strength than the original antibodies that went into the model.
Identifying a variety of good candidates early in the development process could help drug companies avoid spending a lot of money on testing candidates that end up failing later on, the researchers say.
Using this technique, researchers could also try to answer some longstanding questions about why different people respond to infection differently. For example, why do some people develop much more severe forms of COVID-19, and why do some people who are exposed to HIV never become infected?
Scientists have been trying to answer those questions by performing single-cell RNA sequencing of immune cells from individuals and comparing them — a process known as antibody repertoire analysis. Previous work has shown that antibody repertoires from two different people may overlap as little as 10 per cent. However, sequencing doesn’t offer as comprehensive a picture of antibody performance as structural information because two antibodies that have different sequences may have similar structures and functions.
The new model can help to solve that problem by quickly generating structures for all of the antibodies found in an individual. In this study, the researchers showed that when the structure is taken into account, there is much more overlap between individuals than the 10 per cent seen in sequence comparisons. They now plan to further investigate how these structures may contribute to the body’s overall immune response against a particular pathogen.