The structure of a human protein modelled by the AlphaFold computer program. Its database of proteins could “fundamentally change biological research”
KAREN ARNOTT/GETTY IMAGES

British scientists’ proteins breakthrough promises a medical revolution

 

Thursday, July 22, 2021

By Tom Whipple, Science Editor

Reprinted from The Times (London)

 

British researchers have opened up a “new vista of science” by creating a database of proteins that promises to power a revolution in biology and drug discovery.

A team from Deepmind, the AI company, has used a computer program called Alphafold to predict the shape of 350,000 protein structures — the building blocks of life — in one day more than doubling the number understood by science.

The tool promises a “revolution” in the life sciences, said Edith Heard, director-general of the European Molecular Biology Laboratory, which worked with Deepmind on making the predictions accessible.

“Proteins represent the fundamental building blocks that living organisms are made of,” she said.

“Accurately predicting their structures, has a huge range of scientific applications from developing new drugs and treatments for disease, right through to designing future crops that can withstand climate change or enzymes that can degrade plastics. So the applications are actually limited only by our imagination.”

Proteins are the molecular machines that drive processes in our cells and throughout nature. Understanding the structures they make is key to understanding much of basic biology. Having reliable predictions for what they look like “will be transformative for our understanding of how life works”, said Heard.

Among the researchers given early access to the library were a team who work on enzymes that digest plastic. Professor John McGeehan, from the University of Portsmouth, said that it helped them “jump at least a year ahead”.

“What took us months and years to do, AlphaFold was able to do in a weekend.”

The newly released library, described in the journal Nature includes 98.5 per cent of human proteins by length, where previously just 17 per cent had a known structure. For 36 per cent of these, the company believes it has accuracy equivalent to experimental methods. For 58 per cent, it thinks the accuracy is good enough to be useful to biology.

Despite their importance, determining the structure of proteins is extremely difficult, and in some cases impossible. Although it is now easy to find out their chemical composition, working out the resulting three dimensional shape requires complex experimental work that can take months — for instance by crystallising the protein and then using x-rays to see the position of each atom.

Last year Deepmind, a U.K. subsidiary of Google’s parent company Alphabet, announced that it had designed a computer program that could take the chemical structure of a protein and in most cases accurately determine its shape.

The program, from the company that had previously created the world’s best chess and go programs, used the structure of known proteins to teach itself the “rules” for how they folded. In the same way, it also taught itself to provide a confidence for its predictions.

The latest work is its first large scale output of the program. Deepmind plans to produce millions more predictions, and improve the ones it has already made.

Sir Paul Nurse, the Nobel Prize-winning biologist and head of the Crick Laboratory in London, is one of those who also had early access to the library.

“This is a very major contribution, very major,” he said. He said that many researchers will still want experimental confirmation of structures, at least initially, but it allowed them to very quickly test out hypotheses on molecular pathways.

“What this provides is a much simpler and quicker way of getting that information”. Now, he said, he wanted to see what the community does with it. “There’s a huge amount of data here and what interests me is how you turn that data into knowledge.”

Although structural experiments will remain the gold standard, the ease with which many proteins can now be accessed could entirely change the way the field operates, other researchers said.

“I know there’s going to be thousands of scientists tomorrow delightfully clicking through and looking at different structures and immediately having ideas, ideas about how that works and ideas about the next experiment,” said Ewan Birney, deputy director-general at EMBL.

“I keep pinching myself a bit about it,” he added. “A whole vista of this science is opening up.”

Understanding proteins is key to basic biology

What is a protein?
Proteins are the cogs of biology. They are the structures that form the moving parts of the fundamental processes that drive the natural world at the smallest level.

Why do we not know what they are like?
The chemical composition of a protein is precisely determined — DNA provides the code for the amino acids that build each protein. The problem is, the function of the protein is not determined by the chemicals in it so much as the three-dimensional shape they form. That is very hard to predict.

How do we predict it currently?
The gold standard are experiments in which, for instance, the proteins are crystallised and then observed through an x-ray. They are extremely fiddly — a little like trying to work out the shape of a skyscraper from its shadow.

What did DeepMind do?
The company’s computer program, Alphafold, used machine learning to look through the known structures, and spot patterns in how the different atoms weaved and folded onto each other. Then it tested its pattern-spotting ability on newly analysed proteins for which the structure was not published. In an international competition, it came the closest ever to being able to predict to near-atom accuracy the shape of novel proteins.

What can we do with this knowledge?
A lot of work is yet to be done. Scientists will want to test out its predictions themselves before they trust them, even in those where the program itself believes itself to be highly accurate. They will also want to apply the program to protein complexes — some of the really interesting biology occurs when proteins interact with each other. But, ultimately, manipulating and understanding proteins is key to basic biology.

Particularly for scientists in less well-resourced fields, such as neglected tropical diseases, a free tool for determining protein structure could be transformative.