• A “deep learning” software program from Google-owned lab DeepMind showed great progress in solving one in every of biology’s biggest challenges – understanding protein folding.

  • Protein folding is the method by which a protein takes its shape from a string of constructing blocks to its final three-dimensional structure, which determines its function.

  • By higher predicting how proteins take their structure, or “fold,” scientists can more quickly develop drugs that, for instance, block the motion of crucial viral proteins.

Solving what biologists call “the protein-folding problem” is an enormous deal. Proteins are the workhorses of cells and are present in all living organisms. They are made up of long chains of amino acids and are vital for the structure of cells and communication between them in addition to regulating the entire chemistry within the body.

This week, the Google-owned artificial intelligence company DeepMind demonstrated a deep-learning program called AlphaFold2, which experts are calling a breakthrough toward solving the grand challenge of protein folding.

Proteins are long chains of amino acids linked together like beads on a string. But for a protein to do its job within the cell, it must “fold” – a technique of twisting and bending that transforms the molecule into a fancy three-dimensional structure that may interact with its goal within the cell. If the folding is disrupted, then the protein won’t form the right shape – and it won’t have the ability to perform its job contained in the body. This can result in disease – as is the case in a standard disease like Alzheimer’s, and rare ones like cystic fibrosis.

Deep learning is a computational technique that uses the customarily hidden information contained in vast datasets to unravel questions of interest. It’s been used widely in fields resembling games, speech and voice recognition, autonomous cars, science and medicine.

I consider that tools like AlphaFold2 will help scientists to design latest kinds of proteins, ones that will, for instance, help break down plastics and fight future viral pandemics and disease.

I’m a computational chemist and creator of the book The State of Science. My students and I study the structure and properties of fluorescent proteins using protein-folding computer programs based on classical physics.

After many years of study by 1000’s of research groups, these protein-folding prediction programs are superb at calculating structural changes that occur after we make small alterations to known molecules.

But they haven’t adequately managed to predict how proteins fold from scratch. Before deep learning got here along, the protein-folding problem seemed impossibly hard, and it seemed poised to frustrate computational chemists for a lot of many years to come back.

A series of amino acids goes through several folding steps, which occurs through hydrogen bonds between amino acids in numerous regions of the protein, before arriving at the ultimate structure. The example shown here is hemoglobin, a protein in red blood cells that transports oxygen to body tissues.
Anatomy & Physiology, Connexions website, CC BY

Protein folding

The sequence of the amino acids – which is encoded in DNA – defines the protein’s 3D shape. The shape determines its function. If the structure of the protein changes, it’s unable to perform its function. Correctly predicting protein folds based on the amino acid sequence could revolutionize drug design, and explain the causes of latest and old diseases.

All proteins with the identical sequence of amino acid constructing blocks fold into the identical three-dimensional form, which optimizes the interactions between the amino acids. They do that inside milliseconds, although they’ve an astronomical variety of possible configurations available to them – about 10 to the ability of 300. This massive number is what makes it hard to predict how a protein folds even when scientists know the total sequence of amino acids that go into making it. Previously predicting the structure of protein from the amino acid sequence was unimaginable. Protein structures were experimentally determined, a time-consuming and expensive endeavor.

Once researchers can higher predict how proteins fold, they’ll have the ability to raised understand how cells function and the way misfolded proteins cause disease. Better protein prediction tools may also help us design drugs that may goal a selected topological region of a protein where chemical reactions happen.

What’s your move?
style-photography/Getty Images

AlphaFold is born from deep-learning chess, Go and poker games

The success of DeepMind’s protein-folding prediction program, called AlphaFold, isn’t unexpected. Other deep-learning programs written by DeepMind have demolished the world’s best chess, Go and poker players.

In 2016 Stockfish-8, an open-source chess engine, was the world’s computer chess champion. It evaluated 70 million chess positions per second and had centuries of amassed human chess strategies and many years of computer experience to attract upon. It played efficiently and brutally, mercilessly beating all its human challengers without an oz. of finesse. Enter deep learning.

On Dec. 7, 2017, Google’s deep-learning chess program AlphaZero thrashed Stockfish-8. The chess engines played 100 games, with AlphaZero winning 28 and tying 72. It didn’t lose a single game. AlphaZero did only 80,000 calculations per second, versus Stockfish-8’s 70 million calculations, and it took just 4 hours to learn chess from scratch by playing against itself a number of million times and optimizing its neural networks because it learned from its experience.

AlphaZero didn’t learn anything from humans or chess games played by humans. It taught itself and, in the method, derived strategies never seen before. In a commentary in Science magazine, former world chess champion Garry Kasparov wrote that by learning from playing itself, AlphaZero developed strategies that “reflect the reality” of chess fairly than reflecting “the priorities and prejudices” of the programmers. “It’s the embodiment of the cliché ‘work smarter, not harder.’”

How do proteins fold?

CASP – the Olympics for molecular modelers

Every two years, the world’s top computational chemists test the talents of their programs to predict the folding of proteins and compete within the Critical Assessment of Structure Prediction (CASP) competition.

In the competition, teams are given the linear sequence of amino acids for about 100 proteins for which the 3D shape is understood but hasn’t yet been published; they then should compute how these sequences would fold. In 2018 AlphaFold, the deep-learning rookie on the competition, beat all the normal programs – but barely.

Two years later, on Monday, it was announced that Alphafold2 had won the 2020 competition by a healthy margin. It whipped its competitors, and its predictions were comparable to the present experimental results determined through gold standard techniques like X-ray diffraction crystallography and cryo-electron microscopy. Soon I expect AlphaFold2 and its progeny will probably be the methods of selection to find out protein structures before resorting to experimental techniques that require painstaking, laborious work on expensive instrumentation.

One of the explanations for AlphaFold2’s success is that it could use the Protein Database, which has over 170,000 experimentally determined 3D structures, to coach itself to calculate the appropriately folded structures of proteins.

The potential impact of AlphaFold may be appreciated if one compares the variety of all published protein structures – roughly 170,000 – with the 180 million DNA and protein sequences deposited within the Universal Protein Database. AlphaFold will help us sort through treasure troves of DNA sequences trying to find latest proteins with unique structures and functions.

Has AlphaFold made me, a molecular modeler, redundant?

As with the chess and Go programs – AlphaZero and AlphaGo – we don’t exactly know what the AlphaFold2 algorithm is doing and why it uses certain correlations, but we do know that it really works.

Besides helping us predict the structures of essential proteins, understanding AlphaFold’s “considering” may also help us gain latest insights into the mechanism of protein folding.

One of probably the most common fears expressed about AI is that it’ll result in large-scale unemployment. AlphaFold still has a major approach to go before it could actually consistently and successfully predict protein folding.

However, once it has matured and this system can simulate protein folding, computational chemists will probably be integrally involved in improving the programs, trying to grasp the underlying correlations used, and applying this system to unravel essential problems resembling the protein misfolding related to many diseases resembling Alzheimer’s, Parkinson’s, cystic fibrosis and Huntington’s disease.

AlphaFold and its offspring will definitely change the way in which computational chemists work, but it surely won’t make them redundant. Other areas won’t be as fortunate. In the past robots were able to exchange humans doing manual labor; with AI, our cognitive skills are also being challenged.

This article was originally published at theconversation.com