24 Facts About AlphaFold


AlphaFold is an artificial intelligence program developed by DeepMind, a subsidiary of Alphabet, which performs predictions of protein structure.

FactSnippet No. 1,582,758

Team that used AlphaFold 2 repeated the placement in the CASP competition in November 2020.

FactSnippet No. 1,582,759

AlphaFold started competing in the 2018 CASP using an artificial intelligence deep learning technique.

FactSnippet No. 1,582,760

AlphaFold 1 was built on work developed by various teams in the 2010s, work that looked at the large databanks of related DNA sequences now available from many different organisms, to try to find changes at different residues that appeared to be correlated, even though the residues were not consecutive in the main chain.

FactSnippet No. 1,582,761

Central to AlphaFold is a distance map predictor implemented as a very deep residual neural networks with 220 residual blocks processing a representation of dimensionality 64×64×128 – corresponding to input features calculated from two 64 amino acid fragments.

FactSnippet No. 1,582,762

Related searches

DeepMind AlphaFold 2 DNA NMR

Alongside a distance map in the form of a very finely-grained histogram of distances, AlphaFold predicts F and ? angles for each residue which are used to create the initial predicted 3D structure.

FactSnippet No. 1,582,763

The AlphaFold authors concluded that the depth of the model, its large crop size, the large training set of roughly 29,000 proteins, modern Deep Learning techniques, and the richness of information from the predicted histogram of distances helped AlphaFold achieve a high contact map prediction precision.

FactSnippet No. 1,582,764

Software design used in AlphaFold 1 contained a number of modules, each trained separately, that were used to produce the guide potential that was then combined with the physics-based energy potential.

FactSnippet No. 1,582,765

AlphaFold 2 replaced this with a system of sub-networks coupled together into a single differentiable end-to-end model, based entirely on pattern recognition, which was trained in an integrated way as a single integrated structure.

FactSnippet No. 1,582,766

AlphaFold team stated in November 2020 that they believe AlphaFold can be further developed, with room for further improvements in accuracy.

FactSnippet No. 1,582,767

However, the October 2021 update, named AlphaFold-Multimer, included protein complexes in its training data.

FactSnippet No. 1,582,768

In December 2018, DeepMind's AlphaFold placed first in the overall rankings of the 13th Critical Assessment of Techniques for Protein Structure Prediction.

FactSnippet No. 1,582,769

AlphaFold gave the best prediction for 25 out of 43 protein targets in this class, achieving a median score of 58.

FactSnippet No. 1,582,770

AlphaFold has not announced plans to make their code publicly available as of 5 March 2021.

FactSnippet No. 1,582,771

In 2018 AlphaFold 1 had only reached this level of accuracy in two of all of its predictions.

FactSnippet No. 1,582,772

AlphaFold 2 achieved an accuracy in modelling surface side chains described as "really really extraordinary".

FactSnippet No. 1,582,773

Three structures that AlphaFold 2 had the least success in predicting, two had been obtained by protein NMR methods, which define protein structure directly in aqueous solution, whereas AlphaFold was mostly trained on protein structures in crystals.

FactSnippet No. 1,582,774

AlphaFold 2 scoring more than 90 in CASP's global distance test is considered a significant achievement in computational biology and great progress towards a decades-old grand challenge of biology.

FactSnippet No. 1,582,775

Where structures that AlphaFold 2 did predict were for proteins that had strong interactions either with other copies of themselves, or with other structures, these were the cases where AlphaFold 2's predictions tended to be least refined and least reliable.

FactSnippet No. 1,582,776

However, the authors highlighted that many AlphaFold models were accurate enough to allow for the introduction of post-predictional modifications.

FactSnippet No. 1,582,777

At launch the database contains AlphaFold-predicted models of protein structures of nearly the full UniProt proteome of humans and 20 model organisms, amounting to over 365,000 proteins.

FactSnippet No. 1,582,778

AlphaFold planned to add more sequences to the collection, the initial goal being to cover most of the UniRef90 set of more than 100 million proteins.

FactSnippet No. 1,582,779

AlphaFold DB uses a monomeric model similar to the CASP14 version.

FactSnippet No. 1,582,780

AlphaFold has been used to predict structures of proteins of SARS-CoV-2, the causative agent of COVID-19.

FactSnippet No. 1,582,781