AlphaFold 3: The Turns of the Amino
[Paper: Accurate structure prediction of biomolecular interactions with AlphaFold 3]
Summary by Adrian Wilkins-Caruana
In 2021, Google DeepMind announced AlphaFold 2, their latest deep learning–based, protein-folding algorithm. Protein folding is the process of predicting the 3d coordinates of the heavy atoms in a given protein using some basic information about that protein, such as its primary amino acid sequence. Protein folding methods that preceded AlphaFold 2 — including the original AlphaFold — were ok, but AlphaFold 2 blew them out of the water, achieving error rates 3x smaller than the next best method. But now AlphaFold 3, which Google DeepMind has developed in collaboration with Isomorphic Labs, goes beyond proteins to analyzing a broad spectrum of biomolecules.
Given an input list of molecules, AlphaFold 3 can determine their joint 3d structure to reveal how they all fit together. In addition to proteins, AlphaFold 3 can model other large molecules like DNA and RNA, as well as smaller ones known as ligands. For example, the figure below shows the structure of a protein (blue) bound to a double helix of DNA (pink), and how this structure compares to the ground-truth, experimentally measured structure (gray).
Architecturally, AlphaFold 3 is very similar to AlphaFold 2. Each method has two main components: one for generating representations of the molecules, and another for predicting their structure. AlphaFold 3’s representation method, called Pairformer, is a simpler version of the one in AlphaFold 2. Both of these methods work like a transformer: The attention mechanism operates on the chemical structure of the biomolecule, and a gating mechanism generates representations for pairs of atoms in the molecule, similar to the causal attention mask in a transformer. The figure below shows an example of the pair representation, and how the elements of the representation correspond to atoms in a graph representation of a molecule.
The next step AlphaFold 3 performs is determining the actual 3d coordinates of each atom in the joint biomolecular structure. Unlike AlphaFold 2, which used a complicated structure-prediction module that needed carefully tuned parameters to ensure that its predictions were plausible, AlphaFold 3 uses a diffusion model to directly predict the 3d coordinates. This works kind of like a text-conditioned, image-generating diffusion model, except that AlphaFold 3 uses the pairwise representations to condition the denoising of the atoms’ coordinates.
While a diffusion model offers many benefits — like not needing to enforce global rotational and translational invariances during generation — it also has drawbacks. For example, the researchers found that the model would hallucinate plausible chemical structures where they shouldn’t exist. To counteract this, they used predicted structures from the AlphaFold-Multimer v2.3 — an extension of AlphaFold 2 for protein complex structure prediction — to enrich AlphaFold 3’s training data. This effectively taught AlphaFold 3 to mimic the non-hallucination behavior of its predecessor.
This new AlphaFold model is a tremendous leap forward in terms of its predictive accuracy, but it’s also a one-stop shop for many biomolecular modeling tasks. For example, AlphaFold 3 achieves much higher accuracy on protein-nucleic acid interactions than nucleic acid–specific predictors. It’s a similar story for protein-ligand interactions and antibody-antigen prediction. The method also demonstrates that deep learning methods are highly effective at modeling a variety of biomolecular interactions and will help us better understand how the most complex processes in our bodies work, like drug interactions, hormone production, and the health-preserving process of DNA repair.