Antibody-antigen Affinity Prediction Task Overview

Definition: Antibodies recognize pathogen antigens and destroy them. The activity is measured by their binding affinities. This task is to predict the affinity from the amino acid sequences of both antigen and antibodies.

Impact: Compared to small-molecule drugs, antibodies have numerous ideal properties such as minimal adverse effect and also can bind to many "undruggable" targets due to different biochemical mechanisms. Besides, a reliable affinity predictor can help accelerate the antibody development processes by reducing the amount of wet-lab experiments.

Generalization: The models are expected to extrapolate to unseen classes of antigen and antibody pairs.

Product: Antibody, immunotherapy.

Pipeline: Activity.


Dataset Description: Antibody-antigen affinity measures the efficacy of the antibody to the antigen. Processed from SAbDab dataset, where we only use protein/peptide antigens for sequence compatbility.

Task Description: Regression. Given the amino acid sequence of antibody and antigen, predict their binding affinity.

Dataset Statistics: 493 pairs, 431 antibodies and 401 antigens.

Note: In the antibody column, each antibody is organized from left to right as heavy chain to light chain. Also, we highly suggest you take the log affinity when making prediction.

Dataset Split: Random Split

from tdc.multi_pred import AntibodyAff
data = AntibodyAff(name = 'Protein_SAbDab')
split = data.get_split()


[1] Dunbar, James, et al. “SAbDab: the structural antibody database.” Nucleic acids research 42.D1 (2014): D1140-D1146.

Dataset License: CC BY 3.0.