Overview of TDC Datasets
At its core, TDC collects machine learning tasks and their associated datasets spread across various therapeutic domains. These tasks and datasets have the following traits:
- Instrumenting disease treatment from bench to bedside with AI/ML: TDC covers a variety of learning tasks going from wet-lab target identification all the way to biomedical product manufacturing.
- Learning tasks and datasets built off the latest biotechnological platforms: TDC is regularly updated to add novel datasets and learning tasks, such as those on antibody therapeutics and gene-editing.
- Machine-learning ready datasets: TDC provides rich representation of biomedical entities in each dataset. The feature information is carefully curated and processed.

Learning Tasks in TDC
TDC learning tasks cover a range of therapeutic products and pipelines. It spans both small molecules and biologics, where the latter products include antibodies, peptides, microRNAs, and gene editing. Further, the tasks map to the following drug discovery and development pipelines:
- Target discovery pipelines aim to identify candidate druggable targets.
- Activity modeling pipelines aim to screen and generate, de novo, individual or combinatorial candidate hits with high binding activity towards the targets.
- Efficacy and safety pipelines aim to optimize pharmaceutical profiles of the hits so that products can be delivered to the site of action safely and efficaciously.
- Manufacturing pipelines aim to synthesize therapeutics.
Below is a summary table of TDC learning tasks in the perspective of therapeutic products and pipelines. To explore the datasets, click the tag of the task of interest.
Learning Tasks | Therapeutics Products | Development Pipelines | |||||
---|---|---|---|---|---|---|---|
Small-Molecule | Macro-Molecule | Cell & Gene Therapy | Target Discovery | Activity Modeling | Efficacy & Safety | Manufacturing | |
ADME |
|||||||
Tox |
|||||||
HTS |
|||||||
QM |
|||||||
Yields |
|||||||
Paratope |
|||||||
Epitope |
|||||||
Develop |
|||||||
CRISPROutcome |
|||||||
DTI |
|||||||
DDI |
|||||||
PPI |
|||||||
GDA |
|||||||
DrugRes |
|||||||
DrugSyn |
|||||||
PeptideMHC |
|||||||
AntibodyAff |
|||||||
MTI |
|||||||
Catalyst |
|||||||
MolGen |
|||||||
RetroSyn |
|||||||
Reaction |
We also maintain a page with pointers to external resources that are relevant to TDC. Click here to visit.
Start Exploring TDC Datasets