Overview of TDC Datasets
At its core, TDC collects ML tasks and associated datasets across therapeutic modalities and stages of discovery. These tasks and datasets have the following properties:
- Instrumenting disease treatment from bench to bedside with AI/ML: TDC covers a variety of learning tasks going from wet-lab target identification to biomedical product manufacturing.
- Building off the latest biotechnological platforms: TDC is regularly updated with novel datasets and tasks, such as antibody therapeutics and gene editing.
- Providing AI/ML-ready datasets: TDC datasets provide rich information on biomedical entities. This information is carefully curated, processed, and readily available in TDC.

Machine Learning Tasks in TDC
ML tasks cover a range of therapeutic modalities, including small molecules and biologics, including antibodies, peptides, miRNAs, and gene editing therapies. They also map to drug discovery and development pipelines:
- Target discovery: Tasks to identify candidate drug targets.
- Activity modeling: Tasks to screen and generate individual or combinatorial candidates with high binding activity towards targets.
- Efficacy and safety: Tasks to optimize therapeutic signatures indicative of drug safety and efficacy.
- Manufacturing: Tasks in support of synthesis and manufacturing of therapeutics.
ML Tasks | Therapeutic Modalities | Stages of Discovery and Development | |||||
---|---|---|---|---|---|---|---|
Small Molecules | MacroMolecules | Cell & Gene Therapy | Target Discovery | Activity Modeling | Efficacy & Safety | Manufacturing | |
ADME |
|||||||
Tox |
|||||||
HTS |
|||||||
QM |
|||||||
Yields |
|||||||
Epitope |
|||||||
Develop |
|||||||
CRISPROutcome |
|||||||
DTI |
|||||||
DDI |
|||||||
PPI |
|||||||
GDA |
|||||||
DrugRes |
|||||||
DrugSyn |
|||||||
PeptideMHC |
|||||||
AntibodyAff |
|||||||
MTI |
|||||||
Catalyst |
|||||||
MolGen |
|||||||
RetroSyn |
|||||||
Reaction |
TDC also maintains a list of external data resources in therapeutic science.
Explore TDC Datasets