Overview of TDC Datasets

At its core, TDC collects ML tasks and associated datasets spread across therapeutic domains. These tasks and datasets have the following properties:

  • Instrumenting disease treatment from bench to bedside with AI/ML: TDC covers a variety of learning tasks going from wet-lab target identification all the way to biomedical product manufacturing.
  • Building off the latest biotechnological platforms: TDC is regularly updated with novel datasets and tasks, such as those about antibody therapeutics and gene editing.
  • Providing AI/ML-ready datasets: TDC datasets provide rich representations of biomedical entities. The feature information is carefully curated and processed.
TDC logo

Learning Tasks in TDC

TDC tasks cover a range of therapeutic products and pipelines. They spans small molecules and biologics, where the latter group includes antibodies, peptides, microRNAs, and gene editing.

Further, TDC tasks map to the following drug discovery and development pipelines:

  • Target discovery: Tasks aiming to identify candidate drug targets.
  • Activity modeling: Tasks aiming to screen and generate, de novo, individual or combinatorial candidate hits with high binding activity towards targets.
  • Efficacy and safety: Tasks aiming to optimize pharmaceutical profiles of hits so that drugs can be delivered to the appropriate site safely and effectively.
  • Manufacturing: Tasks aiming to synthesize safe and efficacious therapeutics.

Below is a summary table of TDC learning tasks. To explore datasets, click the tag of the task of interest.

Learning Tasks Therapeutics Products Development Pipelines
Small-Molecule Macro-Molecule Cell & Gene Therapy Target Discovery Activity Modeling Efficacy & Safety Manufacturing
ADME
Tox
HTS
QM
Yields
Paratope
Epitope
Develop
CRISPROutcome
DTI
DDI
PPI
GDA
DrugRes
DrugSyn
PeptideMHC
AntibodyAff
MTI
Catalyst
MolGen
RetroSyn
Reaction

TDC also maintains a page with pointers to external resources that are relevant to therapeutics. Click here to visit.


Start Exploring TDC Datasets