Overview of TDC Datasets

At its core, TDC collects ML tasks and associated datasets spread across therapeutic domains. These tasks and datasets have the following properties:

  • Instrumenting disease treatment from bench to bedside with AI/ML: TDC covers a variety of learning tasks going from wet-lab target identification to biomedical product manufacturing.
  • Building off the latest biotechnological platforms: TDC is regularly updated with novel datasets and tasks, such as antibody therapeutics and gene editing.
  • Providing AI/ML-ready datasets: TDC datasets provide rich representations of biomedical entities. The feature information is carefully curated and processed.
TDC logo

Learning Tasks in TDC

TDC tasks cover a range of therapeutic products and pipelines. They span small molecules and biologics, where the latter group includes antibodies, peptides, microRNAs, and gene editing.

Further, TDC tasks map to the following drug discovery and development pipelines:

  • Target discovery: Tasks to identify candidate drug targets.
  • Activity modeling: Tasks to screen and generate individual or combinatorial candidates with high binding activity towards targets.
  • Efficacy and safety: Tasks to optimize therapeutic signatures indicative of drug safet and efficacy.
  • Manufacturing: Tasks to synthesize therapeutics.

Below is a summary table of TDC learning tasks. To explore datasets, click the tag of the task of interest.

Learning Tasks Therapeutics Products Development Pipelines
Small-Molecule Macro-Molecule Cell & Gene Therapy Target Discovery Activity Modeling Efficacy & Safety Manufacturing
ADME
Tox
HTS
QM
Yields
Paratope
Epitope
Develop
CRISPROutcome
DTI
DDI
PPI
GDA
DrugRes
DrugSyn
PeptideMHC
AntibodyAff
MTI
Catalyst
MolGen
RetroSyn
Reaction

TDC also maintains a page with pointers to external resources that are relevant to therapeutics. Click here to visit.


Start Exploring TDC Datasets