Overview of TDC Datasets

At its core, TDC collects machine learning tasks and their associated datasets spread across various therapeutic domains. These tasks and datasets have the following traits:

  • Instrumenting disease treatment from bench to bedside with AI/ML: TDC covers a variety of learning tasks going from wet-lab target identification all the way to biomedical product manufacturing.
  • Learning tasks and datasets built off the latest biotechnological platforms: TDC is regularly updated to add novel datasets and learning tasks, such as those on antibody therapeutics and gene-editing.
  • Machine-learning ready datasets: TDC provides rich representation of biomedical entities in each dataset. The feature information is carefully curated and processed.
TDC logo
TDC is an open-source effort. We welcome contributions from the research community. If you want to get involved, join our Slack Workspace!

Learning Tasks in TDC

TDC learning tasks cover a range of therapeutic products and pipelines. It spans both small molecules and biologics, where the latter products include antibodies, peptides, microRNAs, and gene editing. Further, the tasks map to the following drug discovery and development pipelines:

  • Target discovery pipelines aim to identify candidate druggable targets.
  • Activity modeling pipelines aim to screen and generate, de novo, individual or combinatorial candidate hits with high binding activity towards the targets.
  • Efficacy and safety pipelines aim to optimize pharmaceutical profiles of the hits so that products can be delivered to the site of action safely and efficaciously.
  • Manufacturing pipelines aim to synthesize therapeutics.

Below is a summary table of TDC learning tasks in the perspective of therapeutic products and pipelines. To explore the datasets, click the tag of the task of interest.

Learning Tasks Therapeutics Products Development Pipelines
Small-Molecule Macro-Molecule Cell & Gene Therapy Target Discovery Activity Modeling Efficacy & Safety Manufacturing
ADME
Tox
HTS
QM
Yields
Paratope
Epitope
Develop
CRISPROutcome
DTI
DDI
PPI
GDA
DrugRes
DrugSyn
PeptideMHC
AntibodyAff
MTI
Catalyst
MolGen
RetroSyn
Reaction

We also maintain a page with pointers to external resources that are relevant to TDC. Click here to visit.


Start Exploring TDC Datasets