Therapeutics Data Commons

Artificial intelligence foundation for therapeutic science

Artificial intelligence is poised to enable breakthroughs and discoveries in therapeutic science. Therapeutics Data Commons is a coordinated initiative to access and evaluate artificial intelligence capability across therapeutic modalities and stages of discovery. The Commons is a resource with AI-solvable tasks, AI-ready datasets, and curated benchmarks, providing an ecosystem of tools, libraries, leaderboards, and community resources, including data functions, strategies for systematic model evaluation, meaningful data splits, data processors, and molecule generation oracles. All resources are integrated via an open Python library.
Therapeutic science is an exciting field with incredible opportunities for expansion, innovation, and impact. Curated AI-ready datasets, machine learning tasks, and benchmarks in the Commons serve as a meeting point betwen biochemical, biomedical and machine learning scientists. Therapeutics Data Commons is a resource to access and evaluate AI methods, supporting the development of AI methods, with a strong bent towards establishing the foundation of which AI methods are most suitable for drug discovery applications and why. It can facilitate algorithmic and scientific advances and accelerate AI method development, validation and transition into biomedical and clinical implementation.
TDC at a glance
Key presentations and publications of the Commons

Intuitive Interface

TDC software is minimally dependent on external packages. Any TDC dataset can be retrieved with just 3 lines of code.

From Bench to Bedside

TDC covers a wide range of learning tasks, including target discovery, activity screening, efficacy, safety, and manufacturing across biomedical products, including small molecules, antibodies, and vaccines.

Numerous Data Functions

TDC provides extensive data functions, including data evaluators, meaningful data splits, data processors, and molecule generation oracles.