Loading a dataset in TDC:
from tdc.single_pred import ADME
data = ADME(name='Caco2_Wang')
df = data.get_data()
split = data.get_split()
Therapeutics Data Commons
Machine Learning Datasets and Tasks for Therapeutics
Therapeutics Data Commons (TDC) is the first unifying framework to systematically access and evaluate machine learning across the entire range of therapeutics.
Therapeutics machine learning is an exciting field with incredible opportunities for expansion, innovation, and impact. The collection of curated datasets, learning tasks, and benchmarks in TDC serves as a meeting point for domain and machine learning scientists.
We envision that TDC can considerably accelerate machine-learning model development, validation and transition into biomedical and clinical implementation.
3 Lines of Code
TDC is minimally dependent on external packages. Any TDC dataset can be retrieved using only 3 lines of code.
From Bench to Bedside
TDC covers a wide range of learning tasks, including target discovery, activity screening, efficacy, safety, and manufacturing across biomedical products, including small molecules, antibodies, and vaccines.
Numerous Data Functions
TDC provides extensive data functions, including data evaluators, meaningful data splits, data processors, and molecule generation oracles.
Loading a dataset in TDC:
from tdc.single_pred import ADME
data = ADME(name='Caco2_Wang')
df = data.get_data()
split = data.get_split()