PerturbOutcome Prediction Benchmark Group
We define a task for predicting responses in gene expression of single cells to chemical and genetic
perturbations, aiming to measure model generalization across cell lines and perturbation types. Understanding
cellular responses to genetic perturbation is central to numerous biomedical applications,
from identifying genetic interactions involved in cancer to developing methods for regenerative
medicine.
Furthermore, counterfactual prediction of drug-based perturbations at single-cell
resolution enables cell-type specific drugs and treatments, facilitating precision medicine. The
predictive, non-generative task is then formalized as a function of a cell, with corresponding attributes
such as cell line, disease, and tissue, and a perturbation, such as a drug type or a CRISPR-based
perturbation, which outputs a count for gene expression of the cell after the input pe
In TDC-2, we’ve used the scPerturb datasets for building benchmarks for this task. More details to-be-announced.
To access a benchmark in the group, use the following code:
from tdc.benchmark_group import counterfactual_group
group = counterfactual_group.CounterfactualGroup() # GenePerturbGroup for genetic perturbations
train, val = group.get_train_valid_split()
test = group.get_test()
## --- train your model --- ##
predictions = model.predict(test) # modify as per your model code and test output
out = group.evaluate(predictions)
Follow the instructions on how to use the BenchmarkGroup
class and obtain training, validation, and test sets, and how to submit your model to the leaderboard.
The evaluation metric is R-squared. More details to-be-announced.