Drug Combination Benchmark Group

Drug combination screening offers significant potential for expanding the use of existing drugs and in improving their efficacy. For instance, the simultaneous modulation of multiple targets can address the common mechanisms of drug resistance seen in the treatment of cancers. However, experimentally exploring the entire space of possible drug combinations is not a feasible task. Thus, computational models that can predict synergistic combinations can prove very valuable.

We measure drug combination response using the complementary metrics of sensitivity and synergy. The drug combination sensitivity (CSS) score is derived using relative IC50 values of compounds and the area under their dose-response curves. For synergy, we use metrics derived from four different models: Bliss model, Highest Single Agent (HSA), Loewe additivity model and Zero Interaction Potency (ZIP). Synergy is a measure of deviation of an observed drug combination response from the expected effect of non-interaction.

In this benchmark group, we use the DrugComb dataset. To access, type:

from tdc import BenchmarkGroup
group = BenchmarkGroup(name = 'DrugCombo_Group', path = 'data/', file_format='pkl')

predictions = {}

for benchmark in group:
    name = benchmark['name']
    train_val, test = benchmark['train_val'], benchmark['test']

    ## --- train your model --- ##
    predictions[name] = y_pred_test

out = group.evaluate(predictions)

Note that the output includes the evaluations across tissues:

{'drugcomb_css': {'mae': 22.963}, 'drugcomb_css_kidney': {'mae': 21.811}, 'drugcomb_css_lung': {'mae': 21.367},
'drugcomb_css_breast': {'mae': 18.25}, 'drugcomb_css_hematopoietic_lymphoid': {'mae': 40.379}, 'drugcomb_css_colon': {'mae': 25.001},
'drugcomb_css_prostate': {'mae': 21.874}, 'drugcomb_css_ovary': {'mae': 19.468}, 'drugcomb_css_skin': {'mae': 18.683}, 
'drugcomb_css_brain': {'mae': 21.89}, 'drugcomb_hsa': {'mae': 4.562}, 'drugcomb_loewe': {'mae': 11.109}, 'drugcomb_bliss': {'mae': 4.686}, 
'drugcomb_zip': {'mae': 4.545}}

Follow the instruction on how to use the BenchmarkGroup class and for other useful functions to facilitate your model building.

For every dataset, we use drug combination split and hold out a 20% test set. The evaluation metrics is MAE.

We encourage submissions that reports results for all five synergy scores. Note that the tissue types are automatically calculated based on the test set prediction on CSS data type.


Label Number Task Metric Split
CSS 297,098 Regression MAE Combination
HSA 297,098 Regression MAE Combination
Loewe 297,098 Regression MAE Combination
Bliss 297,098 Regression MAE Combination
Zip 297,098 Regression MAE Combination

Tissue Label # of Cell Lines Test Set Size Metric
Kidney CSS 8 8,096 MAE
Lung CSS 9 9,108 MAE
Breast CSS 5 20,551 MAE
Hema CSS 6 6,072 MAE
Colon CSS 7 7,084 MAE
Prostate CSS 2 2,024 MAE
Ovary CSS 7 7,084 MAE
Skin CSS 9 9,108 MAE
Brain CSS 6 6,072 MAE

Leaderboard on Combination Response Prediction

Rank Model Contact Link #Params CSS HSA Loewe Bliss Zip
1 MLP Yusuf Roohani GitHub, Paper 7,141,297 16.858 ± 0.005 4.453 ± 0.002 9.184 ± 0.001 4.560 ± 0.000 4.027 ± 0.003

: Click to sort models for specific benchmark column.

Leaderboard on Combination Sensitivity Scores Across Tissues

Rank Model Contact Link #Params Kidney Lung Breast Hema Colon Prost. Ovary Skin Brain
1 MLP Yusuf Roohani GitHub, Paper 7,141,297 14.570 ± 0.003 15.653 ± 0.017 13.432 ± 0.049 28.764 ± 0.201 17.729 ± 0.042 15.692 ± 0.005 15.263 ± 0.041 15.663 ± 0.065 15.694 ± 0.006

: Click to sort models for specific benchmark column.