TrialOutcome Prediction Benchmark Group

Clinical trial outcome prediction is a machine learning task that aims to forecast the outcome of clinical trials, such as the approval rate of a drug or treatment. It utilizes various clinical trial features, including the drug's molecular structure, disease code representing the medical condition, and eligibility criteria that specify participant selection criteria. This task is formulated as a binary classification problem, where the machine learning model predicts whether a clinical trial will have a positive or negative outcome.

Our benchmark uses the Trial Outcome Prediction (TOP) dataset. TOP consists of 17,538 clinical trials with 13,880 small-molecule drugs and 5,335 diseases.

To access a benchmark in the group, use the following code:

from tdc.benchmark_group import trialoutcome_group
group = trialoutcome.TrialOutcomeGroup() # GenePerturbGroup for genetic perturbations
train, val = group.get_train_valid_split()
test = group.get_test()

## --- train your model --- ##

predictions = model.predict(test)  # modify as per your model code and test output
out = group.evaluate(predictions)

Follow the instructions on how to use the BenchmarkGroup class and obtain training, validation, and test sets, and how to submit your model to the leaderboard.

The evaluation metric is AUPRC.