Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CONS(=O)(=O)c1cc(-c2c3c(=O)n(C)c(=O)n(CC4CC4)c3nn2Cc2ccnc3ccc(Cl)cc23)n(C)c1. The target protein (P56068) has sequence MKIGVFDSGVGGFSVLKSLLKAQLFDEIIYYGDSARVPYGTKDPTTIKQFGLEALDFFKPHQIKLLIVACNTASALALEEMQKHSKIPVVGVIEPSILAIKRQVKDKNAPILVLGTKATIQSNAYDNALKQQGYLNVSHLATSLFVPLIEESILEGELLETCMRYYFTPLEILPEVVILGCTHFPLIAQKIEGYFMEHFALSTPPLLIHSGDAIVEYLQQNYALKKNACAFPKVEFHASGDVVWLEKQAKEWLKL. The pIC50 is 7.3. (2) The compound is CN1CCN(c2ccc(C(=O)Nc3n[nH]c4cc(OCCOc5ccccc5)ccc34)cc2)CC1. The target protein sequence is TYKYLQKPMYEVQWKVVEEINGNNYVYIDPTQLPYDHKWEFPRNRLSFGKTLGAGAFGKVVEATAYGLIKSDAAMTVAVKMLKPSAHLTEREALMSELKVLSYLGNHMNIVNLLGACTIGGPTLVITEYCCYGDLLNFLRRKRDSFICSKQEDHAEAALYKNLLHSKESSCSDSTNEYMDMKPGVSYVVPTKADKRRSVRIGSYIERDVTPAIMEDDELALDLEDLLSFSYQVAKGMAFLASKNCIHRDLAARNILLTHGRITKICDFGLARDIKNDSNYVVKGNARLPVKWMAPESIFNCVYTFESDVWSYGIFLWELFSLGSSPYPGMPVDSKFYKMIKEGFRMLSPEHAPAEMYDIMKTCWDADPLKRPTFKQIVQLIEKQISESTNHIYSNLANCSPNRQKPVVDHSVRINSVGSTASSSQPLLVHDDV. The pIC50 is 6.0. (3) The small molecule is CCOC(=O)COc1cc(O)c2c(=O)c(O)c(-c3ccc4c(c3)OC(c3ccc(OCC(=O)OCC)c(OC)c3)C(CO)O4)oc2c1. The target protein (P22985) has sequence MTADELVFFVNGKKVVEKNADPETTLLVYLRRKLGLCGTKLGCGEGGCGACTVMISKYDRLQNKIVHFSVNACLAPICSLHHVAVTTVEGIGNTQKLHPVQERIARSHGSQCGFCTPGIVMSMYTLLRNQPEPTVEEIENAFQGNLCRCTGYRPILQGFRTFAKDGGCCGGSGNNPNCCMNQTKDQTVSLSPSLFNPEDFKPLDPTQEPIFPPELLRLKDTPQKKLRFEGERVTWIQASTMEELLDLKAQHPDAKLVVGNTEIGIEMKFKNMLFPLIVCPAWIPELNSVVHGPEGISFGASCPLSLVESVLAEEIAKLPEQKTEVFRGVMEQLRWFAGKQVKSVASIGGNIITASPISDLNPVFMASGAKLTLVSRGTRRTVRMDHTFFPGYRKTLLRPEEILLSIEIPYSKEGEFFSAFKQASRREDDIAKVTSGMRVLFKPGTIEVQELSLCFGGMADRTISALKTTPKQLSKSWNEELLQSVCAGLAEELQLAPDAP.... The pIC50 is 4.2. (4) The compound is CCCCC/C=C\C/C=C\C/C=C\CCCCCCC(=O)O. The target protein (P00761) has sequence FPTDDDDKIVGGYTCAANSIPYQVSLNSGSHFCGGSLINSQWVVSAAHCYKSRIQVRLGEHNIDVLEGNEQFINAAKIITHPNFNGNTLDNDIMLIKLSSPATLNSRVATVSLPRSCAAAGTECLISGWGNTKSSGSSYPSLLQCLKAPVLSDSSCKSSYPGQITGNMICVGFLEGGKDSCQGDSGGPVVCNGQLQGIVSWGYGCAQKNKPGVYTKVCNYVNWIQQTIAAN. The pIC50 is 3.7. (5) The drug is Cc1c(C(=O)c2cccc3ccccc23)c2ccccc2n1CCN1CCOC(O)C1. The target protein (O09114) has sequence MAALRMLWMGLVLLGLLGFPQTPAQGHDTVQPNFQQDKFLGRWYSAGLASNSSWFREKKAVLYMCKTVVAPSTEGGLNLTSTFLRKNQCETKIMVLQPAGAPGHYTYSSPHSGSIHSVSVVEANYDEYALLFSRGTKGPGQDFRMATLYSRTQTLKDELKEKFTTFSKAQGLTEEDIVFLPQPDKCIQE. The pIC50 is 6.2.