Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is O=C(Nc1ccc2[nH]nc(-c3ccncc3)c2c1)[C@H]1CCN(CC(=O)N2CCN(c3ccc(-c4ncccn4)cc3)CC2)C1. The target protein sequence is MSRSKRDNNFYSVEIGDSTFTVLKRYQNLKPIGSGAQGIVCAAYDAILERNVAIKKLSRPFQNQTHAKRAYRELVLMKCVNHKNIIGLLNVFTPQKSLEEFQDVYIVMELMDANLCQVIQMELDHERMSYLLYQMLCGIKHLHSAGIIHRDLKPSNIVVKSDCTLKILDFGLARTAGTSFMMTPYVVTRYYRAPEVILGMGYKENVDLWSVGCIMGEMVCHKILFPGRDYIDQWNKVIEQLGTPCPEFMKKLQPTVRTYVENRPKYAGYSFEKLFPDVLFPADSEHNKLKASQARDLLSKMLVIDASKRISVDEALQHPYINVWYDPSEAEAPPPKIPDKQLDEREHTIEEWKELIYKEVMDL. The pIC50 is 6.0. (2) The target protein (P09487) has sequence MISPFLLLAIGTCFASSLVPEKEKDPKYWRDQAQQTLKNALRLQTLNTNVAKNVIMFLGDGMGVSTVTAARILKGQLHHSPGEETKLEMDKFPYVALSKTYNTNAQVPDSAGTATAYLCGVKANEGTVGVSAATQRSQCNTTQGNEVTSILRWAKDAGKSVGIVTTTRVNHATPSASYAHSADRDWYSDNEMPPEALSQGCKDIAYQLMHNIKDIEVIMGGGRKYMFPKNRTDVEYELDEKARGTRLDGLNLIDIWKSFKPKHKHSHYVWNRTDLLALDPHSVDYLLGLFEPGDMQYELNRNNATDPSLSEMVEMAIRILNKNPKGFFLLVEGGRIDHGHHEGKAKQALHEAVEMDQAIGQAGAMTSVEDTLTVVTADHSHVFTFGGYTPRGNSIFGLAPMVSDTDKKPFTAILYGNGPGYKVVGGERENVSMVDYAHNNYQAQSAVPLRHETHGGEDVAVFAKGPMAHLLHGVHEQNYIPHVMAYAACIGANRDHCASA.... The pIC50 is 6.6. The drug is CN(C)c1ccc2c3c(cccc13)S(=O)(=O)NC2c1cc(Cl)cc(Cl)c1O. (3) The compound is COc1ccc2nc(N[C@@H](c3ccc(O)cc3)P(=O)(OC)OC)sc2c1. The target protein (Q99714) has sequence MAAACRSVKGLVAVITGGASGLGLATAERLVGQGASAVLLDLPNSGGEAQAKKLGNNCVFAPADVTSEKDVQTALALAKGKFGRVDVAVNCAGIAVASKTYNLKKGQTHTLEDFQRVLDVNLMGTFNVIRLVAGEMGQNEPDQGGQRGVIINTASVAAFEGQVGQAAYSASKGGIVGMTLPIARDLAPIGIRVMTIAPGLFGTPLLTSLPEKVCNFLASQVPFPSRLGDPAEYAHLVQAIIENPFLNGEVIRLDGAIRMQP. The pIC50 is 3.6. (4) The drug is COC(=O)[C@@H](NC(=S)[C@@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)OC(C)(C)C)C(C)C)C(C)C. The target protein sequence is PQVTLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF. The pIC50 is 5.3. (5) The compound is Nc1ccc2ccccc2n1. The target protein sequence is MEEKEILWNEAKAFIAACYQELGKAAEVKDRLADIKSEIDLTGSYVHTKEELEHGAKMAWRNSNRCIGRLFWNSLNVIDRRDVRTKEEVRDALFHHIETATNNGKIRPTITIFPPEEKGEKQVEIWNHQLIRYAGYESDGERIGDPASCSLTAACEELGWRGERTDFDLLPLIFRMKGDEQPVWYELPRSLVIEVPITHPDIEAFSDLELKWYGVPIISDMKLEVGGIHYNAAPFNGWYMGTEIGARNLADEKRYDKLKKVASVIGIAADYNTDLWKDQALVELNKAVLHSYKKQGVSIVDHHTAASQFKRFEEQAEEAGRKLTGDWTWLIPPISPAATHIFHRSYDNSIVKPNYFYQDKPYE. The pIC50 is 4.6. (6) The compound is COc1ccc(C(=O)Nc2c(Cl)cncc2Cl)cc1OC1CCCC1. The target protein sequence is SAAEEETRELQSLAAAVVPSAQTLKITDFSFSDFELSDLETALCTIRMFTDLNLVQNFQMKHEVLCRWILSVKKNYRKNVAYHNWRHAFNTAQCMFAALKAGKIQNKLTDLEILALLIAALSHDLDHRGVNNSYIQRSEHPLAQLYCHSIMEHHHFDQCLMILNSPGNQILSGLSIEEYKTTLKIIKQAILATDLALYIKRRGEFFELIRKNQFNLEDPHQKELFLAMLMTACDLSAITKPWPIQQRIAELVATEFFDQGDRERKELNIEPTDLMNREKKNKIPSMQVGFIDAICLQLYEALTHVSEDCFPLLDGCRKNRQKWQALAEQQEKMLINGESGQAKRN. The pIC50 is 5.5. (7) The drug is CC(=O)O[C@H](COP(=O)(O)O)[C@@H](OC(C)=O)C(=O)NO. The target protein (Q5NGP7) has sequence MEISMTSHINNAVETFRLEIETLEKLKNSIDENFEKACEIILENNRDKSRVIITGMGKSGHIGKKMAATFASTGTPAFFVHPGEAGHGDFGMITKNDVLIAISNSGTSSEIMGLLPMIKHLDIPIIAITSNPKSILARNSNVTLNLHVDKEACPLNLAPTSSTTATLVLGDALAIALLKAKNFSEKDFAFSHPNGALGRKLILKVENIMRKGNEIPIVKPTDNIRKAILEISDKGVGNTLVAENNTLLGIFTDGDLRRMFEAESFNSQRAISEVMTKNPKSISKEEMAITALEKMEKYEITSLAVVDNGHNILGIVTMHDLIKLELR. The pIC50 is 5.2. (8) The pIC50 is 3.5. The drug is COc1ccc(C(=O)[O-])cc1. The target protein (P51580) has sequence MDGTRTSLDIEEYSDTEVQKNQVLTLEEWQDKWVNGKTAFHQEQGHQLLKKHLDTFLKGKSGLRVFFPLCGKAVEMKWFADRGHSVVGVEISELGIQEFFTEQNLSYSEEPITEIPGTKVFKSSSGNISLYCCSIFDLPRTNIGKFDMIWDRGALVAINPGDRKCYADTMFSLLGKKFQYLLCVLSYDPTKHPGPPFYVPHAEIERLFGKICNIRCLEKVDAFEERHKSWGIDCLFEKLYLLTEK. (9) The small molecule is C[C@@H](NC(=O)Cc1ccc(C2CC2)cc1)c1ccc(OCC(F)(F)F)cn1. The target protein sequence is MARFGDEMPARYGGGGSGAAAGVVVGSGGGRGAGGSRQGGQPGAQRMYKQSMAQRARTMALYNPIPVRQNCLTVNRSLFLFSEDNVVRKYAKKITEWPPFEYMILATIIANCIVLALEQHLPDDDKTPMSERLDDTEPYFIGIFCFEAGIKIIALGFAFHKGSYLRNGWNVMDFVVVLTGILATVGTEFDLRTLRAVRVLRPLKLVSGIPSLQVVLKSIMKAMIPLLQIGLLLFFAILIFAIIGLEFYMGKFHTTCFEEGTDDIQGESPAPCGTEEPARTCPNGTKCQPYWEGPNNGITQFDNILFAVLTVFQCITMEGWTDLLYNSNDASGNTWNWLYFIPLIIIGSFFMLNLVLGVLSGEFAKERERVENRRAFLKLRRQQQIERELNGYMEWISKAEEVILAEDETDGEQRHPFDGALRRTTIKKSKTDLLNPEEAEDQLADIASVGSPFARASIKSAKLENSTFFHKKERRMRFYIRRMVKTQAFYWTVLSLVALN.... The pIC50 is 4.5. (10) The compound is COc1ccc(NC(=O)C2CCN(S(=O)(=O)c3cn(C(C)C)cn3)CC2)cc1. The target protein (P9WHH9) has sequence MTHYDVVVLGAGPGGYVAAIRAAQLGLSTAIVEPKYWGGVCLNVGCIPSKALLRNAELVHIFTKDAKAFGISGEVTFDYGIAYDRSRKVAEGRVAGVHFLMKKNKITEIHGYGTFADANTLLVDLNDGGTESVTFDNAIIATGSSTRLVPGTSLSANVVTYEEQILSRELPKSIIIAGAGAIGMEFGYVLKNYGVDVTIVEFLPRALPNEDADVSKEIEKQFKKLGVTILTATKVESIADGGSQVTVTVTKDGVAQELKAEKVLQAIGFAPNVEGYGLDKAGVALTDRKAIGVDDYMRTNVGHIYAIGDVNGLLQLAHVAEAQGVVAAETIAGAETLTLGDHRMLPRATFCQPNVASFGLTEQQARNEGYDVVVAKFPFTANAKAHGVGDPSGFVKLVADAKHGELLGGHLVGHDVAELLPELTLAQRWDLTASELARNVHTHPTMSEALQECFHGLVGHMINF. The pIC50 is 5.6.