From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CS/C(Nc1cccc(C(C)C)c1)=C(/C#N)S(=O)(=O)c1ccccc1. The target protein (P9WMK9) has sequence MSDEDRTDRATEDHTIFDRGVGQRDQLQRLWTPYRMNYLAEAPVKRDPNSSASPAQPFTEIPQLSDEEGLVVARGKLVYAVLNLYPYNPGHLMVVPYRRVSELEDLTDLESAELMAFTQKAIRVIKNVSRPHGFNVGLNLGTSAGGSLAEHLHVHVVPRWGGDANFITIIGGSKVIPQLLRDTRRLLATEWARQP. The pIC50 is 5.8. (2) The drug is CO[C@H]1/C=C/O[C@@]2(C)Oc3c(C)c(O)c4c(O)c(cc(O)c4c3C2=O)NC(=O)/C(C)=C\C=C\[C@H](C)[C@H](O)[C@@H](C)[C@@H](O)[C@@H](C)[C@H](OC(C)=O)[C@@H]1C. The target protein sequence is MVYSYTEKKRIRKDFGKRPQVLDVPYLLSIQLDSFQKFIEQDPEGQYGLEAAFRSVFPIQSYSGNSELQYVSYRLGEPVFDVQECQIRGVTYSAPLRVKLRLVIYEREAPEGTVKDIKEQEVYMGEIPLMTDNGTFVINGTERVIVSQLHRSPGVFFDSDKGKTHSSGKVLYNARIIPYRGSWLDFEFDPKDNLFVRIDRRRKLPATIILRALNYTTEQILDLFFEKVIFEIRDNKLQMELVPERLRGETASFDIEANGKVYVEKGRRITARHIRQLEKDDVKLIEVPVEYIAGKVVAKDYIDESTGELICAANMELSLDLLAKLSQSGHKRIETLFTNDLDHGPYISETLRVDPTNDRLSALVEIYRMMRPGEPPTREAAESLFENLFFSEDRYDLSAVGRMKFNRSLLREEIEGSGILSKDDIIDVMKKLIDIRNGKGEVDDIDHLGNRRIRSVGEMAENQFRVGLVRVERAVKERLSLGDLDTLMPQDMINAKPISA.... The pIC50 is 4.0. (3) The drug is C[C@]12CC[C@@H](O)[C@@](C)(CO)[C@@H]1CC[C@H]1C[C@@H]3C[C@@]12CC[C@]3(O)CO. The target protein (P08546) has sequence MFFNPYLSGGVTGGAVAGGRRQRSQPGSAQGSGKRPPQKQFLQIVPRGVMFDGQTGLIKHKTGRLPLMFYREIKHLLSHDMVWPCPWRETLVGRVVGPIRFHTYDQTDAVLFFDSPENVSPRYRQHLVPSGNVLRFFGATEHGYSICVNVFGQRSYFYCEYSDTDRLREVIASVGELVPEPRTPYAVSVTPATKTSIYGYGTRPVPDLQCVSISNWTMARKIGEYLLEQGFPVYEVRVDPLTRLVIDRRITTFGWCSVNRYDWRQQGRASTCDIEVDCDVSDLVAVPDDSSWPRYRCLSFDIECMSGEGGFPCAEKSDDIVIQISCVCYETGGNTAVDQGIPNGNDGRGCTSEGVIFGHSGLHLFTIGTCGQVGPDVDVYEFPSEYELLLGFMLFFQRYAPAFVTGYNINSFDLKYILTRLEYLYKVDSQRFCKLPTAQGGRFFLHSPAVGFKRQYAAAFPSASHNNPASTAATKVYIAGSVVIDMYPVCMAKTNSPNYK.... The pIC50 is 6.3. (4) The compound is CCCCNc1ccc(C=NNC(N)=S)nc1. The target protein (P31350) has sequence MLSLRVPLAPITDPQQLQLSPLKGLSLVDKENTPPALSGTRVLASKTARRIFQEPTEPKTKAAAPGVEDEPLLRENPRRFVIFPIEYHDIWQMYKKAEASFWTAEEVDLSKDIQHWESLKPEERYFISHVLAFFAASDGIVNENLVERFSQEVQITEARCFYGFQIAMENIHSEMYSLLIDTYIKDPKEREFLFNAIETMPCVKKKADWALRWIGDKEATYGERVVAFAAVEGIFFSGSFASIFWLKKRGLMPGLTFSNELISRDEGLHCDFACLMFKHLVHKPSEERVREIIINAVRIEQEFLTEALPVKLIGMNCTLMKQYIEFVADRLMLELGFSKVFRVENPFDFMENISLEGKTNFFEKRVGEYQRMGVMSSPTENSFTLDADF. The pIC50 is 5.6. (5) The drug is CCC(C)(C)Cc1cnc(CCc2ccc(-c3ccccn3)cc2)[nH]1. The target protein (Q8K418) has sequence MSQRQPQSPNQTLISITNDTETSSSAVSNDTTPKGWTGDNSPGIEALCAIYITYAVIISVGILGNAILIKVFFKTKSMQTVPNIFITSLAFGDLLLLLTCVPVDATHYLAEGWLFGKVGCKVLSFIRLTSVGVSVFTLTILSADRYKAVVKPLERQPSNAILKTCAKAGGIWIMAMIFALPEAIFSNVYTFQDPNRNVTFESCNSYPISERLLQEIHSLLCFLVFYIIPLSIISVYYSLIARTLYKSTLNIPTEEQSHARKQIESRKRIAKTVLVLVALFALCWLPNHLLYLYHSFTYESYAEPSDVPFVVTIFSRVLAFSNSCVNPFALYWLSKTFQKHFKAQLCCFKAEQPEPPLGDTPLNNLTVMGRVPATGSAHVSEISVTLFSGSTAKKGEDKV. The pIC50 is 8.6. (6) The small molecule is CCCCCCCCCCCCc1cccc(O)c1C(=O)O. The target protein (P22513) has sequence MPIKVGINGFGRIGRMVFQALCEDGLLGTEIDVVAVVDMNTDAEYFAYQMRYDTVHGKFKYEVTTTKSSPSVAKDDTLVVNGHRILCVKAQRNPADLPWGKLGVEYVIESTGLFTAKAAAEGHLRGGARKVVISAPASGGAKTLVMGVNHHEYNPSEHHVVSNASCTTNCLAPIVHVLVKEGFGVQTGLMTTIHSYTATQKTVDGVSVKDWRGGRAAAVNIIPSTTGAAKAVGMVIPSTQGKLTGMSFRVPTPDVSVVDLTFTAARDTSIQEIDAALKRASKTYMKGILGYTDEELVSADFINDNRSSIYDSKATLQNNLPKERRFFKIVSWYDNEWGYSHRVVDLVRHMASKDRSARL. The pIC50 is 4.3. (7) The small molecule is CC(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)[C@@H](O)CC(=O)N[C@@H](CCC(N)=O)C(N)=O. The target protein sequence is MNNYFLRKENFFILFCFVFVSIFFVSNVTIIKCNNVENKIDNVGKKIENVGKKIGDMENKNDNVENKNDNVGNKNDNVKNASSDLYKYKLYGDIDEYAYYFLDIDIGKPSQRISLILDTGSSSLSFPCNGCKDCGIHMEKPYNLNYSKTSSILYCNKSNCPYGLKCVGNKCEYLQSYCEGSQIYGFYFSDIVTLPSYNNKNKISFEKLMGCHMHEESLFLHQQATGVLGFSLTKPNGVPTFVDLLFKHTPSLKPIYSICVSEHGGELIIGGYEPDYFLSNQKEKQKMDKSDNNSSNKGNVSIKLKNNDKNDDEENNSKDVIVSNNVEDIVWQAITRKYYYYIKIYGLDLYGTNIMDKKELDMLVDSGSTFTHIPENIYNQINYYLDILCIHDMTNIYEINKRLKLTNESLNKPLVYFEDFKTALKNIIQNENLCIKIVDGVQCWKSLENLPNLYITLSNNYKMIWKPSSYLYKKESFWCKGLEKQVNNKPILGLTFFKNK.... The pIC50 is 6.2. (8) The compound is Oc1ccc(O)c(/C=N/Nc2cnc3ccccc3n2)c1. The target protein sequence is MFLAQEIIRKKRDGHALSDEEIRFFINGIRDNTISEGQIAALAMTIFFHDMTMPERVSLTMAMRDSGTVLDWKSLHLNGPIVDKHSTGGVGDVTSLMLGPMVAACGGYIPMISGRGLGHTGGTLDKLESIPGFDIFPDDNRFREIIKDVGVAIIGQTSSLAPADKRFYATRDITATVDSIPLITASILAKKLAEGLDALVMDVKVGSGAFMPTYELSEALAEAIVGVANGAGVRTTALLTDMNQVLASSAGNAVEVREAVQFLTGEYRNPRLFDVTMALCVEMLISGKLAKDDAEARAKLQAVLDNGKAAEVFGRMVAAQKGPTDFVENYAKYLPTAMLTKAVYADTEGFVSEMDTRALGMAVVAMGGGRRQASDTIDYSVGFTDMARLGDQVDGQRPLAVIHAKDENNWQEAAKAVKAAIKLADKAPESTPTVYRRISE. The pIC50 is 4.5.