From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Clc1nss/c1=N\c1nncs1. The target protein (O15305) has sequence MAAPGPALCLFDVDGTLTAPRQKITKEMDDFLQKLRQKIKIGVVGGSDFEKVQEQLGNDVVEKYDYVFPENGLVAYKDGKLLCRQNIQSHLGEALIQDLINYCLSYIAKIKLPKKRGTFIEFRNGMLNVSPIGRSCSQEERIEFYELDKKENIRQKFVADLRKEFAGKGLTFSIGGQISFDVFPDGWDKRYCLRHVENDGYKTIYFFGDKTMPGGNDHEIFTDPRTMGYSVTAPEDTRRICELLFS. The pIC50 is 4.0. (2) The compound is C=C1/C(=C\C=C2/CCC[C@@]3(C)[C@H]2CC[C@@H]3[C@H](C)CCCC(C)(C)O)C[C@@H](O)C[C@@H]1O. The target protein (P11473) has sequence MEAMAASTSLPDPGDFDRNVPRICGVCGDRATGFHFNAMTCEGCKGFFRRSMKRKALFTCPFNGDCRITKDNRRHCQACRLKRCVDIGMMKEFILTDEEVQRKREMILKRKEEEALKDSLRPKLSEEQQRIIAILLDAHHKTYDPTYSDFCQFRPPVRVNDGGGSHPSRPNSRHTPSFSGDSSSSCSDHCITSSDMMDSSSFSNLDLSEEDSDDPSVTLELSQLSMLPHLADLVSYSIQKVIGFAKMIPGFRDLTSEDQIVLLKSSAIEVIMLRSNESFTMDDMSWTCGNQDYKYRVSDVTKAGHSLELIEPLIKFQVGLKKLNLHEEEHVLLMAICIVSPDRPGVQDAALIEAIQDRLSNTLQTYIRCRHPPPGSHLLYAKMIQKLADLRSLNEEHSKQYRCLSFQPECSMKLTPLVLEVFGNEIS. The pIC50 is 8.9. (3) The compound is O=C(O)Cc1ccccc1Nc1c(Cl)cccc1Cl. The target protein (P35354) has sequence MLARALLLCAVLALSHTANPCCSHPCQNRGVCMSVGFDQYKCDCTRTGFYGENCSTPEFLTRIKLFLKPTPNTVHYILTHFKGFWNVVNNIPFLRNAIMSYVLTSRSHLIDSPPTYNADYGYKSWEAFSNLSYYTRALPPVPDDCPTPLGVKGKKQLPDSNEIVEKLLLRRKFIPDPQGSNMMFAFFAQHFTHQFFKTDHKRGPAFTNGLGHGVDLNHIYGETLARQRKLRLFKDGKMKYQIIDGEMYPPTVKDTQAEMIYPPQVPEHLRFAVGQEVFGLVPGLMMYATIWLREHNRVCDVLKQEHPEWGDEQLFQTSRLILIGETIKIVIEDYVQHLSGYHFKLKFDPELLFNKQFQYQNRIAAEFNTLYHWHPLLPDTFQIHDQKYNYQQFIYNNSILLEHGITQFVESFTRQIAGRVAGGRNVPPAVQKVSQASIDQSRQMKYQSFNEYRKRFMLKPYESFEELTGEKEMSAELEALYGDIDAVELYPALLVEKPRP.... The pIC50 is 8.3. (4) The small molecule is Cc1nc(C)c(-c2ccc3cc(-c4c(C5CCCCC5)c5ccc6cc5n4CC(=O)NCC/C=C\CCNC6=O)ccc3n2)s1. The target protein (O92972) has sequence MSTNPKPQRKTKRNTNRRPQDVKFPGGGQIVGGVYLLPRRGPRLGVRATRKASERSQPRGRRQPIPKARRPEGRAWAQPGYPWPLYGNEGLGWAGWLLSPRGSRPSWGPTDPRRRSRNLGKVIDTLTCGFADLMGYIPLVGAPLGGAARALAHGVRVLEDGVNYATGNLPGCSFSIFLLALLSCLTIPASAYEVRNVSGIYHVTNDCSNSSIVYEAADVIMHTPGCVPCVREGNSSRCWVALTPTLAARNASVPTTTIRRHVDLLVGTAAFCSAMYVGDLCGSIFLVSQLFTFSPRRHETVQDCNCSIYPGHVSGHRMAWDMMMNWSPTTALVVSQLLRIPQAVVDMVAGAHWGVLAGLAYYSMVGNWAKVLIVALLFAGVDGETHTTGRVAGHTTSGFTSLFSSGASQKIQLVNTNGSWHINRTALNCNDSLQTGFFAALFYAHKFNSSGCPERMASCRPIDWFAQGWGPITYTKPNSSDQRPYCWHYAPRPCGVVPAS.... The pIC50 is 5.6. (5) The drug is Cc1cc(C(=O)O)ccc1-c1ccc(N2C(=O)N(c3cc(=O)[nH]cn3)C3(CCN(Cc4ncccc4C)CC3)C2=O)cn1. The target protein (Q96KS0) has sequence MDSPCQPQPLSQALPQLPGSSSEPLEPEPGRARMGVESYLPCPLLPSYHCPGVPSEASAGSGTPRATATSTTASPLRDGFGGQDGGELRPLQSEGAAALVTKGCQRLAAQGARPEAPKRKWAEDGGDAPSPSKRPWARQENQEAEREGGMSCSCSSGSGEASAGLMEEALPSAPERLALDYIVPCMRYYGICVKDSFLGAALGGRVLAEVEALKRGGRLRDGQLVSQRAIPPRSIRGDQIAWVEGHEPGCRSIGALMAHVDAVIRHCAGRLGSYVINGRTKAMVACYPGNGLGYVRHVDNPHGDGRCITCIYYLNQNWDVKVHGGLLQIFPEGRPVVANIEPLFDRLLIFWSDRRNPHEVKPAYATRYAITVWYFDAKERAAAKDKYQLASGQKGVQVPVSQPPTPT. The pIC50 is 9.2. (6) The drug is O=C(Oc1ccc(Cl)cc1C(=O)c1cccc(Cl)c1Cl)c1ccccc1. The target protein (P59264) has sequence HLLQFRKMIKKMTGKEPIVSYAFYGCYCGKGGRGKPKDATDRCCFVHDCCYEKVTGCDPKWSYYTYSLEDGDIVCEGDPYCTKVKCECDKKAAICFRDNLKTYKNRYMTFPDIFCTDPTEGC. The pIC50 is 4.3. (7) The compound is O=C(N[C@H]1CCOC1=O)c1ccc(Br)cc1. The target protein sequence is MHDEREGYLEILSRITTEEEFFSLVLEICGNYGFEFFSFGARAPFPLTAPKYHFLSNYPGEWKSRYISEDYTSIDPIVRHGLLEYTPLIWNGEDFQENRFFWEEALHHGIRHGWSIPVRGKYGLISMLSLVRSSESIAATEILEKESFLLWITSMLQATFGDLLAPRIVPESNVRLTARETEMLKWTAVGKTYGEIGLILSIDQRTVKFHIVNAMRKLNSSNKAEATMKAYAIGLLN. The pIC50 is 6.7. (8) The drug is Cc1ccc(NC(=O)Nc2cc(C(F)(F)F)ccc2F)cc1Nc1ccc2c(c1)NC(=O)/C2=C\c1ccc[nH]1. The target protein (P35546) has sequence MAKATSGAAGLGLKLILLLPLLGEAPLGLYFSRDAYWERLYVDQPAGTPLLYVHALRDAPGEVPSFRLGQHLYGVYRTRLHENDWIRINETTGLLYLNQSLDHSSWEQLSIRNGGFPLLTIFLQVFLGSTAQREGECHWPGCTRVYFSFINDTFPNCSSFKAQDLCIPETAVSFRVRENRPPGTFYHFHMLPVQFLCPNISVKYSLLGGDSLPFRCDPDCLEVSTRWALDRELREKYVLEALCIVAGPGANKETVTLSFPVTVYDEDDSAPTFSGGVGTASAVVEFKRKEGTVVATLQVFDADVVPASGELVRRYTNTLLSGDSWAQQTFRVEHSPIETLVQVNNNSVRATMHNYKLILNRSLSISESRVLQLAVLVNDSDFQGPGAGGILVLHFNVSVLPVTLNLPRAYSFPVNKRARRYAQIGKVCVENCQEFSGVSIQYKLQPSSINCTALGVVTSPEDTSGTLFVNDTEALRRPECTKLQYTVVATDRQTRRQTQA.... The pIC50 is 5.3. (9) The drug is C[C@@H]1O[C@H](C)CN2c3c(cc4c(N5C(=O)OC[C@@H]5CF)noc4c3F)CC3(C(=O)NC(=O)NC3=O)[C@@H]12. The target protein (P0AES4) has sequence MSDLAREITPVNIEEELKSSYLDYAMSVIVGRALPDVRDGLKPVHRRVLYAMNVLGNDWNKAYKKSARVVGDVIGKYHPHGDSAVYDTIVRMAQPFSLRYMLVDGQGNFGSIDGDSAAAMRYTEIRLAKIAHELMADLEKETVDFVDNYDGTEKIPDVMPTKIPNLLVNGSSGIAVGMATNIPPHNLTEVINGCLAYIDDEDISIEGLMEHIPGPDFPTAAIINGRRGIEEAYRTGRGKVYIRARAEVEVDAKTGRETIIVHEIPYQVNKARLIEKIAELVKEKRVEGISALRDESDKDGMRIVIEVKRDAVGEVVLNNLYSQTQLQVSFGINMVALHHGQPKIMNLKDIIAAFVRHRREVVTRRTIFELRKARDRAHILEALAVALANIDPIIELIRHAPTPAEAKTALVANPWQLGNVAAMLERAGDDAARPEWLEPEFGVRDGLYYLTEQQAQAILDLRLQKLTGLEHEKLLDEYKELLDQIAELLRILGSADRLME.... The pIC50 is 6.9. (10) The drug is O=C([O-])C1=C(/C=C/c2cccc[n+]2[O-])CS(=O)(=O)[C@@H]2/C(=C\c3ccccn3)C(=O)N12. The target protein sequence is MMKKSLCCALLLGISCSALATPVSEKQLAEVVANTVTPLMKAQSVPGMAVAVIYQGKPHYYTFGKADIAANKPVTPQTLFELGSISKTFTGVLGGDAIARGEISLDDPVTRYWPQLTGKQWQGIRMLDLATYTAGGLPLQVPDEVTDNASLLRFYQNWQPQWKPGTTRLYANASIGLFGALAVKPSGMPYEQAMTTRVLKPLKLDHTWINVPKAEEAHYAWGYRDGKAVRAVRVSPGMLDAQAYGVKTNVQDMANWVMANMAPENVADASLKQGIALAQSRYWRIGSMYQGLGWEMLNWPVEANTVVEGSDSKVALAPLPVAEVNPPAPPVKASWVHKTGSTGGFGSYVAFIPEKQIGIVMLANTSYPNPARVEAAYHILEALQ. The pIC50 is 7.0.