This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC[C@H]1O[C@@H](NC(=S)NN=Cc2ccncc2)[C@H](O)[C@@H](O)[C@@H]1O. The target protein (P00489) has sequence MSRPLSDQEKRKQISVRGLAGVENVTELKKNFNRHLHFTLVKDRNVATPRDYYFALAHTVRDHLVGRWIRTQQHYYEKDPKRIYYLSLEFYMGRTLQNTMVNLALENACDEATYQLGLDMEELEEIEEDAGLGNGGLGRLAACFLDSMATLGLAAYGYGIRYEFGIFNQKICGGWQMEEADDWLRYGNPWEKARPEFTLPVHFYGRVEHTSQGAKWVDTQVVLAMPYDTPVPGYRNNVVNTMRLWSAKAPNDFNLKDFNVGGYIQAVLDRNLAENISRVLYPNDNFFEGKELRLKQEYFVVAATLQDIIRRFKSSKFGCRDPVRTNFDAFPDKVAIQLNDTHPSLAIPELMRVLVDLERLDWDKAWEVTVKTCAYTNHTVLPEALERWPVHLLETLLPRHLQIIYEINQRFLNRVAAAFPGDVDRLRRMSLVEEGAVKRINMAHLCIAGSHAVNGVARIHSEILKKTIFKDFYELEPHKFQNKTNGITPRRWLVLCNPGL.... The pIC50 is 3.7. (2) The small molecule is O=C1c2ccccc2C(=O)C12OC(c1ccccc1)c1c2c(O)n(-c2ccc(Cl)c(C(F)(F)F)c2)c1O. The target protein (Q8P1K1) has sequence MDLQAQLEELKTKTLETLQSLTGNHTKELQDLRVAVLGKKGSLTELLKGLKDLSNDLRPVVGKQVNEVRDLLTKAFEEQAKIVEAAKIQAQLDAESIDVTLPGRQMTLGHRHVLTQTSEEIEDIFLGMGFQIVDGFEVEKDYYNFERMNLPKDHPARDMQDTFYITEEILLRTHTSPVQARTLDQHDFSKGPLKMVSPGRVFRRDTDDATHSHQFHQIEGLVVGKNISMRDLKGTLEMIIKKMFGEERSIRLRPSYFPFTEPSVEVDVSCFKCGGKGCNVCKKTGWIEILGAGMVHPSVLEMSGVDAKEYSGFAFGLGQERIAMLRYGINDIRGFYQGDQRFSEQFN. The pIC50 is 4.5. (3) The compound is NCCCNCCCCNCCC(=O)NCCCCCNC(=O)[C@H](CC(N)=O)NC(=O)Cc1c[nH]c2cccc(O)c12. The target protein (P22756) has sequence MERSTVLIQPGLWTRDTSWTLLYFLCYILPQTSPQVLRIGGIFETVENEPVNVEELAFKFAVTSINRNRTLMPNTTLTYDIQRINLFDSFEASRRACDQLALGVAALFGPSHSSSVSAVQSICNALEVPHIQTRWKHPSVDSRDLFYINLYPDYAAISRAVLDLVLYYNWKTVTVVYEDSTGLIRLQELIKAPSRYNIKIKIRQLPPANKDAKPLLKEMKKSKEFYVIFDCSHETAAEILKQILFMGMMTEYYHYFFTTLDLFALDLELYRYSGVNMTGFRKLNIDNPHVSSIIEKWSMERLQAPPRPETGLLDGMMTTEAALMYDAVYMVAIASHRASQLTVSSLQCHRHKPCALGPRFMNLIKEARWDGLTGRITFNKTDGLRKDFDLDIISLKEEGTEKASGEVSKHLYKVWKKIGIWNSNSGLNMTDGNRDRSNNITDSLANRTLIVTTILEEPYVMYRKSDKPLYGNDRFEAYCLDLLKELSNILGFLYDVKLVP.... The pIC50 is 9.5. (4) The target protein (P47824) has sequence MARRLQDELSAFFFEYDTPRMVLVRNKKVGVIFRLIQLVVLVYVIGWVFVYEKGYQTSSDLISSVSVKLKGLAVTQLQGLGPQVWDVADYVFPAHGDSSFVVMTNFIVTPQQTQGHCAENPEGGICQDDSGCTPGKAERKAQGIRTGNCVPFNGTVKTCEIFGWCPVEVDDKIPSPALLREAENFTLFIKNSISFPRFKVNRRNLVEEVNGTYMKKCLYHKIQHPLCPVFNLGYVVRESGQDFRSLAEKGGVVGITIDWKCDLDWHVRHCKPIYQFHGLYGEKNLSPGFNFRFARHFVQNGTNRRHLFKVFGIHFDILVDGKAGKFDIIPTMTTIGSGIGIFGVATVLCDLLLLHILPKRHYYKQKKFKYAEDMGPGEGEHDPVATSSTLGLQENMRTS. The compound is Cc1nc(N=Nc2cc(S(=O)(=O)O)ccc2S(=O)(=O)O)c(COP(=O)(O)O)c(C=O)c1O. The pIC50 is 7.4. (5) The target protein (Q9P0G3) has sequence MSLRVLGSGTWPSAPKMFLLLTALQVLAIAMTQSQEDENKIIGGHTCTRSSQPWQAALLAGPRRRFLCGGALLSGQWVITAAHCGRPILQVALGKHNLRRWEATQQVLRVVRQVTHPNYNSRTHDNDLMLLQLQQPARIGRAVRPIEVTQACASPGTSCRVSGWGTISSPIARYPASLQCVNINISPDEVCQKAYPRTITPGMVCAGVPQGGKDSCQGDSGGPLVCRGQLQGLVSWGMERCALPGYPGVYTNLCKYRSWIEETMRDK. The drug is COc1ccc(NC(=O)/C(C#N)=C\c2ccc(O)c(OC)c2)cc1. The pIC50 is 4.0. (6) The drug is O=C(O)c1cc(-c2ccc(Br)cc2)nc2c(C(F)(F)F)cc(Br)cc12. The target protein (P78600) has sequence MIIIKRFLHIKTVPKSYGNQLSKFKYSKQIPTHEVLTKLGYITYPRAGLVNWSKMGLLIQNKISQIIRQRMDEIQFEEVSLSLISHKELWKLTNRWDQEEIFKLVGDEYLLVPTAEEEITNYVKKQFLESYKNFPLALYQINPKFRNEKRPRGGLLRGKEFLMKDAYSFDLNESEAMKTYEKVVGAYHKIFQDLGIPYVKAEADSGDIGGSLSHEWHYLNSSGEDTVFECNECHNVSNMEKALSYPKEIDETIEVSVIYFTTEDKSTLICAYYPSNRVLEPKFIQNEIPDIDLDSINDLSEFNHDISTRIVRIMDSRLSSRSKFPDFPISNFINRSLITTLTDIPIVLAQEGEICGHCEEGKLSASSAIEVGHTFYLGDKYSKPLDLEVDVPTSNNSIEKQRIMMGCYGIGISRIIAAIAEINRDEKGLKWPRSIAPWEVTVVEVSKQKQLKNVNDNNHHNNPQDNFQEIYNILNQANIDYRLDNRSDSMGKKLKQSDLL.... The pIC50 is 6.7. (7) The small molecule is OC[C@H]1N/C(=N\C2CCCCC2)[C@H](O)[C@@H](O)[C@H]1O. The target protein (Q58D55) has sequence MPGVVRLLALLLVPLLLGSARGLHNATQRTFQIDYRRNRFLKDGQPFRYISGSIHYFRVPRFYWKDRLLKMKMAGLNAIQTYVAWNFHELQPGRYNFSGDHDVEHFIQLAHELGLLVILRPGPYICAEWDMGGLPAWLLEKKSIVLRSSDPDYLAAVDKWLGVLLPKMRPLLYKNGGPIITVQVENEYGSYLSCDYDYLRFLQKRFHDHLGEDVLLFTTDGVNERLLQCGALQGLYATVDFSPGTNLTAAFMLQRKFEPTGPLVNSEFYTGWLDHWGQRHSTVSSKAVAFTLHDMLALGANVNMYMFIGGTNFAYWNGANIPYQPQPTSYDYDAPLSEAGDLTEKYFALRDIIQKFAKVPEGPIPPSTPKFAYGKVALNKLKTVEDALNILCPSGPIKSVYPLTFIDVKQYFGFVLYRTMLPEDCSDPTPLSSPLSGVHDRAYVSVNGVAQGILERESVITLNITGKAGATLDLLVENMGRVNYGSSINDFKGLVSNLTL.... The pIC50 is 6.0. (8) The small molecule is NS(=O)(=O)c1c(C(F)(F)F)ccc([C@H]2CC[C@H](NC3CCNCC3)CC2)c1-c1nnn[nH]1. The target protein sequence is MLKVISSLLVYMTASVMAVASPLAHSGEPSGEYPTVNEIPVGEVRLYQIADGVWSHIATQSFDGAVYPSNGLIVRDGDELLLIDTAWGAKNTAALLAEIEKQIGLPVTRAVSTHFHDDRVGGVDVLRAAGVATYASPSTRRLAEAEGNEIPTHSLEGLSSSGDAVRFGPVELFYPGAAHSTDNLVVYVPSANVLYGGCAVHELSSTSAGNVADADLAEWPTSVERIQKHYPEAEVVIPGHGLPGGLDLLQHTANVVKAHKNRSVAE. The pIC50 is 8.2.