This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)C[C@H](C=O)NC(=O)[C@H](NS(=O)(=O)c1ccc(F)cc1)C(C)C. The target protein (P97571) has sequence MAEELITPVYCTGVSAQVQKQRDKELGLGRHENAIKYLGQDYENLRARCLQNGVLFQDDAFPPVSHSLGFKELGPNSSKTYGIKWKRPTELLSNPQFIVDGATRTDICQGALGDCWLLAAIASLTLNETILHRVVPYGQSFQEGYAGIFHFQLWQFGEWVDVVVDDLLPTKDGKLVFVHSAQGNEFWSALLEKAYAKVNGSYEALSGGCTSEAFEDFTGGVTEWYDLQKAPSDLYQIILKALERGSLLGCSINISDIRDLEAITFKNLVRGHAYSVTDAKQVTYQGQRVNLIRMRNPWGEVEWKGPWSDNSYEWNKVDPYEREQLRVKMEDGEFWMSFRDFIREFTKLEICNLTPDALKSRTLRNWNTTFYEGTWRRGSTAGGCRNYPATFWVNPQFKIRLEEVDDADDYDSRESGCSFLLALMQKHRRRERRFGRDMETIGFAVYQVPRELAGQPVHLKRDFFLANASRAQSEHFINLREVSNRIRLPPGEYIVVPSTF.... The pIC50 is 3.0. (2) The drug is NCCCC(=O)O. The target protein (P23574) has sequence MGSGKVFLFSPSLLWSQTRGVRLIFLLLTLHLGNCIDKADDEDDEDLTMNKTWVLAPKIHEGDITQILNSLLQGYDNKLRPDIGVRPTVIETDVYVNSIGPVDPINMEYTIDIIFAQTWFDSRLKFNSTMKVLMLNSNMVGKIWIPDTFFRNSRKSDAHWITTPNRLLRIWSDGRVLYTLRLTINAECYLQLHNFPMDEHSCPLEFSSYGYPKNEIEYKWKKPSVEVADPKYWRLYQFAFVGLRNSTEISHTISGDYIIMTIFFDLSRRMGYFTIQTYIPCILTVVLSWVSFWINKDAVPARTSLGITTVLTMTTLSTIARKSLPKVSYVTAMDLFVSVCFIFVFAALMEYGTLHYFTSNNKGKTTRDRKLKSKTSVSPGLHAGSTLIPMNNISMPQGEDDYGYQCLEGKDCATFFCCFEDCRTGSWREGRIHIRIAKIDSYSRIFFPTAFALFNLVYWVGYLYL. The pIC50 is 7.5. (3) The small molecule is N#Cc1c(N)[nH]c2ccc(Oc3ccc(NC(=O)CN)cc3)cc12. The target protein (Q16875) has sequence MPLELTQSRVQKIWVPVDHRPSLPRSCGPKLTNSPTVIVMVGLPARGKTYISKKLTRYLNWIGVPTKVFNVGEYRREAVKQYSSYNFFRPDNEEAMKVRKQCALAALRDVKSYLAKEGGQIAVFDATNTTRERRHMILHFAKENDFKAFFIESVCDDPTVVASNIMEVKISSPDYKDCNSAEAMDDFMKRISCYEASYQPLDPDKCDRDLSLIKVIDVGRRFLVNRVQDHIQSRIVYYLMNIHVQPRTIYLCRHGENEHNLQGRIGGDSGLSSRGKKFASALSKFVEEQNLKDLRVWTSQLKSTIQTAEALRLPYEQWKALNEIDAGVCEELTYEEIRDTYPEEYALREQDKYYYRYPTGESYQDLVQRLEPVIMELERQENVLVICHQAVLRCLLAYFLDKSAEEMPYLKCPLHTVLKLTPVAYGCRVESIYLNVESVCTHRERSEDAKKGPNPLMRRNSVTPLASPEPTKKPRINSFEEHVASTSAALPSCLPPEVPT.... The pIC50 is 5.6. (4) The drug is CC(=O)N[C@@H](Cc1ccc(OP(=O)(O)O)cc1)C(=O)N[C@@H](C)c1nc(Cc2ccc(I)cc2)no1. The target protein (P43403) has sequence MPDPAAHLPFFYGSISRAEAEEHLKLAGMADGLFLLRQCLRSLGGYVLSLVHDVRFHHFPIERQLNGTYAIAGGKAHCGPAELCEFYSRDPDGLPCNLRKPCNRPSGLEPQPGVFDCLRDAMVRDYVRQTWKLEGEALEQAIISQAPQVEKLIATTAHERMPWYHSSLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALSLIYGKTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYCLKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYTPEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNLLIADIELGCGNFGSVRQGVYRMRKKQIDVAIKVLKQGTEKADTEEMMREAQIMHQLDNPYIVRLIGVCQAEALMLVMEMAGGGPLHKFLVGKREEIPVSNVAELLHQVSMGMKYLEEKNFVHRDLAARNVLLVNRHYAKISDFGLSKALGADDSYYTARSAGK.... The pIC50 is 5.3. (5) The compound is O=C(Nc1cc[nH]c(=O)c1)c1cc(F)ccc1Oc1ccccc1. The target protein (Q9Y5Y9) has sequence MEFPIGSLETNNFRRFTPESLVEIEKQIAAKQGTKKAREKHREQKDQEEKPRPQLDLKACNQLPKFYGELPAELIGEPLEDLDPFYSTHRTFMVLNKGRTISRFSATRALWLFSPFNLIRRTAIKVSVHSWFSLFITVTILVNCVCMTRTDLPEKIEYVFTVIYTFEALIKILARGFCLNEFTYLRDPWNWLDFSVITLAYVGTAIDLRGISGLRTFRVLRALKTVSVIPGLKVIVGALIHSVKKLADVTILTIFCLSVFALVGLQLFKGNLKNKCVKNDMAVNETTNYSSHRKPDIYINKRGTSDPLLCGNGSDSGHCPDGYICLKTSDNPDFNYTSFDSFAWAFLSLFRLMTQDSWERLYQQTLRTSGKIYMIFFVLVIFLGSFYLVNLILAVVTMAYEEQNQATTDEIEAKEKKFQEALEMLRKEQEVLAALGIDTTSLHSHNGSPLTSKNASERRHRIKPRVSEGSTEDNKSPRSDPYNQRRMSFLGLASGKRRAS.... The pIC50 is 5.5.