This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P10619) has sequence MIRAAPPPLFLLLLLLLLLVSWASRGEAAPDQDEIQRLPGLAKQPSFRQYSGYLKGSGSKHLHYWFVESQKDPENSPVVLWLNGGPGCSSLDGLLTEHGPFLVQPDGVTLEYNPYSWNLIANVLYLESPAGVGFSYSDDKFYATNDTEVAQSNFEALQDFFRLFPEYKNNKLFLTGESYAGIYIPTLAVLVMQDPSMNLQGLAVGNGLSSYEQNDNSLVYFAYYHGLLGNRLWSSLQTHCCSQNKCNFYDNKDLECVTNLQEVARIVGNSGLNIYNLYAPCAGGVPSHFRYEKDTVVVQDLGNIFTRLPLKRMWHQALLRSGDKVRMDPPCTNTTAASTYLNNPYVRKALNIPEQLPQWDMCNFLVNLQYRRLYRSMNSQYLKLLSSQKYQILLYNGDVDMACNFMGDEWFVDSLNQKMEVQRRPWLVKYGDSGEQIAGFVKEFSHIAFLTIKGAGHMVPTDKPLAAFTMFSRFLNKQPY. The pIC50 is 5.0. The drug is Cc1ccccc1[C@H](CC(=O)O)NC(=O)c1cccnc1. (2) The compound is CNS(=O)(=O)NC(=O)CCCc1c(-c2ccc(F)cc2)[nH]c2ccc(C#N)cc12. The target protein (P35344) has sequence MQEFTWENYSYEDFFGDFSNYSYSTDLPPTLLDSAPCRSESLETNSYVVLITYILVFLLSLLGNSLVMLVILYSRSTCSVTDVYLLNLAIADLLFATTLPIWAASKVHGWTFGTPLCKVVSLVKEVNFYSGILLLACISVDRYLAIVHATRTMIQKRHLVKFICLSMWGVSLILSLPILLFRNAIFPPNSSPVCYEDMGNSTAKWRMVLRILPQTFGFILPLLVMLFCYVFTLRTLFQAHMGQKHRAMRVIFAVVLIFLLCWLPYNLVLLTDTLMRTHVIQETCERRNDIDRALDATEILGFLHSCLNPIIYAFIGQKFRYGLLKILAAHGLISKEFLAKESRPSFVASSSGNTSTTL. The pIC50 is 5.9. (3) The compound is O=C(N[C@@H](Cc1ccccc1)P(=O)(O)CC(Cc1ccccc1)C(=O)NC(Cc1ccccc1)C(=O)O)OCc1ccccc1. The target protein (P08473) has sequence MGKSESQMDITDINTPKPKKKQRWTPLEISLSVLVLLLTIIAVTMIALYATYDDGICKSSDCIKSAARLIQNMDATTEPCTDFFKYACGGWLKRNVIPETSSRYGNFDILRDELEVVLKDVLQEPKTEDIVAVQKAKALYRSCINESAIDSRGGEPLLKLLPDIYGWPVATENWEQKYGASWTAEKAIAQLNSKYGKKVLINLFVGTDDKNSVNHVIHIDQPRLGLPSRDYYECTGIYKEACTAYVDFMISVARLIRQEERLPIDENQLALEMNKVMELEKEIANATAKPEDRNDPMLLYNKMTLAQIQNNFSLEINGKPFSWLNFTNEIMSTVNISITNEEDVVVYAPEYLTKLKPILTKYSARDLQNLMSWRFIMDLVSSLSRTYKESRNAFRKALYGTTSETATWRRCANYVNGNMENAVGRLYVEAAFAGESKHVVEDLIAQIREVFIQTLDDLTWMDAETKKRAEEKALAIKERIGYPDDIVSNDNKLNNEYLEL.... The pIC50 is 6.1. (4) The compound is CCN(Cc1nc2c(c(=O)[nH]1)COCC2)C(=O)CCc1ccccc1. The target protein sequence is SPDDKEFQSVEEEMQSTVREHRDGGHAGGIFNRYNILKIQKVCNKKLWERYTHRRKEVSEENHNHANERMLFHGSPFVNAIIHKGFDERHAYIGGMFGAGIYFAENSSKSNQYVYGIGGGTGCPVHKDRSCYICHRQLLFCRVTLGKSFLQFSAMKMAHSPPGHHSVTGRPSVNGLALAEYVIYRGEQAYPEYLITYQIMRPEGMV. The pIC50 is 5.3. (5) The compound is CCCC[C@H](NS(=O)(=O)c1ccc(F)cc1)C(=O)N[C@H](C=O)CC(C)C. The target protein (P97571) has sequence MAEELITPVYCTGVSAQVQKQRDKELGLGRHENAIKYLGQDYENLRARCLQNGVLFQDDAFPPVSHSLGFKELGPNSSKTYGIKWKRPTELLSNPQFIVDGATRTDICQGALGDCWLLAAIASLTLNETILHRVVPYGQSFQEGYAGIFHFQLWQFGEWVDVVVDDLLPTKDGKLVFVHSAQGNEFWSALLEKAYAKVNGSYEALSGGCTSEAFEDFTGGVTEWYDLQKAPSDLYQIILKALERGSLLGCSINISDIRDLEAITFKNLVRGHAYSVTDAKQVTYQGQRVNLIRMRNPWGEVEWKGPWSDNSYEWNKVDPYEREQLRVKMEDGEFWMSFRDFIREFTKLEICNLTPDALKSRTLRNWNTTFYEGTWRRGSTAGGCRNYPATFWVNPQFKIRLEEVDDADDYDSRESGCSFLLALMQKHRRRERRFGRDMETIGFAVYQVPRELAGQPVHLKRDFFLANASRAQSEHFINLREVSNRIRLPPGEYIVVPSTF.... The pIC50 is 6.6.