This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCCCCCCCCCCCCCCC(=O)N(C)[C@H](CO)C(=O)N[C@H](C)C(=O)NCC(=O)N(C)[C@@H]1C(=O)N[C@@H](C)C(=O)N[C@H](C(=O)O)Cc2ccc(O)c(c2)-c2cc1ccc2O. The target protein (P0A070) has sequence MKKELLEWIISIAVAFVILFIVGKFIVTPYTIKGESMDPTLKDGERVAVNIIGYKTGGLEKGNVVVFHANKNDDYVKRVIGVPGDKVEYKNDTLYVNGKKQDEPYLNYNLKHKQGDYITGTFQVKDLPNANPKSNVIPKGKYLVLGDNREVSKDSRAFGLIDEDQIVGKVSFRFWPFSEFKHNFNPENTKN. The pIC50 is 5.7. (2) The drug is COc1ccc2cc3[n+](cc2c1OCCCOc1ccc(Cl)cc1)CCc1cc2c(cc1-3)OCO2. The target protein (P0A031) has sequence MLEFEQGFNHLATLKVIGVGGGGNNAVNRMIDHGMNNVEFIAINTDGQALNLSKAESKIQIGEKLTRGLGAGANPEIGKKAAEESREQIEDAIQGADMVFVTSGMGGGTGTGAAPVVAKIAKEMGALTVGVVTRPFSFEGRKRQTQAAAGVEAMKAAVDTLIVIPNDRLLDIVDKSTPMMEAFKEADNVLRQGVQGISDLIAVSGEVNLDFADVKTIMSNQGSALMGIGVSSGENRAVEAAKKAISSPLLETSIVGAQGVLMNITGGESLSLFEAQEAADIVQDAADEDVNMIFGTVINPELQDEIVVTVIATGFDDKPTSHGRKSGSTGFGTSVNTSSNATSKDESFTSNSSNAQATDSVSERTHTTKEDDIPSFIRNREERRSRRTRR. The pIC50 is 4.1. (3) The target protein (Q64640) has sequence MAAADEPKPKKLKVEAPEALSENVLFGMGNPLLDISAVVDKDFLDKYSLKPNDQILAEDKHKELFDELVKKFKVEYHAGGSTQNSMKVAQWMIQEPHRAATFFGCIGIDKFGEILKSKAADAHVDAHYYEQNEQPTGTCAACITGGNRSLVANLAAANCYKKEKHLDLENNWMLVEKARVYYIAGFFLTVSPESVLKVARYAAENNRTFTLNLSAPFISQFFKEALMEVMPYVDILFGNETEAATFAREQGFETKDIKEIARKTQALPKVNSKRQRTVIFTQGRDDTIVATGNDVTAFPVLDQNQEEIVDTNGAGDAFVGGFLSQLVSNKPLTECIRAGHYAASVIIRRTGCTFPEKPDFH. The pIC50 is 7.1. The drug is CN(C)c1ccc(-c2cc(-c3cccc(C(N)=O)c3)c3c(N)ncnc3n2)cc1.