Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P11802) has sequence MATSRYEPVAEIGVGAYGTVYKARDPHSGHFVALKSVRVPNGGGGGGGLPISTVREVALLRRLEAFEHPNVVRLMDVCATSRTDREIKVTLVFEHVDQDLRTYLDKAPPPGLPAETIKDLMRQFLRGLDFLHANCIVHRDLKPENILVTSGGTVKLADFGLARIYSYQMALTPVVVTLWYRAPEVLLQSTYATPVDMWSVGCIFAEMFRRKPLFCGNSEADQLGKIFDLIGLPPEDDWPRDVSLPRGAFPPRGPRPVQSVVPEMEESGAQLLLEMLTFNPHKRISAFRALQHSYLHKDEGNPE. The small molecule is COCc1cccc2c(C3=C(c4coc5ccccc45)C(=O)NC3=O)cn(C)c12. The pIC50 is 5.1. (2) The small molecule is NC(Cc1ccccc1)C(=O)NC1CCCCNC(=O)CCNC(=O)C(Cc2ccc(O)cc2)NC1=O. The target protein sequence is MGSPWNGSDGPEDAREPPWAALPPCDERRCSPFPLGTLVPVTAVCLGLFAVGVSGNVVTVLLIGRYRDMRTTTNLYLGSMAVSDLLILLGLPFDLYRLWRSRPWVFGQLLCRLSLYVGEGCTYASLLHMTALSVERYLAICRPLRARVLVTRRRVRALIAALWAVALLSAGPFFFLVGVEQDPAVFAAPDRNGTVPLDPSSPAPASPPSGPGAEAAALFSRECRPSRAQLGLLRVMLWVTTAYFFLPFLCLSILYGLIARQLWRGRGPLRGPAATGRERGHRQTVRVLLVVVLAFIVCWLPFHVGRIIYINTQDSRMMYFSQYFNIVALQLFYLSASINPILYNLISKKYRAAARRLLRESRAGPSGVCGSRGPEQDVAGDTGGDTAGCTETSANTKTAA. The pIC50 is 4.4. (3) The compound is Cc1nsc(Nc2ccn3nccc3n2)c1C(=O)Nc1ccc(F)c(F)c1. The pIC50 is 4.7. The target protein sequence is DAAIAEDPPDAIAGLQAEWMQMSSLGTVDAPNFIVGNPWDDKLIFKLLSGLSKPVSSYPNTFEWQCKLPAIKPKTEFQLGSKLVYVHHLLGEGAFAQVYEATQGDLNDAKNKQKFVLKVQKPANPWEFYIGTQLMERLKPSMQHMFMKFYSAHLFQNGSVLVGELYSYGTLLNAINLYKNTPEKVMPQGLVISFAMRMLYMIEQVHDCEIIHGDIKPDNFILGNGFLEQDDEDDLSAGLALIDLGQSIDMKLFPKGTIFTAKCETSGFQCVEMLSNKPWNYQIDYFGVAATVYCMLFGTYMKVKNEGGECKPEGLFRRLPHLDMWNEFFHVMLNIPDCHHLPSLDLLRQKLKKVFQQHYTNKIRALRNRLIVLLLECKRSRK. (4) The drug is NC(N)=NN=Cc1ccc2ccccc2c1. The target protein (P03374) has sequence MPNHQSGSPTGSSDLLLSGKKQRPHLALRRKRRREMRKINRKVRRMNLAPIKEKTAWQHLQALISEAEEVLKTSQTPQNSLTLFLALLSVLGPPPVTGESYWAYLPKPPILHPVGWGSTDPIRVLTNQTMYLGGSPDFHGFRNMSGNVHFEGKSDTLPICFSFSFSTPTGCFQVDKQVFLSDTPTVDNNKPGGKGDKRRMWELWLHTLGNSGANTKLVPIKKKLPPKYPHCQIAFKKDAFWEGDESAPPRWLPCAFPDKGVSFSPKGALGLLWDFSLPSPSVDQSDQIKSKKDLFGNYTPPVNKEVHRWYEAGWVEPTWFWENSPKDPNDRDFTALVPHTELFRLVAASRHLILKRPGFQEHEMIPTSACVTYPYAILLGLPQLIDIEKRGSTFHISCSSCRLTNCLDSSAYDYAAIIVKRPPYVLLPVDIGDEPWFDDSAIQTFRYATDLIRAKRFVAAIILGISALIAIITSFAVATTALVKEMQTATFVNNLHRNVT.... The pIC50 is 5.0. (5) The small molecule is CN(C=O)[C@H]1CC[C@H]2[C@@H]3CC=C4C[C@@H](O)CC[C@]4(C)[C@H]3CC[C@@]21C. The target protein (P05093) has sequence MWELVALLLLTLAYLFWPKRRCPGAKYPKSLLSLPLVGSLPFLPRHGHMHNNFFKLQKKYGPIYSVRMGTKTTVIVGHHQLAKEVLIKKGKDFSGRPQMATLDIASNNRKGIAFADSGAHWQLHRRLAMATFALFKDGDQKLEKIICQEISTLCDMLATHNGQSIDISFPVFVAVTNVISLICFNTSYKNGDPELNVIQNYNEGIIDNLSKDSLVDLVPWLKIFPNKTLEKLKSHVKIRNDLLNKILENYKEKFRSDSITNMLDTLMQAKMNSDNGNAGPDQDSELLSDNHILTTIGDIFGAGVETTTSVVKWTLAFLLHNPQVKKKLYEEIDQNVGFSRTPTISDRNRLLLLEATIREVLRLRPVAPMLIPHKANVDSSIGEFAVDKGTEVIINLWALHHNEKEWHQPDQFMPERFLNPAGTQLISPSVSYLPFGAGPRSCIGEILARQELFLIMAWLLQRFDLEVPDDGQLPSLEGIPKVVFLIDSFKVKIKVRQAWR.... The pIC50 is 6.5.