Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50.. Dataset: Drug-target binding data from BindingDB using IC50 measurements (1) The compound is COc1ccccc1NS(=O)(=O)c1cc(NC(=O)c2ccc(NC(C)=O)cc2)ccc1N1CCOCC1. The target protein (Q13285) has sequence MDYSYDEDLDELCPVCGDKVSGYHYGLLTCESCKGFFKRTVQNNKHYTCTESQSCKIDKTQRKRCPFCRFQKCLTVGMRLEAVRADRMRGGRNKFGPMYKRDRALKQQKKAQIRANGFKLETGPPMGVPPPPPPAPDYVLPPSLHGPEPKGLAAGPPAGPLGDFGAPALPMAVPGAHGPLAGYLYPAFPGRAIKSEYPEPYASPPQPGLPYGYPEPFSGGPNVPELILQLLQLEPDEDQVRARILGCLQEPTKSRPDQPAAFGLLCRMADQTFISIVDWARRCMVFKELEVADQMTLLQNCWSELLVFDHIYRQVQHGKEGSILLVTGQEVELTTVATQAGSLLHSLVLRAQELVLQLLALQLDRQEFVCLKFIILFSLDLKFLNNHILVKDAQEKANAALLDYTLCHYPHCGDKFQQLLLCLVEVRALSMQAKEYLYHKHLGNEMPRNNLLIEMLQAKQT. The pIC50 is 4.2. (2) The target protein (P03409) has sequence MAHFPGFGQSLLFGYPVYVFGDCVQGDWCPISGGLCSARLHRHALLATCPEHQITWDPIDGRVIGSALQFLIPRLPSFPTQRTSKTLKVLTPPITHTTPNIPPSFLQAMRKYSPFRNGYMEPTLGQHLPTLSFPDPGLRPQNLYTLWGGSVVCMYLYQLSPPITWPLLPHVIFCHPGQLGAFLTNVPYKRIEELLYKISLTTGALIILPEDCLPTTLFQPARAPVTLTAWQNGLLPFHSTLTTPGLIWTFTDGTPMISGPCPKDGQPSLVLQSSSFIFHKFQTKAYHPSFLLSHGLIQYSSFHSLHLLFEEYTNIPISLLFNEKEADDNDHEPQISPGGLEPPSEKHFRETEV. The compound is COc1c(F)cccc1C(=O)N1CCc2c(C)c(-c3cc(F)c4c(c3C)CCCO4)c([C@H](OC(C)(C)C)C(=O)O)c(C)c2CC1. The pIC50 is 8.3.