This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC[C@H]1NC[C@H](O)[C@@H]1O. The target protein (Q14697) has sequence MAAVAAVAARRRRSWASLVLAFLGVCLGITLAVDRSNFKTCEESSFCKRQRSIRPGLSPYRALLDSLQLGPDSLTVHLIHEVTKVLLVLELQGLQKNMTRFRIDELEPRRPRYRVPDVLVADPPIARLSVSGRDENSVELTMAEGPYKIILTARPFRLDLLEDRSLLLSVNARGLLEFEHQRAPRVSQGSKDPAEGDGAQPEETPRDGDKPEETQGKAEKDEPGAWEETFKTHSDSKPYGPMSVGLDFSLPGMEHVYGIPEHADNLRLKVTEGGEPYRLYNLDVFQYELYNPMALYGSVPVLLAHNPHRDLGIFWLNAAETWVDISSNTAGKTLFGKMMDYLQGSGETPQTDVRWMSETGIIDVFLLLGPSISDVFRQYASLTGTQALPPLFSLGYHQSRWNYRDEADVLEVDQGFDDHNLPCDVIWLDIEHADGKRYFTWDPSRFPQPRTMLERLASKRRKLVAIVDPHIKVDSGYRVHEELRNLGLYVKTRDGSDYEG.... The pIC50 is 3.1. (2) The small molecule is CC(C)(C)c1ccc(C(=O)N/N=C/c2cc(Br)c(O)c(Br)c2O)cc1. The target protein (O25928) has sequence MEQSHQNLQSQFFIEHILQILPHRYPMLLVDRIIELQANKKIVAYKNITFNEDVFNGHFPNKPIFPGVLIVEGMAQTGGFLAFTSLWGFDPEIAKTKIVYFMTIDKVKFRIPVTPGDRLEYHLEVLKHKGMIWQVGGTAQVDGKVVAEAELKAMIAERD. The pIC50 is 5.8. (3) The small molecule is CN(c1c(N)n(Cc2ccccc2)c(=O)[nH]c1=O)S(=O)(=O)c1ccccc1. The target protein sequence is MKRKGIILAGGSGTRLHPATLAISKQLLPVYDKPMIYYPLSTLMLAGIREILIISTPQDTPRFQQLLGDGSNWGLDLQYAVQPSPDGLAQAFLIGESFIGNDLSALVLGDNLYYGHDFHELLGSASQRQTGASVFAYHVLDPERYGVVEFDQGGKAISLEEKPLEPKSNYAVTGLYFYDQQVVDIARDLKPSPRGELEITDVNRAYLERGQLSVEIMGRGYAWLDTGTHDSLLEAGQFIATLENRQGLKVACPEEIAYRQKWIDAAQLEKLAAPLAKNGYGQYLKRLLTETVY. The pIC50 is 7.1. (4) The drug is CN1C(=O)c2cc(Nc3ccccc3)c(Nc3ccccc3)cc2C1=O. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAQLDQFERIKTLGTGSFGRVMLSKHKETGNHYAMKILDKQKVVKLKQIEHTLNVKRILQAVNFPFLVKLEFSFKENSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIQQTDFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAADYPPFFADQPIQIYEKIVYGKVRFPSHFSSDLKDLLRNLQQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 3.3. (5) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)CNC(=O)CNC(=O)[C@H](CC1CCCCC1)NC(=O)[C@H](C)N)[C@@H](C)CC)C(=O)NCC(N)=O. The target protein (P17342) has sequence MPSLLVLTFSPCVLLGWALLAGGTGGGGVGGGGGGAGIGGGRQEREALPPQKIEVLVLLPQDDSYLFSLTRVRPAIEYALRSVEGNGTGRRLLPPGTRFQVAYEDSDCGNRALFSLVDRVAAARGAKPDLILGPVCEYAAAPVARLASHWDLPMLSAGALAAGFQHKDSEYSHLTRVAPAYAKMGEMMLALFRHHHWSRAALVYSDDKLERNCYFTLEGVHEVFQEEGLHTSIYSFDETKDLDLEDIVRNIQASERVVIMCASSDTIRSIMLVAHRHGMTSGDYAFFNIELFNSSSYGDGSWKRGDKHDFEAKQAYSSLQTVTLLRTVKPEFEKFSMEVKSSVEKQGLNMEDYVNMFVEGFHDAILLYVLALHEVLRAGYSKKDGGKIIQQTWNRTFEGIAGQVSIDANGDRYGDFSVIAMTDVEAGTQEVIGDYFGKEGRFEMRPNVKYPWGPLKLRIDENRIVEHTNSSPCKSSGGLEESAVTGIVVGALLGAGLLMA.... The pIC50 is 7.9. (6) The small molecule is CCOC(=O)CCCC/[N+]([O-])=C(/c1cccnc1)c1cccc(COC(=O)N2CCc3c(sc4c3C(c3ccccc3Cl)=NC(C)c3nnc(C)n3-4)C2)c1. The target protein (P24557) has sequence MEALGFLKLEVNGPMVTVALSVALLALLKWYSTSAFSRLEKLGLRHPKPSPFIGNLTFFRQGFWESQMELRKLYGPLCGYYLGRRMFIVISEPDMIKQVLVENFSNFTNRMASGLEFKSVADSVLFLRDKRWEEVRGALMSAFSPEKLNEMVPLISQACDLLLAHLKRYAESGDAFDIQRCYCNYTTDVVASVAFGTPVDSWQAPEDPFVKHCKRFFEFCIPRPILVLLLSFPSIMVPLARILPNKNRDELNGFFNKLIRNVIALRDQQAAEERRRDFLQMVLDARHSASPMGVQDFDIVRDVFSSTGCKPNPSRQHQPSPMARPLTVDEIVGQAFIFLIAGYEIITNTLSFATYLLATNPDCQEKLLREVDVFKEKHMAPEFCSLEEGLPYLDMVIAETLRMYPPAFRFTREAAQDCEVLGQRIPAGAVLEMAVGALHHDPEHWPSPETFNPERFTAEARQQHRPFTYLPFGAGPRSCLGVRLGLLEVKLTLLHVLHKF.... The pIC50 is 7.2.