This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COCCOc1cc2ncc3c(N)nc(N(C)C)cc3c2cc1OC. The target protein sequence is GPMDGTAAEPRPGAGSLQHAQPPPQPRKKRPEDFKFGKILGEGSFSTVVLARELATSREYAIKILEKRHIIKENKVPYVTRERDVMSRLDHPFFVKLYFTFQDDEKLYFGLSYAKNGELLKYIRKIGSFDETCTRFYTAEIVSALEYLHGKGIIHRDLKPENILLNEDMHIQITDFGTAKVLSPESKQARANSFVGTAQYVSPELLTEKSACKSSDLWALGCIIYQLVAGLPPFRAGNEYLIFQKIIKLEYDFPEKFFPKARDLVEKLLVLDATKRLGCEEMEGYGPLKAHPFFESVTWENLHQQTPPKLT. The pIC50 is 6.1. (2) The compound is C[C@]12CC[C@H](O)C[C@H]1CC[C@@H]1[C@@H]2CC[C@]2(C)[C@@H](/C=C/C3=CC(=O)OC3)CC[C@]12O. The target protein (P50997) has sequence MGKGVGRDKYEPAAVSEHGDKKKAKKERDMDELKKEVSMDDHKLSLDELHRKYGTDLSRGLTTARAAEILARDGPNALTPPPTTPEWVKFCRQLFGGFSMLLWIGAILCFLAYGIQAATEEEPQNDNLYLGVVLSAVVIITGCFSYYQEAKSSKIMESFKNMVPQQALVIRNGEKMSINAEEVVIGDLVEVKGGDRIPADLRIISANGCKVDNSSLTGESEPQTRSPDFTNENPLETRNIAFFSTNCVKGTARGIVVYTGDRTVMGRIATLASGLEGGQTPIAAEIEHFIHIITGVAVFLGVSFFILSLILEYTWLEAVIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIHEADTTENQSGVSFDKSSATWLALSRIAGLCNRAVFQANQENLPILKRAVAGDASESALLKCIELCCGSVKEMRDRYAKIVEIPFNSTNKYQLSIHKNPNTSEPR.... The pIC50 is 5.4. (3) The small molecule is COc1ccc(S(=O)(=O)NCc2c(-c3ccccc3)noc2C)c(OC)c1. The target protein (P03951) has sequence MIFLYQVVHFILFTSVSGECVTQLLKDTCFEGGDITTVFTPSAKYCQVVCTYHPRCLLFTFTAESPSEDPTRWFTCVLKDSVTETLPRVNRTAAISGYSFKQCSHQISACNKDIYVDLDMKGINYNSSVAKSAQECQERCTDDVHCHFFTYATRQFPSLEHRNICLLKHTQTGTPTRITKLDKVVSGFSLKSCALSNLACIRDIFPNTVFADSNIDSVMAPDAFVCGRICTHHPGCLFFTFFSQEWPKESQRNLCLLKTSESGLPSTRIKKSKALSGFSLQSCRHSIPVFCHSSFYHDTDFLGEELDIVAAKSHEACQKLCTNAVRCQFFTYTPAQASCNEGKGKCYLKLSSNGSPTKILHGRGGISGYTLRLCKMDNECTTKIKPRIVGGTASVRGEWPWQVTLHTTSPTQRHLCGGSIIGNQWILTAAHCFYGVESPKILRVYSGILNQSEIKEDTSFFGVQEIIIHDQYKMAESGYDIALLKLETTVNYTDSQRPIC.... The pIC50 is 4.3. (4) The small molecule is COc1cc(Br)c(S(=O)(=O)Cc2ccc(Cl)c(O[C@@H]3CCN(C)C3)c2)cc1OC. The target protein (Q8VIH9) has sequence MALSLESTSFPMLAVSRSTASELPGGFNVSHNSSWTGPTDPSSLQDLVATGVIGAVLSTMGVVGVVGNVYTLVVMCRFLRASASMYVYVVNLALADLLYLLSIPFIVATYVTKDWHFGDVGCRVLFSLDFLTMHASIFTLTIMSSERYAAVLRPLDTVQRSKGYRKLLALGTWLLALLLTLPMMLAIRLVRRGSKSLCLPAWGPRAHRTYLTLLFGTSIVGPGLVIGLLYIRLARAYWLSQQASFKQTRRLPNPRVLYLILGIVLLFWACFLPFWLWQLLAQYHQAMPLTPETARIINYLTACLTYGNSCINPFLYTLLTKNYREYLRGRQRSLGSSCRGPGSAGSFLSSRVHLQQDSGRSLSSNSQQATETLVLSPVPPNGAFV. The pIC50 is 5.3. (5) The small molecule is CSCC[C@H](NC(=O)[C@H](Cc1ccc(O)c(C)c1C)NC(=O)[C@@H](NC(=O)[C@H](CS)NC=O)C(C)C)C(=O)O. The target protein (P29702) has sequence MAAADGVGEAAQGGDPGQPEPPPPPQPHPPPPPPQPPQEEAAAASPIDDGFLSLDSPTYVLYRDRPEWADIDPVPQNDGPNPVVQIIYSEKFQDVYDYFRAVLQRDERSERAFKLTRDAIELNAANYTVWHFRRVLLKSLQKDLHEEMNYISAIIEEQPKNYQVWHHRRVLVEWLRDPSQELEFIADILTQDAKNYHAWQHRQWVIQEFKLWDNELQYVDQLLKEDVRNNSVWNQRYFVISNTTGYNDRAILEREVQYTLEMIKLVPHNESAWNYLKGILQDRGLSKYPNLLNQLLDLQPSHSSPYLIAFLVDIYEDMLENQCDNKEDILNKALELCEILAKEKDTIRKEYWRYIGRSLQSKHSTESDPPTNVQQ. The pIC50 is 5.0. (6) The small molecule is CC[C@H](NC(=O)[C@@H]1C[C@@H](S(=O)(=O)c2ccc(Br)cc2C(F)(F)F)CN1c1cc(C)nn1C1CCC1)C(=O)C(=O)NC1CC1. The target protein (P07711) has sequence MNPTLILAAFCLGIASATLTFDHSLEAQWTKWKAMHNRLYGMNEEGWRRAVWEKNMKMIELHNQEYREGKHSFTMAMNAFGDMTSEEFRQVMNGFQNRKPRKGKVFQEPLFYEAPRSVDWREKGYVTPVKNQGQCGSCWAFSATGALEGQMFRKTGRLISLSEQNLVDCSGPQGNEGCNGGLMDYAFQYVQDNGGLDSEESYPYEATEESCKYNPKYSVANDTGFVDIPKQEKALMKAVATVGPISVAIDAGHESFLFYKEGIYFEPDCSSEDMDHGVLVVGYGFESTESDNNKYWLVKNSWGEEWGMGGYVKMAKDRRNHCGIASAASYPTV. The pIC50 is 6.7. (7) The compound is O=c1oc(Cl)c(Cl)c2ccccc12. The target protein (P09391) has sequence MLMITSFANPRVAQAFVDYMATQGVILTIQQHNQSDVWLADESQAERVRAELARFLENPADPRYLAASWQAGHTGSGLHYRRYPFFAALRERAGPVTWVMMIACVVVFIAMQILGDQEVMLWLAWPFDPTLKFEFWRYFTHALMHFSLMHILFNLLWWWYLGGAVEKRLGSGKLIVITLISALLSGYVQQKFSGPWFGGLSGVVYALMGYVWLRGERDPQSGIYLQRGLIIFALIWIVAGWFDLFGMSMANGAHIAGLAVGLAMAFVDSLNARKRK. The pIC50 is 4.7.