Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1c(Cl)cccc1CCCCOc1ccc(C#Cc2ccc(F)c3c(CCCC(=O)O)c(C)n(CCCC(=O)O)c23)cc1. The target protein sequence is MEPNNSSSRNCMIQESFKKEFYPVTYLVIFVWGALGNGLSIYVFLQTYKKSTSANVFMLNLAMSDLLFISTLPFRAHYYLNNSNWIFGDVPCRIMSYSLYVNMYTSIYFLTVLSVVRFLATVHPFRLLHVTSFRSAWILCGIIWIFTMASAAVLLMHGSEPKNSITTCLELDIRKVGKLKVMNHIALVVGFLLPFFTLSICYLLVIRVLLKVEIPESTLRASHRKALITIIIALITFLLCFLPYHTLRTLHLITWNKDSCGNGLHKAVVITLALAAANSCVNPFLYYFAGENFKDKLKAVFIKDHPQKAKCSFPICL. The pIC50 is 7.5. (2) The pIC50 is 6.6. The target protein (Q16873) has sequence MKDEVALLAAVTLLGVLLQAYFSLQVISARRAFRVSPPLTTGPPEFERVYRAQVNCSEYFPLFLATLWVAGIFFHEGAAALCGLVYLFARLRYFQGYARSAQLRLAPLYASARALWLLVALAALGLLAHFLPAALRAALLGRLRTLLPWA. The small molecule is COCC(C1CC1)N(c1ccc(Cl)cc1)c1cnc(C(=O)C2CC2C(=O)O)c(OC)c1. (3) The drug is O=C(O)Cc1cccn1-c1ccccc1F. The target protein (Q8VD48) has sequence MLLWVLALLFLCAFLWNYKGQLKIADIADKYIFITGCDSGFGNLAARTFDRKGFRVIAACLTESGSEALKAKTSERLHTVLLDVTNPENVKETAQWVKSHVGEKGLWGLINNAGVLGVLAPTDWLTVDDYREPIEVNLFGLINVTLNMLPLVKKARGRVINVSSIGGRLAFGGGGYTPSKYAVEGFNDSLRRDMKAFGVHVSCIEPGLFKTGLADPIKTTEKKLAIWKHLSPDIKQQYGEGYIEKSLHRLKSSTSSVNLDLSLVVECMDHALTSLFPKTRYTAGKDAKTFWIPLSHMPAALQDFLLLKEKVELANPQAV. The pIC50 is 4.7. (4) The drug is C[C@@H]1CC(=O)NN=C1c1ccc(NC2=C(Cc3cccc(Br)c3)C(=O)CCC2)cc1. The pIC50 is 9.4. The target protein sequence is MGAFSGSCRPKINPLTPFPGFYPCSEIEDPAEKGDRKLNKGLNRNSLPTPQLRRSSGTSGLLPVEQSSRWDRNNGKRPHQEFGISSQGCYLNGPFNSNLLTIPKQRSSSVSLTHHVGLRRAGVLSSLSPVNSSNHGPVSTGSLTNRSPIEFPDTADFLNKPSVILQRSLGNAPNTPDFYQQLRNSDSNLCNSCGHQMLKYVSTSESDGTDCCSGKSGEEENIFSKESFKLMETQQEEETEKKDSRKLFQEGDKWLTEEAQSEQQTNIEQEVSLDLILVEEYDSLIEKMSNWNFPIFELVEKMGEKSGRILSQVMYTLFQDTGLLEIFKIPTQQFMNYFRALENGYRDIPYHNRIHATDVLHAVWYLTTRPVPGLQQIHNGCGTGNETDSDGRINHGRIAYISSKSCSNPDESYGCLSSNIPALELMALYVAAAMHDYDHPGRTNAFLVATNAPQAVLYNDRSVLENHHAASAWNLYLSRPEYNFLLHLDHVEFKRFRFLV.... (5) The drug is COC(=O)c1ccc(C(=O)N2CC[C@](c3ccc(C(OCc4c(F)cccc4F)(C(F)(F)F)C(F)(F)F)cc3)(S(=O)(=O)c3ccc(F)cc3)C2)cc1. The target protein sequence is APYASLTEIEHLVQSVCKSYRETCQLRLEDLLRQRSNIFSREEVTGYQRKSMWEMWERCAHHLTEAIQYVVEFAKRLSGFMELCQNDQIVLLKAGAMEVVLVRMCRAYNADNRTVFFEGKYGGMELFRALGCSELISSIFDFSHSLSALHFSEDEIALYTALVLINAHRPGLQEKRKVEQLQYNLELAFHHHLCKTHRQSILAKLPPKGKLRSLCSQHVERLQIFQHLHPIVVQAAFPPLYKELFS. The pIC50 is 6.7. (6) The drug is CCOC(=O)[C@H]1O[C@@H]1C(=O)N[C@H](C(=O)NCCc1ccc(O)cc1)C(C)CC. The target protein (P10605) has sequence MWWSLILLSCLLALTSAHDKPSFHPLSDDLINYINKQNTTWQAGRNFYNVDISYLKKLCGTVLGGPKLPGRVAFGEDIDLPETFDAREQWSNCPTIGQIRDQGSCGSCWAFGAVEAISDRTCIHTNGRVNVEVSAEDLLTCCGIQCGDGCNGGYPSGAWSFWTKKGLVSGGVYNSHVGCLPYTIPPCEHHVNGSRPPCTGEGDTPRCNKSCEAGYSPSYKEDKHFGYTSYSVSNSVKEIMAEIYKNGPVEGAFTVFSDFLTYKSGVYKHEAGDMMGGHAIRILGWGVENGVPYWLAANSWNLDWGDNGFFKILRGENHCGIESEIVAGIPRTDQYWGRF. The pIC50 is 6.1. (7) The compound is CC(O[C@@H]1CC[C@@H](O)C[C@@H]1O)(P(=O)(O)O)P(=O)(O)O. The target protein (P20456) has sequence MADPWQECMDYAVTLAGQAGEVVREALKNEMNIMVKSSPADLVTATDQKVEKMLITSIKEKYPSHSFIGEESVAAGEKSILTDNPTWIIDPIDGTTNFVHGFPFVAVSIGFVVNKKMEFGIVYSCLEDKMYTGRKGKGAFCNGQKLQVSHQEDITKSLLVTELGSSRTPETVRIILSNIERLLCLPIHGIRGVGTAALNMCLVAAGAADAYYEMGIHCWDVAGAGIIVTEAGGVLLDVTGGPFDLMSRRVIASSNKTLAERIAKEIQIIPLQRDDED. The pIC50 is 5.4. (8) The drug is O=C(N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N1CCC[C@H]1C(=O)O)[C@@H](CS)Cc1ccccc1. The target protein (P12822) has sequence MGAAPGRRGPRLLRPPPPLLLLLLLLRPPPAALTLDPGLLPGDFAADEAGARLFASSYNSSAEQVLFRSTAASWAHDTNITAENARRQEEEALLSQEFAEAWGKKAKELYDPVWQNFTDPELRRIIGAVRTLGPANLPLAKRQQYNSLLSNMSQIYSTGKVCFPNKTASCWSLDPDLNNILASSRSYAMLLFAWEGWHNAVGIPLKPLYQEFTALSNEAYRQDGFSDTGAYWRSWYDSPTFEEDLERIYHQLEPLYLNLHAYVRRVLHRRYGDRYINLRGPIPAHLLGNMWAQSWESIYDMVVPFPDKPNLDVTSTMVQKGWNATHMFRVAEEFFTSLGLLPMPPEFWAESMLEKPEDGREVVCHASAWDFYNRKDFRIKQCTQVTMDQLSTVHHEMGHVQYYLQYKDQPVSLRRANPGFHEAIGDVLALSVSTPAHLHKIGLLDHVTNDTESDINYLLKMALEKIAFLPFGYLVDQWRWGVFSGRTPSSRYNFDWWYLR.... The pIC50 is 8.0. (9) The drug is Cc1cccc([C@H](C(N)=O)[C@@H](CCC(F)(F)F)C(=O)N[C@H]2N=C(c3ccccc3)c3ccccc3N(C)C2=O)c1. The target protein (Q9UM47) has sequence MGPGARGRRRRRRPMSPPPPPPPVRALPLLLLLAGPGAAAPPCLDGSPCANGGRCTQLPSREAACLCPPGWVGERCQLEDPCHSGPCAGRGVCQSSVVAGTARFSCRCPRGFRGPDCSLPDPCLSSPCAHGARCSVGPDGRFLCSCPPGYQGRSCRSDVDECRVGEPCRHGGTCLNTPGSFRCQCPAGYTGPLCENPAVPCAPSPCRNGGTCRQSGDLTYDCACLPGFEGQNCEVNVDDCPGHRCLNGGTCVDGVNTYNCQCPPEWTGQFCTEDVDECQLQPNACHNGGTCFNTLGGHSCVCVNGWTGESCSQNIDDCATAVCFHGATCHDRVASFYCACPMGKTGLLCHLDDACVSNPCHEDAICDTNPVNGRAICTCPPGFTGGACDQDVDECSIGANPCEHLGRCVNTQGSFLCQCGRGYTGPRCETDVNECLSGPCRNQATCLDRIGQFTCICMAGFTGTYCEVDIDECQSSPCVNGGVCKDRVNGFSCTCPSGFS.... The pIC50 is 9.0.