This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCOC(=O)C(Cc1ccc(NS(=O)(=O)[O-])cc1)(NC(=O)OCc1ccccc1)C(=O)OC. The target protein (P10586) has sequence MAPEPAPGRTMVPLVPALVMLGLVAGAHGDSKPVFIKVPEDQTGLSGGVASFVCQATGEPKPRITWMKKGKKVSSQRFEVIEFDDGAGSVLRIQPLRVQRDEAIYECTATNSLGEINTSAKLSVLEEEQLPPGFPSIDMGPQLKVVEKARTATMLCAAGGNPDPEISWFKDFLPVDPATSNGRIKQLRSGALQIESSEESDQGKYECVATNSAGTRYSAPANLYVRVRRVAPRFSIPPSSQEVMPGGSVNLTCVAVGAPMPYVKWMMGAEELTKEDEMPVGRNVLELSNVVRSANYTCVAISSLGMIEATAQVTVKALPKPPIDLVVTETTATSVTLTWDSGNSEPVTYYGIQYRAAGTEGPFQEVDGVATTRYSIGGLSPFSEYAFRVLAVNSIGRGPPSEAVRARTGEQAPSSPPRRVQARMLSASTMLVQWEPPEEPNGLVRGYRVYYTPDSRRPPNAWHKHNTDAGLLTTVGSLLPGITYSLRVLAFTAVGDGPPS.... The pIC50 is 3.6. (2) The small molecule is Cc1ccc(C(=O)Nc2ccc(CN3CCN(C)CC3)c(C(F)(F)F)c2)cc1CCc1cnc2cccnn12. The target protein (P00519) has sequence MLEICLKLVGCKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQES.... The pIC50 is 7.7. (3) The small molecule is CCN1CCN(c2cc3cc[nH]c(=O)c3c(Nc3ccc(N4CCOCC4)cc3)n2)CC1. The target protein (P43403) has sequence MPDPAAHLPFFYGSISRAEAEEHLKLAGMADGLFLLRQCLRSLGGYVLSLVHDVRFHHFPIERQLNGTYAIAGGKAHCGPAELCEFYSRDPDGLPCNLRKPCNRPSGLEPQPGVFDCLRDAMVRDYVRQTWKLEGEALEQAIISQAPQVEKLIATTAHERMPWYHSSLTREEAERKLYSGAQTDGKFLLRPRKEQGTYALSLIYGKTVYHYLISQDKAGKYCIPEGTKFDTLWQLVEYLKLKADGLIYCLKEACPNSSASNASGAAAPTLPAHPSTLTHPQRRIDTLNSDGYTPEPARITSPDKPRPMPMDTSVYESPYSDPEELKDKKLFLKRDNLLIADIELGCGNFGSVRQGVYRMRKKQIDVAIKVLKQGTEKADTEEMMREAQIMHQLDNPYIVRLIGVCQAEALMLVMEMAGGGPLHKFLVGKREEIPVSNVAELLHQVSMGMKYLEEKNFVHRDLAARNVLLVNRHYAKISDFGLSKALGADDSYYTARSAGK.... The pIC50 is 8.0. (4) The small molecule is O=C(C=Cc1ccc(Sc2cccc(OCc3cnc[nH]3)c2)c(C(F)(F)F)c1C(F)(F)F)N1CCOCC1. The pIC50 is 7.2. The target protein (P20701) has sequence MKDSCITVMAMALLSGFFFFAPASSYNLDVRGARSFSPPRAGRHFGYRVLQVGNGVIVGAPGEGNSTGSLYQCQSGTGHCLPVTLRGSNYTSKYLGMTLATDPTDGSILACDPGLSRTCDQNTYLSGLCYLFRQNLQGPMLQGRPGFQECIKGNVDLVFLFDGSMSLQPDEFQKILDFMKDVMKKLSNTSYQFAAVQFSTSYKTEFDFSDYVKRKDPDALLKHVKHMLLLTNTFGAINYVATEVFREELGARPDATKVLIIITDGEATDSGNIDAAKDIIRYIIGIGKHFQTKESQETLHKFASKPASEFVKILDTFEKLKDLFTELQKKIYVIEGTSKQDLTSFNMELSSSGISADLSRGHAVVGAVGAKDWAGGFLDLKADLQDDTFIGNEPLTPEVRAGYLGYTVTWLPSRQKTSLLASGAPRYQHMGRVLLFQEPQGGGHWSQVQTIHGTQIGSYFGGELCGVDVDQDGETELLLIGAPLFYGEQRGGRVFIYQRR.... (5) The small molecule is CCCCC(=O)NC[C@@H](O)[C@@H](O)[C@@H]1OC(C(=O)O)=C[C@H](NC(=N)N)[C@H]1NC(C)=O.O=C(O)C(F)(F)F. The target protein sequence is MNPNQKIITIGSICMVVGIISLILQIGNIISIWISHSIQTGNQNHTGICNQGSITYKVVAGQDSTSVILTGNSSLCPIRGWAIHSKDNGIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSRGTFKDRSPYRALMSCPVGEAPSPYNSRFESVAWSASACHDGMGWLTIGISGPDDGAVAVLKYNGIITETIKSWRKNILRTQESECTCVNGSCFTIMTDGPSDGLASYKIFKIEKGKVTKSIELNAPNSHYEECSCYPDTGKVMCVCRDNWHGSNRPWVSFDQNLDYKIGYICSGVFGDNPRPKDGTGSCGPVSADGANGVKGFSYKYGNGVWIGRTKSDSSRHGFEMIWDPNGWTETDSRFSMRQDVVAMTDRSGYSGSFVQHPELTGLDCMRPCFWVELIRGLPEENAIWTSGSIISFCGVNSDTVDWSWPDGAELPFTIDK. The pIC50 is 6.1. (6) The target protein (P21730) has sequence MDSFNYTTPDYGHYDDKDTLDLNTPVDKTSNTLRVPDILALVIFAVVFLVGVLGNALVVWVTAFEAKRTINAIWFLNLAVADFLSCLALPILFTSIVQHHHWPFGGAACSILPSLILLNMYASILLLATISADRFLLVFKPIWCQNFRGAGLAWIACAVAWGLALLLTIPSFLYRVVREEYFPPKVLCGVDYSHDKRRERAVAIVRLVLGFLWPLLTLTICYTFILLRTWSRRATRSTKTLKVVVAVVASFFIFWLPYQVTGIMMSFLEPSSPTFLLLKKLDSLCVSFAYINCCINPIIYVVAGQGFQGRLRKSLPSLLRNVLTEESVVRESKSFTRSTVDTMAQKTQAV. The pIC50 is 6.7. The small molecule is O=C(c1csc2ccccc12)N(CCc1ccc(Cl)cc1)[C@H]1CC[C@@]2(CCCO2)CC1. (7) The small molecule is O=C(c1ccccc1)c1ccccc1OCC(O)CN1CCN(c2cccc3[nH]c(=O)oc23)CC1. The target protein (Q62205) has sequence MAMLPPPGPQSFVHFTKQSLALIEQRISEEKAKGHKDEKKDDEEEGPKPSSDLEAGKQLPFIYGDIPPGMVSEPLEDLDPYYADKKTFIVLNKGKAIFRFNATPALYMLSPFSPLRRISIKILVHSLFSMLIMCTILTNCIFMTMSNPPDWTKNVEYTFTGIYTFESLIKILARGFCVGEFTFLRDPWNWLDFVVIVFAYLTEFVNLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLKHKCFRKDLEQNETLESIMSTAESEEELKRYFYYLEGSKDALLCGFSTDSGQCPEGYECVTAGRNPDYGYTSFDTFGWAFLALFRLMTQDYWENLYQQTLRAAGKTYMIFFVVVIFLGSFYLINLILAVVAMAYEEQNQANIEEAKQKELEFQQMLDRLKKEQEEAEAIAAAAAEYTSLGRSRIMGLSESSSETSRLSSKSAKERRNRRKKKKQKLSSGEEKGDDEKLS.... The pIC50 is 4.5. (8) The drug is CNc1ccc(C(C)(C)C(=O)N[C@H]2C3C[C@@H]4C[C@H]2C[C@@](CC(N)=O)(C3)C4)cn1. The target protein (P80365) has sequence MERWPWPSGGAWLLVAARALLQLLRSDLRLGRPLLAALALLAALDWLCQRLLPPPAALAVLAAAGWIALSRLARPQRLPVATRAVLITGCDSGFGKETAKKLDSMGFTVLATVLELNSPGAIELRTCCSPRLRLLQMDLTKPGDISRVLEFTKAHTTSTGLWGLVNNAGHNEVVADAELSPVATFRSCMEVNFFGALELTKGLLPLLRSSRGRIVTVGSPAGDMPYPCLGAYGTSKAAVALLMDTFSCELLPWGVKVSIIQPGCFKTESVRNVGQWEKRKQLLLANLPQELLQAYGKDYIEHLHGQFLHSLRLAMSDLTPVVDAITDALLAARPRRRYYPGQGLGLMYFIHYYLPEGLRRRFLQAFFISHCLPRALQPGQPGTTPPQDAAQDPNLSPGPSPAVAR. The pIC50 is 4.0.