This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C([O-])c1ccc(-c2ccc(O[C@H]3O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]3O)cc2Cl)cc1. The target protein sequence is MKRVITLFAVLLMGWSVNAWSFACKTANGTAIPIGGGSANVYVNLAPAVNVGQNLVVDLSTQIFCHNDYPETITDYVTLQRGAAYGGVLSSFSGTVKYNGSSYPFPTTSETPRVVYNSRTDKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQFVWNIYANNDVVVPTGGCDVSARDVTVTLPDYPGSVPIPLTVYCAKSQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ. The pIC50 is 4.8. (2) The target protein sequence is MTGDTPINIFGRNILTALGMSLNLPVARIEPIKITLKPGKDGPRLKQWPLTKEKVEALKEICEKMEKEGQLEEAPPTNPYNTPTFAIKKKDKNKWRMLIDFRELNRVTQDFTEIQLGIPHPAGLAKKKRITVLDVGDAYFSIPLYEDFRPYTAFTLPSVNNVEPGKRYIYKVLPQGWKGSPAIFQYTMRQILEPFRKANPDVILIQYMDDILIASDRTGLEHDKVVLQLKELLNGLGFSTPEEKFQKDPPFQWMGYELWPTKWKLQKIQLPQKETWTVNDIQKLVGILNWAAQIYPGIKTKHLCRLIRGKMTLTEEVQWTELAEAELEENRIILDQEQEGHYYQEEKELEATIQKSQDNQWTYKIHQEEKILKVGKYAKIKNTHTNGVRLLAQVVQKIGKEALVIWGRIPKFHLPVERETWEQWWDNYWQVTWIPEWDFVSTPPLVRLTFNLVGDPIPGTETFYTDGSCNRQSKEGKAGYVTDRGRDKVRVLEQTTNQQA.... The compound is CC1=C(CCC(=O)O)c2cc3nc(cc4[nH]c(cc5[nH]c(cc1n2)c(C)c5C(COC(=O)c1ccc2ccccc2c1)OC(=O)c1ccc2ccccc2c1)c(C)c4C(COC(=O)c1ccc2ccccc2c1)OC(=O)c1ccc2ccccc2c1)C(C)=C3CCC(=O)O. The pIC50 is 4.6. (3) The drug is C1=CCN(Cc2ccccc2)C1. The target is XTSFAESXKPVQQPSAFGS. The pIC50 is 4.0. (4) The small molecule is Cn1cccc1CC(=O)NN=C/C=C\c1ccco1. The target protein (O75475) has sequence MTRDFKPGDLIFAKMKGYPHWPARVDEVPDGAVKPPTNKLPIFFFGTHETAFLGPKDIFPYSENKEKYGKPNKRKGFNEGLWEIDNNPKVKFSSQQAATKQSNASSDVEVEEKETSVSKEDTDHEEKASNEDVTKAVDITTPKAARRGRKRKAEKQVETEEAGVVTTATASVNLKVSPKRGRPAATEVKIPKPRGRPKMVKQPCPSESDIITEEDKSKKKGQEEKQPKKQPKKDEEGQKEEDKPRKEPDKKEGKKEVESKRKNLAKTGVTSTSDSEEEGDDQEGEKKRKGGRNFQTAHRRNMLKGQHEKEAADRKRKQEEQMETEQQNKDEGKKPEVKKVEKKRETSMDSRLQRIHAEIKNSLKIDNLDVNRCIEALDELASLQVTMQQAQKHTEMITTLKKIRRFKVSQVIMEKSTMLYNKFKNMFLVGEGDSVITQVLNKSLAEQRQHEEANKTKDQGKKGPNKKLEKEQTGSKTLNGGSDAQDGNQPQHNGESNEDS.... The pIC50 is 4.4. (5) The drug is CN(CC(=O)c1c(-c2ccccc2)[nH]c2ccccc12)CC(=O)N1CCOCC1. The target protein (P48442) has sequence MASYSCCLALLALAWHSSAYGPDQRAQKKGDIILGGLFPIHFGVAAKDQDLKSRPESVECIRYNFRGFRWLQAMIFAIEEINSSPSLLPNMTLGYRIFDTCNTVSKALEATLSFVAQNKIDSLNLDEFCNCSEHIPSTIAVVGATGSGVSTAVANLLGLFYIPQVSYASSSRLLSNKNQYKSFLRTIPNDEHQATAMADIIEYFRWNWVGTIAADDDYGRPGIEKFREEAEERDICIDFSELISQYSDEEEIQQVVEVIQNSTAKVIVVFSSGPDLEPLIKEIVRRNITGRIWLASEAWASSSLIAMPEYFHVVGGTIGFGLKAGQIPGFREFLQKVHPRKSVHNGFAKEFWEETFNCHLQEGAKGPLPVDTFVRSHEEGGNRLLNSSTAFRPLCTGDENINSVETPYMDYEHLRISYNVYLAVYSIAHALQDIYTCLPGRGLFTNGSCADIKKVEAWQVLKHLRHLNFTNNMGEQVTFDECGDLVGNYSIINWHLSPED.... The pIC50 is 3.5. (6) The compound is O=C(/C=C/c1cccc2ccccc12)c1ccc2ccccc2c1. The target protein sequence is MSDPLHVTFVCTGNICRSPMAEKMFAQQLRHRGLGDAVRVTSAGTGNWHVGSCADERAAGVLRAHGYPTDHRAAQVGTEHLAADLLVALDRNHARLLRQLGVEAARVRMLRSFDPRSGTHALDVEDPYYGDHSDFEEVFAVIESALPGLHDWVDERLARNGPS. The pIC50 is 4.0. (7) The drug is O=c1cc(-c2ccc(O)cc2)[nH]c(=S)[nH]1. The target protein (P49959) has sequence MSTADALDDENTFKILVATDIHLGFMEKDAVRGNDTFVTLDEILRLAQENEVDFILLGGDLFHENKPSRKTLHTCLELLRKYCMGDRPVQFEILSDQSVNFGFSKFPWVNYQDGNLNISIPVFSIHGNHDDPTGADALCALDILSCAGFVNHFGRSMSVEKIDISPVLLQKGSTKIALYGLGSIPDERLYRMFVNKKVTMLRPKEDENSWFNLFVIHQNRSKHGSTNFIPEQFLDDFIDLVIWGHEHECKIAPTKNEQQLFYISQPGSSVVTSLSPGEAVKKHVGLLRIKGRKMNMHKIPLHTVRQFFMEDIVLANHPDIFNPDNPKVTQAIQSFCLEKIEEMLENAERERLGNSHQPEKPLVRLRVDYSGGFEPFSVLRFSQKFVDRVANPKDIIHFFRHREQKEKTGEEINFGKLITKPSEGTTLRVEDLVKQYFQTAEKNVQLSLLTERGMGEAVQEFVDKEEKDAIEELVKYQLEKTQRFLKERHIDALEDKIDEE.... The pIC50 is 4.9. (8) The compound is N#Cc1ccccc1Cn1c(=O)n(C[C@H]2CC[C@H](C(=O)NCc3ccccc3)CC2)c(=O)c2ccccc21. The target protein (P21552) has sequence MNVPLGGIWLWLPLLLTWLTPEVSSSWWYMRATGGSSRVMCDNVPGLVSRQRQLCHRHPDVMRAIGLGVAEWTAECQHQFRQHRWNCNTLDRDHSLFGRVLLRSSRESAFVYAISSAGVVFAITRACSQGELKSCSCDPKKKGSAKDSKGTFDWGGCSDNIDYGIKFARAFVDAKERKGKDARALMNLHNNRAGRKAVKRFLKQECKCHGVSGSCTLRTCWLAMADFRKTGDYLWRKYNGAIQVVMNQDGTGFTVANKRFKKPTKNDLVYFENSPDYCIRDREAGSLGTAGRVCNLTSRGMDSCEVMCCGRGYDTSHVTRMTKCECKFHWCCAVRCQDCLEALDVHTCKAPKSADWATPT. The pIC50 is 7.6. (9) The small molecule is O=C(O)CNC(=O)c1nc(Cc2ccccc2)c2ccccc2c1O. The target protein sequence is MASESETLNPSARIMTFYPTMEEFRNFSRYIAYIESQGAHRAGLAKVVPPKEWKPRASYDDIDDLVIPAPIQQLVTGQSGLFTQYNIQKKAMTVREFRKIANSDKYCTPRYSEFEELERKYWKNLTFNPPIYGADVNGTLYEKHVDEWNIGRLRTILDLVEKESGITIEGVNTPYLYFGMWKTSFAWHTEDMDLYSINYLHFGEPKSWYSVPPEHGKRLERLAKGFFPGSAQSCEAFLRHKMTLISPLMLKKYGIPFDKVTQEAGEFMITFPYGYHAGFNHGFNCAESTNFATRRWIEYGKQAVLCSCRKDMVKISMDVFVRKFQPERYKLWKAGKDNTVIDHTLPTPEAAEFLKESELPPRAGNEEECPEEDMEGVEDGEEGDLKTSLAKHRIGTKRHRVCLEIPQEVSQSELFPKEDLSSEQYEMTECPAALAPVRPTHSSVRQVEDGLTFPDYSDSTEVKFEELKNVKLEEEDEEEEQAAAALDLSVNPASVGGRLV.... The pIC50 is 4.0.