This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCCCC(Oc1cc(O)c(C(=O)O)cc1C#Cc1ccccc1OC(F)(F)F)C(=O)NC1CCCCC1. The target protein (O95278) has sequence MRFRFGVVVPPAVAGARPELLVVGSRPELGRWEPRGAVRLRPAGTAAGDGALALQEPGLWLGEVELAAEEAAQDGAEPGRVDTFWYKFLKREPGGELSWEGNGPHHDRCCTYNENNLVDGVYCLPIGHWIEATGHTNEMKHTTDFYFNIAGHQAMHYSRILPNIWLGSCPRQVEHVTIKLKHELGITAVMNFQTEWDIVQNSSGCNRYPEPMTPDTMIKLYREEGLAYIWMPTPDMSTEGRVQMLPQAVCLLHALLEKGHIVYVHCNAGVGRSTAAVCGWLQYVMGWNLRKVQYFLMAKRPAVYIDEEALARAQEDFFQKFGKVRSSVCSL. The pIC50 is 5.0. (2) The small molecule is CCCC[C@]1(CC)CS(=O)(=O)c2cc(CNCCS(=O)(=O)O)c(OC)cc2[C@@H](c2ccccc2)N1. The target protein (P70172) has sequence MDNSSVCPPNATVCEGDSCVVPESNFNAILNTVMSTVLTILLAMVMFSMGCNVEVHKFLGHIKRPWGIFVGFLCQFGIMPLTGFILSVASGILPVQAVVVLIMGCCPGGTGSNILAYWIDGDMDLSVSMTTCSTLLALGMMPLCLFVYTKMWVDSGTIVIPYDSIGISLVALVIPVSFGMFVNHKWPQKAKIILKIGSITGVILIVLIAVIGGILYQSAWIIEPKLWIIGTIFPIAGYSLGFFLARLAGQPWYRCRTVALETGMQNTQLCSTIVQLSFSPEDLNLVFTFPLIYTVFQLVFAAVILGIYVTYRKCYGKNDAEFLEKTDNEMDSRPSFDETNKGFQPDEK. The pIC50 is 9.0. (3) The small molecule is O=C(NCCc1ccccc1)c1cc2cccc([N+](=O)[O-])c2[nH]1. The target protein (P16885) has sequence MSTTVNVDSLAEYEKSQIKRALELGTVMTVFSFRKSTPERRTVQVIMETRQVAWSKTADKIEGFLDIMEIKEIRPGKNSKDFERAKAVRQKEDCCFTILYGTQFVLSTLSLAADSKEDAVNWLSGLKILHQEAMNASTPTIIESWLRKQIYSVDQTRRNSISLRELKTILPLINFKVSSAKFLKDKFVEIGAHKDELSFEQFHLFYKKLMFEQQKSILDEFKKDSSVFILGNTDRPDASAVYLHDFQRFLIHEQQEHWAQDLNKVRERMTKFIDDTMRETAEPFLFVDEFLTYLFSRENSIWDEKYDAVDMQDMNNPLSHYWISSSHNTYLTGDQLRSESSPEAYIRCLRMGCRCIELDCWDGPDGKPVIYHGWTRTTKIKFDDVVQAIKDHAFVTSSFPVILSIEEHCSVEQQRHMAKAFKEVFGDLLLTKPTEASADQLPSPSQLREKIIIKHKKLGPRGDVDVNMEDKKDEHKQQGELYMWDSIDQKWTRHYCAIAD.... The pIC50 is 6.0. (4) The small molecule is CC(CCc1cc(-c2ccccc2F)on1)(C(=O)NO)S(C)(=O)=O. The target protein sequence is TVEHLLSAMAGLGIDNAYVELSASEVPIMDGSAGPFVFLIQSAGLQEQEAAKKFIRIKREVSVEEGDKRAVFVPFDGFKVSFEIDFDHPVFRGRTQQASVDFSSTSFVKEVSRARTFGFMRDIEYLRSQNLALGGSVENAIVVDENRVLNEDGLRYEDEFVKHKILDAIGDLYLLGNSLIGEFRGFKSGHALNNQLL. The pIC50 is 8.1. (5) The small molecule is CCCCn1c(=O)c2ccccc2c2cc(C(O)(C(F)(F)F)C(F)(F)F)cc(OC)c21. The target protein (P35398) has sequence MESAPAAPDPAASEPGSSGADAAAGSRETPLNQESARKSEPPAPVRRQSYSSTSRGISVTKKTHTSQIEIIPCKICGDKSSGIHYGVITCEGCKGFFRRSQQSNATYSCPRQKNCLIDRTSRNRCQHCRLQKCLAVGMSRDAVKFGRMSKKQRDSLYAEVQKHRMQQQQRDHQQQPGEAEPLTPTYNISANGLTELHDDLSNYIDGHTPEGSKADSAVSSFYLDIQPSPDQSGLDINGIKPEPICDYTPASGFFPYCSFTNGETSPTVSMAELEHLAQNISKSHLETCQYLREELQQITWQTFLQEEIENYQNKQREVMWQLCAIKITEAIQYVVEFAKRIDGFMELCQNDQIVLLKAGSLEVVFIRMCRAFDSQNNTVYFDGKYASPDVFKSLGCEDFISFVFEFGKSLCSMHLTEDEIALFSAFVLMSADRSWLQEKVKIEKLQQKIQLALQHVLQKNHREDGILTKLICKVSTLRALCGRHTEKLMAFKAIYPDIVR.... The pIC50 is 5.0. (6) The small molecule is Cc1ccc(NC(=O)c2[nH]c(C3(C)CCCCC3)nc2CCC23CC4CC(CC(C4)C2)C3)cc1C(=O)O. The target protein (P30552) has sequence MELLKLNRSAQGSGAGPGASLCRAGGALLNSSGAGNLSCEPPRLRGAGTRELELAIRVTLYAVIFLMSVGGNVLIIVVLGLSRRLRTVTNAFLLSLAVSDLLLAVACMPFTLLPNLMGTFIFGTVVCKAVSYLMGVSVSVSTLSLVAIALERYSAICRPLQARVWQTRSHAARVIIATWMLSGLLMVPYPVYTAVQPAGGARALQCVHRWPSARVRQTWSVLLLLLLFFVPGVVMAVAYGLISRELYLGLRFDEDSDSESRVRSQGGLRGGAGPGPAPPNGSCRPEGGLAGEDGDGCYVQLPRSRQTLELSALTAPTPGPGGGPRPYQAKLLAKKRVVRMLLVIVVLFFLCWLPLYSANTWRAFDSSGAHRALSGAPISFIHLLSYASACVNPLVYCFMHRRFRQACLETCARCCPRPPRARPRPLPDEDPPTPSIASLSRLSYTTISTLGPG. The pIC50 is 8.0. (7) The compound is O=C(CCCCCCc1ccc([N+](=O)[O-])cc1)c1ncc(-c2ccccn2)o1. The target protein sequence is MWLFDLVLTSLATSMAWGYPSLPPVVDTVQGKVLGKYVSLEGFAQPVAVFLGVPFAKPPLGPLRFAPPQAAEPWNFVKNTTSYPPMCSQDAVGGQVLSELFTNRKDNIPLKFSEDCLYLNIYTPADLTKNSRLPVMVWIHGGGLVVGGASTYDGLALSAHENVVVVTIQYRLGIWGFFSTGDEHGRGNWGHLDQLAALRWVQENIANFGGNPGSVTIFGESAGGESVSVLVLSPLAKNLFHRAISESGVALTAALVKKDMKDTAQQIAVFAGCKSTTSAVLVHCLRQKTEDELLEVSLKLKFFTLDLLGDPRESYPFLPTVVDGVLLPKMPQEILAEKKFNSVPYIIGINKQEFGWLLPMMMGYPLSEDKLDQKTASSLLWKSYPIANIPEELTPLASEKYLGGTDDPVKKKALFLDMLGDVVFGVPSVTVARHHRDAGAPTYMYEFQYHPSFSSDMKPQTVVGDHGDELFSVFGAPFLKGGASEEEIRLSKMMMKLWAN.... The pIC50 is 4.3. (8) The compound is C(=C/c1ccc2c(ccn2CCCCn2ccnc2)c1)\c1ccccc1. The target protein (O35084) has sequence MTQAVKLASRVFHRIHLPLQLDASLGSRGSESVLRSLSDIPGPSTLSFLAELFCKGGLSRLHELQVHGAARYGPIWSGSFGTLRTVYVADPTLVEQLLRQESHCPERCSFSSWAEHRRRHQRACGLLTADGEEWQRLRSLLAPLLLRPQAAAGYAGTLDNVVRDLVRRLRRQRGRGSGLPGLVLDVAGEFYKFGLESIGAVLLGSRLGCLEAEVPPDTETFIHAVGSVFVSTLLTMAMPNWLHHLIPGPWARLCRDWDQMFAFAQRHVELREGEAAMRNQGKPEEDMPSGHHLTHFLFREKVSVQSIVGNVTELLLAGVDTVSNTLSWTLYELSRHPDVQTALHSEITAGTRGSCAHPHGTALSQLPLLKAVIKEVLRLYPVVPGNSRVPDRDIRVGNYVIPQDTLVSLCHYATSRDPTQFPDPNSFNPARWLGEGPTPHPFASLPFGFGKRSCIGRRLAELELQMALSQILTHFEVLPEPGALPIKPMTRTVLVPERSI.... The pIC50 is 6.8. (9) The small molecule is C[C@H]1CCN1c1nc(N2C[C@H]3[C@H](CC(=O)O)[C@H]3C2)cc(C(F)(F)F)n1. The target protein (P50053) has sequence MEEKQILCVGLVVLDVISLVDKYPKEDSEIRCLSQRWQRGGNASNSCTVLSLLGAPCAFMGSMAPGHVADFLVADFRRRGVDVSQVAWQSKGDTPSSCCIINNSNGNRTIVLHDTSLPDVSATDFEKVDLTQFKWIHIEGRNASEQVKMLQRIDAHNTRQPPEQKIRVSVEVEKPREELFQLFGYGDVVFVSKDVAKHLGFQSAEEALRGLYGRVRKGAVLVCAWAEEGADALGPDGKLLHSDAFPPPRVVDTLGAGDTFNASVIFSLSQGRSVQEALRFGCQVAGKKCGLQGFDGIV. The pIC50 is 7.4.