This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is CCCCC/C=C\C/C=C\CCCCCCCC(=O)[O-]. The target protein (P04799) has sequence MAFSQYISLAPELLLATAIFCLVFWVLRGTRTQVPKGLKSPPGPWGLPFIGHMLTLGKNPHLSLTKLSQQYGDVLQIRIGSTPVVVLSGLNTIKQALVKQGDDFKGRPDLYSFTLITNGKSMTFNPDSGPVWAARRRLAQDALKSFSIASDPTSVSSCYLEEHVSKEANHLISKFQKLMAEVGHFEPVNQVVESVANVIGAMCFGKNFPRKSEEMLNLVKSSKDFVENVTSGNAVDFFPVLRYLPNPALKRFKNFNDNFVLFLQKTVQEHYQDFNKNSIQDITGALFKHSENYKDNGGLIPQEKIVNIVNDIFGAGFETVTTAIFWSILLLVTEPKVQRKIHEELDTVIGRDRQPRLSDRPQLPYLEAFILEIYRYTSFVPFTIPHSTTRDTSLNGFHIPKECCIFINQWQVNHDEKQWKDPFVFRPERFLTNDNTAIDKTLSEKVMLFGLGKRRCIGEIPAKWEVFLFLAILLHQLEFTVPPGVKVDLTPSYGLTMKPR.... The pKi is 4.9. (2) The drug is NCCc1ccc(O)c(O)c1. The target is MLLARMKPQVQPELGGADQ. The pKi is 6.4. (3) The target protein (P17936) has sequence MQRARPTLWAAALTLLVLLRGPPVARAGASSAGLGPVVRCEPCDARALAQCAPPPAVCAELVREPGCGCCLTCALSEGQPCGIYTERCGSGLRCQPSPDEARPLQALLDGRGLCVNASAVSRLRAYLLPAPPAPGNASESEEDRSAGSVESPSVSSTHRVSDPKFHPLHSKIIIIKKGHAKDSQRYKVDYESQSTDTQNFSSESKRETEYGPCRREMEDTLNHLKFLNVLSPRGVHIPNCDKKGFYKKKQCRPSKGRKRGFCWCVDKYGQPLPGYTTKGKEDVHCYSMQSK. The pKi is 7.6. The compound is O=C(O)c1cc2cc(O)c(O)cc2c(C(O)c2ccccc2Cl)n1. (4) The compound is COc1ccccc1-c1cn2c(-c3ccccc3Cl)c(CN)c(C)nc2n1. The target protein (P27487) has sequence MKTPWKVLLGLLGAAALVTIITVPVVLLNKGTDDATADSRKTYTLTDYLKNTYRLKLYSLRWISDHEYLYKQENNILVFNAEYGNSSVFLENSTFDEFGHSINDYSISPDGQFILLEYNYVKQWRHSYTASYDIYDLNKRQLITEERIPNNTQWVTWSPVGHKLAYVWNNDIYVKIEPNLPSYRITWTGKEDIIYNGITDWVYEEEVFSAYSALWWSPNGTFLAYAQFNDTEVPLIEYSFYSDESLQYPKTVRVPYPKAGAVNPTVKFFVVNTDSLSSVTNATSIQITAPASMLIGDHYLCDVTWATQERISLQWLRRIQNYSVMDICDYDESSGRWNCLVARQHIEMSTTGWVGRFRPSEPHFTLDGNSFYKIISNEEGYRHICYFQIDKKDCTFITKGTWEVIGIEALTSDYLYYISNEYKGMPGGRNLYKIQLSDYTKVTCLSCELNPERCQYYSVSFSKEAKYYQLRCSGPGLPLYTLHSSVNDKGLRVLEDNSAL.... The pKi is 8.0. (5) The drug is c1cncc(OC[C@@H]2CCN2)c1. The target protein (Q13572) has sequence MQTFLKGKRVGYWLSEKKIKKLNFQAFAELCRKRGMEVVQLNLSRPIEEQGPLDVIIHKLTDVILEADQNDSQSLELVHRFQEYIDAHPETIVLDPLPAIRTLLDRSKSYELIRKIEAYMEDDRICSPPFMELTSLCGDDTMRLLEKNGLTFPFICKTRVAHGTNSHEMAIVFNQEGLNAIQPPCVVQNFINHNAVLYKVFVVGESYTVVQRPSLKNFSAGTSDRESIFFNSHNVSKPESSSVLTELDKIEGVFERPSDEVIRELSRALRQALGVSLFGIDIIINNQTGQHAVIDINAFPGYEGVSEFFTDLLNHIATVLQGQSTAMAATGDVALLRHSKLLAEPAGGLVGERTCSASPGCCGSMMGQDAPWKAEADAGGTAKLPHQRLGCNAGVSPSFQQHCVASLATKASSQ. The pKi is 5.0. (6) The drug is O=P([O-])([O-])OP(=O)([O-])[O-]. The target protein sequence is MSSIRSYKGIVPKLGEGVYIDSSAVLVGDIELGDDASIWPLVAARGDVNHIRIGKRTNIQDGSVLHVTHKNAENPNGYPLCIGDDVTIGHKVMLHGCTIHDRVLVGMGSIVLDGAVIENDVMIGAGSLVPPGKRLESGFLYMGSPVKQARPLNDKERAFLVKSSSNYVQSKNDYLNDVKTVRE. The pKi is 2.1. (7) The drug is CN(C)CC1CCC(CN=C(N)c2ccc(CC(NC(=O)C(Cc3ccccc3)NS(=O)(=O)c3ccc4ccccc4c3)C(=O)N3CCCC3)cc2)CC1. The target protein (P21555) has sequence MNSTLFSRVENYSVHYNVSENSPFLAFENDDCHLPLAVIFTLALAYGAVIILGVSGNLALIIIILKQKEMRNVTNILIVNLSFSDLLVAVMCLPFTFVYTLMDHWVFGETMCKLNPFVQCVSITVSIFSLVLIAVERHQLIINPRGWRPNNRHAYIGITVIWVLAVASSLPFVIYQILTDEPFQNVSLAAFKDKYVCFDKFPSDSHRLSYTTLLLVLQYFGPLCFIFICYFKIYIRLKRRNNMMDKIRDSKYRSSETKRINVMLLSIVVAFAVCWLPLTIFNTVFDWNHQIIATCNHNLLFLLCHLTAMISTCVNPIFYGFLNKNFQRDLQFFFNFCDFRSRDDDYETIAMSTMHTDVSKTSLKQASPVAFKKISMNDNEKI. The pKi is 8.8. (8) The small molecule is CC1=C(C(=O)O)N2C(=O)[C@@H](NC(=O)[C@H](N)c3ccccc3)[C@H]2SC1. The target protein (Q63424) has sequence MNPFQKNESKETLFSPVSTEEMLPRPPSPPKKSPPKIFGSSYPVSIAFIVVNEFCERFSYYGMKAVLTLYFLYFLHWNEDTSTSVYHAFSSLCYFTPILGAAIADSWLGKFKTIIYLSLVYVLGHVFKSLGAIPILGGKMLHTILSLVGLSLIALGTGGIKPCVAAFGGDQFEEEHAEARTRYFSVFYLAINAGSLISTFITPMLRGDVKCFGQDCYALAFGVPGLLMVLALVVFAMGSKMYRKPPPEGNIVAQVIKCIWFALCNRFRNRSGDLPKRQHWLDWAAEKYPKHLIADVKALTRVLFLYIPLPMFWALLDQQGSRWTLQANKMNGDLGFFVLQPDQMQVLNPFLVLIFIPLFDLVIYRLISKCRINFSSLRKMAVGMILACLAFAVAALVETKINGMIHPQPASQEIFLQVLNLADGDVKVTVLGSRNNSLLVESVSSFQNTTHYSKLHLEAKSQDLHFHLKYNSLSVHNDHSVEEKNCYQLLIHQDGESISS.... The pKi is 4.1. (9) The drug is Nc1ncnc2nc(-c3ccc(N4CCOCC4)nc3)cc(-c3cccc(Br)c3)c12. The target protein (P47937) has sequence MASVPTGENWTDGTAGVGSHTGNLSAALGITEWLALQAGNFSSALGLPVTSQAPSQVRANLTNQFVQPSWRIALWSLAYGLVVAVAVFGNLIVIWIILAHKRMRTVTNYFLVNLAFSDASVAAFNTLVNFIYGVHSEWYFGANYCRFQNFFPITAVFASIYSMTAIAVDRYMAIIDPLKPRLSATATKIVIGSIWILAFLLAFPQCLYSKIKVMPGRTLCYVQWPEGPKQHFTYHIIVIILVYCFPLLIMGVTYTIVGITLWGGEIPGDTCDKYHEQLKAKRKVVKMMIIVVVTFAICWLPYHVYFILTAIYQQLNRWKYIQQVYLASFWLAMSSTMYNPIIYCCLNKRFRAGFKRAFRWCPFIQVSSYDELELKTTRFHPTRQSSLYTVSRMESVTVLYDPSEGDPAKSSRKKRAVPRDPSANGCSHREFKSASTTSSFISSPYTSVDEYS. The pKi is 5.0. (10) The small molecule is C#CC[C@H](C[C@H](O)[C@H](CC1CCCCC1)NC(=O)[C@H](CSC)NC(=O)c1nc2ccccc2[nH]1)C(=O)NCCN1CCOCC1. The target protein (P07267) has sequence MFSLKALLPLALLLVSANQVAAKVHKAKIYKHELSDEMKEVTFEQHLAHLGQKYLTQFEKANPEVVFSREHPFFTEGGHDVPLTNYLNAQYYTDITLGTPPQNFKVILDTGSSNLWVPSNECGSLACFLHSKYDHEASSSYKANGTEFAIQYGTGSLEGYISQDTLSIGDLTIPKQDFAEATSEPGLTFAFGKFDGILGLGYDTISVDKVVPPFYNAIQQDLLDEKRFAFYLGDTSKDTENGGEATFGGIDESKFKGDITWLPVRRKAYWEVKFEGIGLGDEYAELESHGAAIDTGTSLITLPSGLAEMINAEIGAKKGWTGQYTLDCNTRDNLPDLIFNFNGYNFTIGPYDYTLEVSGSCISAITPMDFPEPVGPLAIVGDAFLRKYYSIYDLGNNAVGLAKAI. The pKi is 6.4.