This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is CCCC[C@@H](N)CC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@H](C(=O)N[C@H]1Cc2ccccc2CN(CC(=O)N[C@@H](Cc2ccccc2)C(=O)O)C1=O)[C@@H](C)CC. The target protein (Q9UIQ6) has sequence MEPFTNDRLQLPRNMIENSMFEEEPDVVDLAKEPCLHPLEPDEVEYEPRGSRLLVRGLGEHEMEEDEEDYESSAKLLGMSFMNRSSGLRNSATGYRQSPDGACSVPSARTMVVCAFVIVVAVSVIMVIYLLPRCTFTKEGCHKKNQSIGLIQPFATNGKLFPWAQIRLPTAVVPLRYELSLHPNLTSMTFRGSVTISVQALQVTWNIILHSTGHNISRVTFMSAVSSQEKQAEILEYAYHGQIAIVAPEALLAGHNYTLKIEYSANISSSYYGFYGFSYTDESNEKKYFAATQFEPLAARSAFPCFDEPAFKATFIIKIIRDEQYTALSNMPKKSSVVLDDGLVQDEFSESVKMSTYLVAFIVGEMKNLSQDVNGTLVSIYAVPEKIGQVHYALETTVKLLEFFQNYFEIQYPLKKLDLVAIPDFEAGAMENWGLLTFREETLLYDSNTSSMADRKLVTKIIAHELAHQWFGNLVTMKWWNDLWLNEGFATFMEYFSLEK.... The pKi is 5.3. (2) The small molecule is COc1ccccc1N1CCN(CCCCn2ncc(=O)n(C)c2=O)CC1. The target protein sequence is MRPALKGAILSLLGHYKWEKFVYLYDTERGFSILQAIMEAAVQNNWQVTARSVGNIKDVQEFRRIIEEMDRRQEKRYLIDCEVERINTILEQVVILGKHSRGYHYMLANLGFTDILLERVMHGGANITGFQIVNNENPMVQQFIQRWVRLDEREFPEAKNAPLKYTSALTHDAILVIAEAFRYLRRQRVDVSRRGSAGDCLANPAVPWSQGIDIERALKMVQVQGMTGNIQFDTYGRRTNYTIDVYEMKVTGSRKAGYWNEYERFVPFSDQQISNDSASAENRTIVVTTILESPYVMYKKNHEQLEGNERYEGYCVDLAYEIAKHVRIKYKLSIVGDGKYGARDPETKIWNGMVGELVYGRADIAVAPLTITLVREEVIDFSKPFMSLGISIMIKKPQKSKPGVFSFLDPLAYEIWMCIVFAYIGVSVVLFLVSRFSPYEWHLEDNNEEPRDPQSPPDPPNEFGIFNSLWFSLGAFMQQGCDISPRSLSGRIVGGVWWFF.... The pKi is 5.0. (3) The pKi is 5.2. The compound is C[C@]12CC[C@@H]3c4ccc(O)cc4CC[C@H]3[C@@H]1CC[C@@H]2O. The target protein sequence is MSTEKKKEPCCSKLKMFLAAMCFVFFAKAFQGSYMKSSVTQIERRFDVPSSLIGFIDGSFEIGNLFVIAFVSYFGAKLHRPRLIAAGCLVMSAGSFITAMPHFFQGQYKYESTISHFSASVNGTENVLPCLTNASLAQDSEIPTVESQAECEKASSSSLWLFVFLGNMLRGIGETPVMPLGLSYLDDFSREENTAFYLALIQTVGIMGPMFGFMLGSFCAKLYVDIGTVDLDSITINYKDSRWVGAWWLGFLVTGGVMLLAGIPFWFLPKSLTRQGEPESEKKPGAPEGGEQERFIPDNNKHNPPASKPAPVTMSALAKDFLPSLKKLFSNTIYVLLVCTGLIQVSGFIGMITFKPKFMEQVYGQSASRAIFLIGIMNLPAVALGIVTGGFIMKRFKVNVLGAAKICIVASVLAFCSMLIQYFLQCDNSQVAGLTVTYQGAPEVSYQTETLISQCNIGCSCSLKHWDPICASNGVTYTSPCLAGCQTSTGIGKEMVFHNC.... (4) The small molecule is NC(=O)C1CCCN1C(=O)C(Cc1cnc[nH]1)NC(=O)C1CCC(=O)C1. The target protein sequence is MMFLWWLLLLGTAISHKVHSQEQPLLEEDTAPADNLDVLEKAKGILIRSFLEGFQEGQQINRDLPDAMEMIYKRQHPGKRFQEEIEKRQHPGKRDLEDLQLSKRQHPGRRYLEDMEKRQHPGKREEGDWSRGYLTDDSGYLDLFSDVSKRQHPGKRVPDPFFIKRQHPGKRGIEEEDDTEFENSKEVGKRQHPGKRYDPCEGPNAYNCNSGNLQLDSVEEGWAA. The pKi is 5.0. (5) The pKi is 7.3. The target protein (P47871) has sequence MPPCQPQRPLLLLLLLLACQPQVPSAQVMDFLFEKWKLYGDQCHHNLSLLPPPTELVCNRTFDKYSCWPDTPANTTANISCPWYLPWHHKVQHRFVFKRCGPDGQWVRGPRGQPWRDASQCQMDGEEIEVQKEVAKMYSSFQVMYTVGYSLSLGALLLALAILGGLSKLHCTRNAIHANLFASFVLKASSVLVIDGLLRTRYSQKIGDDLSVSTWLSDGAVAGCRVAAVFMQYGIVANYCWLLVEGLYLHNLLGLATLPERSFFSLYLGIGWGAPMLFVVPWAVVKCLFENVQCWTSNDNMGFWWILRFPVFLAILINFFIFVRIVQLLVAKLRARQMHHTDYKFRLAKSTLTLIPLLGVHEVVFAFVTDEHAQGTLRSAKLFFDLFLSSFQGLLVAVLYCFLNKEVQSELRRRWHRWRLGKVLWEERNTSNHRASSSPGHGPPSKELQFGRGGGSQDSSAETPLAGGLPRLAESPF. The drug is CC(C)(C)c1ccc(C(Cc2ccc(C(=O)NCCC(=O)O)cc2)C(=O)Nc2ccc(OC(F)(F)F)cc2)cc1. (6) The drug is O=Cc1ccc(-c2cn([C@@H]3O[C@H](COP(=O)(O)OP(=O)(O)O[C@H]4O[C@H](CO)[C@H](O)[C@H](O)[C@H]4O)[C@@H](O)[C@H]3O)c(=O)[nH]c2=O)s1. The target protein (P14769) has sequence MNVKGKVILSMLVVSTVIVVFWEYIHSPEGSLFWINPSRNPEVGGSSIQKGWWLPRWFNNGYHEEDGDINEEKEQRNEDESKLKLSDWFNPFKRPEVVTMTKWKAPVVWEGTYNRAVLDNYYAKQKITVGLTVFAVGRYIEHYLEEFLTSANKHFMVGHPVIFYIMVDDVSRMPLIELGPLRSFKVFKIKPEKRWQDISMMRMKTIGEHIVAHIQHEVDFLFCMDVDQVFQDKFGVETLGESVAQLQAWWYKADPNDFTYERRKESAAYIPFGEGDFYYHAAIFGGTPTQVLNITQECFKGILKDKKNDIEAQWHDESHLNKYFLLNKPTKILSPEYCWDYHIGLPADIKLVKMSWQTKEYNVVRNNV. The pKi is 5.0. (7) The compound is CO[C@@H]1O[C@H](CO)[C@@H](O[C@@H]2O[C@H](CO)[C@@H](S[C@@H]3O[C@H](CO)[C@@H](O[C@@H]4O[C@H](CO)[C@@H](O)[C@H](O)[C@H]4O)[C@H](O)[C@H]3O)[C@H](O)[C@H]2O)[C@H](O)[C@H]1O. The target protein (P43316) has sequence ADGRSTRYWDCCKPSCGWAKKAPVNQPVFSCNANFQRITDFDAKSGCEPGGVAYSCADQTPWAVNDDFALGFAATSIAGSNEAGWCCACYELTFTSGPVAGKKMVVQSTSTGGDLGSNHFDLNIPGGGVGIFDGCTPQFGGLPGQRYGGISSRNECDRFPDALKPGCYWRFDWFKNADNPSFSFRQVQCPAELVARTGCRRNDDGNFPAVQIP. The pKi is 2.4. (8) The drug is N=C(N)c1ccc2cc(C(=O)Nc3ccc(CN)cc3)ccc2c1. The target protein sequence is IIGGEFTTIENQPWFAAIYRRHRGGSVTYVCGGSLISPCWVISATHCFIDYPKKEDYIVYLGRSRLNSNTQGEMKFEVENLILHKDYSADTLAHHNDIALLKIRSKEGRCAQPSRTIQTIALPSMYNDPQFGTSCEITGFGKEQSTDYLYPEQLKMTVVKLISHRECQQPHYYGSEVTTKMLCAADPQWKTDSCQGDSGGPLVCSLQGRMTLTGIVSWGRGCALKDKPGVYTRVSHFLPWIRSHTK. The pKi is 7.4.