Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. From a dataset of Drug-target binding data from BindingDB using Ki measurements. (1) The small molecule is CCCCCCCCc1ccc(Oc2ccccc2)c(O)c1. The target protein sequence is MGFLAGKKILITGLLSNKSIAYGIAKAMHREGAELAFTYVGQFKDRVEKLCAEFNPAAVLPCDVTSDQEIKDLFVELGKVWDGLDAIVHSIAFAPRDQLEGNFIDCVTREGFSIAHDISAYSFAALAKEGRSMMKNRNASMVALTYIGAEKAMPSYNTMGIAKASLEATVRYTALALGEDGIKVNAVSAGPIKTLAASGISNFKKMLDYNAMVSPLKKNVDIMEVGNTVAFLCSDMATGITGEVVHVDAGYHCVSMGNVL. The pKi is 7.6. (2) The pKi is 9.0. The small molecule is CC[C@H](C)CN(C[C@@H](O)[C@H](Cc1ccccc1)NC(=O)c1cccc(O)c1)S(=O)(=O)c1ccc2ncsc2c1. The target protein sequence is PQITLWQRPLVTVKIGGQLREALLDTGADNTVLEDINLPGKWKPKMIGGIGGFIKVKQYEQVLIEICGKKAIGTVLVGPTPVNIIGRDMLTQIGCTLNF. (3) The target protein sequence is MASLPVAANWTNGGTAGADIGNLSALEPGAAAGKAETEWLQLLVQAGNLSASFSPLGLAAASPAPSQSRSNITNQFVQPSWRIALWSLAYGMVVAVAVFGNLIVIWIILAHKRMRTVTNYFLVNLAFSDASMAAFNTLVNFIYALHSEWYFGANYCRFQNFFPITAVFASIYSMTAISVDRYMAIIDPLKPRLSATATKIVIGGIWILAFLLALPQCLYSKTKVMPGRTLCYVQWPEGPKQHFTYHIIVIILVYCFPLLVMGITYTIVGITLWGGEIPGDTCDKYHEQLKAKRKVVKMMIIVVVTFAICWLPYHIYFILTAIYQQLNRWKYIQQVYLASFWLAMSSTMYNPIIYCCLNKRFRAGFKRAFRWCPFIEVSSYDELELKTTRFHPTRQSSLYTVTRMESMTVVFDPSDADNTRSSRKKRVPPRDPSFNGCSRQNSKSGSTTSSFISSPYTSVDEYS. The pKi is 5.0. The drug is COc1ccccc1N1CCN(CCCCn2ncc(=O)n(C)c2=O)CC1. (4) The target protein sequence is MNLTNYTTEASVAVKPKTVTEKMLICMTLVIITTLTMLLNSAVIMAICTTRKLHQPANYLICSLAVTDLLVAVLVMPLSVMYIVMDNWRLGYFICEVWLSVDMTCCTCSILHLCVIALDRYWAITKAIEYARKRTARRAGLMILTVWTISIFISMPPLFWRSHRQVSPPPSQCTIQHDHVIYTIYSTLGAFYIPLTLILILYYRIYHAAKSLYQKRGSSRHLSNRSTDSQNSFASCKLTQTFCVSDFSTSDPTTEFEKIHTSIRIPPFDNDLDQPGERQQISSTRERKAARILGLILGAFILSWLPFFIKELIVGLSIYTVSSEVGDFLTWLGYVNSLINPLLYTSFNEDFKLAFKKLIRCREHT. The pKi is 8.2. The compound is NCCc1c[nH]c2ccc(O)cc12. (5) The compound is CC(=O)NCC1Cc2ccccc2N(Cc2ccccc2)C1. The target protein (P48039) has sequence MQGNGSALPNASQPVLRGDGARPSWLASALACVLIFTIVVDILGNLLVILSVYRNKKLRNAGNIFVVSLAVADLVVAIYPYPLVLMSIFNNGWNLGYLHCQVSGFLMGLSVIGSIFNITGIAINRYCYICHSLKYDKLYSSKNSLCYVLLIWLLTLAAVLPNLRAGTLQYDPRIYSCTFAQSVSSAYTIAVVVFHFLVPMIIVIFCYLRIWILVLQVRQRVKPDRKPKLKPQDFRNFVTMFVVFVLFAICWAPLNFIGLAVASDPASMVPRIPEWLFVASYYMAYFNSCLNAIIYGLLNQNFRKEYRRIIVSLCTARVFFVDSSNDVADRVKWKPSPLMTNNNVVKVDSV. The pKi is 7.2. (6) The compound is COC(=O)Nc1ccc(-c2ccnc([C@H](Cc3ccccc3)NC(=O)[C@H]3CC[C@H](CN)CC3)c2)cc1. The target protein (P04070) has sequence MWQLTSLLLFVATWGISGTPAPLDSVFSSSERAHQVLRIRKRANSFLEELRHSSLERECIEEICDFEEAKEIFQNVDDTLAFWSKHVDGDQCLVLPLEHPCASLCCGHGTCIDGIGSFSCDCRSGWEGRFCQREVSFLNCSLDNGGCTHYCLEEVGWRRCSCAPGYKLGDDLLQCHPAVKFPCGRPWKRMEKKRSHLKRDTEDQEDQVDPRLIDGKMTRRGDSPWQVVLLDSKKKLACGAVLIHPSWVLTAAHCMDESKKLLVRLGEYDLRRWEKWELDLDIKEVFVHPNYSKSTTDNDIALLHLAQPATLSQTIVPICLPDSGLAERELNQAGQETLVTGWGYHSSREKEAKRNRTFVLNFIKIPVVPHNECSEVMSNMVSENMLCAGILGDRQDACEGDSGGPMVASFHGTWFLVGLVSWGEGCGLLHNYGVYTKVSRYLDWIHGHIRDKEAPQKSWAP. The pKi is 4.5. (7) The drug is CC(C)(C)c1ccc(NC(=O)N2CCN(c3ncccc3Cl)CC2)cc1. The target protein (Q02294) has sequence MVRFGDELGGRYGGTGGGERARGGGAGGAGGPGQGGLPPGQRVLYKQSIAQRARTMALYNPIPVKQNCFTVNRSLFVFSEDNVVRKYAKRITEWPPFEYMILATIIANCIVLALEQHLPDGDKTPMSERLDDTEPYFIGIFCFEAGIKIIALGFVFHKGSYLRNGWNVMDFVVVLTEILATAGTDFDLRTLRAVRVLRPLKLVSGIPSLQVVLKSIMKAMVPLLQIGLLLFFAILMFAIIGLEFYMGKFHKACFPNSTDAEPVGDFPCGKEAPARLCDSDTECREYWPGPNFGITNFDNILFAILTVFQCITMEGWTDILYNTNDAAGNTWNWLYFIPLIIIGSFFMLNLVLGVLSGEFAKERERVENRRAFLKLRRQQQIERELNGYLEWIFKAEEVMLAEEDKNAEEKSPLDAVLKRAATKKSRNDLIHAEEGEDRFVDLCAAGSPFARASLKSGKTESSSYFRRKEKMFRFLIRRMVKAQSFYWVVLCVVALNTLCV.... The pKi is 5.0. (8) The compound is CCCN(CCc1ccc(C)cc1)[C@@H]1CCc2c(O)cccc2C1. The target protein (P21728) has sequence MRTLNTSAMDGTGLVVERDFSVRILTACFLSLLILSTLLGNTLVCAAVIRFRHLRSKVTNFFVISLAVSDLLVAVLVMPWKAVAEIAGFWPFGSFCNIWVAFDIMCSTASILNLCVISVDRYWAISSPFRYERKMTPKAAFILISVAWTLSVLISFIPVQLSWHKAKPTSPSDGNATSLAETIDNCDSSLSRTYAISSSVISFYIPVAIMIVTYTRIYRIAQKQIRRIAALERAAVHAKNCQTTTGNGKPVECSQPESSFKMSFKRETKVLKTLSVIMGVFVCCWLPFFILNCILPFCGSGETQPFCIDSNTFDVFVWFGWANSSLNPIIYAFNADFRKAFSTLLGCYRLCPATNNAIETVSINNNGAAMFSSHHEPRGSISKECNLVYLIPHAVGSSEDLKKEEAAGIARPLEKLSPALSVILDYDTDVSLEKIQPITQNGQHPT. The pKi is 5.3. (9) The small molecule is CCNc1cc(C#Cc2ccc(Cl)s2)nc2c1ncn2[C@H]1[C@H](O)[C@H](O)[C@]2(C(=O)NC)C[C@H]12. The target protein (Q28309) has sequence MAVNGTALLLANVTYITVEILIGLCAIVGNVLVIWVVKLNPSLQTTTFYFIVSLALADIAVGVLVMPLAIVISLGITIQFYNCLFMTCLLLIFTHASIMSLLAIAVDRYLRVKLTVRYRRVTTQRRIWLALGLCWLVSFLVGLTPMFGWNMKLTSEHQRNVTFLSCQFSSVMRMDYMVYFSFFTWILIPLVVMCAIYLDIFYVIRNKLNQNFSSSKETGAFYGREFKTAKSLFLVLFLFAFSWLPLSIINCITYFHGEVPQIILYLGILLSHANSMMNPIVYAYKIKKFKETYLLIFKTYMICQSSDSLDSSTE. The pKi is 7.3. (10) The drug is CCNC(=O)[C@H]1O[C@@H](n2cnc3c(=O)[nH]cnc32)[C@H](O)[C@@H]1O. The target protein sequence is MEPLLLLSLALFSDAMVMDEKVKSGVELDTASAICNYDAHYKDHTKYWCRGYFRDSCNIIAFTPNSSNRVALKDTGDQLIITVSCLVKEDTGWYWCGIQRDFARDDMDFTKLIVTDNREDRANGLSPGTSGNRTRSCKTSKAVQKAEGSRMSILIVCVLISGLGIIFLISHMSRGRRSQRNRGVTGKSINRNPQASQAPSMVSIPLTVLPKVPRQNGQQKALQWTGNATKTG. The pKi is 5.3.