Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is CC(C)(C)NC(=O)[C@@H]1CN(Cc2cccnc2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2C[C@H]1O. The target protein sequence is PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPFNIIGRNLLTQIGCTLNF. The pKi is 7.9. (2) The small molecule is CC(C)(C)c1cc(NC(=O)Nc2ccc(Oc3ccc4cccnc4c3)cc2)n(-c2ccc3ncccc3c2)n1. The target protein sequence is MLEICLKLVGCKSKKGLSSSSSCYLEEALQRPVASDFEPQGLSEAARWNSKENLLAGPSENDPNLFVALYDFVASGDNTLSITKGEKLRVLGYNHNGEWCEAQTKNGQGWVPSNYITPVNSLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQFGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIITEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQES.... The pKi is 8.6. (3) The small molecule is CCCCCCCCCCCCCCCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1ccccc1)C(=O)O. The pKi is 4.1. The target protein (P30656) has sequence MQAIADSFSVPNRLVKELQYDNEQNLESDFVTGASQFQRLAPSLTVPPIASPQQFLRAHTDDSRNPDCKIKIAHGTTTLAFRFQGGIIVAVDSRATAGNWVASQTVKKVIEINPFLLGTMAGGAADCQFWETWLGSQCRLHELREKERISVAAASKILSNLVYQYKGAGLSMGTMICGYTRKEGPTIYYVDSDGTRLKGDIFCVGSGQTFAYGVLDSNYKWDLSVEDALYLGKRSILAAAHRDAYSGGSVNLYHVTEDGWIYHGNHDVGELFWKVKEEEGSFNNVIG. (4) The drug is CNCCC(Oc1ccccc1C)c1ccccc1. The target protein (P04774) has sequence MEQTVLVPPGPDSFNFFTRESLAAIERRIAEEKAKNPKPDKKDDDENGPKPNSDLEAGKNLPFIYGDIPPEMVSEPLEDLDPYYINKKTFIVLNKGKAIFRFSATSALYILTPFNPLRKIAIKILVHSLFSMLIMCTILTNCVFMTMSNPPDWTKNVEYTFTGIYTFESLIKIIARGFCLEDFTFLRDPWNWLDFTVITFAYVTEFVDLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLRNKCVQWPPTNASLEEHSIEKNVTTDYNGTLVNETVFEFDWKSYIQDSRYHYFLEGVLDALLCGNSSDAGQCPEGYMCVKAGRNPNYGYTSFDTFSWAFLSLFRLMTQDFWENLYQLTLRAAGKTYMIFFVLVIFLGSFYLINLILAVVAMAYEEQNQATLEEAEQKEAEFQQMLEQLKKQQEAAQQAAAATASEHSREPSAAGRLSDSSSEASKLSSKSAKERRNR.... The pKi is 6.0. (5) The compound is Oc1ccccc1. The target protein sequence is MNWLVAALAVCVLVPSANCASDSVAWCYHQPSCNDTTWPTIAAKYCNGTRQSPINIVSASAEPNANLTEFTFQNYGDTSILKKILNTGKTVQVSLGSGVSISGGDLSEAYDSLQFHLHWGKGSSIPGSDGKRYPMELHIVNSKSTFNGNTTLAVKDSTGLAALGFFIEETSGNETQQPASWNTLTSYLANITNSGDSVSIAPGISLDDLLVGVDRTKYYRYLGSLTTPQLQEAVVWTVFKDSIKVSKDLIDLFSTTVHVSNTSSPLMTNVFRNVQPAQPVTTQAASSSATSKTCYSLGLMALSLALGRS. The pKi is 4.9. (6) The small molecule is CCNC(=O)[C@H]1CCCN1C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CO)N(C)C(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@@H]1CCC(=O)N1. The target protein (P16235) has sequence MGRRVPALRQLLVLAVLLLKPSQLQSRELSGSRCPEPCDCAPDGALRCPGPRAGLARLSLTYLPVKVIPSQAFRGLNEVVKIEISQSDSLERIEANAFDNLLNLSELLIQNTKNLLYIEPGAFTNLPRLKYLSICNTGIRTLPDVTKISSSEFNFILEICDNLHITTIPGNAFQGMNNESVTLKLYGNGFEEVQSHAFNGTTLISLELKENIYLEKMHSGAFQGATGPSILDISSTKLQALPSHGLESIQTLIALSSYSLKTLPSKEKFTSLLVATLTYPSHCCAFRNLPKKEQNFSFSIFENFSKQCESTVRKADNETLYSAIFEENELSGWDYDYGFCSPKTLQCAPEPDAFNPCEDIMGYAFLRVLIWLINILAIFGNLTVLFVLLTSRYKLTVPRFLMCNLSFADFCMGLYLLLIASVDSQTKGQYYNHAIDWQTGSGCGAAGFFTVFASELSVYTLTVITLERWHTITYAVQLDQKLRLRHAIPIMLGGWLFSTL.... The pKi is 9.8. (7) The pKi is 6.0. The drug is CCCCC1CCN(CCCC(=O)c2ccccc2C)CC1. The target protein (Q13639) has sequence MDKLDANVSSEEGFGSVEKVVLLTFLSTVILMAILGNLLVMVAVCWDRQLRKIKTNYFIVSLAFADLLVSVLVMPFGAIELVQDIWIYGEVFCLVRTSLDVLLTTASIFHLCCISLDRYYAICCQPLVYRNKMTPLRIALMLGGCWVIPTFISFLPIMQGWNNIGIIDLIEKRKFNQNSNSTYCVFMVNKPYAITCSVVAFYIPFLLMVLAYYRIYVTAKEHAHQIQMLQRAGASSESRPQSADQHSTHRMRTETKAAKTLCIIMGCFCLCWAPFFVTNIVDPFIDYTVPGQVWTAFLWLGYINSGLNPFLYAFLNKSFRRAFLIILCCDDERYRRPSILGQTVPCSTTTINGSTHVLRDAVECGGQWESQCHPPATSPLVAAQPSDT. (8) The compound is Clc1cccc(OC[C@@H]2CN(CCN3CCc4ccccc43)CCO2)c1. The target protein (P97717) has sequence MNPDLDTGHNTSAPAHWGELKDANFTGPNQTSSNSTLPQLDVTRAISVGCLGAFILFAIVGNILVILSVACNRHLRTPTNYFIVNLAIADLLLSFTDLPFSATLEVLGYWVLGRIFCDIWAAVDVLCCTASILSLCAISIDRYIGVRYSLQYPTLVTRRKAILALLSVWVLSTVISIGPLLGWKEPAPNDDKECGVTEEPFYALFSSLGSFYIPLAVILVMYCRVYIVAKRTTKNLEAGVMKEMSNSKELTLRIHSKNFHEDTLSSTKAKGHNPRSSIAVKLFKFSREKKAAKTLGIVVGMFILCWLPFFIALPLGSLFSTLKPPDAVFKVVFWLGYFNSCLNPIIYPCSSKEFKRAFMRILGCQCRGGRRRRRRRRLGACAYTYRPWTRGGSLERSQSRKDSLDDSGSCMSGSQRTLPSASPSPGYLGRGTQPPVELCAFPEWKPGALLSLPEPPGRRGRLDSGPLFTFKLLGEPESPGTEGDASNGGCDTTTDLANGQ.... The pKi is 5.2. (9) The compound is CN1C[C@H](C(=O)N[C@]2(C)O[C@@]3(O)[C@@H]4CCCN4C(=O)[C@H](Cc4ccccc4)N3C2=O)C=C2c3cccc4[nH]cc(c34)C[C@H]21. The target protein (P30940) has sequence MDFLNSSDQNLTSEELLNRMPSKILVSLTLSGLALMTTTINCLVITAIIVTRKLHHPANYLICSLAVTDFLVAVLVMPFSIVYIVRESWIMGQGLCDLWLSVDIICCTCSILHLSAIALDRYRAITDAVEYARKRTPRHAGITITTVWVISVFISVPPLFWRHQGNSRDDQCIIKHDHIVSTIYSTFGAFYIPLVLILILYYKIYRAARTLYHKRQASRMIKEELNGQVLLESGEKSIKLVSTSYMLEKSLSDPSTDFDRIHSTVKSPRSELKHEKSWRRQKISGTRERKAATTLGLILGAFVICWLPFFVKELVVNICEKCKISEEMSNFLAWLGYLNSLINPLIYTIFNEDFKKAFQKLVRCRN. The pKi is 7.3.