Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is COc1cc(Cc2cnc(N)nc2N)c2cc(Cc3c(C(=O)N(C)C)[nH]c4ccc(Cl)cc34)oc2c1OC. The target protein sequence is MTLSILVAHDLQRVIGFENQLPWHLPNDLKHVKKLSTGHTLVMGRKTFESIGKPLPNRRNVVLTSDTSFNVEGVDVIHSIEDIYQLPGHVFIFGGQTLYEEMIDKVDDMYITVIEGKFRGDTFFPPYTFEDWEVASSVEGKLDEKNTIPHTFLHLIRKK. The pKi is 7.0. (2) The compound is CN1C[C@@H](COc2ccc(S(=O)(=O)Nc3cccc(CC(=O)O)c3)cc2)Oc2ccccc21. The target protein (P70263) has sequence MNESYRCQTSTWVERGSSATMGAVLFGAGLLGNLLALVLLARSGLGSCRPGPLHPPPSVFYVLVCGLTVTDLLGKCLISPMVLAAYAQNQSLKELLPASGNQLCETFAFLMSFFGLASTLQLLAMAVECWLSLGHPFFYQRHVTLRRGVLVAPVVAAFCLAFCALPFAGFGKFVQYCPGTWCFIQMIHKERSFSVIGFSVLYSSLMALLVLATVVCNLGAMYNLYDMHRRQRHYPHRCSRDRAQSGSDYRHGSLHPLEELDHFVLLALMTVLFTMCSLPLIYRAYYGAFKLENKAEGDSEDLQALRFLSVISIVDPWIFIIFRTSVFRMLFHKVFTRPLIYRNWSSHSQQSNVESTL. The pKi is 6.4. (3) The compound is CSCC[C@H](NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H]1CCCN1C(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CS)NC(=O)[C@H](CO)NC(=O)[C@H](C)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CCCC[C@@H]1SC[C@@H]2NC(=O)N[C@H]12)[C@@H](C)O)C(C)C)C(=O)O. The target protein sequence is NENEPREADKSHPEQRELRPRLCTMKKGPSGYGFNLHSDKSKPGQFIRSVDPDSPAEASGLRAQDRIVEVNGVCMEGKQHGDVVSAIRAGGDETKLLVVDRETDEFFKKCRVIPSQEHLNGPLPVPFTNGEIQKENSREALAEAALESPRPALVRSASSDTSEELNSQ. The pKi is 5.2. (4) The drug is CC1(C)S[C@@H]2[C@H](NC(=O)[C@H](N)c3ccccc3)C(=O)N2[C@H]1C(=O)O. The target protein (Q16348) has sequence MNPFQKNESKETLFSPVSIEEVPPRPPSPPKKPSPTICGSNYPLSIAFIVVNEFCERFSYYGMKAVLILYFLYFLHWNEDTSTSIYHAFSSLCYFTPILGAAIADSWLGKFKTIIYLSLVYVLGHVIKSLGALPILGGQVVHTVLSLIGLSLIALGTGGIKPCVAAFGGDQFEEKHAEERTRYFSVFYLSINAGSLISTFITPMLRGDVQCFGEDCYALAFGVPGLLMVIALVVFAMGSKIYNKPPPEGNIVAQVFKCIWFAISNRFKNRSGDIPKRQHWLDWAAEKYPKQLIMDVKALTRVLFLYIPLPMFWALLDQQGSRWTLQAIRMNRNLGFFVLQPDQMQVLNPLLVLIFIPLFDFVIYRLVSKCGINFSSLRKMAVGMILACLAFAVAAAVEIKINEMAPAQPGPQEVFLQVLNLADDEVKVTVVGNENNSLLIESIKSFQKTPHYSKLHLKTKSQDFHFHLKYHNLSLYTEHSVQEKNWYSLVIREDGNSISS.... The pKi is 2.9. (5) The compound is O=C(O)[C@H]1N[C@@H]1C(=O)O. The target protein sequence is ASQDSFRIEYDTFGELKVPNDKYYGAQTVRSTMNFKIGGVTERMPIPVLKAFGILKRAAAEVNQDYGLDPKIANAIMKAADEVAEGKLNDHFPLVVWQTGSGTQTNMNVNEVISNRAIEMLGGELGSKKPVHPNDHVNKSQSSNDTFPTAMHIAAAVEVHEALLPGLQKLHDALDAKSREFAQIIKIGRTHTQDAVPLTLGQEFSGYVQQVKYAITRIKAAMPRIYELAAGGTAVGTGLNTRIGFAEKVAAKVAALTGLPFVTAPNNFEALAAHDALVEHSGAMNTTACSLMKIANDIRFLGSGPRSGLGELILPENEPGSSIMPGKVNPTQCEALTMVAAQVMGNHVAVTVGGSNGHFELNVFKPMMIKNVLHSARLLGDAAVSFTENCVVGIQANTERINKLMNESLMLVTALNPHIGYDKAAKIAKTAHKNGSTLKATAVELGYLTAEQFDEWVKPRDMLGPK. The pKi is 7.1. (6) The small molecule is CC(C)Oc1ccccc1N1CCN(Cc2nc(CN3CCCCC3=O)cs2)CC1. The target protein (P35348) has sequence MVFLSGNASDSSNCTQPPAPVNISKAILLGVILGGLILFGVLGNILVILSVACHRHLHSVTHYYIVNLAVADLLLTSTVLPFSAIFEVLGYWAFGRVFCNIWAAVDVLCCTASIMGLCIISIDRYIGVSYPLRYPTIVTQRRGLMALLCVWALSLVISIGPLFGWRQPAPEDETICQINEEPGYVLFSALGSFYLPLAIILVMYCRVYVVAKRESRGLKSGLKTDKSDSEQVTLRIHRKNAPAGGSGMASAKTKTHFSVRLLKFSREKKAAKTLGIVVGCFVLCWLPFFLVMPIGSFFPDFKPSETVFKIVFWLGYLNSCINPIIYPCSSQEFKKAFQNVLRIQCLCRKQSSKHALGYTLHPPSQAVEGQHKDMVRIPVGSRETFYRISKTDGVCEWKFFSSMPRGSARITVSKDQSSCTTARVRSKSFLQVCCCVGPSTPSLDKNHQVPTIKVHTISLSENGEEV. The pKi is 8.9. (7) The small molecule is COc1ccc(N2CCN(CC[C@@H]3OCCc4cc(C(N)=O)ccc43)CC2)cc1. The target protein (Q9N2B7) has sequence MEEPGAQCAPPPAGSETWVPQANLSSAPSQNCSAKDYIYQDSIALPWKVLLVMLLALITLATTLSNAFVIATVYRTRKLHTPANYLIASLAVTDLLVSILVMPISTMYTVTGRWTLGQVVCDFWLSSDITCCTASILHLCVIALDRYWAITDAVEYSAKRTPKRAAVMIALVWVFSISISLPPFFWRQAKAEEEVSECVVNTDHILYTVYSTVGAFYFPTLLLIALYGRIYVEARSRILKQTPNRTGKRLTRAQLITDSPGSTSSVTSINSRVPDVPSESGSPVYVNQVKVRVSDALLEKKKLMAARERKATKTLGIILGAFIVCWLPFFIISLVMPICKDACWFHLAIFDFFTWLGYLNSLINPIIYTMSNEDFKQAFHKLIRFKCTS. The pKi is 5.5. (8) The small molecule is COc1cc(CCC(C)(C)N2Cc3cccc(N4CCNCC4)c3C2)ccc1O. The target protein (Q5BJF2) has sequence MGAPATRRCVEWLLGLYFLSHIPITLFMDLQAVLPRELYPVEFRNLLKWYAKEFKDPLLQEPPAWFKSFLFCELVFQLPFFPIATYAFLKGSCKWIRTPAIIYSVHTMTTLIPILSTFLFEDFSKASGFKGQRPETLHERLTLVSVYAPYLLIPFILLIFMLRSPYYKYEEKRKKK. The pKi is 5.0. (9) The small molecule is CCOC(=O)c1ncn2c1CN(C)C(=O)c1cc(N=[N+]=[N-])ccc1-2. The target protein (P34903) has sequence MIITQTSHCYMTSLGILFLINILPGTTGQGESRRQEPGDFVKQDIGGLSPKHAPDIPDDSTDNITIFTRILDRLLDGYDNRLRPGLGDAVTEVKTDIYVTSFGPVSDTDMEYTIDVFFRQTWHDERLKFDGPMKILPLNNLLASKIWTPDTFFHNGKKSVAHNMTTPNKLLRLVDNGTLLYTMRLTIHAECPMHLEDFPMDVHACPLKFGSYAYTTAEVVYSWTLGKNKSVEVAQDGSRLNQYDLLGHVVGTEIIRSSTGEYVVMTTHFHLKRKIGYFVIQTYLPCIMTVILSQVSFWLNRESVPARTVFGVTTVLTMTTLSISARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRSWAWEGKKVPEALEMKKKTPAAPAKKTSTTFNIVGTTYPINLAKDTEFSTISKGAAPSASSTPTIIASPKATYVQDSPTETKTYNSVSKVDKISRIIFPVLFAIFNLVYWATYVNRESAIKGMIRKQ. The pKi is 8.1. (10) The drug is CC(N)(CCCNCCCN)C(=O)O. The target protein (P27117) has sequence MNSFSNEEFDCHFLDEGFTAKDILDQKINEVSYSDDKDAFYVADLGDILKKHLRWLKALPRVTPFYAVKCNDSRTIVKTLAAIGTGFDCASKTEIQLVQSLGVPPERIIYANPCKQVSQIKYAANNGVQMMTFDSEVELMKVARAHPKAKLVLRIATDDSKAVCRLSVKFGATLKTSRLLLERAKELDIDVIGVSFHVGSGCTDPETFVQAISDARCVFDMGAEVGFNMYLLDIGGGFPGSEDVKLKFEEITSVINPALDKYFPSDSGVRIIAEPGRYYVASAFTLAVNIIAKKLVLKEQTGSDDEEESTDRTFMYYVNDGVYGSFNCILYDHAHVKPLLQKRPKPDEKYYSSSIWGPTCDGLDRIVERCNLPEMHVGDWMLFENMGAYTVAAASTFNGFQRPTIYYVMSGPTWQLMQQIRTQDFPPGVEEPDVGPLPVSCAWESGMKRHSAACASTRINV. The pKi is 2.4.