Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. From a dataset of Drug-target binding data from BindingDB using Ki measurements. (1) The small molecule is CN1CCC(=O)N(CC(=O)NCc2cc3cc(C(=O)N4CCC(N5C(=O)OCc6ccccc65)CC4)ccc3o2)C1=O. The target protein (P30559) has sequence MEGALAANWSAEAANASAAPPGAEGNRTAGPPRRNEALARVEVAVLCLILLLALSGNACVLLALRTTRQKHSRLFFFMKHLSIADLVVAVFQVLPQLLWDITFRFYGPDLLCRLVKYLQVVGMFASTYLLLLMSLDRCLAICQPLRSLRRRTDRLAVLATWLGCLVASAPQVHIFSLREVADGVFDCWAVFIQPWGPKAYITWITLAVYIVPVIVLAACYGLISFKIWQNLRLKTAAAAAAEAPEGAAAGDGGRVALARVSSVKLISKAKIRTVKMTFIIVLAFIVCWTPFFFVQMWSVWDANAPKEASAFIIVMLLASLNSCCNPWIYMLFTGHLFHELVQRFLCCSASYLKGRRLGETSASKKSNSSSFVLSHRSSSQRSCSQPSTA. The pKi is 8.4. (2) The compound is O=C(O)C1=CCNCC1. The target protein (A8MPY1) has sequence MVLAFQLVSFTYIWIILKPNVCAASNIKMTHQRCSSSMKQTCKQETRMKKDDSTKARPQKYEQLLHIEDNDFAMRPGFGGSPVPVGIDVHVESIDSISETNMDFTMTFYLRHYWKDERLSFPSTANKSMTFDHRLTRKIWVPDIFFVHSKRSFIHDTTMENIMLRVHPDGNVLLSLRITVSAMCFMDFSRFPLDTQNCSLELESYAYNEDDLMLYWKHGNKSLNTEEHMSLSQFFIEDFSASSGLAFYSSTGWYNRLFINFVLRRHVFFFVLQTYFPAILMVMLSWVSFWIDRRAVPARVSLGITTVLTMSTIITAVSASMPQVSYLKAVDVYLWVSSLFVFLSVIEYAAVNYLTTVEERKQFKKTGKISRMYNIDAVQAMAFDGCYHDSEIDMDQTSLSLNSEDFMRRKSICSPSTDSSRIKRRKSLGGHVGRIILENNHVIDTYSRILFPIVYILFNLFYWGVYV. The pKi is 5.0. (3) The small molecule is N[C@H](CCS(=O)(=O)O)C(=O)O. The target protein (P35368) has sequence MNPDLDTGHNTSAPAHWGELKNANFTGPNQTSSNSTLPQLDITRAISVGLVLGAFILFAIVGNILVILSVACNRHLRTPTNYFIVNLAMADLLLSFTVLPFSAALEVLGYWVLGRIFCDIWAAVDVLCCTASILSLCAISIDRYIGVRYSLQYPTLVTRRKAILALLSVWVLSTVISIGPLLGWKEPAPNDDKECGVTEEPFYALFSSLGSFYIPLAVILVMYCRVYIVAKRTTKNLEAGVMKEMSNSKELTLRIHSKNFHEDTLSSTKAKGHNPRSSIAVKLFKFSREKKAAKTLGIVVGMFILCWLPFFIALPLGSLFSTLKPPDAVFKVVFWLGYFNSCLNPIIYPCSSKEFKRAFVRILGCQCRGRGRRRRRRRRRLGGCAYTYRPWTRGGSLERSQSRKDSLDDSGSCLSGSQRTLPSASPSPGYLGRGAPPPVELCAFPEWKAPGALLSLPAPEPPGRRGRHDSGPLFTFKLLTEPESPGTDGGASNGGCEAAA.... The pKi is 5.0. (4) The small molecule is COc1cc(OC)c(C(=O)CCCCN2CCC3(CC2)NC(=O)NC3=O)cc1NS(=O)(=O)c1ccc(C(F)(F)F)cc1. The target protein (P23979) has sequence MRLCIPQVLLALFLSMLTAPGEGSRRRATQEDTTQPALLRLSDHLLANYKKGVRPVRDWRKPTTVSIDVIMYAILNVDEKNQVLTTYIWYRQYWTDEFLQWTPEDFDNVTKLSIPTDSIWVPDILINEFVDVGKSPNIPYVYVHHRGEVQNYKPLQLVTACSLDIYNFPFDVQNCSLTFTSWLHTIQDINITLWRSPEEVRSDKSIFINQGEWELLEVFPQFKEFSIDISNSYAEMKFYVIIRRRPLFYAVSLLLPSIFLMVVDIVGFCLPPDSGERVSFKITLLLGYSVFLIIVSDTLPATIGTPLIGVYFVVCMALLVISLAETIFIVRLVHKQDLQRPVPDWLRHLVLDRIAWILCLGEQPMAHRPPATFQANKTDDCSGSDLLPAMGNHCSHVGGPQDLEKTPRGRGSPLPPPREASLAVRGLLQELSSIRHFLEKRDEMREVARDWLRVGYVLDRLLFRIYLLAVLAYSITLVTLWSIWHYS. The pKi is 5.0. (5) The compound is Cc1cn(Cc2cn([C@@H]3C(O)[C@H](n4cnc5c(NC(=O)c6ccccc6)ncnc54)O[C@@H]3CO)nn2)c(=O)[nH]c1=O. The target protein (P61823) has sequence MALKSLVLLSLLVLVLLLVRVQPSLGKETAAAKFERQHMDSSTSAASSSNYCNQMMKSRNLTKDRCKPVNTFVHESLADVQAVCSQKNVACKNGQTNCYQSYSTMSITDCRETGSSKYPNCAYKTTQANKHIIVACEGNPYVPVHFDASV. The pKi is 3.8. (6) The target protein (P48066) has sequence MTAEKALPLGNGKAAEEARESEAPGGGCSSGGAAPARHPRVKRDKAVHERGHWNNKVEFVLSVAGEIIGLGNVWRFPYLCYKNGGGAFLIPYVVFFICCGIPVFFLETALGQFTSEGGITCWRKVCPLFEGIGYATQVIEAHLNVYYIIILAWAIFYLSNCFTTELPWATCGHEWNTENCVEFQKLNVSNYSHVSLQNATSPVMEFWEHRVLAISDGIEHIGNLRWELALCLLAAWTICYFCIWKGTKSTGKVVYVTATFPYIMLLILLIRGVTLPGASEGIKFYLYPDLSRLSDPQVWVDAGTQIFFSYAICLGCLTALGSYNNYNNNCYRDCIMLCCLNSGTSFVAGFAIFSVLGFMAYEQGVPIAEVAESGPGLAFIAYPKAVTMMPLSPLWATLFFMMLIFLGLDSQFVCVESLVTAVVDMYPKVFRRGYRRELLILALSVISYFLGLVMLTEGGMYIFQLFDSYAASGMCLLFVAIFECICIGWVYGSNRFYDNI.... The small molecule is O=C(NC1CCN(Cc2ccccc2)CC1)c1ccc2ccccc2c1. The pKi is 4.0. (7) The small molecule is N=c1nc(N)nc(N)c2c1N(COCCO)CN2. The target protein (Q9Y2T3) has sequence MCAAQMPPLAHIFRGTFVHSTWTCPMEVLRDHLLGVSDSGKIVFLEEASQQEKLAKEWCFKPCEIRELSHHEFFMPGLVDTHIHASQYSFAGSSIDLPLLEWLTKYTFPAEHRFQNIDFAEEVYTRVVRRTLKNGTTTACYFATIHTDSSLLLADITDKFGQRAFVGKVCMDLNDTFPEYKETTEESIKETERFVSEMLQKNYSRVKPIVTPRFSLSCSETLMGELGNIAKTRDLHIQSHISENRDEVEAVKNLYPSYKNYTSVYDKNNLLTNKTVMAHGCYLSAEELNVFHERGASIAHCPNSNLSLSSGFLNVLEVLKHEVKIGLGTDVAGGYSYSMLDAIRRAVMVSNILLINKVNEKSLTLKEVFRLATLGGSQALGLDGEIGNFEVGKEFDAILINPKASDSPIDLFYGDFFGDISEAVIQKFLYLGDDRNIEEVYVGGKQVVPFSSSV. The pKi is 4.5. (8) The small molecule is O[C@@H]1CCCC[C@H]1N1CCC(c2ccncc2)CC1. The target protein (Q62666) has sequence MEPTAPTGQARAAATKLSEAVGAALQEPQRQRRLVLVIVCVALLLDNMLYMVIVPIVPDYIAHMRGGSEGPTLVSEVWEPTLPPPTLANASAYLANTSASPTAAGSARSILRPRYPTESEDVKIGVLFASKAILQLLVNPLSGPFIDRMSYDVPLLIGLGVMFASTVMFAFAEDYATLFAARSLQGLGSAFADTSGIAMIADKYPEEPERSRALGVALAFISFGSLVAPPFGGILYEFAGKRVPFLVLAAVSLFDALLLLAVAKPFSAAARARANLPVGTPIHRLMLDPYIAVVAGALTTCNIPLAFLEPTIATWMKHTMAASEWEMGMVWLPAFVPHVLGVYLTVRLAARYPHLQWLYGALGLAVIGVSSCVVPACRSFAPLVVSLCGLCFGIALVDTALLPTLAFLVDVRHVSVYGSVYAIADISYSVAYALGPIVAGHIVHSLGFEQLSLGMGLANLLYAPVLLLLRNVGLLTRSRSERDVLLDEPPQGLYDAVRLR.... The pKi is 5.5. (9) The pKi is 5.0. The target protein sequence is MVRLLLLFFPAVFLEMSLFPRGPGGKVLLAGASSQRSVARMDGDVIIGALFSVHHQPPAEKVPERKCGEIREQYGIQRVEAMFHTLDKINADPVLLPNITLGSEIRDSCWHSSVALEQSIEFIRDSLISIRDEKDGLNRCLPDGQTLPPGRTKKPIAGVIGPGSSSVAIQVQNLLQLFDIPQIAYSATSIDLSDKTLYKYFLRVVPSDTLQARAMLDIVKRYNWTYVSAVHTEGNYGESGMDAFKELAAQEGLCIAHSDKIYSNAGEKSFDRLLRKLRERLPKARVVVCFCEGMTVRGLLSAMRRLGVVGEFSLIGSDGWADRDEVIEGYEVEANGGITIKLQSPEVRSFDDYFLKLRLDTNTRNPWFPEFWQHRFQCRLPGHILENPNFKRICTGNESLEENYVQDSKMGFVINAIYAMAHGLQNMHHALCPGHVGLCDAMKPIDGSKLLDFLIKSTFIGVSGEEVWFDEKGDAPGRYDIMNLQYTEANRYDYVHVGTW.... The compound is N[C@@]1(C(=O)O)CC[C@H](C(=O)O)C1.