Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. From a dataset of Drug-target binding data from BindingDB using Ki measurements. (1) The target protein sequence is MTTIKENEFLCDEEIYKSFVHLKDKICEERKKKELVNNNIDNVNFNDDDDNNYDDDGNSYSSYIKEMKKLLKVVLLKYKALKFGEFILKSKRKSNYFFSSGVLNNIVSSNIICFLLSELILKNKLSFDYLLGASYKGIPMVSLTSHFLFESKKYSNIFYLYDRKEKKEYGDKNVIVGNLDDDDKDILNLKKKTKNNQDEEKKNIIIIDDVFTCGTALTEILAKLKTYEHLKVVAFIVLLNRNEYEINENNQKIYFKDIFEKRVGIPLYSILSYKDDIQSMI. The drug is O=c1[nH]c(=O)n(CC[N@H+]2CC(O)[C@@H](CO)C2)cc1Br. The pKi is 5.9. (2) The compound is CC[C@H](C)[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccccc1)C(C)C)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)O)[C@@H](C)O. The target protein (P27114) has sequence MVSRKAVAALLLVHVTAMLASQTEAFVPIFTYSELQRMQERERNRGHKKSLSVQQRSDAAAAPRPAEPTLEEENGRMQLTAPVEIGMRMNSRQLEKYRAALEAAERAVHPDAPSRPCWPAGGESGWSGEPSPT. The pKi is 8.4. (3) The target is MLLARMKPQVQPELGGADQ. The pKi is 8.5. The small molecule is CNCCC(Oc1ccccc1OC)c1ccccc1. (4) The compound is CC(C)(C)c1ccc(-n2nc(-c3cccnc3)cc2CCCCC(=O)N[C@@H](Cc2ccc(O)cc2)C(N)=O)cc1. The target protein (P22888) has sequence MKQRFSALQLLKLLLLLQPPLPRALREALCPEPCNCVPDGALRCPGPTAGLTRLSLAYLPVKVIPSQAFRGLNEVIKIEISQIDSLERIEANAFDNLLNLSEILIQNTKNLRYIEPGAFINLPRLKYLSICNTGIRKFPDVTKVFSSESNFILEICDNLHITTIPGNAFQGMNNESVTLKLYGNGFEEVQSHAFNGTTLTSLELKENVHLEKMHNGAFRGATGPKTLDISSTKLQALPSYGLESIQRLIATSSYSLKKLPSRETFVNLLEATLTYPSHCCAFRNLPTKEQNFSHSISENFSKQCESTVRKVNNKTLYSSMLAESELSGWDYEYGFCLPKTPRCAPEPDAFNPCEDIMGYDFLRVLIWLINILAIMGNMTVLFVLLTSRYKLTVPRFLMCNLSFADFCMGLYLLLIASVDSQTKGQYYNHAIDWQTGSGCSTAGFFTVFASELSVYTLTVITLERWHTITYAIHLDQKLRLRHAILIMLGGWLFSSLIAML.... The pKi is 6.5. (5) The drug is Nc1ccc(S(=O)(=O)N(CCCNCCCN(Cc2ccccc2)S(=O)(=O)c2ccc(N)cc2)Cc2ccccc2)cc1. The target protein (P00791) has sequence MKWLLLLSLVVLSECLVKVPLVRKKSLRQNLIKNGKLKDFLKTHKHNPASKYFPEAAALIGDEPLENYLDTEYFGTIGIGTPAQDFTVIFDTGSSNLWVPSVYCSSLACSDHNQFNPDDSSTFEATSQELSITYGTGSMTGILGYDTVQVGGISDTNQIFGLSETEPGSFLYYAPFDGILGLAYPSISASGATPVFDNLWDQGLVSQDLFSVYLSSNDDSGSVVLLGGIDSSYYTGSLNWVPVSVEGYWQITLDSITMDGETIACSGGCQAIVDTGTSLLTGPTSAIANIQSDIGASENSDGEMVISCSSIDSLPDIVFTINGVQYPLSPSAYILQDDDSCTSGFEGMDVPTSSGELWILGDVFIRQYYTVFDRANNKVGLAPVA. The pKi is 5.4. (6) The compound is CCCCNC(=S)N1CC(O)[C@@H](O)[C@@H](O)C1CO. The target protein sequence is LRNATQRMFEIDYSRDSFLKDGQPFRYTSGSIHYSRVPRFYWKDRLLKMKMAGLNAIQTYVPWNFHEPWPGQYQFSEDHDVEYFLRLAHELGLLVILRPGPYICAEWEMGGLPAWLLEKESILLRSSDPDYLAAVDKWLGVLLPKMKPLLYQNGGPVITVQVENEYGSYFACDFDYLRFLQKRFRHHLGDDVVLFTTDGAHKTFLKCGALQGLYTTVDFGTGSNITDAFLSQRKCEPKGPLINSEFYTGWLDHWGQPHSTIKTEAVASSLYDILARGASVNLYMFIGGTNFAYWNGANSPYAAQPTSYDYDAPLSEAGDLTEKYFALRNIIQKFEKVPEGPIPPSTPKFAYGKVTLEKLKTVGAALDILCPSGPIKSLYPLTFIQVKQHYGFVLYRTTLPQDCSNPAPLSSPLNGVHDRAYVAVDGIPQGVLERNNVITLNITGKAGATLDLLVENMGRVNYGAYINDFKGLVSNLTLSSNILTDWTIFPLDTEDAVRSH.... The pKi is 3.5. (7) The target protein (Q03521) has sequence MLEQVILFTILMGFLISVLLSPILIPFLRRLKFGQSIREEGPKSHQKKSGTPTMGGVMIILSIIVTTIVMTQKFSEISPEMVLLLFVTLGYGLLGFLDDYIKVVMKRNLGLTSKQKLIGQIIIAVVFYAVYHYYNFATDIRIPGTDLSFDLGWAYFILVLFMLVGGSNAVNLTDGLDGLLSGTAAIAFGAFAILAWNQSQYDVAIFSVAVVGAVLGFLVFNAHPAKVFMGDTGSLALGGAIVTIAILTKLEILLVIIGGVFVIETLSVILQVISFKTTGKRIFKMSPLHHHYELVGWSEWRVVVTFWAAGLLLAVLGIYIEVWL. The pKi is 7.3. The compound is CC(C)C[C@@H](NC(=O)[C@@H](NC(=O)N[C@H](C(=O)O)C(C)C)[C@@H]1CCN=C(N)N1)C(=O)NCCCN[C@H](C(=O)O)[C@H](O[C@@H]1O[C@H](CN)[C@@H](O)[C@H]1O)[C@H]1O[C@@H](n2ccc(=O)[nH]c2=O)[C@H](O)[C@@H]1O. (8) The small molecule is CCCCCCCCNC(=O)Oc1cccc(OC(=O)c2ccccc2)c1. The target protein (P00602) has sequence NLYQFKNMIHCTVPSRPWWHFADYGCYCGRGGKGTAVDDLDRCCQVHDNCYGEAEKLGCWPYLTLYKYECSQGKLTCSGGNNKCEAAVCNCDLVAANCFAGAPYIDANYNVNLKERCQ. The pKi is 4.0. (9) The drug is COC1Oc2ccc(Cl)cc2C(=O)/C1=C\Nc1ccc(S(N)(=O)=O)cc1. The target protein (P21589) has sequence MCPRAARAPATLLLALGAVLWPAAGAWELTILHTNDVHSRLEQTSEDSSKCVNASRCMGGVARLFTKVQQIRRAEPNVLLLDAGDQYQGTIWFTVYKGAEVAHFMNALRYDAMALGNHEFDNGVEGLIEPLLKEAKFPILSANIKAKGPLASQISGLYLPYKVLPVGDEVVGIVGYTSKETPFLSNPGTNLVFEDEITALQPEVDKLKTLNVNKIIALGHSGFEMDKLIAQKVRGVDVVVGGHSNTFLYTGNPPSKEVPAGKYPFIVTSDDGRKVPVVQAYAFGKYLGYLKIEFDERGNVISSHGNPILLNSSIPEDPSIKADINKWRIKLDNYSTQELGKTIVYLDGSSQSCRFRECNMGNLICDAMINNNLRHTDEMFWNHVSMCILNGGGIRSPIDERNNGTITWENLAAVLPFGGTFDLVQLKGSTLKKAFEHSVHRYGQSTGEFLQVGGIHVVYDLSRKPGDRVVKLDVLCTKCRVPSYDPLKMDEVYKVILPNF.... The pKi is 7.1.