From a dataset of Drug-target binding data from BindingDB using Ki measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is CNCCc1c[nH]c2ccc(O)cc12. The target protein (P31388) has sequence MVPEPGPVNSSTPAWGPGPPPAPGGSGWVAAALCVVIVLTAAANSLLIVLICTQPALRNTSNFFLVSLFTSDLMVGLVVMPPAMLNALYGRWVLARGLCLLWTAFDVMCCSASILNLCLISLDRYLLILSPLRYKLRMTAPRALALILGAWSLAALASFLPLLLGWHELGKARTPAPGQCRLLASLPFVLVASGVTFFLPSGAICFTYCRILLAARKQAVQVASLTTGTAGQALETLQVPRTPRPGMESADSRRLATKHSRKALKASLTLGILLGMFFVTWLPFFVANIAQAVCDCISPGLFDVLTWLGYCNSTMNPIIYPLFMRDFKRALGRFLPCVHCPPEHRPALPPPPCGPLTAVPDQASACSRCCLCLCRQTQIQTPLQGAPRACSSQPSFCCLERPPGTPRHPPGPPLWSTSLSQTLWSLRYGRIHSVPP. The pKi is 7.0. (2) The compound is O=P(O)(O)OP(=O)(O)OP(=O)(O)OC[C@H]1O[C@@H](n2ccc3ccccc32)C[C@@H]1O. The target protein (P06710) has sequence MSYQVLARKWRPQTFADVVGQEHVLTALANGLSLGRIHHAYLFSGTRGVGKTSIARLLAKGLNCETGITATPCGVCDNCREIEQGRFVDLIEIDAASRTKVEDTRDLLDNVQYAPARGRFKVYLIDEVHMLSRHSFNALLKTLEEPPEHVKFLLATTDPQKLPVTILSRCLQFHLKALDVEQIRHQLEHILNEEHIAHEPRALQLLARAAEGSLRDALSLTDQAIASGDGQVSTQAVSAMLGTLDDDQALSLVEAMVEANGERVMALINEAAARGIEWEALLVEMLGLLHRIAMVQLSPAALGNDMAAIELRMRELARTIPPTDIQLYYQTLLIGRKELPYAPDRRMGVEMTLLRALAFHPRMPLPEPEVPRQSFAPVAPTAVMTPTQVPPQPQSAPQQAPTVPLPETTSQVLAARQQLQRVQGATKAKKSEPAAATRARPVNNAALERLASVTDRVQARPVPSALEKAPAKKEAYRWKATTPVMQQKEVVATPKALKKA.... The pKi is 4.2. (3) The compound is COc1ccc(C#CCN(C)C)cc1OC. The pKi is 5.5. The target protein (O08590) has sequence MTQKTTLVLLALAVITIFALVCVLLAGRSGDGGRLSQPLHCPSVLPSVQPQTHPGQSQPFADLSPEELTAVMSFLIKHLGPGLVDAAQARPSDNCVFSVELQLPAKAAALAHLDRGGPPPVREALAIIFFGGQPKPNVSELVVGPLPHPSYMRDVTVERHGGPLPYYRRPVLTREYQDIQEMIFHRELPQASGLLHHCCFYKRQGHNLLKMTTAPRGLQSGDRATWFGIYYNLSGAGFYPHPIGLELLVDHKALDPALWTIQKVFYQGRYYESLTQLEDMFEAGLVNVVLVPDNGTGGSWSLKSSVPPGRAPPLQFHPEGPRFSVQGSQVRSSLWAFSFGLGAFSGPRIFDIRFQGERVAYEISVQEAIALYGGNSPASMSTCYMDGSFGIGKYSTPLTRGVDCPYLATYVDWHFLLESQTPKTLRDAFCVFEQNQGLPLRRHHSDFYSHYFGGVVETVLVVRSVATLLNYDYVWDMVFHSNGAIEVKFHATGYITSAFF.... (4) The small molecule is CN[C@@H](C)C(=O)N[C@H](C(=O)N1CCC[C@H]1c1nc(-c2ccc(F)c3ccccc23)cs1)C1CCOCC1. The target protein (Q96CA5) has sequence MGPKDSAKCLHRGPQPSHWAAGDGPTQERCGPRSLGSPVLGLDTCRAWDHVDGQILGQLRPLTEEEEEEGAGATLSRGPAFPGMGSEELRLASFYDWPLTAEVPPELLAAAGFFHTGHQDKVRCFFCYGGLQSWKRGDDPWTEHAKWFPSCQFLLRSKGRDFVHSVQETHSQLLGSWDPWEEPEDAAPVAPSVPASGYPELPTPRREVQSESAQEPGGVSPAEAQRAWWVLEPPGARDVEAQLRRLQEERTCKVCLDRAVSIVFVPCGHLVCAECAPGLQLCPICRAPVRSRVRTFLS. The pKi is 4.4. (5) The compound is O=C([O-])[C@]1(O)C=C(OCc2cccs2)[C@@H](O)[C@H](O)C1. The target protein (Q48255) has sequence MKILVIQGPNLNMLGHRDPRLYGMVTLDQIHEIMQTFVKQGNLDVELEFFQTNFEGEIIDKIQESVGSDYEGIIINPGAFSHTSIAIADAIMLAGKPVIEVHLTNIQAREEFRKNSYTGAACGGVIMGFGPLGYNMALMAMVNILAEMKAFQEAQKNNPNNPINNQK. The pKi is 6.0. (6) The small molecule is N=C(N)c1ccc2oc(Cc3cc4cc(C(=N)N)ccc4o3)cc2c1. The target protein (P00735) has sequence MARVRGPRLPGCLALAALFSLVHSQHVFLAHQQASSLLQRARRANKGFLEEVRKGNLERECLEEPCSREEAFEALESLSATDAFWAKYTACESARNPREKLNECLEGNCAEGVGMNYRGNVSVTRSGIECQLWRSRYPHKPEINSTTHPGADLRENFCRNPDGSITGPWCYTTSPTLRREECSVPVCGQDRVTVEVIPRSGGSTTSQSPLLETCVPDRGREYRGRLAVTTSGSRCLAWSSEQAKALSKDQDFNPAVPLAENFCRNPDGDEEGAWCYVADQPGDFEYCDLNYCEEPVDGDLGDRLGEDPDPDAAIEGRTSEDHFQPFFNEKTFGAGEADCGLRPLFEKKQVQDQTEKELFESYIEGRIVEGQDAEVGLSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTVDDLLVRIGKHSRTRYERKVEKISMLDKIYIHPRYNWKENLDRDIALLKLKRPIELSDYIHPVCLPDKQTAAKLLHAGF.... The pKi is 5.0. (7) The compound is CC[C@H](C)[C@H](NC(=O)CNC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](CCSC)NC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(=O)N[C@H]1CSSC[C@@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](Cc2cnc[nH]2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](Cc2ccc(O)cc2)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](Cc2ccccc2)C(=O)N[C@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(=O)N[C@@H](C)C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(=O)N[C@@H](CO)C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](Cc2ccc(O)cc2)C(N)=O)[C@@H](C)CC)C(C)C)[C@@H](C)O)[C@@H](C)CC)C(C)C)[C@@H](C)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)CNC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CCCNC(=N)N)NC1=O. The target protein (Q867C0) has sequence MARGLRGLPRRGLWLLLVNHLFLATACQDTDHAALLRKYCLPQFQVDMEAIGKALWCDWDKTIGSYKDLSDCTRLVAQRLDCFWPNAAVDKFFLGVHQQYFRNCPVSGRALQDPPSSVLCPFIVVPILATLLMTALVVWRSKRPEGIV. The pKi is 7.7. (8) The compound is CNCCC(Oc1ccccc1C)c1ccccc1. The target protein (Q9WVR3) has sequence MASVCGAPSPGGALGSQAPAWYHRDLSRAAAEELLARAGRDGSFLVRDSESVAGAFALCVLYQKHVHTYRILPDGEDFLAVQTSQGVPVRRFQTLGELIGLYAQPNQGLVCALLLPVEGEREPDPPDDRDASDVEDEKPPLPPRSGSTSISVPAGPSSPLPAPETPTTPAAESTPNGLSTVSHEYLKGSYGLDLEAVRGGASNLPHLTRTLVTSCRRLHSEVDKVLSGLEILSKVFDQQSSPMVTRLLQQQSLPQTGEQELESLVLKLSVLKDFLSGIQKKALKALQDMSSTAPPAPLQPSIRKAKTIPVQAFEVKLDVTLGDLTKIGKSQKFTLSVDVEGGRLVLLRRQRDSQEDWTTFTHDRIRQLIKSQRVQNKLGVVFEKEKDRTQRKDFIFVSARKREAFCQLLQLMKNKHSKQDEPDMISVFIGTWNMGSVPPPKNVTSWFTSKGLGKALDEVTVTIPHDIYVFGTQENSVGDREWLDLLRGGLKELTDLDYRP.... The pKi is 6.0. (9) The compound is NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)CCC(=O)Nc1ccc(O[C@@H]2O[C@H](CO)[C@@H](O)[C@H](O)[C@H]2O)cc1. The target protein sequence is MKKTWWKEGVAYQIYPRSFMDANGDGIGDLRGIIEKLDYLVELGVDIVWICPIYRSPNADNGYDISDYYAIMDEFGTMDDFDELLAQAHRRGLKIILDLVINHTSDEHPWFIESRSSRDNPKRDWYIWRDGKDGREPNNWESIFGGSAWQYDERTGQYYLHLFDVKQPDLNWENSEVRQALYDMINWWLDKGIDGFRIDAISHIKKKPGLPDLPNPKGLKYVPSFAAHMNQPGIMEYLRELKEQTFARYDIMTVGEANGVTVDEAEQWVGEENGVFHMIFQFEHLGLWKRKADGSIDVRRLKRTLTKWQKGLENRGWNALFLENHDLPRSVSTWGNDREYWAESAKALGALYFFMQGTPFIYQGQEIGMTNVQFSDIRDYRDVAALRLYELERANGRTHEEVMKIIWKTGRDNSRTPMQWSDAPNAGFTTGTPWIKVNENYRTINVEAERRDPNSVWSFYRQMIQLRKANELFVYGAYDLLLENHPSIYAYTRTLGRDRA.... The pKi is 4.0.