Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. From a dataset of Drug-target binding data from BindingDB using Ki measurements. (1) The pKi is 6.5. The drug is O=C(Nc1ccc(Cl)c(C(F)(F)F)c1)[C@H]1CC=C[C@H]2CCN(Cc3ccccc3)C(=O)[C@@H]12. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSFYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. (2) The drug is CCCCOC(=O)NS(=O)(=O)c1cc(CC(C)C)ccc1-c1ccc(C(=O)N(CC)CC)cc1. The target protein (P30555) has sequence MILNSSTEDSIKRIQDDCPKAGRHNYIFVMIPTLYSIIFVVGIFGNSLVVIVIYFYMKLKTVASVFLLNLALADLCFLLTLPLWAVYTAMEYRWPFGNYLCKIASASVSFNLYASVFLLTCLSIDRYLAIVHPMKSRLRRTMLVAKVTCIIIWLLAGLASLPTIIHRNVFFIENTNITVCAFHYESQNSTLPVGLGLTKNILGFLFPFLIILTSYTLIWKALKKAYEIQKNKPRNDDIFKIIMAIVLFFFFSWVPHQIFTFLDVLIQLGIIHDCKIADIVDTAMPITICLAYFNNCLNPLFYGFLGKKFKKYFLQLLKYIPPKAKSHSSLSTKMSTLSYRPSENGSSSTKKSAPCTEVE. The pKi is 6.9. (3) The compound is Nc1ccn(C2O[C@H](COP(=O)([O-])N[C@H](c3ccccc3)P(=O)([O-])[O-])[C@@H](O)[C@H]2O)c(=O)n1. The target protein (Q64686) has sequence MACILKRKPALAVSFIALCILLLAMRLANDVTFPLLLNCFGQPKTKWIPLSYTLRQPLQTHYGYINVRTQEPLQLNCNHCAVVSNSGQMVGQKVGEEIDRASCIWRMNNAPTKGFEEDVGYMTMVRVVSHTSVPLLLKNPDYFFKEASTTIYVIWGPFRNMRKDGNGIVYNMLKKTVDAYPDAQIYVTTEQRMTYCDGVFKDETGKDRVQSGSYLSTGWFTFILAMDACYSIHVYGMINETYCTTEGYRKVPYHYYEQGKDECNEYLLHEHAPYGGHRFITEKKVFAKWAKKHRIVFTHPNWTVS. The pKi is 3.9. (4) The drug is COC(=O)[C@@H]1C[C@H](OC(C)=O)C(=O)[C@H]2[C@@]1(C)CC[C@H]1C(=O)O[C@H](c3ccoc3)C[C@]21C. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSFYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKi is 7.7. (5) The compound is FC(F)(F)c1ccc(CNc2ccc([C@H]3CNCCO3)cc2)cc1. The target protein (Q923X8) has sequence MATDDDRFPWDQDSILSRDLLSASSMQLCYEKLNRSCVRSPYSPGPRLILYAVFGFGAVLAVCGNLLVMTSILHFRQLHSPANFLVASLACADFLVGLTVMPFSMVRSVEGCWYFGDIYCKFHSSFDGSFCYSSIFHLCFISADRYIAVSDPLIYPTRFTASVSGKCITFSWLLSIIYSFSLFYTGVNEAGLEDLVSALTCVGGCQIAVNQSWVFINFLLFLVPALVMMTVYSKIFLIAKQQAQNIEKMGKQTARASESYKDRVAKRERKAAKTLGIAVAAFLLSWLPYFIDSIIDAFLGFVTPTYVYEILVWIGYYNSAMNPLIYAFFYPWFRKAIKLIVTGKILRENSSATNLFPE. The pKi is 6.8. (6) The pKi is 9.6. The compound is COc1ccc(S(=O)(=O)N(CC(C)C)C[C@@H](O)[C@H](Cc2ccccc2)NC(=O)O[C@H]2CCOCOC2)cc1. The target protein (P03369) has sequence MGARASVLSGGELDKWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIDVKDTKEALEKIEEEQNKSKKKAQQAAAAAGTGNSSQVSQNYPIVQNLQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQDVKNWMTETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNPANIMMQRGNFRNQRKTVKCFNCGKEGHIAKNCRAPRKKGCWRCGREGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTRRELQVWGGENNSLSEAGADRQGTVSFNFPQITLWQRPL.... (7) The drug is NCCc1ccc(S(N)(=O)=O)cc1. The target protein (P40881) has sequence MMFNKQIFTILILSLSLALAGSGCISEGAEDNVAQEITVDEFSNIRENPVTPWNPEPSAPVIDPTAYIDPQASVIGEVTIGANVMVSPMASIRSDEGMPIFVGDRSNVQDGVVLHALETINEEGEPIEDNIVEVDGKEYAVYIGNNVSLAHQSQVHGPAAVGDDTFIGMQAFVFKSKVGNNCVLEPRSAAIGVTIPDGRYIPAGMVVTSQAEADKLPEVTDDYAYSHTNEAVVYVNVHLAEGYKETS. The pKi is 6.6. (8) The pKi is 7.7. The drug is Fc1ccc([C@@H]2CCNC[C@H]2COc2ccc3c(c2)OCO3)cc1. The target is MLLARMKPQVQPELGGADQ. (9) The drug is CC(C)(C)OC(=O)N[C@H](C(=O)NCC#N)c1ccccc1. The target protein (P25975) has sequence MNPSFFLTVLCLGVASAAPKLDPNLDAHWHQWKATHRRLYGMNEEEWRRAVWEKNKKIIDLHNQEYSEGKHGFRMAMNAFGDMTNEEFRQVMNGFQNQKHKKGKLFHEPLLVDVPKSVDWTKKGYVTPVKNQGQCGSCWAFSATGALEGQMFRKTGKLVSLSEQNLVDCSRAQGNQGCNGGLMDNAFQYIKDNGGLDSEESYPYLATDTNSCNYKPECSAANDTGFVDIPQREKALMKAVATVGPISVAIDAGHTSFQFYKSGIYYDPDCSSKDLDHGVLVVGYGFEGTDSNNNKFWIVKNSWGPEWGWNGYVKMAKDQNNHCGIATAASYPTV. The pKi is 4.5. (10) The compound is NCCCC[C@H](NC(=O)c1ccc(F)c(F)c1)C(=O)c1noc(Cc2ccc(OCCc3ccc(Cl)c(Cl)c3)cc2)n1. The target protein (Q02844) has sequence MLKLLLLTLPLLSSLVHAAPGPAMTREGIVGGQEAHGNKWPWQVSLRANDTYWMHFCGGSLIHPQWVLTAAHCVGPDVADPNKVRVQLRKQYLYYHDHLMTVSQIITHPDFYIVQDGADIALLKLTNPVNISDYVHPVPLPPASETFPSGTLCWVTGWGNIDNGVNLPPPFPLKEVQVPIIENHLCDLKYHKGLITGDNVHIVRDDMLCAGNEGHDSCQGDSGGPLVCKVEDTWLQAGVVSWGEGCAQPNRPGIYTRVTYYLDWIHHYVPKDF. The pKi is 6.4.