Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd.. Dataset: Drug-target binding data from BindingDB using Kd measurements (1) The small molecule is Cc1nc(Nc2ncc(C(=O)Nc3c(C)cccc3Cl)s2)cc(N2CCN(CCO)CC2)n1. The target is PFCDPK1(Pfalciparum). The pKd is 6.2. (2) The compound is O=C([O-])CC1Sc2nnc(C3CC3)n2N=C1c1ccccc1. The target protein (Q48481) has sequence MNNKNIMIVGAGFSGVVIARQLAEQGYTVKIIDRRDHIGGNSYDTRDPQTDVMVHVYGPHIFHTDNETVWNYVNQYAEMMPYVNRVKATVNGQVFSLPINLHTINQFFAKTCSPDEARALISEKGDSSIVEPQTFEEQALRFIGKELYEAFFKGYTIKQWGMEPSELPASILKRLPVRFNYDDNYFNHKFQGMPKLGYTRMIEAIADHENISIELQREFLPEEREDYAHVFYSGPLDAFYSYQYGRLGYRTLDFEKFTYQGDYQGCAVMNYCSIDVPYTRITEHKYFSPWESHEGSVCYKEYSRACGENDIPYYPIRQMGEMALLEKYLSLAESEKNITFVGRLGTYRYLDMDVTIAEALKTADEFLSSVANQEEMPVFTVPVR. The pKd is 4.1. (3) The drug is CC[C@H](C)[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1c[nH]cn1)NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](N)CSC1CC(=O)N(CCOCCOCCOCc2cn(CCO[C@@H]3O[C@H](CO)[C@@H](O[C@@H]4O[C@H](CO[C@]5(C(=O)O)C[C@H](O)[C@@H](NC(C)=O)[C@H]([C@H](O)[C@H](O)CO)O5)[C@@H](O)[C@H](O)[C@H]4O)[C@H](O)[C@H]3O)nn2)C1=O)[C@@H](C)O)[C@@H](C)CC)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(=O)N[C@@H](Cc1ccccc1)C(=O)O. The target protein sequence is FLGFLGAAGSTMGAASITLTVQARTLLSGIVQQQNNLLRAIEAQQHLLQLTVWGIKQLQARVLAVERYLQDQQLLGIWGCSGKHICTTTVPWNSSWSNKSLEEIWQNMTWMEWEREIDNYTGLI. The pKd is 6.6. (4) The pKd is 7.3. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The drug is N#Cc1cnc2cc(OC[C@@H](O)CO)c(NC(=O)CC3CSSC3)cc2c1Nc1ccc(OCc2cccc(F)c2)c(Cl)c1. (5) The small molecule is CC(C)C[C@H](N)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@H](C(=O)NCC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CC(C)C)C(N)=O)[C@@H](C)O. The target protein (Q14145) has sequence MQPDPRPSGAGACCRFLPLQSQCPEGAGDAVMYASTECKAEVTPSQHGNRTFSYTLEDHTKQAFGIMNELRLSQQLCDVTLQVKYQDAPAAQFMAHKVVLASSSPVFKAMFTNGLREQGMEVVSIEGIHPKVMERLIEFAYTASISMGEKCVLHVMNGAVMYQIDSVVRACSDFLVQQLDPSNAIGIANFAEQIGCVELHQRAREYIYMHFGEVAKQEEFFNLSHCQLVTLISRDDLNVRCESEVFHACINWVKYDCEQRRFYVQALLRAVRCHSLTPNFLQMQLQKCEILQSDSRCKDYLVKIFEELTLHKPTQVMPCRAPKVGRLIYTAGGYFRQSLSYLEAYNPSDGTWLRLADLQVPRSGLAGCVVGGLLYAVGGRNNSPDGNTDSSALDCYNPMTNQWSPCAPMSVPRNRIGVGVIDGHIYAVGGSHGCIHHNSVERYEPERDEWHLVAPMLTRRIGVGVAVLNRLLYAVGGFDGTNRLNSAECYYPERNEWRMI.... The pKd is 6.5. (6) The compound is C[N+]1(C)[C@H]2CC(OC(=O)[C@H](CO)c3ccccc3)C[C@@H]1[C@H]1O[C@@H]21. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTTFKCLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKd is 9.2. (7) The small molecule is C#Cc1cccc(Nc2ncnc3cc(OCCOC)c(OCCOC)cc23)c1. The target is PFCDPK1(Pfalciparum). The pKd is 5.0.