Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is NC(=O)c1ccc[n+]([C@@H]2O[C@H](COP(=O)([O-])OP(=O)(O)OC[C@H]3O[C@@H](n4cnc5c(N)ncnc54)[C@H](O)[C@@H]3O)[C@@H](O)[C@H]2O)c1. The target protein (Q13363) has sequence MGSSHLLNKGLPLGVRPPIMNGPLHPRPLVALLDGRDCTVEMPILKDVATVAFCDAQSTQEIHEKVLNEAVGALMYHTITLTREDLEKFKALRIIVRIGSGFDNIDIKSAGDLGIAVCNVPAASVEETADSTLCHILNLYRRATWLHQALREGTRVQSVEQIREVASGAARIRGETLGIIGLGRVGQAVALRAKAFGFNVLFYDPYLSDGVERALGLQRVSTLQDLLFHSDCVTLHCGLNEHNHHLINDFTVKQMRQGAFLVNTARGGLVDEKALAQALKEGRIRGAALDVHESEPFSFSQGPLKDAPNLICTPHAAWYSEQASIEMREEAAREIRRAITGRIPDSLKNCVNKDHLTAATHWASMDPAVVHPELNGAAYRYPPGVVGVAPTGIPAAVEGIVPSAMSLSHGLPPVAHPPHAPSPGQTVKPEADRDHASDQL. The pKd is 6.4. (2) The drug is CNC(=O)c1cn(C2C(O)C(CO)OC(SC)C2O)nn1. The target protein (P16110) has sequence MADSFSLNDALAGSGNPNPQGYPGAWGNQPGAGGYPGAAYPGAYPGQAPPGAYPGQAPPGAYPGQAPPSAYPGPTAPGAYPGPTAPGAYPGQPAPGAFPGQPGAPGAYPQCSGGYPAAGPYGVPAGPLTVPYDLPLPGGVMPRMLITIMGTVKPNANRIVLDFRRGNDVAFHFNPRFNENNRRVIVCNTKQDNNWGKEERQSAFPFESGKPFKIQVLVEADHFKVAVNDAHLLQYNHRMKNLREISQLGISGDITLTSANHAMI. The pKd is 3.6. (3) The drug is CCCCCO[C@H]1O[C@H](COS(=O)(=O)[O-])[C@@H](O[C@@H]2O[C@@H](C(=O)[O-])[C@@H](O[C@H]3O[C@H](COS(=O)(=O)[O-])[C@@H](O[C@@H]4O[C@@H](C(=O)[O-])[C@@H](O[C@H]5O[C@H](COS(=O)(=O)[O-])[C@@H](O[C@@H]6O[C@@H](C(=O)[O-])[C@@H](O)[C@H](O)[C@H]6OS(=O)(=O)[O-])[C@H](O)[C@H]5NC(C)=O)[C@H](O)[C@H]4OS(=O)(=O)[O-])[C@H](O)[C@H]3NS(=O)(=O)[O-])[C@H](O)[C@H]2O)[C@H](O)[C@H]1NC(C)=O.[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+]. The target protein (P22004) has sequence MPGLGRRAQWLCWWWGLLCSCCGPPPLRPPLPAAAAAAAGGQLLGDGGSPGRTEQPPPSPQSSSGFLYRRLKTQEKREMQKEILSVLGLPHRPRPLHGLQQPQPPALRQQEEQQQQQQLPRGEPPPGRLKSAPLFMLDLYNALSADNDEDGASEGERQQSWPHEAASSSQRRQPPPGAAHPLNRKSLLAPGSGSGGASPLTSAQDSAFLNDADMVMSFVNLVEYDKEFSPRQRHHKEFKFNLSQIPEGEVVTAAEFRIYKDCVMGSFKNQTFLISIYQVLQEHQHRDSDLFLLDTRVVWASEEGWLEFDITATSNLWVVTPQHNMGLQLSVVTRDGVHVHPRAAGLVGRDGPYDKQPFMVAFFKVSEVHVRTTRSASSRRRQQSRNRSTQSQDVARVSSASDYNSSELKTACRKHELYVSFQDLGWQDWIIAPKGYAANYCDGECSFPLNAHMNATNHAIVQTLVHLMNPEYVPKPCCAPTKLNAISVLYFDDNSNVILK.... The pKd is 5.0. (4) The small molecule is CO[C@@]1(NC(=O)Cc2cccs2)C(=O)N2C(C(=O)O)=C(COC(N)=O)CS[C@@H]21. The target protein sequence is MQLSHRPAETGDLETVAGFPQDRDELFYCYPKAIWPFSVAQLAAAIAERRGSTVAVHDGQVLGFANFYQWQHGDFCALGNMMVAPAARGLGVARYLIGVMENLAREQYKARLMKISCFNANAAGLLLYTQLGYQPRAIAERHDPDGRRVALIQMDKPLEP. The pKd is 4.3. (5) The compound is C[N+]1(C)[C@H]2CC(OC(=O)[C@H](CO)c3ccccc3)C[C@@H]1[C@H]1O[C@@H]21. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKCTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTTFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKd is 9.2. (6) The compound is Clc1ccccc1C(c1ccccc1)(c1ccccc1)n1ccnc1. The target protein (P9WPP7) has sequence MTATVLLEVPFSARGDRIPDAVAELRTREPIRKVRTITGAEAWLVSSYALCTQVLEDRRFSMKETAAAGAPRLNALTVPPEVVNNMGNIADAGLRKAVMKAITPKAPGLEQFLRDTANSLLDNLITEGAPADLRNDFADPLATALHCKVLGIPQEDGPKLFRSLSIAFMSSADPIPAAKINWDRDIEYMAGILENPNITTGLMGELSRLRKDPAYSHVSDELFATIGVTFFGAGVISTGSFLTTALISLIQRPQLRNLLHEKPELIPAGVEELLRINLSFADGLPRLATADIQVGDVLVRKGELVLVLLEGANFDPEHFPNPGSIELDRPNPTSHLAFGRGQHFCPGSALGRRHAQIGIEALLKKMPGVDLAVPIDQLVWRTRFQRRIPERLPVLW. The pKd is 7.1. (7) The compound is C[N+]1(C)[C@H]2CC(OC(=O)[C@H](CO)c3ccccc3)C[C@@H]1[C@H]1O[C@@H]21. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTTFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKd is 9.2.