Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is COc1cc2c(N3CCN(C(=O)Nc4ccc(OC(C)C)cc4)CC3)ncnc2cc1OCCCN1CCCCC1. The target protein (Q02763) has sequence MDSLASLVLCGVSLLLSGTVEGAMDLILINSLPLVSDAETSLTCIASGWRPHEPITIGRDFEALMNQHQDPLEVTQDVTREWAKKVVWKREKASKINGAYFCEGRVRGEAIRIRTMKMRQQASFLPATLTMTVDKGDNVNISFKKVLIKEEDAVIYKNGSFIHSVPRHEVPDILEVHLPHAQPQDAGVYSARYIGGNLFTSAFTRLIVRRCEAQKWGPECNHLCTACMNNGVCHEDTGECICPPGFMGRTCEKACELHTFGRTCKERCSGQEGCKSYVFCLPDPYGCSCATGWKGLQCNEACHPGFYGPDCKLRCSCNNGEMCDRFQGCLCSPGWQGLQCEREGIQRMTPKIVDLPDHIEVNSGKFNPICKASGWPLPTNEEMTLVKPDGTVLHPKDFNHTDHFSVAIFTIHRILPPDSGVWVCSVNTVAGMVEKPFNISVKVLPKPLNAPNVIDTGHNFAVINISSEPYFGDGPIKSKKLLYKPVNHYEAWQHIQVTNE.... The pKd is 5.0. (2) The compound is CCCCCCC(C)(C)c1ccc([C@@H]2C[C@H](O)CC[C@H]2CCCO)c(O)c1. The target protein sequence is MKSILDGLADTTFRTITTDLLYVGSNDIQYEDIKGDMASKLGYFPQKFPLTSFRGSPFQEKMTAGDNPQLVPADQVNITEFYNKSLSSFKENEENIQCGENFMDIECFMVLNPSQQLAIAVLSLTLGTFTVLENLLVLCVILHSRSLRCRPSYHFIGSLAVADLLGSVIFVYSFIDFHVFHRKDSRNVFLFKLGGVTASFTASVGSLFLTAIDRYISIHRPLAYKRIVTRPKAVVAFCLMWTIAIVIAVLPLLGWNCEKLQSVCSDIFPHIDETYLMFWIGVTSVLLLFIVYAYMYILWKAHSHAVRMIQRGTQKSIIIHTSEDGKVQVTRPDQARMAIRLAKTLVLILVVLIICWGPLLAIMVYDVFGKMNKLIKTVFAFCSMLCLLNSTVNPIIYALRSKDLRHAFRSMFPSCEGTAQPLDNSMGDSDCLHKHANNAASVHRAAESCIKSTVKIAKVTMSVSTDTSAEAL. The pKd is 8.8. (3) The small molecule is CN1CC[C@H](c2c(O)cc(O)c3c(=O)cc(-c4ccccc4Cl)oc23)[C@H](O)C1. The target protein (Q8N752) has sequence MTNNSGSKAELVVGGKYKLVRKIGSGSFGDVYLGITTTNGEDVAVKLESQKVKHPQLLYESKLYTILQGGVGIPHMHWYGQEKDNNVLVMDLLGPSLEDLFNFCSRRFTMKTVLMLADQMISRIEYVHTKNFLHRDIKPDNFLMGTGRHCNKLFLIDFGLAKKYRDNRTRQHIPYREDKHLIGTVRYASINAHLGIEQSRRDDMESLGYVFMYFNRTSLPWQGLRAMTKKQKYEKISEKKMSTPVEVLCKGFPAEFAMYLNYCRGLRFEEVPDYMYLRQLFRILFRTLNHQYDYTFDWTMLKQKAAQQAASSSGQGQQAQTQTGKQTEKNKNNVKDN. The pKd is 5.6. (4) The drug is Cc1cncc(-c2cnc(N[C@@H]3CCNC[C@H]3OCC3CCCCC3)c3[nH]c(=O)c(C)cc23)c1. The target protein (Q12830) has sequence MRGRRGRPPKQPAAPAAERCAPAPPPPPPPPTSGPIGGLRSRHRGSSRGRWAAAQAEVAPKTRLSSPRGGSSSRRKPPPPPPAPPSTSAPGRGGRGGGGGRTGGGGGGGHLARTTAARRAVNKVVYDDHESEEEEEEEDMVSEEEEEEDGDAEETQDSEDDEEDEMEEDDDDSDYPEEMEDDDDDASYCTESSFRSHSTYSSTPGRRKPRVHRPRSPILEEKDIPPLEFPKSSEDLMVPNEHIMNVIAIYEVLRNFGTVLRLSPFRFEDFCAALVSQEQCTLMAEMHVVLLKAVLREEDTSNTTFGPADLKDSVNSTLYFIDGMTWPEVLRVYCESDKEYHHVLPYQEAEDYPYGPVENKIKVLQFLVDQFLTTNIAREELMSEGVIQYDDHCRVCHKLGDLLCCETCSAVYHLECVKPPLEEVPEDEWQCEVCVAHKVPGVTDCVAEIQKNKPYIRHEPIGYDRSRRKYWFLNRRLIIEEDTENENEKKIWYYSTKVQL.... The pKd is 4.0. (5) The target protein (P97696) has sequence MDEGGGGEGGSVPEDLSLEEREELLDIRRRKKELIDDIERLKYEIAEVMTEIDNLTSVEESKTTQRNKQIAMGRKKFNMDPKKGIQFLIENDLLQSSPEDVAQFLYKGEGLNKTVIGDYLGERDDFNIKVLQAFVELHEFADLNLVQALRQFLWSFRLPGEAQKIDRMMEAFASRYCLCNPGVFQSTDTCYVLSFAIIMLNTSLHNHNVRDKPTAERFITMNRGINEGGDLPEELLRNLYESIKNEPFKIPEDDGNDLTHTFFNPDREGWLLKLGGGRVKTWKRRWFILTDNCLYYFEYTTDKEPRGIIPLENLSIREVEDPRKPNCFELYNPSHKGQVIKACKTEADGRVVEGNHVVYRISAPSPEEKEEWMKSIKASISRDPFYDMLATRKRRIANKK. The pKd is 6.0. The compound is O=C(CCCC[C@@H]1SC[C@@H]2NC(=O)N[C@H]12)NCCCCCCOP(=O)(O)O[C@H]1[C@@H](O)[C@@H](O)[C@H](OP(=O)(O)O)[C@@H](OP(=O)(O)O)[C@@H]1O. (6) The small molecule is O=NC1CCc2cc(-c3cn(CCO)nc3-c3ccncc3)ccc21. The pKd is 5.0. The target protein (Q9UEW8) has sequence MAEPSGSPVHVQLPQQAAPVTAAAAAAPAAATAAPAPAAPAAPAPAPAPAAQAVGWPICRDAYELQEVIGSGATAVVQAALCKPRQERVAIKRINLEKCQTSMDELLKEIQAMSQCSHPNVVTYYTSFVVKDELWLVMKLLSGGSMLDIIKYIVNRGEHKNGVLEEAIIATILKEVLEGLDYLHRNGQIHRDLKAGNILLGEDGSVQIADFGVSAFLATGGDVTRNKVRKTFVGTPCWMAPEVMEQVRGYDFKADMWSFGITAIELATGAAPYHKYPPMKVLMLTLQNDPPTLETGVEDKEMMKKYGKSFRKLLSLCLQKDPSKRPTAAELLKCKFFQKAKNREYLIEKLLTRTPDIAQRAKKVRRVPGSSGHLHKTEDGDWEWSDDEMDEKSEEGKAAFSQEKSRRVKEENPEIAVSASTIPEQIQSLSVHDSQGPPNANEDYREASSCAVNLVLRLRNSRKELNDIRFEFTPGRDTADGVSQELFSAGLVDGHDVVIV....