Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The small molecule is NCCCC(=O)N[C@@H](Cc1c[nH]c2ccc(O)cc12)C(=O)N[C@@H](Cc1ccc(Cl)cc1)C(=O)N[C@@H](Cc1ccncc1)C(=O)NCCc1c[nH]c2ccccc12. The target protein sequence is NEVTLLDSRSVQGELGWIASPLEGGWEEVSIMDEKNTPIRTYQVCNVMEPSQNNWLRTDWITREGAQRVYIEIKFTLRDCNSLPGVMGTCKETFNLYYYESDNDKERFIRENQFVKIDTIAADESFTQVDIGDRIMKLNTEIRDVGPLSKKGFYLAFQDVGACIALVSVRVFYKKCPLTVR. The pKd is 6.0. (2) The compound is Oc1cccc2ccccc12. The target protein (O74036) has sequence MAGEEVKEIDEFEELGFEPATEETPKKKKKEKIIRSIEDLPGVGPATAEKLREAGYDTLEAIAVASPIELKEVAGISEGTALKIIQAARKAANLGTFMRADEYLKKRATIGRISTGSKSLDKLLGGGIETQAITEVFGEFGSGKTQLAHTLAVMVQLPPEEGGLNGSVIWIDTENTFRPERIREIAQNRGLDPDEVLKHIYVARAFNSNHQMLLVQQAEDKIKELLNTDRPVKLLIVDSLTSHFRSEYIGRGALAERQQKLAKHLADLHRLANLYDIAVFVTNQVQARPDAFFGDPTRPIGGHILAHSATLRVYLRKGKGGKRIARLIDAPHLPEGEAVFSITEKGIED. The pKd is 3.4. (3) The compound is CCCCCCC(C)(C)c1ccc([C@@H]2C[C@H](O)CC[C@H]2CCCO)c(O)c1. The target protein sequence is MKSILDGLADTTFRTITTDLLYVGSNDIQYEDIKGDMASKLGYFPQKFPLTSFRGSPFQEKMTAGDNPQLVPADQVNITEFYNKSLSSFKENEENIQCGENFMDIECFMVLNPSQQLAIAVLSLTLGTFTVLENLLVLCVILHSRSLRCRPSYHFIGSLAVADLLGSVIFVYSFIDFHVFHRKDSRNVFLFKLGGVTASFTASVGSLFLTAIDRYISIHRPLAYKRIVTRPKAVVAFCLMWTIAIVIAVLPLLGWNCEKLQSVCSDIFPHIDETYLMFWIGVTSVLLLFIVYAYMYILWKAHSHAVRMIQRGTQKSIIIHTSEDGKVQVTRPDQARMDIRLAKALVLILVVLIICWGPLLAIMVYDVFGKMNKLIKTVFAFCSMLCLLNSTVNPIIYALRSKDLRHAFRSMFPSCEGTAQPLDNSMGDSDCLHKHANNAASVHRAAESCIKSTVKIAKVTMSVSTDTSAEAL. The pKd is 8.5. (4) The drug is Oc1ccccc1I. The target protein sequence is MKQKPAFIPYAGAQFEPEEMLSKSAEYYQFMDHRRTVREFSNRAIPLEVIENIVMTASTAPSGAHKQPWTFVVVSDPQIKAKIRQAAEKEEFESYNGRMSNEWLEDLQPFGTDWHKPFLEIAPYLIVVFRKAYDVLPDGTQRKNYYVQESVGIACGFLLAAIHQAGLVALTHTPSPMNFLQKILQRPENERPFLLVPVGYPAEGAMVPDLQRKDKAAVMVVYHHHHHH. The pKd is 4.2.