Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is Cc1ccc2nc(NCCN)c3ncc(C)n3c2c1. The target protein sequence is MNKMKNFKRRFSLSVPRTETIEESLAEFTEQFNQLHNRRNENLQLGPLGRDPPQECSTFSPTDSGEEPGQLSPGVQFQRRQNQRRFSMEDVSKRLSLPMDIRLPQEFLQKLQMESPDLPKPLSRMSRRASLSDIGFGKLETYVKLDKLGEGTYATVFKGRSKLTENLVALKEIRLEHEEGAPCTAIREVSLLKNLKHANIVTLHDLIHTDRSLTLVFEYLDSDLKQYLDHCGNLMSMHNVKIFMFQLLRGLAYCHHRKILHRDLKPQNLLINERGELKLADFGLARAKSVPTKTYSNEVVTLWYRPPDVLLGSTEYSTPIDMWGVGCIHYEMATGRPLFPGSTVKEELHLIFRLLGTPTEETWPGVTAFSEFRTYSFPCYLPQPLINHAPRLDTDGIHLLSSLLLYESKSRMSAEAALSHSYFRSLGERVHQLEDTASIFSLKEIQLQKDPGYRGLAFQQPGRGKNRRQSIF. The pKd is 5.0. (2) The compound is CC(C)(COP(=O)([O-])OP(=O)([O-])OC[C@H]1O[C@@H](n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1OP(=O)([O-])[O-])[C@@H](O)C(=O)NCCC(=O)NCCS. The target protein sequence is MQLSHRPAETGDLETVAGFPQDRDELFYCYPKAIWPFSVAQLAAAIAERRGSTVAVHDGQVLGFANFYQWQHGDFCALGNMMVAPAARGLGVARYLIGVMENLAREQYKARLMKISCFNANAAGLLLYTQLGYQPRAIAERHDPDGRRVALIQMDKPLEP. The pKd is 5.4. (3) The small molecule is CC[C@H](C)[C@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(C)C)NC(=O)[C@H](CCSC)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](NC(=O)CCNC(=S)Nc1ccc(-c2c3ccc(=O)cc-3oc3cc(O)ccc23)c(C(=O)O)c1)[C@@H](C)O)C(=O)N1CCC[C@H]1C(N)=O. The target protein (P21645) has sequence MPSIRLADLAQQLDAELHGDGDIVITGVASMQSAQTGHITFMVNPKYREHLGLCQASAVVMTQDDLPFAKSAALVVKNPYLTYARMAQILDTTPQPAQNIAPSAVIDATAKLGNNVSIGANAVIESGVELGDNVIIGAGCFVGKNSKIGAGSRLWANVTIYHEIQIGQNCLIQSGTVVGADGFGYANDRGNWVKIPQIGRVIIGDRVEIGACTTIDRGALDDTIIGNGVIIDNQCQIAHNVVIGDNTAVAGGVIMAGSLKIGRYCMIGGASVINGHMEICDKVTVTGMGMVMRPITEPGVYSSGIPLQPNKVWRKTAALVMNIDDMSKRLKSLERKVNQQD. The pKd is 6.2. (4) The compound is CO[C@]12CC[C@@]3(C[C@@H]1C(C)(C)O)[C@H]1Cc4ccc(O)c5c4[C@@]3(CCN1CC1CC1)[C@H]2O5. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMAICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSYYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKd is 9.2. (5) The drug is Cc1ccc(NC(=O)c2ccc(CN3CCN(C)CC3)cc2)cc1Nc1nccc(-c2cccnc2)n1. The target protein sequence is HHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIINEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQES. The pKd is 5.0. (6) The small molecule is COCCOCOc1ccc2c(c1)[C@H](NC(=O)c1ccc(-c3ccccc3)cc1)[C@@H](O)[C@H](CC(=O)O)N2C(=O)OCC[Si](C)(C)C. The target protein (Q61337) has sequence MGTPKQPSLAPAHALGLRKSDPGIRSLGSDAGGRRWRPAAQSMFQIPEFEPSEQEDASATDRGLGPSLTEDQPGPYLAPGLLGSNIHQQGRAATNSHHGGAGAMETRSRHSSYPAGTEEDEGMEEELSPFRGRSRSAPPNLWAAQRYGRELRRMSDEFEGSFKGLPRPKSAGTATQMRQSAGWTRIIQSWWDRNLGKGGSTPSQ. The pKd is 4.0.