Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The small molecule is CCOc1ccc(C[C@@H](NC(=O)CC(C)(C)C)C(=O)N[C@@H](Cc2ccccc2)C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)NCCCC(=O)N2CCC[C@@H]2C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCCNC(=N)N)C(N)=O)C(C)C)cc1. The target protein (Q00788) has sequence MLLVSTVSAVPGLFSPPSSPSNSSQEELLDDRDPLLVRAELALLSTIFVAVALSNGLVLGALIRRGRRGRWAPMHVFISHLCLADLAVALFQVLPQLAWDATDRFHGPDALCRAVKYLQMVGMYASSYMILAMTLDRHRAICRPMLAYRHGGGARWNRPVLVAWAFSLLLSLPQLFIFAQRDVGNGSGVFDCWARFAEPWGLRAYVTWIALMVFVAPALGIAACQVLIFREIHASLVPGPSERAGRRRRGRRTGSPSEGAHVSAAMAKTVRMTLVIVIVYVLCWAPFFLVQLWAAWDPEAPLERPPFVLLMLLASLNSCTNPWIYASFSSSVSSELRSLLCCAQRHTTHSLGPQDESCATASSSLMKDTPS. The pKd is 7.9. (2) The compound is Cc1sc2c(c1C)C(c1ccc(Cl)cc1)=N[C@@H](CC(=O)OC(C)(C)C)c1nnc(C)n1-2. The target protein sequence is NPPPPETSNPNKPKRQTNQLQYLLRVVLKTLWKHQFAWPFQQPVDAVKLNAPDYYKIIKTPMDMGTIKKRLENNYYWNAQECIQDFNTMFTNCYIYNKPGDDIVLMAEALEKLFLQKINELPT. The pKd is 8.7. (3) The compound is Cc1ccc2c(c1)C(=O)N(C1CCN(CCCn3ccnc3-c3ccccc3)CC1)C(c1ccccc1F)N2. The target protein (O43924) has sequence MSAKDERAREILRGFKLNWMNLRDAETGKILWQGTEDLSVPGVEHEARVPKKILKCKAVSRELNFSSTEQMEKFRLEQKVYFKGQCLEEWFFEFGFVIPNSTNTWQSLIEAAPESQMMPASVLTGNVIIETKFFDDDLLVSTSRVRLFYV. The pKd is 7.2. (4) The drug is NCCCC[C@@H](N)C(=O)O. The target protein sequence is MGSKTKTTEAPALRRELKARHLTMIAIGGSIGTGLFVASGATISQAGPGGALFSYILIGLMVYFLMTSLGELAAYMPVSGSFATYGQNYVEEGFGFALGWNYWYNWAVTIAVDLVAAQLVMGWWFPDTPGWIWSALFLCVIFLLNYISVRGFGEAEYWFSLIKVATVIIFIIVGVAMIIGIFKGVEPVGWSNWTTGDAPFAGGFAAMIGVAMIVGFSFQGTELIGIAAGESENPEKNIPRAVRQVFWRILLFYVFAILIISLIIPYTDPNLLRNDVKDISVSPFTLVFQHAGLLSAAAIMNAVILTAVLSAGNSGMYASTRMLYTLACDGKAPRIFAKLSRGGVPRNALYATTVIAGLCFLTSMFGNQTVYLWLLNTSGMTGFIAWLGIAISHYRFRRGYVLQGYDVNDLPYRSGFFPLGPIFAFVLCLIITLGQNYEAFLKDTIDWGGVAATYIGIPLFLLIWFGYKLIKGTHFVRYSEMHFPERVKK. The pKd is 2.7. (5) The compound is CCN(CCO)CCCOc1ccc2c(Nc3cc(CC(=O)Nc4cccc(F)c4)n[nH]3)ncnc2c1. The target protein (P78356) has sequence MSSNCTSTTAVAVAPLSASKTKTKKKHFVCQKVKLFRASEPILSVLMWGVNHTINELSNVPVPVMLMPDDFKAYSKIKVDNHLFNKENLPSRFKFKEYCPMVFRNLRERFGIDDQDYQNSVTRSAPINSDSQGRCGTRFLTTYDRRFVIKTVSSEDVAEMHNILKKYHQFIVECHGNTLLPQFLGMYRLTVDGVETYMVVTRNVFSHRLTVHRKYDLKGSTVAREASDKEKAKDLPTFKDNDFLNEGQKLHVGEESKKNFLEKLKRDVEFLAQLKIMDYSLLVGIHDVDRAEQEEMEVEERAEDEECENDGVGGNLLCSYGTPPDSPGNLLSFPRFFGPGEFDPSVDVYAMKSHESSPKKEVYFMAIIDILTPYDTKKKAAHAAKTVKHGAGAEISTVNPEQYSKRFNEFMSNILT. The pKd is 5.0. (6) The compound is Nc1c(S(=O)(=O)[O-])cc(Nc2ccc(-c3ccc(Nc4cc(S(=O)(=O)[O-])c(N)c5c4C(=O)c4ccccc4C5=O)cc3)cc2)c2c1C(=O)c1ccccc1C2=O. The target protein (P14210) has sequence MWVTKLLPALLLQHVLLHLLLLPIAIPYAEGQRKRRNTIHEFKKSAKTTLIKIDPALKIKTKKVNTADQCANRCTRNKGLPFTCKAFVFDKARKQCLWFPFNSMSSGVKKEFGHEFDLYENKDYIRNCIIGKGRSYKGTVSITKSGIKCQPWSSMIPHEHSFLPSSYRGKDLQENYCRNPRGEEGGPWCFTSNPEVRYEVCDIPQCSEVECMTCNGESYRGLMDHTESGKICQRWDHQTPHRHKFLPERYPDKGFDDNYCRNPDGQPRPWCYTLDPHTRWEYCAIKTCADNTMNDTDVPLETTECIQGQGEGYRGTVNTIWNGIPCQRWDSQYPHEHDMTPENFKCKDLRENYCRNPDGSESPWCFTTDPNIRVGYCSQIPNCDMSHGQDCYRGNGKNYMGNLSQTRSGLTCSMWDKNMEDLHRHIFWEPDASKLNENYCRNPDDDAHGPWCYTGNPLIPWDYCPISRCEGDTTPTIVNLDHPVISCAKTKQLRVVNGIP.... The pKd is 5.7. (7) The small molecule is Oc1ccccc1I. The target protein sequence is MGEPRTRAEARPWVDEDLKDSSDLHQAEEDADEWQESEENVEHIPFSHNHYPEKEMVKRSQEFYELLNKRRSVRFISNEQVPMEVIDNVIRTAGTAPSGAHTEPWTFVVVKDPDVKHKIRKIIEEEEEINYMKRMGHRWVTDLKKLRTNWIKEYLDTAPILILIFKQVHGFAANGKKKVHYYNEISVSIACGILLAALQNAGLVTVTTTPLNCGPRLRVLLGRPAHEKLLMLLPVGYPSKEATVPDLKRKPLDQIMVTV. The pKd is 2.9. (8) The compound is CCn1c(-c2nonc2N)nc2c(C#CC(C)(C)O)ncc(OC[C@H]3CCCNC3)c21. The target is PFCDPK1(Pfalciparum). The pKd is 5.0. (9) The small molecule is CC[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)c1ccc(Br)cc1)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCC[N+](C)(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCCNC(=S)Nc1ccc(-c2c3ccc(=O)cc-3oc3cc(O)ccc23)c(C(=O)O)c1)C(N)=O. The target protein (O00257) has sequence MELPAVGEHVFAVESIEKKRIRKGRVEYLVKWRGWSPKYNTWEPEENILDPRLLIAFQNRERQEQLMGYRKRGPKPKPLVVQVPTFARRSNVLTGLQDSSTDNRAKLDLGAQGKGQGHQYELNSKKHHQYQPHSKERAGKPPPPGKSGKYYYQLNSKKHHPYQPDPKMYDLQYQGGHKEAPSPTCPDLGAKSHPPDKWAQGAGAKGYLGAVKPLAGAAGAPGKGSEKGPPNGMMPAPKEAVTGNGIGGKMKIVKNKNKNGRIVIVMSKYMENGMQAVKIKSGEVAEGEARSPSHKKRAADERHPPADRTFKKAAGAEEKKVEAPPKRREEEVSGVSDPQPQDAGSRKLSPTKEAFGEQPLQLTTKPDLLAWDPARNTHPPSHHPHPHPHHHHHHHHHHHHAVGLNLSHVRKRCLSETHGEREPCKKRLTARSISTPTCLGGSPAAERPADLPPAAALPQPEVILLDSDLDEPIDLRCVKTRSEAGEPPSSLQVKPETPAS.... The pKd is 6.1. (10) The drug is NC(=O)C[C@@H]1NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc2ccc(O)cc2)NC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@@H]2CCCCNC(=O)CC[C@H](NC1=O)C(=O)N[C@H](C(N)=O)CSCC(=O)N2. The target protein (Q14451) has sequence MELDLSPPHLSSSPEDLCPAPGTPPGTPRPPDTPLPEEVKRSQPLLIPTTGRKLREEERRATSLPSIPNPFPELCSPPSQSPILGGPSSARGLLPRDASRPHVVKVYSEDGACRSVEVAAGATARHVCEMLVQRAHALSDETWGLVECHPHLALERGLEDHESVVEVQAAWPVGGDSRFVFRKNFAKYELFKSSPHSLFPEKMVSSCLDAHTGISHEDLIQNFLNAGSFPEIQGFLQLRGSGRKLWKRFFCFLRRSGLYYSTKGTSKDPRHLQYVADVNESNVYVVTQGRKLYGMPTDFGFCVKPNKLRNGHKGLRIFCSEDEQSRTCWLAAFRLFKYGVQLYKNYQQAQSRHLHPSCLGSPPLRSASDNTLVAMDFSGHAGRVIENPREALSVALEEAQAWRKKTNHRLSLPMPASGTSLSAAIHRTQLWFHGRISREESQRLIGQQGLVDGLFLVRESQRNPQGFVLSLCHLQKVKHYLILPSEEEGRLYFSMDDGQT.... The pKd is 6.0.