This data is from Drug-target binding data from BindingDB using Kd measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is Cc1cncc(-c2cnc(N[C@@H]3CCN(C)C[C@H]3OCC3CCS(=O)(=O)CC3)c3[nH]c(=O)c(C)cc23)c1. The target protein (Q9UIG0) has sequence MAPLLGRKPFPLVKPLPGEEPLFTIPHTQEAFRTREEYEARLERYSERIWTCKSTGSSQLTHKEAWEEEQEVAELLKEEFPAWYEKLVLEMVHHNTASLEKLVDTAWLEIMTKYAVGEECDFEVGKEKMLKVKIVKIHPLEKVDEEATEKKSDGACDSPSSDKENSSQIAQDHQKKETVVKEDEGRRESINDRARRSPRKLPTSLKKGERKWAPPKFLPHKYDVKLQNEDKIISNVPADSLIRTERPPNKEIVRYFIRHNALRAGTGENAPWVVEDELVKKYSLPSKFSDFLLDPYKYMTLNPSTKRKNTGSPDRKPSKKSKTDNSSLSSPLNPKLWCHVHLKKSLSGSPLKVKNSKNSKSPEEHLEEMMKMMSPNKLHTNFHIPKKGPPAKKPGKHSDKPLKAKGRSKGILNGQKSTGNSKSPKKGLKTPKTKMKQMTLLDMAKGTQKMTRAPRNSGGTPRTSSKPHKHLPPAALHLIAYYKENKDREDKRSALSCVIS.... The pKd is 4.0. (2) The drug is COc1cc2c(N3CCN(C(=O)Nc4ccc(OC(C)C)cc4)CC3)ncnc2cc1OCCCN1CCCCC1. The target protein sequence is MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKT.... The pKd is 8.3. (3) The drug is O=C(O)CC(O)(CC(=O)O)C(=O)O. The target protein (P15474) has sequence MPRSLANAPIMILNGPNLNLLGQRQPEIYGSDTLADVEALCVKAAAAHGGTVDFRQSNHEGELVDWIHEARLNHCGIVINPAAYSHTSVAILDALNTCDGLPVVEVHISNIHQREPFRHHSYVSQRADGVVAGCGVQGYVFGVERIAALAGAGSARA. The pKd is 5.1. (4) The small molecule is Cc1sc2c(c1C)C(c1ccc(Cl)cc1)=N[C@H](CC(=O)OC(C)(C)C)c1nnc(C)n1-2. The target protein sequence is GKLSEHLRYCDSILREMLSKKHAAYAWPFYKPVDAEALELHDYHDIIKHPMDLSTVKRKMDGREYPDAQGFAADVRLMFSNCYKYNPPDHEVVAMARKLQDVFEMRFAKMP. The pKd is 7.7. (5) The drug is CS(=O)(=O)CCNCc1ccc(-c2ccc3ncnc(Nc4ccc(OCc5cccc(F)c5)c(Cl)c4)c3c2)o1. The target protein (Q9Y2U5) has sequence MDDQQALNSIMQDLAVLHKASRPALSLQETRKAKSSSPKKQNDVRVKFEHRGEKRILQFPRPVKLEDLRSKAKIAFGQSMDLHYTNNELVIPLTTQDDLDKAVELLDRSIHMKSLKILLVINGSTQATNLEPLPSLEDLDNTVFGAERKKRLSIIGPTSRDRSSPPPGYIPDELHQVARNGSFTSINSEGEFIPESMDQMLDPLSLSSPENSGSGSCPSLDSPLDGESYPKSRMPRAQSYPDNHQEFSDYDNPIFEKFGKGGTYPRRYHVSYHHQEYNDGRKTFPRARRTQGTSLRSPVSFSPTDHSLSTSSGSSIFTPEYDDSRIRRRGSDIDNPTLTVMDISPPSRSPRAPTNWRLGKLLGQGAFGRVYLCYDVDTGRELAVKQVQFDPDSPETSKEVNALECEIQLLKNLLHERIVQYYGCLRDPQEKTLSIFMEYMPGGSIKDQLKAYGALTENVTRKYTRQILEGVHYLHSNMIVHRDIKGANILRDSTGNVKLG.... The pKd is 5.0. (6) The pKd is 5.0. The target protein (P36507) has sequence MLARRKPVLPALTINPTIAEGPSPTSEGASEANLVDLQKKLEELELDEQQKKRLEAFLTQKAKVGELKDDDFERISELGAGNGGVVTKVQHRPSGLIMARKLIHLEIKPAIRNQIIRELQVLHECNSPYIVGFYGAFYSDGEISICMEHMDGGSLDQVLKEAKRIPEEILGKVSIAVLRGLAYLREKHQIMHRDVKPSNILVNSRGEIKLCDFGVSGQLIDSMANSFVGTRSYMAPERLQGTHYSVQSDIWSMGLSLVELAVGRYPIPPPDAKELEAIFGRPVVDGEEGEPHSISPRPRPPGRPVSGHGMDSRPAMAIFELLDYIVNEPPPKLPNGVFTPDFQEFVNKCLIKNPAERADLKMLTNHTFIKRSEVEEVDFAGWLCKTLRLNQPGTPTRTAV. The compound is COc1ccc(COc2ccc(Cc3cnc(N)nc3N)cc2OC)cc1. (7) The compound is CN[C@H](CC(C)C)C(=O)N[C@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(=O)N[C@@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)O)c1ccc(O)cc1)c1ccc(O)cc1. The target protein (Q8RN04) has sequence MSEDDPRPLHIRRQGLDPADELLAAGALTRVTIGSGADAETHWMATAHAVVRQVMGDHQQFSTRRRWDPRDEIGGKGIFRPRELVGNLMDYDPPEHTRLRRKLTPGFTLRKMQRMAPYIEQIVNDRLDEMERAGSPADLIAFVADKVPGAVLCELVGVPRDDRDMFMKLCHGHLDASLSQKRRAALGDKFSRYLLAMIARERKEPGEGMIGAVVAEYGDDATDEELRGFCVQVMLAGDDNISGMIGLGVLAMLRHPEQIDAFRGDEQSAQRAVDELIRYLTVPYSPTPRIAREDLTLAGQEIKKGDSVICSLPAANRDPALAPDVDRLDVTREPIPHVAFGHGVHHCLGAALARLELRTVFTELWRRFPALRLADPAQDTEFRLTTPAYGLTELMVAW. The pKd is 4.8.