Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd.. Dataset: Drug-target binding data from BindingDB using Kd measurements (1) The compound is C[C@@H](Oc1cc(-n2cnc3ccc(CN4CCN(C)CC4)cc32)sc1C(N)=O)c1ccccc1C(F)(F)F. The target protein (Q16816) has sequence MTRDEALPDSHSAQDFYENYEPKEILGRGVSSVVRRCIHKPTSQEYAVKVIDVTGGGSFSPEEVRELREATLKEVDILRKVSGHPNIIQLKDTYETNTFFFLVFDLMKRGELFDYLTEKVTLSEKETRKIMRALLEVICTLHKLNIVHRDLKPENILLDDNMNIKLTDFGFSCQLEPGERLREVCGTPSYLAPEIIECSMNEDHPGYGKEVDMWSTGVIMYTLLAGSPPFWHRKQMLMLRMIMSGNYQFGSPEWDDYSDTVKDLVSRFLVVQPQNRYTAEEALAHPFFQQYLVEEVRHFSPRGKFKVIALTVLASVRIYYQYRRVKPVTREIVIRDPYALRPLRRLIDAYAFRIYGHWVKKGQQQNRAALFENTPKAVLLSLAEEDY. The pKd is 5.0. (2) The pKd is 5.7. The target protein (P36888) has sequence MPALARDGGQLPLLVVFSAMIFGTITNQDLPVIKCVLINHKNNDSSVGKSSSYPMVSESPEDLGCALRPQSSGTVYEAAAVEVDVSASITLQVLVDAPGNISCLWVFKHSSLNCQPHFDLQNRGVVSMVILKMTETQAGEYLLFIQSEATNYTILFTVSIRNTLLYTLRRPYFRKMENQDALVCISESVPEPIVEWVLCDSQGESCKEESPAVVKKEEKVLHELFGTDIRCCARNELGRECTRLFTIDLNQTPQTTLPQLFLKVGEPLWIRCKAVHVNHGFGLTWELENKALEEGNYFEMSTYSTNRTMIRILFAFVSSVARNDTGYYTCSSSKHPSQSALVTIVEKGFINATNSSEDYEIDQYEEFCFSVRFKAYPQIRCTWTFSRKSFPCEQKGLDNGYSISKFCNHKHQPGEYIFHAENDDAQFTKMFTLNIRRKPQVLAEASASQASCFSDGYPLPSWTWKKCSDKSPNCTEEITEGVWNRKANRKVFGQWVSSST.... The drug is Cc1n[nH]c2ccc(-c3cncc(OC[C@@H](N)Cc4ccccc4)c3)cc12. (3) The drug is Cc1cc2c(cc1C(=O)c1cccc(C(=O)O)c1)C(C)(C)CCC2(C)C. The target protein (P28704) has sequence MSWATRPPFLPPRHAAGQCGPVGVRKEMHCGVASRWRRRRPWLDPAAAAAAAGEQQALEPEPGEAGRDGMGDSGRDSRSPDSSSPNPLSQGIRPSSPPGPPLTPSAPPPPMPPPPLGSPFPVISSSMGSPGLPPPAPPGFSGPVSSPQINSTVSLPGGGSGPPEDVKPPVLGVRGLHCPPPPGGPGAGKRLCAICGDRSSGKHYGVYSCEGCKGFFKRTIRKDLTYSCRDNKDCTVDKRQRNRCQYCRYQKCLATGMKREAVQEERQRGKDKDGDGDGAGGAPEEMPVDRILEAELAVEQKSDQGVEGPGATGGGGSSPNDPVTNICQAADKQLFTLVEWAKRIPHFSSLPLDDQVILLRAGWNELLIASFSHRSIDVRDGILLATGLHVHRNSAHSAGVGAIFDRVLTELVSKMRDMRMDKTELGCLRAIILFNPDAKGLSNPGEVEILREKVYASLETYCKQKYPEQQGRFAKLLLRLPALRSIGLKCLEHLFFFKLI.... The pKd is 6.0. (4) The compound is C=C/C(C)=C/[C@@]1(C)SC(=O)C(C(=O)C(F)(F)F)C1=O. The target protein (P9WQD9) has sequence MSQPSTANGGFPSVVVTAVTATTSISPDIESTWKGLLAGESGIHALEDEFVTKWDLAVKIGGHLKDPVDSHMGRLDMRRMSYVQRMGKLLGGQLWESAGSPEVDPDRFAVVVGTGLGGAERIVESYDLMNAGGPRKVSPLAVQMIMPNGAAAVIGLQLGARAGVMTPVSACSSGSEAIAHAWRQIVMGDADVAVCGGVEGPIEALPIAAFSMMRAMSTRNDEPERASRPFDKDRDGFVFGEAGALMLIETEEHAKARGAKPLARLLGAGITSDAFHMVAPAADGVRAGRAMTRSLELAGLSPADIDHVNAHGTATPIGDAAEANAIRVAGCDQAAVYAPKSALGHSIGAVGALESVLTVLTLRDGVIPPTLNYETPDPEIDLDVVAGEPRYGDYRYAVNNSFGFGGHNVALAFGRY. The pKd is 4.7.