Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C(O[C@@H]1Cc2c(O)cc(O)cc2O[C@@H]1c1cc(O)c(O)c(O)c1)c1cc(O)c(O)c(O)c1. The target protein (P52020) has sequence MWTFLGIATFTYFYKKCGDVTLANKELLLCVLVFLSLGLVLSYRCRHRNGGLLGRHQSGSQFAAFSDILSALPLIGFFWAKSPPESEKKEQLESKRRRKEVNLSETTLTGAATSVSTSSVTDPEVIIIGSGVLGSALATVLSRDGRTVTVIERDLKEPDRILGECLQPGGYRVLRELGLGDTVESLNAHHIHGYVIHDCESRSEVQIPYPVSENNQVQSGVAFHHGKFIMSLRKAAMAEPNVKFIEGVVLRLLEEDDAVIGVQYKDKETGDTKELHAPLTVVADGLFSKFRKNLISNKVSVSSHFVGFIMKDAPQFKANFAELVLVDPSPVLIYQISPSETRVLVDIRGELPRNLREYMTEQIYPQIPDHLKESFLEACQNARLRTMPASFLPPSSVNKRGVLLLGDAYNLRHPLTGGGMTVALKDIKIWRQLLKDIPDLYDDAAIFQAKKSFFWSRKRSHSFVVNVLAQALYELFSATDDSLRQLRKACFLYFKLGGEC.... The pIC50 is 6.2. (2) The small molecule is CC(C)(O)c1cnc(OCc2cn(-c3ccc(F)c(C(F)F)c3)nn2)nc1. The target protein (P19793) has sequence MDTKHFLPLDFSTQVNSSLTSPTGRGSMAAPSLHPSLGPGIGSPGQLHSPISTLSSPINGMGPPFSVISSPMGPHSMSVPTTPTLGFSTGSPQLSSPMNPVSSSEDIKPPLGLNGVLKVPAHPSGNMASFTKHICAICGDRSSGKHYGVYSCEGCKGFFKRTVRKDLTYTCRDNKDCLIDKRQRNRCQYCRYQKCLAMGMKREAVQEERQRGKDRNENEVESTSSANEDMPVERILEAELAVEPKTETYVEANMGLNPSSPNDPVTNICQAADKQLFTLVEWAKRIPHFSELPLDDQVILLRAGWNELLIASFSHRSIAVKDGILLATGLHVHRNSAHSAGVGAIFDRVLTELVSKMRDMQMDKTELGCLRAIVLFNPDSKGLSNPAEVEALREKVYASLEAYCKHKYPEQPGRFAKLLLRLPALRSIGLKCLEHLFFFKLIGDTPIDTFLMEMLEAPHQMT. The pIC50 is 7.8. (3) The small molecule is CN(CCCc1ccc(-c2ccccc2)cc1)Cc1cccc2cc[nH]c12.Cl. The target protein (O07855) has sequence MKIAVIGAGVTGLAAAARIASQGHEVTIFEKNNNVGGRMNQLKKDGFTFDMGPTIVMMPDVYKDVFTACGKNYEDYIELRQLRYIYDVYFDHDDRITVPTDLAELQQMLESIEPGSTHGFMSFLTDVYKKYEIARRYFLERTYRKPSDFYNMTSLVQGAKLKTLNHADQLIEHYIDNEKIQKLLAFQTLYIGIDPKRGPSLYSIIPMIEMMFGVHFIKGGMYGMAQGLAQLNKDLGVNIELNAEIEQIIIDPKFKRADAIKVNGDIRKFDKILCTADFPSVAESLMPDFAPIKKYPPHKIADLDYSCSAFLMYIGIDIDVTDQVRLHNVIFSDDFRGNIEEIFEGRLSYDPSIYVYVPAVADKSLAPEGKTGIYVLMPTPELKTGSGIDWSDEALTQQIKEIIYRKLATIEVFEDIKSHIVSETIFTPNDFEQTYHAKFGSAFGLMPTLAQSNYYRPQNVSRDYKDLYFAGASTHPGAGVPIVLTSAKITVDEMIKDIER.... The pIC50 is 6.0. (4) The small molecule is CC(N)C(=O)Nc1cc(Cl)cc(NC(=O)c2ccccn2)n1. The target protein sequence is SDAVSSDRNFPNSTNLPRNPSMADYEARIFTFGTWIYSVNKEQLARAGFYALGEGDKVKCFHCGGGLTDWKPSEDPWEQHAKWYPGCKYLLEQKGQEYINNIHLTHSLEECLVRTT. The pIC50 is 4.8. (5) The drug is O=Cc1ccc(O)c(O)c1O. The target protein (P54300) has sequence MKKALLFSLISMVGFSPASQATQVLNGYWGYQEFLDEFPEQRNLTNALSEAVRAQPVPLSKPTQRPIKISVVYPGQQVSDYWVRNIASFEKRLYKLNINYQLNQVFTRPNADIKQQSLSLMEALKSKSDYLIFTLDTTRHRKFVEHVLDSTNTKLILQNITTPVREWDKHQPFLYVGFDHAEGSRELATEFGKFFPKHTYYSVLYFSEGYISDVRGDTFIHQVNRDNNFELQSAYYTKATKQSGYDAAKASLAKHPDVDFIYACSTDVALGAVDALAELGREDIMINGWGGGSAELDAIQKGDLDITVMRMNDDTGIAMAEAIKWDLEDKPVPTVYSGDFEIVTKADSPERIEALKKRAFRYSDN. The pIC50 is 5.4.