From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=Nc1c(C2C(=O)Nc3c2cccc3C(F)(F)F)[nH]c2ccc(C(=O)O)cc12. The target protein (O19175) has sequence GEEVAVKLESQKARHPQLLYESKLYKILQGGVGIPHIRWYGQEKDYNVLVMDLLGPSLEDLFNFCSRRFTMKTVLMLADQMISRIEYVHTKNFIHRDIKPDNFLMGIGRHCNKLFLIDFGLAKKY. The pIC50 is 5.4. (2) The compound is COCCNC(=O)C(c1ccncc1)N(C(=O)Cn1nnc2ccccc21)c1ccc(OC)cc1. The target protein (P61981) has sequence MVDREQLVQKARLAEQAERYDDMAAAMKNVTELNEPLSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTSADGNEKKIEMVRAYREKIEKELEAVCQDVLSLLDNYLIKNCSETQYESKVFYLKMKGDYYRYLAEVATGEKRATVVESSEKAYSEAHEISKEHMQPTHPIRLGLALNYSVFYYEIQNAPEQACHLAKTAFDDAIAELDTLNEDSYKDSTLIMQLLRDNLTLWTSDQQDDDGGEGNN. The pIC50 is 4.6. (3) The compound is COc1ccc(CCNC(=O)CSc2nnc(CNc3ccc(Cl)cc3)o2)cc1OC. The target protein sequence is MATLEKLMKAFESLKSFQQQQQQQQQQQQQQQQQQQQQQQPPPPPPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPPGPAVAEEPLHRPKKELSATKKDRVNHCLTICENIVAQSVRNSPEFQKLLGIAMELFLLCSDDAESDVRMVADECLNKVIKALMDSNLPRLQLELYKEIKKNGAPRSLRAALWRFAELAHLVRPQKCRPYLVNLLPCLTRTSKRPEESVQETLAAAVPKIMASFGNFANDNEIKVLLKAFIANLKSSSPTIRRTAAGSAVSICQHSRRTQYFYSWLLNVLLGLLVPVEDEHSTLLILGVLLTLRYLVPLLQQQVKDTSLKGSFGVTRKEMEVSPSAEQLVQVYELTLHHTQHQDHNVVTGALELLQQLFRTPPPELLQTLTAVGGIGQLTAAKEESGGRSRSGSIVELIAGGGSSCSPVLSRKQKGKVLLGEEEALEDDSESRSDVSSSALTASVKDEISGELAASSGVSTPGSAGHDIITEQ.... The pIC50 is 5.1. (4) The pIC50 is 8.7. The target protein (P23097) has sequence ATFFLLSWTHCWSLPLPYGDDDDDDLSEEDLEFAEHYLKSYYHPVTLAGILKKSTVTSTVDRLREMQSFFGLDVTGKLDDPTLDIMRKPRCGVPDVGVYNVFPRTLKWSQTNLTYRIVNYTPDISHSEVEKAFRKAFKVWSDVTPLNFTRIHDGTADIMISFGTKEHGDFYPFDGPSGLLAHAFPPGPNLGGDAHFDDDETWTSSSKGYNLFIVAAHELGHSLGLDHSKDPGALMFPIYTYTGKSHFMLPDDDVQGIQSLYGPGDEDPNPKHPKTPEKCDPALSLDAITSLRGETMIFKDRFFWRLHPQQVEPELFLTKSFWPELPNHVDAAYEHPSRDLMFIFRGRKFWALNGYDIMEGYPRKISDLGFPKEVKRLSAAVHFEDTGKTLFFSGNHVWSYDDANQTMDKDYPRLIEEEFPGIGDKVDAVYEKNGYIYFFNGPIQFEYSIWSNRIVRVMPTNSLLWC. The compound is COc1ccc(S(=O)(=O)N2CCN(C(C)C)C(=O)C[C@@H]2C(=O)NO)cc1. (5) The drug is O=C(Nc1ccccc1)c1ccc(S(=O)(=O)NCCc2c(-c3cc4ccccc4o3)[nH]c3ccccc23)cc1. The target protein (P12530) has sequence MGVYRVCVSTGASIYAGSKNKVELWLVGQHGEVELGSCLRPTRNKEEEFKVNVSKYLGSLLFVRLRKKHFLKEDAWFCNWISVQALGAAEDKYWFPCYRWVVGDGVQSLPVGTGCTTVGDPQGLFQKHREQELEERRKLYQWGSWKEGLILNVAGSKLTDLPVDERFLEDKKIDFEASLAWGLAELALKNSLNILAPWKTLDDFNRIFWCGRSKLARRVRDSWQEDSLFGYQFLNGANPMLLRRSVQLPARLVFPPGMEELQAQLEKELKAGTLFEADFALLDNIKANVILYCQQYLAAPLVMLKLQPDGKLMPMVIQLHLPKIGSSPPPLFLPTDPPMVWLLAKCWVRSSDFQVHELNSHLLRGHLMAEVFTVATMRCLPSIHPVFKLIVPHLRYTLEINVRARNGLVSDFGIFDQIMSTGGGGHVQLLQQAGAFLTYRSFCPPDDLADRGLLGVESSFYAQDALRLWEIISRYVQGIMGLYYKTDEAVRDDLELQSWC.... The pIC50 is 7.8. (6) The small molecule is CCCCCCCCCCCCCCC(CCCCCCCCCCCCCC)C(=O)N[C@@H](COC1OC(C)C(O)C(O)C1O)C(=O)N[C@@H](CCC(=O)O)C(=O)NC. The target protein (P18337) has sequence MVFPWRCEGTYWGSRNILKLWVWTLLCCDFLIHHGTHCWTYHYSEKPMNWENARKFCKQNYTDLVAIQNKREIEYLENTLPKSPYYYWIGIRKIGKMWTWVGTNKTLTKEAENWGAGEPNNKKSKEDCVEIYIKRERDSGKWNDDACHKRKAALCYTASCQPGSCNGRGECVETINNHTCICDAGYYGPQCQYVVQCEPLEAPELGTMDCIHPLGNFSFQSKCAFNCSEGRELLGTAETQCGASGNWSSPEPICQVVQCEPLEAPELGTMDCIHPLGNFSFQSKCAFNCSEGRELLGTAETQCGASGNWSSPEPICQETNRSFSKIKEGDYNPLFIPVAVMVTAFSGLAFLIWLARRLKKGKKSQERMDDPY. The pIC50 is 4.7.