From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is C[C@H](CCC(=O)NCC(=O)O)C1CCC2C3C(CC[C@@]21C)[C@@]1(C)Cc2cn(Cc4cccc(O)c4)nc2CC1C[C@H]3O. The target protein (Q14973) has sequence MEAHNASAPFNFTLPPNFGKRPTDLALSVILVFMLFFIMLSLGCTMEFSKIKAHLWKPKGLAIALVAQYGIMPLTAFVLGKVFRLKNIEALAILVCGCSPGGNLSNVFSLAMKGDMNLSIVMTTCSTFCALGMMPLLLYIYSRGIYDGDLKDKVPYKGIVISLVLVLIPCTIGIVLKSKRPQYMRYVIKGGMIIILLCSVAVTVLSAINVGKSIMFAMTPLLIATSSLMPFIGFLLGYVLSALFCLNGRCRRTVSMETGCQNVQLCSTILNVAFPPEVIGPLFFFPLLYMIFQLGEGLLLIAIFWCYEKFKTPKDKTKMIYTAATTEETIPGALGNGTYKGEDCSPCTA. The pIC50 is 6.5. (2) The drug is O=C(c1ccc(-c2cccc(NS(=O)(=O)c3ccccc3OC(F)(F)F)c2)s1)c1c(F)ccc(O)c1F. The target protein (Q62730) has sequence MNPFSSESAWLCLTATAVLGGMLLCKAWSSGQLRSQVVCLAGLWGGACLLSLSLLCSLFLLSVSCFFLLYVSSSDQDLLPVDQKAVLVTGADSGFGHALAKHLDKLGFTVFAGVLDKEGPGAEELRKNCSERLSVLQMDVTKPEQIKDVHSEVAEKIQDKGLWAVVNNAGVLHFPIDGELIPMTVYRKCMAVNFFGAVEVTKVFLPLLRKSKGRLVNVSSMGAMIPFQMVAAYASTKAAISMFSAVIRQELAKWGVKVVTIHPGGFQTNIVGSQDSWDKMEKEILDHFSKEIQENYGQEYVHTQKLALPVMREMSNPDITPVLRDIQHAICAKNPSSFYCSGRMTYLWICFAAYSPISLLDYILKNYFTPKLMPRALRTAS. The pIC50 is 7.4. (3) The compound is O=c1oc2c(O)c(O)cc3c(=O)oc4c(O)c(O)cc1c4c23. The target protein (Q6P6U0) has sequence MGCVFCKKLEPAPKEDVGLEGDFRSQGAEERYYPDPTQGRSSSISPQPISPAFLNVGNIRSVSGTGVTIFVALYDYEARTGDDLTFTKGEKFHILNNTEYDWWEARSLSSGRTGYVPSNYVAPVDSIQAEEWYFGKISRKDAERQLLSDGNPQGAFLIRESETTKGAYSLSIRDWDQNRGDHIKHYKIRKLDMGGYYITTRAQFESVQDLVRHYMEVNDGLCYLLTAPCMVMKPQTLGLAKDAWEIDRNSIALDRRLGTGCFGDVWLGTWNCSTKVAVKTLKPGTMSPKAFLEEAQIMKLLRHDKLVQLYAVVSEEPIYIVTEFMCYGSLLDFLKDRKGHNLMLPNLVDMAAQVAEGMAYMERMNYIHRDLRAANILVGEHLICKIADFGLARLIVDDEYNPQQGTKFPIKWTAPEAALFGRFTVKSDVWSFGILLTELITKGRVPYPGMNNREVLEQVEHGYHMPCPPGCPVSLYEVMEQTWRLDPEERPTFEYLQSFL.... The pIC50 is 5.0. (4) The compound is CNC(=O)c1nc2cccnn2c1-c1cc(F)c(Cl)cc1C. The target protein sequence is ATFPGHSQRREEFLYRSDSDYDLSPKAMSRNSSLPSEQHGDDLIVTPFAQVLASLRSVRNNFTILTNLHGTSNKRSPAASQPPVSRVNPQEESYQKLAMETLEELDWCLDQLETIQTYRSVSEMASNKFKRMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKDREKKKKQQLMTQISGVKKLMHSSSLNNTSISRFGVNTENEDHLAKELEDLNKWGLNIFNVAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNSLHAADVAQSTHVLLSTPALDAVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVIDMVLATDMSKHMSLLADLKTMVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFFQQGDKERERGMEISPMCDKHTASVEKSQVGFIDY.... The pIC50 is 6.8. (5) The compound is O=C(NCc1ccc(Cl)cc1)c1ccc2ccccc2c1. The target protein (P28857) has sequence MDSVSFFNPYLEANRLKKKSRSSYIRILPRGIMHDGAAGLIKDVCDSEPRMFYRDRQYLLSKEMTWPSLDIARSKDYDHMRMKFHIYDAVETLMFTDSIENLPFQYRHFVIPSGTVIRMFGRTEDGEKICVNVFGQEQYFYCECVDGRSLKATINNLMLTGEVKMSCSFVIEPADKLSLYGYNANTVVNLFKVSFGNFYVSQRIGKILQNEGFVVYEIDVDVLTRFFVDNGFLSFGWYNVKKYIPQDMGKGSNLEVEINCHVSDLVSLEDVNWPLYGCWSFDIECLGQNGNFPDAENLGDIVIQISVISFDTEGDRDERHLFTLGTCEKIDGVHIYEFASEFELLLGFFIFLRIESPEFITGYNINNFDLKYLCIRMDKIYHYDIGCFSKLKNGKIGISVPHEQYRKGFLQAQTKVFTSGVLYLDMYPVYSSKITAQNYKLDTIAKICLQQEKEQLSYKEIPKKFISGPSGRAVVGKYCLQDSVLVVRLFKQINYHFEVA.... The pIC50 is 4.0. (6) The small molecule is CCc1ccccc1CC1CCc2cc(OC)ccc2C1=O. The target protein (Q02318) has sequence MAALGCARLRWALRGAGRGLCPHGARAKAAIPAALPSDKATGAPGAGPGVRRRQRSLEEIPRLGQLRFFFQLFVQGYALQLHQLQVLYKAKYGPMWMSYLGPQMHVNLASAPLLEQVMRQEGKYPVRNDMELWKEHRDQHDLTYGPFTTEGHHWYQLRQALNQRLLKPAEAALYTDAFNEVIDDFMTRLDQLRAESASGNQVSDMAQLFYYFALEAICYILFEKRIGCLQRSIPEDTVTFVRSIGLMFQNSLYATFLPKWTRPVLPFWKRYLDGWNAIFSFGKKLIDEKLEDMEAQLQAAGPDGIQVSGYLHFLLASGQLSPREAMGSLPELLMAGVDTTSNTLTWALYHLSKDPEIQEALHEEVVGVVPAGQVPQHKDFAHMPLLKAVLKETLRLYPVVPTNSRIIEKEIEVDGFLFPKNTQFVFCHYVVSRDPTAFSEPESFQPHRWLRNSQPATPRIQHPFGSVPFGYGVRACLGRRIAELEMQLLLARLIQKYKVV.... The pIC50 is 5.1. (7) The drug is Cc1ccc(Cn2nc3c(C(F)(F)F)cccc3c2-c2ccc(F)cc2)c(C)c1. The pIC50 is 8.0. The target protein sequence is MREQCVLSEEQIRKKKIRKQQQQESQSQSQSPVGPQGSSSSASGPGASPGGSEAGSQGSGEGEGVQLTAAQELMIQQLVAAQLQCNKRSFSDQPKVTPWPLGADPQSRDARQQRFAHFTELAIISVQEIVDFAKQVPGFLQLGREDQIALLKASTIEIMLLETARRYNHETECITFLKDFTYSKDDFHRAGLQVEFINPIFEFSRAMRRLGLDDAEYALLIAINIFSADRPNVQEPGRVEALQQPYVEALLSYTRIKRPQDQLRFPRMLMKLVSLRTLSSVHSEQVFALRLQDKKLPPLLSEIWDVHE. (8) The small molecule is CCC(CC)O[C@@H]1C=C(C(=O)O)C[C@H](N)[C@H]1NC(C)=O. The target protein sequence is MNPNQKIITIGSICMVVGIVSLMLQIGNMISIWVSHSIQTGNQHQAEPIRNTNFLTENAVASVTLAGNSSLCPIRGWAVHSKDNSIRIGSKGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPHRTLMSCPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSNGQASYKIFKMEKGKVVKSVELNAPNYHYEECSCYPDAGEIICVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDNPRPNDGTGSCGPVSPNGAYGIKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTGTDSNFSMKQDIVAITDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKESTIWTSGSSISFCGVNSDTVSWSWPDGAELPFTIDK. The pIC50 is 9.7.