Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. From a dataset of Drug-target binding data from BindingDB using IC50 measurements. (1) The drug is O=C(NCC(CNC(=O)c1ccc2ccccc2c1)O[C@H]1O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]1O)c1ccc2ccccc2c1. The target protein (Q9NNX6) has sequence MSDSKEPRLQQLGLLEEEQLRGLGFRQTRGYKSLAGCLGHGPLVLQLLSFTLLAGLLVQVSKVPSSISQEQSRQDAIYQNLTQLKAAVGELSEKSKLQEIYQELTQLKAAVGELPEKSKLQEIYQELTRLKAAVGELPEKSKLQEIYQELTWLKAAVGELPEKSKMQEIYQELTRLKAAVGELPEKSKQQEIYQELTRLKAAVGELPEKSKQQEIYQELTRLKAAVGELPEKSKQQEIYQELTQLKAAVERLCHPCPWEWTFFQGNCYFMSNSQRNWHDSITACKEVGAQLVVIKSAEEQNFLQLQSSRSNRFTWMGLSDLNQEGTWQWVDGSPLLPSFKQYWNRGEPNNVGEEDCAEFSGNGWNDDKCNLAKFWICKKSAASCSRDEEQFLSPAPATPNPPPA. The pIC50 is 2.9. (2) The small molecule is COc1ccc2c(c1)c(CC(=O)O)c(C)n2C(=O)c1ccc(Cl)cc1. The target protein sequence is ANPCCSNPCQNRGECMSTGFDQYKCDCTRTGFYGENCTTPEFLTRIKLLLKPTPNTVHYILTHFKGVWNIVNNIPFLRSLIMKYVLTSRSYLIDSPPTYNVHYGYKSWEAFSNLSYYTRALPPVADDCPTPMGVKGNKELPDSKEVLEKVLLRREFIPDPQGSNMMFAFFAQHFTHQFFKTDHKRGPGFTRGLGHGVDLNHIYGETLDRQHKLRLFKDGKLKYQVIGGEVYPPTVKDTQVEMIYPPHIPENLQFAVGQEVFGLVPGLMMYATIWLREHNRVCDILKQEHPEWGDEQLFQTSRLILIGETIKIVIEDYVQHLSGYHFKLKFDPELLFNQQFQYQNRIASEFNTLYHWHPLLPDTFNIEDQEYSFKQFLYNNSILLEHGLTQFVESFTRQIAGRVAGGRNVPIAVQAVAKASIDQSREMKYQSLNEYRKRFSLKPYTSFEELTGEKEMAAELKALYSDIDVMELYPALLVEKPRPDAIFGETMVELGAPFAL.... The pIC50 is 6.7. (3) The small molecule is O=C1Nc2cc(Nc3cccc(NC(=O)c4cccc(C(F)(F)F)c4)c3)ccc2/C1=C/c1ccc[nH]1. The target protein (Q61851) has sequence MVVPACVLVFCVAVVAGATSEPPGPEQRVVRRAAEVPGPEPSQQEQVAFGSGDTVELSCHPPGGAPTGPTVWAKDGTGLVASHRILVGPQRLQVLNASHEDAGVYSCQHRLTRRVLCHFSVRVTDAPSSGDDEDGEDVAEDTGAPYWTRPERMDKKLLAVPAANTVRFRCPAAGNPTPSISWLKNGKEFRGEHRIGGIKLRHQQWSLVMESVVPSDRGNYTCVVENKFGSIRQTYTLDVLERSPHRPILQAGLPANQTAILGSDVEFHCKVYSDAQPHIQWLKHVEVNGSKVGPDGTPYVTVLKTAGANTTDKELEVLSLHNVTFEDAGEYTCLAGNSIGFSHHSAWLVVLPAEEELMETDEAGSVYAGVLSYGVVFFLFILVVAAVILCRLRSPPKKGLGSPTVHKVSRFPLKRQVSLESNSSMNSNTPLVRIARLSSGEGPVLANVSELELPADPKWELSRTRLTLGKPLGEGCFGQVVMAEAIGIDKDRTAKPVTVA.... The pIC50 is 5.2. (4) The compound is CC(C)NCCCn1c(Sc2ccc(Cl)cc2Cl)nc2c(N)nccc21. The target protein (Q12931) has sequence MARELRALLLWGRRLRPLLRAPALAAVPGGKPILCPRRTTAQLGPRRNPAWSLQAGRLFSTQTAEDKEEPLHSIISSTESVQGSTSKHEFQAETKKLLDIVARSLYSEKEVFIRELISNASDALEKLRHKLVSDGQALPEMEIHLQTNAEKGTITIQDTGIGMTQEELVSNLGTIARSGSKAFLDALQNQAEASSKIIGQFGVGFYSAFMVADRVEVYSRSAAPGSLGYQWLSDGSGVFEIAEASGVRTGTKIIIHLKSDCKEFSSEARVRDVVTKYSNFVSFPLYLNGRRMNTLQAIWMMDPKDVREWQHEEFYRYVAQAHDKPRYTLHYKTDAPLNIRSIFYVPDMKPSMFDVSRELGSSVALYSRKVLIQTKATDILPKWLRFIRGVVDSEDIPLNLSRELLQESALIRKLRDVLQQRLIKFFIDQSKKDAEKYAKFFEDYGLFMREGIVTATEQEVKEDIAKLLRYESSALPSGQLTSLSEYASRMRAGTRNIYYL.... The pIC50 is 4.8. (5) The pIC50 is 5.8. The compound is COc1ccccc1-n1c(C)nnc1-c1ccc(-c2ccccc2)cc1. The target protein (P58295) has sequence MDCSAPKEMNKPPTNILEATVPGHRDSPRAPRTSPEQDLPAAAPAAAVQPPRVPRSASTGAQTFQSADARACEAQRPGVGFCKLSSPQAQATSAALRDLSEGHSAQANPPSGAAGAGNALHCKIPALRGPEEDENVSVGKGTLEHNNTPAVGWVNMSQSTVVLGTDGIASVLPGSVATTTIPEDEQGDENKARGNWSSKLDFILSMVGYAVGLGNVWRFPYLAFQNGGGAFLIPYLMMLALAGLPIFFLEVSLGQFASQGPVSVWKAIPALQGCGIAMLIISVLIAIYYNVIICYTLFYLFASFVSVLPWGSCNNPWNTPECKDKTKLLLDSCVIGDHPKIQIKNSTFCMTAYPNLTMVNFTSQANKTFVSGSEEYFKYFVLKISAGIEYPGEIRWPLAFCLFLAWVIVYASLAKGIKTSGKVVYFTATFPYVVLVILLIRGVTLPGAGAGIWYFITPKWEKLTDATVWKDAATQIFFSLSAAWGGLITLSSYNKFHNNC.... (6) The drug is COc1ccc([C@H](Cc2c(Cl)c[n+]([O-])cc2Cl)OC(=O)c2ccc(CNC(C(=O)OC[C@@H]3CCCCN3C)c3ccccc3)s2)cc1OC. The target protein sequence is MKEHGGTFSSTGISGGSGDSAMDSLQPLQPNYMPVCLFAEESYQKLAMETLEELDWCLDQLETIQTYRSVSEMASNKFKRMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKDREKKKKQQLMTQISGVKKLMHSSSLNNTSISRFGVNTENEDHLAKELEDLNKWGLNIFNVAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNSLHAADVAQSTHVLLSTPALDAVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVIDMVLATDMSKHMSLLADLKTMVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFFQQGDKERERGMEISPMCDKHTASVEKSQVGFIDYIVHPLWETWADLVQPDAQDILDTLEDNRNWYQSMIPQSPSPPLDEQNRDCQ.... The pIC50 is 9.0. (7) The target protein (P40313) has sequence MLLLSLTLSLVLLGSSWGCGIPAIKPALSFSQRIVNGENAVLGSWPWQVSLQDSSGFHFCGGSLISQSWVVTAAHCNVSPGRHFVVLGEYDRSSNAEPLQVLSVSRAITHPSWNSTTMNNDVTLLKLASPAQYTTRISPVCLASSNEALTEGLTCVTTGWGRLSGVGNVTPAHLQQVALPLVTVNQCRQYWGSSITDSMICAGGAGASSCQGDSGGPLVCQKGNTWVLIGIVSWGTKNCNVRAPAVYTRVSKFSTWINQVIAYN. The pIC50 is 7.9. The compound is CCNC(=O)C(=O)C(CC(C)C)NC(=O)C(CCCN=C(N)N[N+](=O)[O-])NC(=O)C(CCCCCCCCC#N)C1CCCC1. (8) The small molecule is Cc1cc(NC2Cc3cc(Cl)c(Cl)cc3C2)n2ncnc2n1. The target protein (Q08210) has sequence MISKLKPQFMFLPKKHILSYCRKDVLNLFEQKFYYTSKRKESNNMKNESLLRLINYNRYYNKIDSNNYYNGGKILSNDRQYIYSPLCEYKKKINDISSYVSVPFKINIRNLGTSNFVNNKKDVLDNDYIYENIKKEKSKHKKIIFLLFVSLFGLYGFFESYNPEFFLYDIFLKFCLKYIDGEICHDLFLLLGKYNILPYDTSNDSIYACTNIKHLDFINPFGVAAGFDKNGVCIDSILKLGFSFIEIGTITPRGQTGNAKPRIFRDVESRSIINSCGFNNMGCDKVTENLILFRKRQEEDKLLSKHIVGVSIGKNKDTVNIVDDLKYCINKIGRYADYIAINVSSPNTPGLRDNQEAGKLKNIILSVKEEIDNLEKNNIMNDESTYNEDNKIVEKKNNFNKNNSHMMKDAKDNFLWFNTTKKKPLVFVKLAPDLNQEQKKEIADVLLETNIDGMIISNTTTQINDIKSFENKKGGVSGAKLKDISTKFICEMYNYTNKQI.... The pIC50 is 5.9. (9) The small molecule is Cc1noc(C)c1-c1ccc2c(c1)C(c1ccccc1)(N1CCC[C@H](O)C1)C(=O)N2. The target protein sequence is MLQNVTPHNKLPGEGNAGLLGLGPEAAAPGKRIRKPSLLYEGFESPTMASVPALQLTPANPPPPEVSNPKKPGRVTNQLQYLHKVVMKALWKHQFAWPFRQPVDAVKLGLPDYHKIIKQPMDMGTIKRRLENNYYWAASECMQDFNTMFTNCYIYNKPTDDIVLMAQTLEKIFLQKVASMPQEEQELVVTIPKNSHKKGAKLAALQGSVTSAHQVPAVSSVSHTALYTPPPEIPTTVLNIPHPSVISSPLLKSLHSAGPPLLAVTAAPPAQPLAKKKGVKRKADTTTPTPTAILAPGSPASPPGSLEPKAARLPPMRRESGRPIKPPRKDLPDSQQQHQSSKKGKLSEQLKHCNGILKELLSKKHAAYAWPFYKPVDASALGLHDYHDIIKHPMDLSTVKRKMENRDYRDAQEFAADVRLMFSNCYKYNPPDHDVVAMARKLQDVFEFRYAKMPDEPLEPGPLPVSTAMPPGL. The pIC50 is 8.0. (10) The small molecule is C=CC(=O)Nc1ccc(S(=O)(=O)N2CCN(C(=O)C34CC5CC(CC(C5)C3)C4)CC2)cn1. The target protein (P00488) has sequence MSETSRTAFGGRRAVPPNNSNAAEDDLPTVELQGVVPRGVNLQEFLNVTSVHLFKERWDTNKVDHHTDKYENNKLIVRRGQSFYVQIDFSRPYDPRRDLFRVEYVIGRYPQENKGTYIPVPIVSELQSGKWGAKIVMREDRSVRLSIQSSPKCIVGKFRMYVAVWTPYGVLRTSRNPETDTYILFNPWCEDDAVYLDNEKEREEYVLNDIGVIFYGEVNDIKTRSWSYGQFEDGILDTCLYVMDRAQMDLSGRGNPIKVSRVGSAMVNAKDDEGVLVGSWDNIYAYGVPPSAWTGSVDILLEYRSSENPVRYGQCWVFAGVFNTFLRCLGIPARIVTNYFSAHDNDANLQMDIFLEEDGNVNSKLTKDSVWNYHCWNEAWMTRPDLPVGFGGWQAVDSTPQENSDGMYRCGPASVQAIKHGHVCFQFDAPFVFAEVNSDLIYITAKKDGTHVVENVDATHIGKLIVTKQIGGDGMMDITDTYKFQEGQEEERLALETALM.... The pIC50 is 5.7.