This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(=O)N[C@H](Cc1ccc2ccccc2c1)C(=O)N[C@H](Cc1ccccc1)C(=O)NCc1ccc2c(c1)c(O)c(C(=O)NCC(=O)O)c(=O)n2C. The target protein (Q9H6Z9) has sequence MPLGHIMRLDLEKIALEYIVPCLHEVGFCYLDNFLGEVVGDCVLERVKQLHCTGALRDGQLAGPRAGVSKRHLRGDQITWIGGNEEGCEAISFLLSLIDRLVLYCGSRLGKYYVKERSKAMVACYPGNGTGYVRHVDNPNGDGRCITCIYYLNKNWDAKLHGGILRIFPEGKSFIADVEPIFDRLLFFWSDRRNPHEVQPSYATRYAMTVWYFDAEERAEAKKKFRNLTRKTESALTED. The pIC50 is 8.2. (2) The compound is O=C1c2ccccc2C(=O)c2c1ccc1c(=O)n(-c3ccccc3)sc21. The target protein (Q8TCT1) has sequence MSGCFPVSGLRCLSRDGRMAAQGAPRFLLTFDFDETIVDENSDDSIVRAAPGQRLPESLRATYREGFYNEYMQRVFKYLGEQGVRPRDLSAIYEAIPLSPGMSDLLQFVAKQGACFEVILISDANTFGVESSLRAAGHHSLFRRILSNPSGPDARGLLALRPFHTHSCARCPANMCKHKVLSDYLRERAHDGVHFERLFYVGDGANDFCPMGLLAGGDVAFPRRGYPMHRLIQEAQKAEPSSFRASVVPWETAADVRLHLQQVLKSC. The pIC50 is 5.7. (3) The small molecule is Cn1cnc2cc(C(=O)O)ccc21. The target protein sequence is MSVSIQGQFPGRRLRRLRKHDFSRRLVAENQLSVNDLIYPMFILMGKDRREKVDSMPGVERLSIDLMLEEAQYLANLGVPAIALFPVVNQDAKSLCAAEAYNPEGLVQRAVRALKEHVPQMGVITDVALDPFTTHGQDGIIDEQGYVLNDETTEVLVKQALSHAQAGADVVAPSDMMDGRIGRIRQALEEAGYIHTQIMAYSAKYASNYYGPFRDAVGSSANLKGGNKKNYQMDPANSDEALHEVAMDINEGADMVMVKPGMPYLDVVRRVKTELQVPTFAYQVSGEYAMHKAAIMNGWLKERETVFESLLCFKRAGADGILTYFAKEVAEWLAEDSAKAAQFLPKK. The pIC50 is 3.1. (4) The compound is CC(C)C[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)OCC1c2ccccc2-c2ccccc21)C(C)C)C(N)=O. The target protein (P04229) has sequence MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVRLLERCIYNQEESVRFDSDVGEYRAVTELGRPDAEYWNSQKDLLEQRRAAVDTYCRHNYGVGESFTVQRRVEPKVTVYPSKTQPLQHHNLLVCSVSGFYPGSIEVRWFRNGQEEKAGVVSTGLIQNGDWTFQTLVMLETVPRSGEVYTCQVEHPSVTSPLTVEWRARSESAQSKMLSGVGGFVLGLLFLGAGLFIYFRNQKGHSGLQPTGFLS. The pIC50 is 4.3. (5) The small molecule is Cc1ccnc2c1NC(=O)c1cccnc1N2C1CC1. The target protein sequence is MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERFAVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALDNIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVEEKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPVHAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRMYSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTILKALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKCFNCGKEGHTARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLREDLAFLQGKAREFSSEQTRANSPTRRELQVWGRDNNSPSEAGADRQGTVSFNFPQVTLWQRPLVT.... The pIC50 is 4.8. (6) The compound is COc1ccc([C@H](Cc2c(Cl)c[n+]([O-])cc2Cl)OC(=O)c2ccc(CNC(C(=O)OC[C@H]3CCN3C)c3ccccc3)s2)cc1OC. The target protein sequence is MKEHGGTFSSTGISGGSGDSAMDSLQPLQPNYMPVCLFAEESYQKLAMETLEELDWCLDQLETIQTYRSVSEMASNKFKRMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKDREKKKKQQLMTQISGVKKLMHSSSLNNTSISRFGVNTENEDHLAKELEDLNKWGLNIFNVAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNSLHAADVAQSTHVLLSTPALDAVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELALMYNDESVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVIDMVLATDMSKHMSLLADLKTMVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFFQQGDKERERGMEISPMCDKHTASVEKSQVGFIDYIVHPLWETWADLVQPDAQDILDTLEDNRNWYQSMIPQSPSPPLDEQNRDCQ.... The pIC50 is 8.2. (7) The drug is N#Cc1c(N)nc(C(C#N)C#N)c(C#N)c1-c1ccccc1. The target protein sequence is MSGSTQPVAQTWRATEPRYPPHSLSYPVQIARTHTDVGLLEYQHHSRDYASHLSPGSIIQPQRRRPSLLSEFQPGNERSQELHLRPESHSYLPELGKSEMEFIESKRPRLELLPDPLLRPSPLLATGQPAGSEDLTKDRSLTGKLEPVSPPSPPHTDPELELVPPRLSKEELIQNMDRVDREITMVEQQISKLKKKQQQLEEEAAKPPEPEKPVSPPPIESKHRSLVQIIYDENRKKAEAAHRILEGLGPQVELPLYNQPSDTRQYHENIKINQAMRKKLILYFKRRNHARKQWEQKFCQRYDQLMEAWEKKVERIENNPRRRAKESKVREYYEKQFPEIRKQRELQERMQSRVGQRGSGLSMSAARSEHEVSEIIDGLSEQENLEKQMRQLAVIPPMLYDADQQRIKFINMNGLMADPMKVYKDRQVMNMWSEQEKETFREKFMQHPKNFGLIASFLERKTVAECVLYYYLTKKNENYKSLVRRSYRRRGKSQQQQQQQ.... The pIC50 is 5.5. (8) The drug is O=C(NCCCCC1C(=O)C(C(=O)C2CC2)C(=O)N1Cc1ccc(F)cc1)c1ccc(Cl)nc1. The target protein (P32755) has sequence MTTYSNKGPKPERGRFLHFHSVTFWVGNAKQAASFYCNKMGFEPLAYKGLETGSREVVSHVIKQGKIVFVLCSALNPWNKEMGDHLVKHGDGVKDIAFEVEDCEHIVQKARERGAKIVREPWVEEDKFGKVKFAVLQTYGDTTHTLVEKINYTGRFLPGFEAPTYKDTLLPKLPSCNLEIIDHIVGNQPDQEMESASEWYLKNLQFHRFWSVDDTQVHTEYSSLRSIVVANYEESIKMPINEPAPGRKKSQIQEYVDYNGGAGVQHIALRTEDIITTIRHLRERGMEFLAVPSSYYRLLRENLKTSKIQVKENMDVLEELKILVDYDEKGYLLQIFTKPMQDRPTLFLEVIQRHNHQGFGAGNFNSLFKAFEEEQALRGNLTDLETNGVRSGM. The pIC50 is 5.4. (9) The compound is CC(CCc1cc(-c2ccccc2)n[nH]1)(C(=O)NO)S(C)(=O)=O. The target protein sequence is TVEHLLSAMAGLGIDNAYVELSASEVPIMDGSAGPFVFLIQSAGLQEQEAAKKFIRIKREVSVEEGDKRAVFVPFDGFKVSFEIDFDHPVFRGRTQQASVDFSSTSFVKEVSRARTFGFMRDIEYLRSQNLALGGSVENAIVVDENRVLNEDGLRYEDEFVKHKILDAIGDLYLLGNSLIGEFRGFKSGHALNNQLL. The pIC50 is 7.0.