This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C1/C(=C/c2ccc(O)c(O)c2O)Oc2cc(O)ccc21. The target protein (H9TB17) has sequence MEPSTLYFYVNGRRVTEKNVDPETMLLPYLGRNLRLTGTKYGCGGGGCGACTVMVSRYDRGTGQIRHYPACACLTPLCSLHGAAVTTVEGVGSTRTRLHPVQERIAKSHGTQCGFCTPGMVMSLYALLRSHPQPSEEQLLEALAGNLCRCTGYRPILDAGKTFCKTSGCCQSKENGVCCLDQGVNGVQEAEGEQTSQELCSEEEFVPLDPTQELIFPPELMILAQKQPQKSRVFTGDRVTWISPVTLKDLLEAKAKNPRAPVVMGNTSVGPEMKFKGVFHPVIISPDGIEELSVIKQGNEGLTLGAGLSLAQVQDVLADVVQQLPEEKTQTLCALLKQLRTLAGSQIRNMASLGGHIMSRHLDSDLNPVLAAASCTLHVPSQEGDRQIPLDEHFLSRSPSADLRPQEVLLSVTIPYSRKWEFVSAFRQAQRKRSARAIVNVGMRVFFGAGDGVISELCILYGGVGPAIVCATDACRKLVGRHWTEEMLDEACRLVLGEVA.... The pIC50 is 4.2. (2) The small molecule is NC(=O)c1cn(-c2ccc(O)cc2Cl)c2cc(-c3ccncc3Cl)ccc2c1=O. The target protein (P07384) has sequence MSEEIITPVYCTGVSAQVQKQRARELGLGRHENAIKYLGQDYEQLRVRCLQSGTLFRDEAFPPVPQSLGYKDLGPNSSKTYGIKWKRPTELLSNPQFIVDGATRTDICQGALGDCWLLAAIASLTLNDTLLHRVVPHGQSFQNGYAGIFHFQLWQFGEWVDVVVDDLLPIKDGKLVFVHSAEGNEFWSALLEKAYAKVNGSYEALSGGSTSEGFEDFTGGVTEWYELRKAPSDLYQIILKALERGSLLGCSIDISSVLDMEAITFKKLVKGHAYSVTGAKQVNYRGQVVSLIRMRNPWGEVEWTGAWSDSSSEWNNVDPYERDQLRVKMEDGEFWMSFRDFMREFTRLEICNLTPDALKSRTIRKWNTTLYEGTWRRGSTAGGCRNYPATFWVNPQFKIRLDETDDPDDYGDRESGCSFVLALMQKHRRRERRFGRDMETIGFAVYEVPPELVGQPAVHLKRDFFLANASRARSEQFINLREVSTRFRLPPGEYVVVPST.... The pIC50 is 5.7. (3) The target protein (Q54873) has sequence MQTKTKKLIVSLSSLVLSGFLLNHYMTIGAEETTTNTIQQSQKEVQYQQRDTKNLVENGDFGQTEDGSSPWTGSKAQGWSAWVDQKNSADASTRVIEAKDGAITISSHEKLRAALHRMVPIEAKKKYKLRFKIKTDNKIGIAKVRIIEESGKDKRLWNSATTSGTKDWQTIEADYSPTLDVDKIKLELFYETGTGTVSFKDIELVEVADQLSEDSQTDKQLEEKIDLPIGKKHVFSLADYTYKVENPDVASVKNGILEPLKEGTTNVIVSKDGKEVKKIPLKILASVKDAYTDRLDDWNGIIAGNQYYDSKNEQMAKLNQELEGKVADSLSSISSQADRTYLWEKFSNYKTSANLTATYRKLEEMAKQVTNPSSRYYQDETVVRTVRDSMEWMHKHVYNSEKSIVGNWWDYEIGTPRAINNTLSLMKEYFSDEEIKKYTDVIEKFVPDPEHFRKTTDNPFKALGGNLVDMGRVKVIAGLLRKDDQEISSTIRSIEQVFKL.... The pIC50 is 2.2. The small molecule is O=C1O[C@H]([C@@H](O)CO)C(O)C1=O. (4) The small molecule is COc1ccc(NC(=O)c2ccc(C(=O)Nc3ccc(OC)cc3OC)cc2)c(OC)c1. The target protein (P03211) has sequence MSDEGPGTGPGNGLGEKGDTSGPEGSGGSGPQRRGGDNHGRGRGRGRGRGGGRPGAPGGSGSGPRHRDGVRRPQKRPSCIGCKGTHGGTGAGAGAGGAGAGGAGAGGGAGAGGGAGGAGGAGGAGAGGGAGAGGGAGGAGGAGAGGGAGAGGGAGGAGAGGGAGGAGGAGAGGGAGAGGGAGGAGAGGGAGGAGGAGAGGGAGAGGAGGAGGAGAGGAGAGGGAGGAGGAGAGGAGAGGAGAGGAGAGGAGGAGAGGAGGAGAGGAGGAGAGGGAGGAGAGGGAGGAGAGGAGGAGAGGAGGAGAGGAGGAGAGGGAGAGGAGAGGGGRGRGGSGGRGRGGSGGRGRGGSGGRRGRGRERARGGSRERARGRGRGRGEKRPRSPSSQSSSSGSPPRRPPPGRRPFFHPVGEADYFEYHQEGGPDGEPDVPPGAIEQGPADDPGEGPSTGPRGQGDGGRRKKGGWFGKHRGQGGSNPKFENIAEGLRALLARSHVERTTDE.... The pIC50 is 4.7. (5) The small molecule is Cc1cn(-c2cc(C(=O)Nc3cccc(Nc4ccc5c(c4)NC(=O)/C5=C\c4ccc[nH]4)c3)cc(C(F)(F)F)c2)cn1. The target protein (Q61851) has sequence MVVPACVLVFCVAVVAGATSEPPGPEQRVVRRAAEVPGPEPSQQEQVAFGSGDTVELSCHPPGGAPTGPTVWAKDGTGLVASHRILVGPQRLQVLNASHEDAGVYSCQHRLTRRVLCHFSVRVTDAPSSGDDEDGEDVAEDTGAPYWTRPERMDKKLLAVPAANTVRFRCPAAGNPTPSISWLKNGKEFRGEHRIGGIKLRHQQWSLVMESVVPSDRGNYTCVVENKFGSIRQTYTLDVLERSPHRPILQAGLPANQTAILGSDVEFHCKVYSDAQPHIQWLKHVEVNGSKVGPDGTPYVTVLKTAGANTTDKELEVLSLHNVTFEDAGEYTCLAGNSIGFSHHSAWLVVLPAEEELMETDEAGSVYAGVLSYGVVFFLFILVVAAVILCRLRSPPKKGLGSPTVHKVSRFPLKRQVSLESNSSMNSNTPLVRIARLSSGEGPVLANVSELELPADPKWELSRTRLTLGKPLGEGCFGQVVMAEAIGIDKDRTAKPVTVA.... The pIC50 is 5.1. (6) The compound is Cc1ccc(S(=O)(=O)NNC(=O)c2ccc3oc4ccccc4c3c2)cc1. The target protein (P54687) has sequence MKDCSNGCSAECTGEGGSKEVVGTFKAKDLIVTPATILKEKPDPNNLVFGTVFTDHMLTVEWSSEFGWEKPHIKPLQNLSLHPGSSALHYAVELFEGLKAFRGVDNKIRLFQPNLNMDRMYRSAVRATLPVFDKEELLECIQQLVKLDQEWVPYSTSASLYIRPTFIGTEPSLGVKKPTKALLFVLLSPVGPYFSSGTFNPVSLWANPKYVRAWKGGTGDCKMGGNYGSSLFAQCEAVDNGCQQVLWLYGEDHQITEVGTMNLFLYWINEDGEEELATPPLDGIILPGVTRRCILDLAHQWGEFKVSERYLTMDDLTTALEGNRVREMFGSGTACVVCPVSDILYKGETIHIPTMENGPKLASRILSKLTDIQYGREESDWTIVLS. The pIC50 is 4.3. (7) The small molecule is CCNC(=O)Nc1cc(Nc2cccc(O)c2)c(C(=O)Nc2cccnc2)cn1. The target protein (P20083) has sequence MTQTYNADAIEVLTGLEPVRRRPGMYTDTTRPNHLGQEVIDNSVDEALAGHAKRVDVILHADQSLEVIDDGRGMPVDIHPEEGVPAVELILCRLHAGGKFSNKNYQFSGGLHGVGISVVNALSKRVEVNVRRDGQVYNIAFENGEKVQDLQVVGTCGKRNTGTSVHFWPDETFFDSPRFSVSRLTHVLKAKAVLCPGVEITFKDEINNTEQRWCYQDGLNDYLAEAVNGLPTLPEKPFIGNFAGDTEAVDWALLWLPEGGELLTESYVNLIPTMQGGTHVNGLRQGLLDAMREFCEYRNILPRGVKLSAEDIWDRCAYVLSVKMQDPQFAGQTKERLSSRQCAAFVSGVVKDAFILWLNQNVQAAELLAEMAISSAQRRMRAAKKVVRKKLTSGPALPGKLADCTAQDLNRTELFLVEGDSAGGSAKQARDREYQAIMPLKGKILNTWEVSSDEVLASQEVHDISVAIGIDPDSDDLSQLRYGKICILADADSDGLHIAT.... The pIC50 is 6.1. (8) The compound is N=C(N)SCc1ccccc1C(=O)c1cc([N+](=O)[O-])ccc1CSC(=N)N. The target protein (P49279) has sequence MTGDKGPQRLSGSSYGSISSPTSPTSPGPQQAPPRETYLSEKIPIPDTKPGTFSLRKLWAFTGPGFLMSIAFLDPGNIESDLQAGAVAGFKLLWVLLWATVLGLLCQRLAARLGVVTGKDLGEVCHLYYPKVPRTVLWLTIELAIVGSDMQEVIGTAIAFNLLSAGRIPLWGGVLITIVDTFFFLFLDNYGLRKLEAFFGLLITIMALTFGYEYVVARPEQGALLRGLFLPSCPGCGHPELLQAVGIVGAIIMPHNIYLHSALVKSREIDRARRADIREANMYFLIEATIALSVSFIINLFVMAVFGQAFYQKTNQAAFNICANSSLHDYAKIFPMNNATVAVDIYQGGVILGCLFGPAALYIWAIGLLAAGQSSTMTGTYAGQFVMEGFLRLRWSRFARVLLTRSCAILPTVLVAVFRDLRDLSGLNDLLNVLQSLLLPFAVLPILTFTSMPTLMQEFANGLLNKVVTSSIMVLVCAINLYFVVSYLPSLPHPAYFGLA.... The pIC50 is 5.0.