Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=P(O)(O)C(O)(NC1CCCCCC1)P(=O)(O)O. The target protein sequence is MAHMERFQKVYEEVQEFLLGDAEKRFEMDVHRKGYLKSMMDTTCLGGKYNRGLCVVDVAEAMAKDTKMDAAAMERVLHDACVCGWMIEMLQAHFLVEDDIMDHSKTRRGKPCWYLHPGVTTQVAINDGLILLAWATQMALHYFADRPFLAEVLRVFHDVDLTTTIGQLYDVTSMVDSAKLDANVAHANTTDYIEYTPFNHRRIVVYKTAYYTYWLPLVMGLLVSGTVEKVDKEATHKVAMVMGEYFQVQDDVMDCFTPPEKLGKIGTDIEDAKCSWLAVTFLTTAPAEKVAEFKANYGSTDPAKVAVIKQLYTEQNLLARFEEYEKAVVAEIEQLIAALEAQNTAFAASVKVLWSKTYKRQK. The pIC50 is 6.6. (2) The compound is O=c1cc(-c2ccc(-c3ccc4oc(-c5ccccc5)cc(=O)c4c3)cc2)oc2ccccc12. The target protein (P04055) has sequence MKLLLLAALLTAGVTAHSISTRAVWQFRNMIKCTIPGSDPLREYNNYGCYCGLGGSGTPVDDLDRCCQTHDHCYNQAKKLESCKFLIDNPYTNTYSYKCSGNVITCSDKNNDCESFICNCDRQAAICFSKVPYNKEYKDLDTKKHC. The pIC50 is 4.0. (3) The drug is Cn1cc(-c2ccc3[nH]c(=O)[nH]c3c2Oc2ccc(F)cc2F)c2cn[nH]c2c1=O. The target protein (P25440) has sequence MLQNVTPHNKLPGEGNAGLLGLGPEAAAPGKRIRKPSLLYEGFESPTMASVPALQLTPANPPPPEVSNPKKPGRVTNQLQYLHKVVMKALWKHQFAWPFRQPVDAVKLGLPDYHKIIKQPMDMGTIKRRLENNYYWAASECMQDFNTMFTNCYIYNKPTDDIVLMAQTLEKIFLQKVASMPQEEQELVVTIPKNSHKKGAKLAALQGSVTSAHQVPAVSSVSHTALYTPPPEIPTTVLNIPHPSVISSPLLKSLHSAGPPLLAVTAAPPAQPLAKKKGVKRKADTTTPTPTAILAPGSPASPPGSLEPKAARLPPMRRESGRPIKPPRKDLPDSQQQHQSSKKGKLSEQLKHCNGILKELLSKKHAAYAWPFYKPVDASALGLHDYHDIIKHPMDLSTVKRKMENRDYRDAQEFAADVRLMFSNCYKYNPPDHDVVAMARKLQDVFEFRYAKMPDEPLEPGPLPVSTAMPPGLAKSSSESSSEESSSESSSEEEEEEDEE.... The pIC50 is 7.0. (4) The small molecule is CCCC[C@@H](O)/C=C(C)/C=C/C=C/C(=O)N1CCCC1=O. The target protein (P04415) has sequence MKEFYISIETVGNNIVERYIDENGKERTREVEYLPTMFRHCKEESKYKDIYGKNCAPQKFPSMKDARDWMKRMEDIGLEALGMNDFKLAYISDTYGSEIVYDRKFVRVANCDIEVTGDKFPDPMKAEYEIDAITHYDSIDDRFYVFDLLNSMYGSVSKWDAKLAAKLDCEGGDEVPQEILDRVIYMPFDNERDMLMEYINLWEQKRPAIFTGWNIEGFDVPYIMNRVKMILGERSMKRFSPIGRVKSKLIQNMYGSKEIYSIDGVSILDYLDLYKKFAFTNLPSFSLESVAQHETKKGKLPYDGPINKLRETNHQRYISYNIIDVESVQAIDKIRGFIDLVLSMSYYAKMPFSGVMSPIKTWDAIIFNSLKGEHKVIPQQGSHVKQSFPGAFVFEPKPIARRYIMSFDLTSLYPSIIRQVNISPETIRGQFKVHPIHEYIAGTAPKPSDEYSCSPNGWMYDKHQEGIIPKEIAKVFFQRKDWKKKMFAEEMNAEAIKKII.... The pIC50 is 3.7. (5) The drug is O=C(Nc1ccc(NS(=O)(=O)c2cccc(C(F)(F)F)c2)cc1)Nc1cccc([N+](=O)[O-])c1. The target protein (Q81RP3) has sequence MTLQEQIMKALHVQPVIDPKAEIRKRVDFLKDYVKKTGAKGFVLGISGGQDSTLAGRLAQLAVEEIRNEGGNATFIAVRLPYKVQKDEDDAQLALQFIQADQSVAFDIASTVDAFSNQYENLLDESLTDFNKGNVKARIRMVTQYAIGGQKGLLVIGTDHAAEAVTGFFTKFGDGGADLLPLTGLTKRQGRALLQELGADERLYLKMPTADLLDEKPGQADETELGITYDQLDDYLEGKTVPADVAEKIEKRYTVSEHKRQVPASMFDDWWK. The pIC50 is 3.5. (6) The drug is CC(C)(Cc1ccc2ccccc2c1)NC[C@@H](O)COc1cccc(Cl)c1C#N. The target protein (P48442) has sequence MASYSCCLALLALAWHSSAYGPDQRAQKKGDIILGGLFPIHFGVAAKDQDLKSRPESVECIRYNFRGFRWLQAMIFAIEEINSSPSLLPNMTLGYRIFDTCNTVSKALEATLSFVAQNKIDSLNLDEFCNCSEHIPSTIAVVGATGSGVSTAVANLLGLFYIPQVSYASSSRLLSNKNQYKSFLRTIPNDEHQATAMADIIEYFRWNWVGTIAADDDYGRPGIEKFREEAEERDICIDFSELISQYSDEEEIQQVVEVIQNSTAKVIVVFSSGPDLEPLIKEIVRRNITGRIWLASEAWASSSLIAMPEYFHVVGGTIGFGLKAGQIPGFREFLQKVHPRKSVHNGFAKEFWEETFNCHLQEGAKGPLPVDTFVRSHEEGGNRLLNSSTAFRPLCTGDENINSVETPYMDYEHLRISYNVYLAVYSIAHALQDIYTCLPGRGLFTNGSCADIKKVEAWQVLKHLRHLNFTNNMGEQVTFDECGDLVGNYSIINWHLSPED.... The pIC50 is 6.3. (7) The compound is O=C1C[C@H](O)C[C@@H](/C=C/c2c(Cl)cc(Cl)cc2OCCC23CC4CC(CC(C4)C2)C3)O1. The target protein (P51639) has sequence MLSRLFRMHGLFVASHPWEVIVGTVTLTICMMSMNMFTGNNKICGWNYECPKFEEDVLSSDIIILTITRCIAILYIYFQFQNLRQLGSKYILGIAGLFTIFSSFVFSTVVIHFLDKELTGLNEALPFFLLLIDLSRASALAKFALSSNSQDEVRENIARGMAILGPTFTLDALVECLVIGVGTMSGVRQLEIMCCFGCMSVLANYFVFMTFFPACVSLVLELSRESREGRPIWQLSHFARVLEEEENKPNPVTQRVKMIMSLGLVLVHAHSRWIADPSPQNSTAEQSKVSLGLAEDVSKRIEPSVSLWQFYLSKMISMDIEQVITLSLALLLAVKYIFFEQAETESTLSLKNPITSPVVTPKKAQDNCCRREPLLVRRNQKLSSVEEDPGVNQDRKVEVIKPLVAEAETSGRATFVLGASAASPPLALGAQEPGIELPSEPRPNEECLQILESAEKGAKFLSDAEIIQLVNAKHIPAYKLETLMETHERGVSIRRQLLSA.... The pIC50 is 5.5.