This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc(S(=O)(=O)N2CN(Cc3ccccc3)C(=O)C[C@@H]2C(=O)NO)cc1. The target protein (P23097) has sequence ATFFLLSWTHCWSLPLPYGDDDDDDLSEEDLEFAEHYLKSYYHPVTLAGILKKSTVTSTVDRLREMQSFFGLDVTGKLDDPTLDIMRKPRCGVPDVGVYNVFPRTLKWSQTNLTYRIVNYTPDISHSEVEKAFRKAFKVWSDVTPLNFTRIHDGTADIMISFGTKEHGDFYPFDGPSGLLAHAFPPGPNLGGDAHFDDDETWTSSSKGYNLFIVAAHELGHSLGLDHSKDPGALMFPIYTYTGKSHFMLPDDDVQGIQSLYGPGDEDPNPKHPKTPEKCDPALSLDAITSLRGETMIFKDRFFWRLHPQQVEPELFLTKSFWPELPNHVDAAYEHPSRDLMFIFRGRKFWALNGYDIMEGYPRKISDLGFPKEVKRLSAAVHFEDTGKTLFFSGNHVWSYDDANQTMDKDYPRLIEEEFPGIGDKVDAVYEKNGYIYFFNGPIQFEYSIWSNRIVRVMPTNSLLWC. The pIC50 is 8.6. (2) The compound is COc1cc(N)ccc1-c1nc2c(C)nn(C3CCCCC3)c(=O)c2[nH]1. The target protein (P70453) has sequence MGITLIWCLALVLIKWITSKRRGAISYDSSDQTALYIRMLGDVRVRSRAGFETERRGSHPYIDFRIFHSQSDIEASVSARNIRRLLSFQRYLRSSRVFRGATVCSSLDILDEDYNGQAKCMLEKVGNWNFDIFLFDRLTNGNSLVSLTFHLFSLHGLIEYFHLDMVKLRRFLVMIQEDYHSQNPYHNAVHAADVTQAMHCYLKEPKLASSVTPWDILLSLIAAATHDLDHPGVNQPFLIKTNHYLATLYKNSSVLENHHWRSAVGLLRESGLFSHLPLESRQEMEAQIGALILATDISRQNEYLSLFRSHLDKGDLHLDDGRHRHLVLQMALKCADICNPCRNWELSKQWSEKVTEEFFHQGDIEKKYHLGVSPLCDRQTESIANIQIGFMTYLVEPLFTEWARFSDTRLSQTMLGHVGLNKASWKGLQRQQPSSEDANAAFELNSQLLTQENRLS. The pIC50 is 7.2. (3) The target protein (Q12772) has sequence MDDSGELGGLETMETLTELGDELTLGDIDEMLQFVSNQVGEFPDLFSEQLCSSFPGSGGSGSSSGSSGSSSSSSNGRGSSSGAVDPSVQRSFTQVTLPSFSPSAASPQAPTLQVKVSPTSVPTTPRATPILQPRPQPQPQPQTQLQQQTVMITPTFSTTPQTRIIQQPLIYQNAATSFQVLQPQVQSLVTSSQVQPVTIQQQVQTVQAQRVLTQTANGTLQTLAPATVQTVAAPQVQQVPVLVQPQIIKTDSLVLTTLKTDGSPVMAAVQNPALTALTTPIQTAALQVPTLVGSSGTILTTMPVMMGQEKVPIKQVPGGVKQLEPPKEGERRTTHNIIEKRYRSSINDKIIELKDLVMGTDAKMHKSGVLRKAIDYIKYLQQVNHKLRQENMVLKLANQKNKLLKGIDLGSLVDNEVDLKIEDFNQNVLLMSPPASDSGSQAGFSPYSIDSEPGSPLLDDAKVKDEPDSPPVALGMVDRSRILLCVLTFLCLSFNPLTSL.... The pIC50 is 5.3. The compound is CCCc1cc(-c2nc(-c3ccc(NCC4CC4)cc3)cs2)ccn1. (4) The compound is O=C(NCc1ccccc1)c1nc([N+](=O)[O-])c(Sc2c(Cl)cncc2Cl)s1. The target protein (Q93009) has sequence MNHQQQQQQQKAGEQQLSEPEDMEMEAGDTDDPPRITQNPVINGNVALSDGHNTAEEDMEDDTSWRSEATFQFTVERFSRLSESVLSPPCFVRNLPWKIMVMPRFYPDRPHQKSVGFFLQCNAESDSTSWSCHAQAVLKIINYRDDEKSFSRRISHLFFHKENDWGFSNFMAWSEVTDPEKGFIDDDKVTFEVFVQADAPHGVAWDSKKHTGYVGLKNQGATCYMNSLLQTLFFTNQLRKAVYMMPTEGDDSSKSVPLALQRVFYELQHSDKPVGTKKLTKSFGWETLDSFMQHDVQELCRVLLDNVENKMKGTCVEGTIPKLFRGKMVSYIQCKEVDYRSDRREDYYDIQLSIKGKKNIFESFVDYVAVEQLDGDNKYDAGEHGLQEAEKGVKFLTLPPVLHLQLMRFMYDPQTDQNIKINDRFEFPEQLPLDEFLQKTDPKDPANYILHAVLVHSGDNHGGHYVVYLNPKGDGKWCKFDDDVVSRCTKEEAIEHNYGG.... The pIC50 is 5.4.