Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is Cc1cc(N2C(=O)c3cc(-c4ccn(C)c(=O)c4)n(C)c3C2c2ccc(Cl)cc2)cn(C)c1=O. The target protein sequence is NPPPPEVSNPKKPGRVTNQLQYLHKVVMKALWKHQFAWPFRQPVDAVKLGLPDYHKIIKQPMDMGTIKRRLENNYYWAASECMQDFNTMFTNCYIYNKPTDDIVLMAQTLEKIFLQKVASMPQEEQELVVTIPKNSHKKGAKLAALQGSVTSAHQVPAVSSVSHTALYTPPPEIPTTVLNIPHPSVISSPLLKSLHSAGPPLLAVTAAPPAQPLAKKKGVKRKADTTTPTPTAILAPGSPASPPGSLEPKAARLPPMRRESGRPIKPPRKDLPDSQQQHQSSKKGKLSEQLKHCNGILKELLSKKHAAYAWPFYKPVDASALGLHDYHDIIKHPMDLSTVKRKMENRDYRDAQEFAADVRLMFSNCYKYNPPDHDVVAMARKLQDVFEFRYAKMPDEPLEPGPL. The pIC50 is 7.7. (2) The pIC50 is 6.1. The drug is CS(=O)(=O)N1CCc2c(c(-c3ccc(Br)cc3)nn2CC(O)CN2CCC(N3C(=O)CCc4cc(Cl)ccc43)CC2)C1. The target protein (P04233) has sequence MHRRRSRSCREDQKPVMDDQRDLISNNEQLPMLGRRPGAPESKCSRGALYTGFSILVTLLLAGQATTAYFLYQQQGRLDKLTVTSQNLQLENLRMKLPKPPKPVSKMRMATPLLMQALPMGALPQGPMQNATKYGNMTEDHVMHLLQNADPLKVYPPLKGSFPENLRHLKNTMETIDWKVFESWMHHWLLFEMSRHSLEQKPTDAPPKVLTKCQEEVSHIPAVHPGSFRPKCDENGNYLPLQCYGSIGYCWCVFPNGTEVPNTRSRGHHNCSESLELEDPSSGLGVTKQDLGPVPM. (3) The target protein (O35913) has sequence MGKSEKRVATHGVRCFAKIKMFLLALTCAYVSKSLSGTYMNSMLTQIERQFGIPTSIVGLINGSFEIGNLLLIIFVSYFGTKLHRPIMIGVGCAVMGLGCFLISLPHFLMGQYEYETILPTSNVSSNSFFCVENRSQTLNPTQDPSECVKEMKSLMWIYVLVGNIIRGIGETPIMPLGISYIEDFAKSENSPLYIGILETGMTIGPLIGLLLASSCANIYVDIESVNTDDLTITPTDTRWVGAWWIGFLVCAGVNILTSFPFFFFPKTLPKEGLQENVDGTENAKEKKHRKKAKEEKRGITKDFFVFMKSLSCNPIYMLFILISVLQFNAFINSFTFMPKYLEQQYGKSTAEVVFLMGLYMLPPICLGYLIGGLIMKKFKVTVKKAAHLAFWLCLSEYLLSFLSYVMTCDNFPVAGLTTSYEGVQHQLYVENKVLADCNTRCNCSTNTWDPVCGDNGLAYMSACLAGCEKSVGTGTNMVFQNCSCIQSSGNSSAVLGLCN.... The compound is C[C@H](CCC(=O)NCCS(=O)(=O)O)[C@H]1CC[C@H]2[C@H]3[C@H](C[C@H](O)[C@@]21C)[C@@]1(C)CC[C@@H](O)C[C@H]1C[C@H]3O. The pIC50 is 2.8. (4) The drug is NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@@H]1CCCN1/C=C1\N=C(c2ccc(Cl)cc2Cl)OC1=O. The target protein (P56192) has sequence MRLFVSDGVPGCLPVLAAAGRARGRAEVLISTVGPEDCVVPFLTRPKVPVLQLDSGNYLFSTSAICRYFFLLSGWEQDDLTNQWLEWEATELQPALSAALYYLVVQGKKGEDVLGSVRRALTHIDHSLSRQNCPFLAGETESLADIVLWGALYPLLQDPAYLPEELSALHSWFQTLSTQEPCQRAAETVLKQQGVLALRPYLQKQPQPSPAEGRAVTNEPEEEELATLSEEEIAMAVTAWEKGLESLPPLRPQQNPVLPVAGERNVLITSALPYVNNVPHLGNIIGCVLSADVFARYSRLRQWNTLYLCGTDEYGTATETKALEEGLTPQEICDKYHIIHADIYRWFNISFDIFGRTTTPQQTKITQDIFQQLLKRGFVLQDTVEQLRCEHCARFLADRFVEGVCPFCGYEEARGDQCDKCGKLINAVELKKPQCKVCRSCPVVQSSQHLFLDLPKLEKRLEEWLGRTLPGSDWTPNAQFITRSWLRDGLKPRCITRDLK.... The pIC50 is 6.2. (5) The compound is O=C(CC(S)C(F)(F)C(F)(F)F)NC(Cc1ccccc1)C(=O)O. The target protein sequence is PKPKKKQRWTPLEISLEVLVLVLVI. The pIC50 is 5.9.