Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is O=C(Nc1ccc(O)cc1OC[C@@H](O)CN1CCC2(CC1)Cc1cc(Cl)ccc1O2)NC1CC1. The target protein sequence is MEISNITETYPTTTEYDYGDSTPCQKTDVRAFGAGLLPPLYSFVFIIGVVGNILVILVLMQHRRLQSMTSIYLFNLAVSDLVFLFTLPFWIDYKLKDNWVFGDAMCKLLSGFYYLGLYSEIFFIILLTIDRYLAIVHAVFSLRARTVTFGIITSIIIWALAILASIPALCFFKAQWEFTHHTCSPHFPDESLKTWKRFQALKLNLLGLILPLLVMIICYAGIIRILLRRPNEKKAKAVRLIFAITLLFFLLWTPYNLTVFVSAFQDVLFTNQCEQSKQLDLAIQVTEVIAYTHCCVNPIIYVFVGERFRKYLRQLFQRHVAIPLAKWLPFFSVDQLERTSSLTPSTGEHELSGGF. The pIC50 is 7.3. (2) The drug is CS(=O)(=O)c1ccc(C(=O)NCCC(=O)N2CCc3ccccc32)cc1[N+](=O)[O-]. The target protein (P10827) has sequence MEQKPSKVECGSDPEENSARSPDGKRKRKNGQCSLKTSMSGYIPSYLDKDEQCVVCGDKATGYHYRCITCEGCKGFFRRTIQKNLHPTYSCKYDSCCVIDKITRNQCQLCRFKKCIAVGMAMDLVLDDSKRVAKRKLIEQNRERRRKEEMIRSLQQRPEPTPEEWDLIHIATEAHRSTNAQGSHWKQRRKFLPDDIGQSPIVSMPDGDKVDLEAFSEFTKIITPAITRVVDFAKKLPMFSELPCEDQIILLKGCCMEIMSLRAAVRYDPESDTLTLSGEMAVKREQLKNGGLGVVSDAIFELGKSLSAFNLDDTEVALLQAVLLMSTDRSGLLCVDKIEKSQEAYLLAFEHYVNHRKHNIPHFWPKLLMKEREVQSSILYKGAAAEGRPGGSLGVHPEGQQLLGMHVVQGPQVRQLEQQLGEAGSLQGPVLQHQSPKSPQQRLLELLHRSGILHARAVCGEDDSSEADSPSSSEEEPEVCEDLAGNAASP. The pIC50 is 4.2. (3) The drug is N#Cc1cn(-c2c[nH]n(-c3cc(N4CCCCO4)ncn3)c2=O)nn1. The target protein (Q9NXG6) has sequence MAAAAVTGQRPETAAAEEASRPQWAPPDHCQAQAAAGLGDGEDAPVRPLCKPRGICSRAYFLVLMVFVHLYLGNVLALLLFVHYSNGDESSDPGPQHRAQGPGPEPTLGPLTRLEGIKVGHERKVQLVTDRDHFIRTLSLKPLLFEIPGFLTDEECRLIIHLAQMKGLQRSQILPTEEYEEAMSTMQVSQLDLFRLLDQNRDGHLQLREVLAQTRLGNGWWMTPESIQEMYAAIKADPDGDGVLSLQEFSNMDLRDFHKYMRSHKAESSELVRNSHHTWLYQGEGAHHIMRAIRQRVLRLTRLSPEIVELSEPLQVVRYGEGGHYHAHVDSGPVYPETICSHTKLVANESVPFETSCRYMTVLFYLNNVTGGGETVFPVADNRTYDEMSLIQDDVDLRDTRRHCDKGNLRVKPQQGTAVFWYNYLPDGQGWVGDVDDYSLHGGCLVTRGTKWIANNWINVDPSRARQALFQQEMARLAREGGTDSQPEWALDRAYRDARV.... The pIC50 is 6.8. (4) The compound is CC[C@@]1(O)C(=O)OCc2c1cc1n(c2=O)Cc2cc3ccccc3nc2-1. The target protein (P00640) has sequence MKELKLKEAKEILKALGLPPQQYNDRSGWVLLALANIKPEDSWKEAKAPLLPTVSIMEFIRTEYGKDYKPNSRETIRRQTLHQFEQARIVDRNRDLPSRATNSKDNNYSLNQVIIDILHNYPNGNWKELIQQFLTHVPSLQELYERALARDRIPIKLLDGTQISLSPGEHNQLHADIVHEFCPRFVGDMGKILYIGDTASSRNEGGKLMVLDSEYLKKLGVPPMSHDKLPDVVVYDEKRKWLFLIEAVTSHGPISPKRWLELEAALSSCTVGKVYVTAFPTRTEFRKNAANIAWETEVWIADNPDHMVHFNGDRFLGPHDKKPELS. The pIC50 is 3.5.