Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q75Z89) has sequence MDILCEENTSLSSTTNSLMQLHADTRLYSTDFNSGEGNTSNAFNWTVDSENRTNLSCEGCLSPPCFSLLHLQEKNWSALLTAVVIILTIAGNILVIMAVSLEKKLQNATNYFLMSLAIADMLLGFLVMPVSTLTILYGYRWPLPSKLCAVWIYLDVLFSTASIMHLCAISLDRYVAIQNPIHHSRFNSRTKAFLKIIAVWTISVGISMPIPVFGLQDDSKVFKEGSCLLADENFVLIGSFVAFFIPLTIMVITYFLTIKSLQKEATLCVSDPGTRTKLASFSFLPQSSLSSEKLFQRSIHREPGSYGRRTMQSISNEQKACKVLGIVFFLFVVMWCPFFITNIMAVICKESCNRDVIEALLNVFVWIGYLSSAVNPLVYTLFNKTYRSAFSRYIQCQYKENKKPLQLILVNTIPALAYKSSQLQMGPKKNSKKDDKTTDNDCTMVALGKEHPEDAPADSSNTVNEKVSCV. The drug is C=CCN1CCN(c2nc3ccsc3n3cccc23)CC1. The pIC50 is 6.1. (2) The small molecule is CCCCCCCCCC(=O)O[C@H]1[C@H](O)[C@@H](CO)O[C@H]1n1cc(/C=C/Br)c(=O)[nH]c1=O. The target protein (P09250) has sequence MSTDKTDVKMGVLRIYLDGAYGIGKTTAAEEFLHHFAITPNRILLIGEPLSYWRNLAGEDAICGIYGTQTRRLNGDVSPEDAQRLTAHFQSLFCSPHAIMHAKISALMDTSTSDLVQVNKEPYKIMLSDRHPIASTICFPLSRYLVGDMSPAALPGLLFTLPAEPPGTNLVVCTVSLPSHLSRVSKRARPGETVNLPFVMVLRNVYIMLINTIIFLKTNNWHAGWNTLSFCNDVFKQKLQKSECIKLREVPGIEDTLFAVLKLPELCGEFGNILPLWAWGMETLSNCSRSMSPFVLSLEQTPQHAAQELKTLLPQMTPANMSSGAWNILKELVNAVQDNTS. The pIC50 is 3.1. (3) The compound is C/C=C1\NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@@H](COS(=O)(=O)[O-])OC)C(C)C)[C@@H](C)OC(=O)[C@H](C(C)C)NC(=O)[C@H](Cc2ccc(O)cc2)N(C)C(=O)[C@H](Cc2ccccc2)N2C(=O)[C@H](CC[C@H]2O)NC1=O. The target protein (P08246) has sequence MTLGRRLACLFLACVLPALLLGGTALASEIVGGRRARPHAWPFMVSLQLRGGHFCGATLIAPNFVMSAAHCVANVNVRAVRVVLGAHNLSRREPTRQVFAVQRIFENGYDPVNLLNDIVILQLNGSATINANVQVAQLPAQGRRLGNGVQCLAMGWGLLGRNRGIASVLQELNVTVVTSLCRRSNVCTLVRGRQAGVCFGDSGSPLVCNGLIHGIASFVRGGCASGLYPDAFAPVAQFVNWIDSIIQRSEDNPCPHPRDPDPASRTH. The pIC50 is 7.5. (4) The compound is C/C1=C\[C@@H](C)[C@@H](C)OC(=O)C[C@H](c2ccc(O)cc2)NC(=O)[C@@H](Cc2c[nH]c3ccccc23)N(C)C(=O)[C@H](C)NC(=O)[C@@H](C)C1. The target protein sequence is MDSEVAALVIDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGIMVGMGQKDSYVGDEAQSKRGILTLRYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPMNPKSNREKMTQIMFETFNVPAFYVSIQAVLSLYSSGRTTGIVLDSGDGVTHVVPIYAGFSLPHAILRIDLAGKDLTDYLMKILSERGYSFSTTAEREIVRDIKEKLCYVALDFEQEMQTAAQSSSIEKSYELPDGQVITIGNERFRAPEALFHPSVLGLESAGIDQTTYNSIMKCDVDVRKELYGNIVMSGGTTMFPGIAERMQKEITALAPSSMKVKIIAPPERKYSVWIGGSILASLTTFQQMWISKQEYDESGPSIVHHKCF. The pIC50 is 4.9. (5) The small molecule is O=C(Cn1nnc(-c2ccccc2)n1)NN=C1C(=O)Nc2ccccc21. The target protein sequence is MATSRAALCAVAVVCVVLAAACAPARAIYVGTPAAALFEEFKRTYRRAYGTLAEEQQRLANFERNLELMREHQARNPHARFGITKFFDLSEAEFAARYLNGAAYFAAAKQHAGQHYRKARADLSAVPDAVDWREKGAVTPVKNQGACGSCWAFSAVGNIESQWARAGHGLVSLSEQQLVSCDDKDNGCNGGLMLQAFEWLLRHMYGIVFTEKSYPYTSGNGDVAECLNSSKLVPGARIDGYVMIPSNETVMAAWLAENGPIAIGVDASSFMSYQSGVLTSCAGDALNHGVLLVGYNTTGGVPYCVIKNSWGEDWGEKGYVRVAMGLNACLLSEYPVSAHVPQSLTPALTASGNFCEACWTVMLHRILSVLKTNGWLLGRRPSARWREDGARGGQ. The pIC50 is 3.9.