Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCn1c(-c2cccnc2)nc2cc(NC(=O)COC)cc(C(=O)NCc3ccccc3)c21. The target protein (O08689) has sequence MMQKLQMYVYIYLFMLIAAGPVDLNEGSEREENVEKEGLCNACAWRQNTRYSRIEAIKIQILSKLRLETAPNISKDAIRQLLPRAPPLRELIDQYDVQRDDSSDGSLEDDDYHATTETIITMPTESDFLMQADGKPKCCFFKFSSKIQYNKVVKAQLWIYLRPVKTPTTVFVQILRLIKPMKDGTRYTGIRSLKLDMSPGTGIWQSIDVKTVLQNWLKQPESNLGIEIKALDENGHDLAVTFPGPGEDGLNPFLEVKVTDTPKRSRRDFGLDCDEHSTESRCCRYPLTVDFEAFGWDWIIAPKRYKANYCSGECEFVFLQKYPHTHLVHQANPRGSAGPCCTPTKMSPINMLYFNGKEQIIYGKIPAMVVDRCGCS. The pIC50 is 4.3. (2) The small molecule is Nc1nc2nc(C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc3c[nH]c4ccccc34)C(=O)O)cnc2c(=O)[nH]1. The target protein (P02879) has sequence MKPGGNTIVIWMYAVATWLCFGSTSGWSFTLEDNNIFPKQYPIINFTTAGATVQSYTNFIRAVRGRLTTGADVRHEIPVLPNRVGLPINQRFILVELSNHAELSVTLALDVTNAYVVGYRAGNSAYFFHPDNQEDAEAITHLFTDVQNRYTFAFGGNYDRLEQLAGNLRENIELGNGPLEEAISALYYYSTGGTQLPTLARSFIICIQMISEAARFQYIEGEMRTRIRYNRRSAPDPSVITLENSWGRLSTAIQESNQGAFASPIQLQRRNGSKFSVYDVSILIPIIALMVYRCAPPPSSQFSLLIRPVVPNFNADVCMDPEPIVRIVGRNGLCVDVRDGRFHNGNAIQLWPCKSNTDANQLWTLKRDNTIRSNGKCLTTYGYSPGVYVMIYDCNTAATDATRWQIWDNGTIINPRSSLVLAATSGNSGTTLTVQTNIYAVSQGWLPTNNTQPFVTTIVGLYGLCLQANSGQVWIEDCSSEKAEQQWALYADGSIRPQQN.... The pIC50 is 4.0. (3) The pIC50 is 7.0. The target protein (Q3SYC2) has sequence MVEFAPLFMPWERRLQTLAVLQFVFSFLALAEICTVGFIALLFTRFWLLTVLYAAWWYLDRDKPRQGGRHIQAIRCWTIWKYMKDYFPISLVKTAELDPSRNYIAGFHPHGVLAVGAFANLCTESTGFSSIFPGIRPHLMMLTLWFRAPFFRDYIMSAGLVTSEKESAAHILNRKGGGNLLGIIVGGAQEALDARPGSFTLLLRNRKGFVRLALTHGAPLVPIFSFGENDLFDQIPNSSGSWLRYIQNRLQKIMGISLPLFHGRGVFQYSFGLIPYRRPITTVVGKPIEVQKTLHPSEEEVNQLHQRYIKELCNLFEAHKLKFNIPADQHLEFC. The compound is C[C@@H](NS(C)(=O)=O)c1ccc(CN2CCOC(c3cccc(C(F)(F)F)c3)C2)cc1. (4) The target protein sequence is KEPRDPDQLYSTLKSILQQVKSHQSAWPFMEPVKRTEAPGYYEVIRFPMDLKTMSERLKNRYYVSKKLFMADLQRVFTNCKEYNPPESEYYKCANILEKFFFSKIKEAGLIDK. The compound is C[C@H](Nc1nn(C)c(=O)c2ccccc12)[C@H](c1ccc(OCc2ccccc2)cc1)N(C)C. The pIC50 is 7.4.