Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P04806) has sequence MVHLGPKKPQARKGSMADVPKELMDEIHQLEDMFTVDSETLRKVVKHFIDELNKGLTKKGGNIPMIPGWVMEFPTGKESGNYLAIDLGGTNLRVVLVKLSGNHTFDTTQSKYKLPHDMRTTKHQEELWSFIADSLKDFMVEQELLNTKDTLPLGFTFSYPASQNKINEGILQRWTKGFDIPNVEGHDVVPLLQNEISKRELPIEIVALINDTVGTLIASYYTDPETKMGVIFGTGVNGAFYDVVSDIEKLEGKLADDIPSNSPMAINCEYGSFDNEHLVLPRTKYDVAVDEQSPRPGQQAFEKMTSGYYLGELLRLVLLELNEKGLMLKDQDLSKLKQPYIMDTSYPARIEDDPFENLEDTDDIFQKDFGVKTTLPERKLIRRLCELIGTRAARLAVCGIAAICQKRGYKTGHIAADGSVYNKYPGFKEAAAKGLRDIYGWTGDASKDPITIVPAEDGSGAGAAVIAALSEKRIAEGKSLGIIGA. The pIC50 is 2.5. The small molecule is O=C(NC1C(O)OC(CO)C(O)C1O)c1ccccc1[N+](=O)[O-]. (2) The target protein (Q92633) has sequence MAAISTSIPVISQPQFTAMNEPQCFYNESIAFFYNRSGKHLATEWNTVSKLVMGLGITVCIFIMLANLLVMVAIYVNRRFHFPIYYLMANLAAADFFAGLAYFYLMFNTGPNTRRLTVSTWLLRQGLIDTSLTASVANLLAIAIERHITVFRMQLHTRMSNRRVVVVIVVIWTMAIVMGAIPSVGWNCICDIENCSNMAPLYSDSYLVFWAIFNLVTFVVMVVLYAHIFGYVRQRTMRMSRHSSGPRRNRDTMMSLLKTVVIVLGAFIICWTPGLVLLLLDVCCPQCDVLAYEKFFLLLAEFNSAMNPIIYSYRDKEMSATFRQILCCQRSENPTGPTEGSDRSASSLNHTILAGVHSNDHSVV. The small molecule is COC(=O)C1(c2ccc(-c3ccc(-c4nnn(C)c4NC(=O)O[C@H](C)c4ccccc4)c(F)c3)cc2)CC1. The pIC50 is 4.5. (3) The drug is CCC(C)CC(C)CCCCCCCCC(=O)N[C@H]1C[C@@H](O)[C@@H](O)NC(=O)[C@@H]2[C@@H](O)CCN2C(=O)[C@H]([C@H](O)CCNC(C)=O)NC(=O)[C@H]([C@H](O)[C@@H](O)c2ccc(O)cc2)NC(=O)[C@@H]2C[C@@H](O)CN2C(=O)[C@H]([C@@H](C)O)NC1=O. The target protein sequence is MSYNDNNNHYYDPNQQGGMPPHQGGEGYYQQQYDDMGQQPHQQDYYDPNAQYQQQPYDMDGYQDQANYGGQPMNAQGYNADPEAFSDFSYGGQTPGTPGYDQYGTQYTPSQMSYGGDPRSSGASTPIYGGQGQGYDPTQFNMSSNLPYPAWSADPQAPIKIEHIEDIFIDLTNKFGFQRDSMRNMFDYFMTLLDSRSSRMSPAQALLSLHADYIGGDNANYRKWYFSSQQDLDDSLGFANMTLGKIGRKARKASKKSKKARKAAEEHGQDVDALANELEGDYSLEAAEIRWKAKMNSLTPEERVRDLALYLLIWGEANQVRFTPECLCYIYKSATDYLNSPLCQQRQEPVPEGDYLNRVITPLYRFIRSQVYEIYDGRFVKREKDHNKVIGYDDVNQLFWYPEGISRIIFEDGTRLVDIPQEERFLKLGEVEWKNVFFKTYKEIRTWLHFVTNFNRIWIIHGTIYWMYTAYNSPTLYTKHYVQTINQQPLASSRWAACAI.... The pIC50 is 6.5. (4) The drug is NC(=O)c1ccc2c(c1)nc(-c1cccnc1)n2[C@@H]1CCC[C@H](NC(=O)c2ccc(Br)s2)C1. The target protein (O15382) has sequence MAAAALGQIWARKLLSVPWLLCGPRRYASSSFKAADLQLEMTQKPHKKPGPGEPLVFGKTFTDHMLMVEWNDKGWGQPRIQPFQNLTLHPASSSLHYSLQLFEGMKAFKGKDQQVRLFRPWLNMDRMLRSAMRLCLPSFDKLELLECIRRLIEVDKDWVPDAAGTSLYVRPVLIGNEPSLGVSQPTRALLFVILCPVGAYFPGGSVTPVSLLADPAFIRAWVGGVGNYKLGGNYGPTVLVQQEALKRGCEQVLWLYGPDHQLTEVGTMNIFVYWTHEDGVLELVTPPLNGVILPGVVRQSLLDMAQTWGEFRVVERTITMKQLLRALEEGRVREVFGSGTACQVCPVHRILYKDRNLHIPTMENGPELILRFQKELKEIQYGIRAHEWMFPV. The pIC50 is 6.3. (5) The compound is CC(C)(c1cc2cnccc2o1)N1CCN(C[C@@H](O)C[C@@H](Cc2ccccc2)C(=O)N[C@H]2c3ccccc3OC[C@H]2O)[C@H](C(=O)NCC(F)(F)F)C1. The target protein sequence is PQITLWKRPIVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKIIGGIGGFVKVREYDQIPVEICGHKAIGTVLIGPTPFNVIGRNLMTQLGCTLNF. The pIC50 is 8.2.