Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COc1ccc(NC(=O)Cn2cc(CSc3nnc(-c4ccccc4)c(-c4ccccc4)n3)nn2)cc1. The target protein (P10253) has sequence MGVRHPPCSHRLLAVCALVSLATAALLGHILLHDFLLVPRELSGSSPVLEETHPAHQQGASRPGPRDAQAHPGRPRAVPTQCDVPPNSRFDCAPDKAITQEQCEARGCCYIPAKQGLQGAQMGQPWCFFPPSYPSYKLENLSSSEMGYTATLTRTTPTFFPKDILTLRLDVMMETENRLHFTIKDPANRRYEVPLETPHVHSRAPSPLYSVEFSEEPFGVIVRRQLDGRVLLNTTVAPLFFADQFLQLSTSLPSQYITGLAEHLSPLMLSTSWTRITLWNRDLAPTPGANLYGSHPFYLALEDGGSAHGVFLLNSNAMDVVLQPSPALSWRSTGGILDVYIFLGPEPKSVVQQYLDVVGYPFMPPYWGLGFHLCRWGYSSTAITRQVVENMTRAHFPLDVQWNDLDYMDSRRDFTFNKDGFRDFPAMVQELHQGGRRYMMIVDPAISSSGPAGSYRPYDEGLRRGVFITNETGQPLIGKVWPGSTAFPDFTNPTALAWWE.... The pIC50 is 4.8. (2) The pIC50 is 9.0. The small molecule is CC[C@H](NC(C)=O)c1cc(Cl)ccc1C1CCN(C(=O)[C@H]2CN(C(C)(C)C)C[C@@H]2c2ccc(C(F)(F)F)cc2)CC1. The target protein (P70596) has sequence MNSTHHHGMYTSLHLWNRSSHGLHGNASESLGKGHSDGGCYEQLFVSPEVFVTLGVISLLENILVIVAIAKNKNLHSPMYFFICSLAVADMLVSVSNGSETIVITLLNSTDTDAQSFTVNIDNVIDSVICSSLLASICSLLSIAVDRYFTIFYALQYHNIMTVRRVGIIISCIWAACTVSGVLFIIYSDSSAVIICLITMFFTMLVLMASLYVHMFLMARLHIKRIAVLPGTGTIRQGANMKGAITLTILIGVFVVCWAPFFLHLLFYISCPQNPYCVCFMSHFNLYLILIMCNAVIDPLIYALRSQELRKTFKEIICFYPLGGICELPGRY. (3) The drug is O=Cc1ccccc1OP(=O)(O)OCc1ccccc1. The target protein sequence is SIQAEEWYFGKITRRESERLLLNAENPRGTFLVRESETTKGAYCLSVSDFDNAKGLNVKHYKIRKLDSGGFYITSRTQFNSLQQLVAYYSKHADGLCHRLTTVCPTSK. The pIC50 is 2.9. (4) The compound is N[C@H]1CC[C@H](Nc2nc(Nc3ccc(C(=O)N4CCCCC4)cc3)c3ncn(-c4cccc(C(=O)O)c4)c3n2)CC1. The target is PFCDPK1(Pfalciparum). The pIC50 is 7.2. (5) The drug is CCCCCCCNCC(P(=O)(O)O)P(=O)(O)O. The target protein sequence is MLKTGLCRRAAATTTITSTVPSNLLTEDGRPFAMVAREVRMMQQNMAGLVSNSNNAVLNHIAKYVFSVSGKMLRPTLVAMMAHALLPPHVSEQIRAESIGSIDDISSGAIRPFLRLGEITELLHTATLVHDDVMDNSNTRRGQPTVHCLYDTKRAVLAGDFLLARASIWIAALGHSRVVLLMSTALEDLAAGEMMQMDGCFDIESYEQKSYCKTASLIANSLASTAVLAGLPNTAYEEAAAKFGKHLGIAFQIVDDCLDITGDDKNLGKPKMADMAEGIATLPVLLAAREETRVYEAVRRRFKNPGDTEMCMEAVERHGCVAEALEHAGEHCRRGVEALHALHTSPARDCLEAAMGLILTRQV. The pIC50 is 5.8. (6) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C)NC(=O)[C@H](Cc1cnc[nH]1)NC(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](CC(=O)O)NC(C)=O)[C@@H](C)O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CCSC)C(=O)N[C@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CC(C)C)C(=O)NCC(N)=O)[C@@H](C)CC. The target protein sequence is MPEEVQTQDQPMETFAVQTFAFQAEIAQLMSLIYESLTDPSKLDSGK. The pIC50 is 3.6. (7) The small molecule is CCCCCCc1ccc(C(=O)CCN2CCOCC2)cc1. The target protein sequence is THKPEPTDEEWELIKTVTEAHVATNAQGSHWKQKRKFLPEDIGQAPIVNAPEGGKVDLEAFSHFTKIITPAITRVVDFAKKLPMFCELPCEDQIILLKGCCMEIMSLRAAVRYDPESETLTLNGEMAVTRGQLKNGGLGVVSDAIFDLGMSLSSFNLDDTEVALLQAVLLMSSDRPGLACVERIEKYQDSFLLAFEHYINYRKHHVTHFWPKLLMKVTDLRMIGACHASRFLHMKVECPTELFPPLFLEVFED. The pIC50 is 5.6. (8) The small molecule is COc1ccc(-c2nc3ccc(NCc4ccc(Cl)c(Cl)c4)cc3n(CCNC(C)=O)c2=O)cc1. The target protein (Q9Z122) has sequence MGKGGNQGEGSTELQAPMPTFRWEEIQKHNLRTDRWLVIDRKVYNVTKWSQRHPGGHRVIGHYSGEDATDAFRAFHLDLDFVGKFLKPLLIGELAPEEPSLDRGKSSQITEDFRALKKTAEDMNLFKTNHLFFFLLLSHIIVMESIAWFILSYFGNGWIPTVITAFVLATSQAQAGWLQHDYGHLSVYKKSIWNHIVHKFVIGHLKGASANWWNHRHFQHHAKPNIFHKDPDIKSLHVFVLGEWQPLEYGKKKLKYLPYNHQHEYFFLIGPPLLIPMYFQYQIIMTMIRRRDWVDLAWAISYYARFFYTYIPFYGILGALVFLNFIRFLESHWFVWVTQMNHIVMEIDLDHYRDWFSSQLAATCNVEQSFFNDWFSGHLNFQIEHHLFPTMPRHNLHKIAPLVKSLCAKHGIEYQEKPLLRALLDIVSSLKKSGELWLDAYLHK. The pIC50 is 7.5. (9) The drug is C[C@H](C(=O)NO)N1CC[C@@](C)(c2ccc(-c3ccccc3)cc2)C1=O. The target protein sequence is MLREQFSFDIAEEASKVCLAHLFTYQDFDMGTLGLAYVGSPRANSHGGVCPKAYYSPIGKKNIYLNSGLTSTKNYGKTILTKEADLVTTHELGHNFGAEHDPDGLAECAPNE. The pIC50 is 4.9.