Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc(CNC(=O)c2cc(=O)c3c(O)cc(OCCc4ccc(NC(=O)C(=O)O)c(C#N)c4)cc3o2)cc1. The target protein (Q5E9B1) has sequence MATLKEKLIAPVAEEETRIPNNKITVVGVGQVGMACAISILGKSLTDELALVDVLEDKLKGEMMDLQHGSLFLQTPKIVADKDYSVTANSKIVVVTAGVRQQEGESRLNLVQRNVNVFKFIIPQIVKYSPDCIIIVVSNPVDILTYVTWKLSGLPKHRVIGSGCNLDSARFRYLMAEKLGIHPSSCHGWILGEHGDSSVAVWSGVNVAGVSLQELNPEMGTDNDSENWKEVHKMVVESAYEVIKLKGYTNWAIGLSVADLIESMLKNLSRIHPVSTMVKGMYGIENEVFLSLPCILNARGLTSVINQKLKDEEVAQLKKSADTLWGIQKDLKDL. The pIC50 is 3.8. (2) The drug is CC[C@]1(O)C[C@@H]2CN(CCc3c([nH]c4ccccc34)[C@@](C(=O)OC)(c3cc4c(cc3OC)N(C)[C@H]3[C@@](O)(C(=O)OC)[C@H](OC(C)=O)[C@]5(CC)C=CCN6CC[C@]43[C@@H]65)C2)C1. The target protein sequence is MREIVSCQAGQCGNQIGSKFWEVIADEHGVDPTGSYQGDSDLQ. The pIC50 is 6.0. (3) The small molecule is Cc1cccc(S(=O)(=O)N(C)c2c(N)n(Cc3ccccc3)c(=O)[nH]c2=O)c1. The target protein sequence is MKRKGIILAGGSGTRLHPATLAISKQLLPVYDKPMIYYPLSTLMLAGIREILIISTPQDTPRFQQLLGDGSNWGLDLQYAVQPSPDGLAQAFLIGESFIGNDLSALVLGDNLYYGHDFHELLGSASQRQTGASVFAYHVLDPERYGVVEFDQGGKAISLEEKPLEPKSNYAVTGLYFYDQQVVDIARDLKPSPRGELEITDVNRAYLERGQLSVEIMGRGYAWLDTGTHDSLLEAGQFIATLENRQGLKVACPEEIAYRQKWIDAAQLEKLAAPLAKNGYGQYLKRLLTETVY. The pIC50 is 6.4. (4) The drug is NC(=O)c1cc([N+](=O)[O-])cc([N+](=O)[O-])c1N(CCCl)CCCl. The target protein sequence is MTPTIELTCGHRSIRHFTDEPISEAQREAIINSARATSSSSFLQCSSIIRITDKALREELVTLTGGQKHVAQAAEFWVFCADFNRHLQICPDAQLGLAEQLLLGVVDTAMMAQNALTAAESLGLGGVYIGGLRNNIEAVTKLLKLPQHVLPLFGLCLGWPADNPDLKPRLPSSILVHENSYQPLDKDALAQYDEQLAEYYLTRGSNNRRDTWSDHIRRTIIKESRPFILDYLHKQGWATR. The pIC50 is 5.1. (5) The drug is CC[C@H](C)[C@H](NC(=O)[C@H](C)NC(=O)[C@@H](NC(=O)[C@@H](N)CCCCN)[C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@H](C(=O)N[C@@H](CC(N)=O)C(=O)O)[C@@H](C)O)C(C)C)[C@@H](C)O. The target protein (P26818) has sequence MADLEAVLADVSYLMAMEKSKATPAARASKKIVLPEPSIRSVMQKYLEERHEITFDKIFNQRIGFLLFKDFCLNEINEAVPQVKFYEEIKEYEKLENEEDRLCRSRQIYDTYIMKELLSCSHPFSKQAVEHVQSHLSKKQVTSTLFQPYIEEICESLRGSIFQKFMESDKFTRFCQWKNVELNIHLTMNDFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNERIMLSLVSTGDCPFIVCMTYAFHTPDKLCFILDLMNGGDLHYHLSQHGVFSEKEMRFYATEIILGLEHMHNRFVVYRDLKPANILLDEHGHVRISDLGLACDFSKKKPHASVGTHGYMAPEVLQKGTAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTMNVELPDVFSPELKSLLEGLLQRDVSKRLGCHGGSAQELKTHDFFRGIDWQHVYLQKYPPPLIPPRGEVNAADAFDIGSFDEEDTKGIKLLD.... The pIC50 is 4.9. (6) The drug is CCCCC(Oc1cc(O)c(C(=O)O)cc1C#Cc1ccccc1OC(F)(F)F)C(=O)NC1CCCCC1. The target protein (Q8WVY7) has sequence MALPIIVKWGGQEYSVTTLSEDDTVLDLKQFLKTLTGVLPERQKLLGLKVKGKPAENDVKLGALKLKPNTKIMMMGTREESLEDVLGPPPDNDDVVNDFDIEDEVVEVENREENLLKISRRVKEYKVEILNPPREGKKLLVLDVDYTLFDHRSCAETGVELMRPYLHEFLTSAYEDYDIVIWSATNMKWIEAKMKELGVSTNANYKITFMLDSAAMITVHTPRRGLIDVKPLGVIWGKFSEFYSKKNTIMFDDIGRNFLMNPQNGLKIRPFMKAHLNRDKDKELLKLTQYLKEIAKLDDFLDLNHKYWERYLSKKQGQ. The pIC50 is 6.3.