Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COC(=O)c1sccc1S(=O)(=O)Nc1ccc(Nc2ccccc2)cc1OC. The target protein (P35396) has sequence MEQPQEETPEAREEEKEEVAMGDGAPELNGGPEHTLPSSSCADLSQNSSPSSLLDQLQMGCDGASGGSLNMECRVCGDKASGFHYGVHACEGCKGFFRRTIRMKLEYEKCDRICKIQKKNRNKCQYCRFQKCLALGMSHNAIRFGRMPEAEKRKLVAGLTASEGCQHNPQLADLKAFSKHIYNAYLKNFNMTKKKARSILTGKSSHNAPFVIHDIETLWQAEKGLVWKQLVNGLPPYNEISVHVFYRCQSTTVETVRELTEFAKNIPNFSSLFLNDQVTLLKYGVHEAIFAMLASIVNKDGLLVANGSGFVTHEFLRSLRKPFSDIIEPKFEFAVKFNALELDDSDLALFIAAIILCGDRPGLMNVPQVEAIQDTILRALEFHLQVNHPDSQYLFPKLLQKMADLRQLVTEHAQMMQWLKKTESETLLHPLLQEIYKDMY. The pIC50 is 7.3. (2) The drug is CO[C@H]1CC[C@]2(C=C(c3cc(-c4ccc(Cl)cc4)ccc3C)C(=O)N2)CC1. The target protein sequence is ANLIPSQEPFPASDNSGETPQRNGEGHTLPKTPSQAEPASHKGPKDAGRRRNSLPPSHQKPPRNPLSSSDAAPSPELQANGTGTQGLEATDTNGLSSSARPQGQQAGSPSKEDKKQANIKRQLMTNFILGSFDDYSSDEDSVAGSSRESTRKGSRASLGALSLEAYLTTGEAETRVPTMRPSMSGLHLVKRGREHKKLDLHRDFTVASPAEFVTRFGGDRVIEKVLIANNGIAAVKCMRSIRRWAYEMFRNERAIRFVVMVTPEDLKANAEYIKMADHYVPVPGGPNNNNYANVELIVDIAKRIPVQAVWAGWGHASENPKLPELLCKNGVAFLGPPSEAMWALGDKIASTVVAQTLQVPTLPWSGSGLTVEWTEDDLQQGKRISVPEDVYDKGCVKDVDEGLEAAERIGFPLMIKASEGGGGKGIRKAESAEDFPILFRQVQSEIPGSPIFLMKLAQHARHLEVQILADQYGNAVSLFGRDCSIQRRHQKIVEEAPATI.... The pIC50 is 6.4. (3) The small molecule is C=C1C(=O)N[C@H](C)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(=O)O)[C@H](C)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](/C=C/C(C)=C/[C@H](C)[C@H](Cc2ccccc2)OC)[C@H](C)C(=O)N[C@@H](C(=O)O)CCC(=O)N1C. The target protein (P63151) has sequence MAGAGGGNDIQWCFSQVKGAVDDDVAEADIISTVEFNHSGELLATGDKGGRVVIFQQEQENKIQSHSRGEYNVYSTFQSHEPEFDYLKSLEIEEKINKIRWLPQKNAAQFLLSTNDKTIKLWKISERDKRPEGYNLKEEDGRYRDPTTVTTLRVPVFRPMDLMVEASPRRIFANAHTYHINSISINSDYETYLSADDLRINLWHLEITDRSFNIVDIKPANMEELTEVITAAEFHPNSCNTFVYSSSKGTIRLCDMRASALCDRHSKLFEEPEDPSNRSFFSEIISSISDVKFSHSGRYMMTRDYLSVKIWDLNMENRPVETYQVHEYLRSKLCSLYENDCIFDKFECCWNGSDSVVMTGSYNNFFRMFDRNTKRDITLEASRENNKPRTVLKPRKVCASGKRKKDEISVDSLDFNKKILHTAWHPKENIIAVATTNNLYIFQDKVN. The pIC50 is 9.8. (4) The small molecule is CCCc1c(C/C=C(\C)C/C=C/C(C)O)[nH]c(=O)c(C)c1O. The target protein (P52505) has sequence MAVRVLCACVRRLPTAFAPLPRLPTLAAARPLSTTLFAAETRTRPGAPLPALVLAQVPGRVTQLCRQYSDAPPLTLEGIKDRVLYVLKLYDKIDPEKLSVNSHFMKDLGLDSLDQVEIIMAMEDEFGFEIPDIDAEKLMCPQEIVDYIADKKDVYE. The pIC50 is 4.6. (5) The compound is Cc1cccc(Cn2cnc3cc(C(=O)O)ccc32)c1. The target protein sequence is MMFNFPNTRLRRRRSSKWVRNLTSESALSVNDLIFPLFVHDREETTELVSSLPGMKCYSIDGLVSIAQEAEDLGINAVAIFPVVDSKLKSENAEEAYNSDNLICKAIRAIKLKVPGIGIIADVALDPYTTHGHDGILKSNQIDVENDKTVSILCKQALALAKAGCNIVASSDMMDGRVGRIRKVLDDNNLQDVSILSYAVKYCSSFYAPFRQIVGSCVSSNSIDKSGYQMDYRNAREAICEIEMDLNEGADFIMVKPGMPYLDIIKMASDEFNFPIFAYQVSGEYAMIKAATNNGWLDYDKVIYESLVGFKRAGASAIFTYAALDVAKNLR. The pIC50 is 3.9. (6) The small molecule is CCc1cc(O)c2c(O)c3c(cc2c1C(=O)OC)C(=O)c1cccc(O)c1C3=O. The target protein (P17301) has sequence MGPERTGAAPLPLLLVLALSQGILNCCLAYNVGLPEAKIFSGPSSEQFGYAVQQFINPKGNWLLVGSPWSGFPENRMGDVYKCPVDLSTATCEKLNLQTSTSIPNVTEMKTNMSLGLILTRNMGTGGFLTCGPLWAQQCGNQYYTTGVCSDISPDFQLSASFSPATQPCPSLIDVVVVCDESNSIYPWDAVKNFLEKFVQGLDIGPTKTQVGLIQYANNPRVVFNLNTYKTKEEMIVATSQTSQYGGDLTNTFGAIQYARKYAYSAASGGRRSATKVMVVVTDGESHDGSMLKAVIDQCNHDNILRFGIAVLGYLNRNALDTKNLIKEIKAIASIPTERYFFNVSDEAALLEKAGTLGEQIFSIEGTVQGGDNFQMEMSQVGFSADYSSQNDILMLGAVGAFGWSGTIVQKTSHGHLIFPKQAFDQILQDRNHSSYLGYSVAAISTGESTHFVAGAPRANYTGQIVLYSVNENGNITVIQAHRGDQIGSYFGSVLCSVDV.... The pIC50 is 4.3. (7) The compound is CNc1ncnc2c1ncn2CC(COP(=O)(O)O)COP(=O)(O)O. The target protein (P49652) has sequence MTEALISAALNGTQPELLAGGWAAGNASTKCSLTKTGFQFYYLPTVYILVFITGFLGNSVAIWMFVFHMRPWSGISVYMFNLALADFLYVLTLPALIFYYFNKTDWIFGDVMCKLQRFIFHVNLYGSILFLTCISVHRYTGVVHPLKSLGRLKKKNAVYVSSLVWALVVAVIAPILFYSGTGVRRNKTITCYDTTADEYLRSYFVYSMCTTVFMFCIPFIVILGCYGLIVKALIYKDLDNSPLRRKSIYLVIIVLTVFAVSYLPFHVMKTLNLRARLDFQTPQMCAFNDKVYATYQVTRGLASLNSCVDPILYFLAGDTFRRRLSRATRKSSRRSEPNVQSKSEEMTLNILTEYKQNGDTSL. The pIC50 is 6.0. (8) The compound is CO[C@@H]1[C@@H](C)C[C@]2(Cc3ccc(C#N)cc3[C@]23N=C(N)N(C2COC2)C3=O)C[C@H]1C. The target protein sequence is MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLVSIPHGPNVTVRANIAAITESDKFFINGSNWEGILGLAYAEIARPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYVVFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDEST. The pIC50 is 8.5.