Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CCCCCCCCCNC(=O)N[C@@H](CC(=O)[O-])C[N+](C)(C)C. The target protein (P32198) has sequence MAEAHQAVAFQFTVTPDGIDLRLSHEALKQICLSGLHSWKKKFIRFKNGIITGVFPANPSSWLIVVVGVISSMHAKVDPSLGMIAKISRTLDTTGRMSSQTKNIVSGVLFGTGLWVAVIMTMRYSLKVLLSYHGWMFAEHGKMSRSTKIWMAMVKVLSGRKPMLYSFQTSLPRLPVPAVKDTVSRYLESVRPLMKEEDFQRMTALAQDFAVNLGPKLQWYLKLKSWWATNYVSDWWEEYIYLRGRGPLMVNSNYYAMEMLYITPTHIQAARAGNTIHAILLYRRTLDREELKPIRLLGSTIPLCSAQWERLFNTSRIPGEETDTIQHIKDSRHIVVYHRGRYFKVWLYHDGRLLRPRELEQQMQQILDDPSEPQPGEAKLAALTAADRVPWAKCRQTYFARGKNKQSLDAVEKAAFFVTLDESEQGYREEDPEASIDSYAKSLLHGRCFDRWFDKSITFVVFKNSKIGINAEHSWADAPVVGHLWEYVMATDVFQLGYSE.... The pIC50 is 3.5. (2) The compound is CCOc1cnn(-c2ccccc2)c(=O)c1Cl. The target protein sequence is MNKQRIYSIVAILLFVVGGVLIGKPFYDGYQAEKKQTENVQAVQKMDYEKHETEFVDASKIDQPDLAEVANASLDKKQVIGRISIPSVSLELPVLKSSTEKNLLSGAATVKENQVMGKGNYALAGHNMSKKGVLFSDIASLKKGDKIYLYDNENEYEYAVTGVSEVTPDKWEVVEDHGKDEITLITCVSVKDNSKRYVVAGDLVGTKAKK. The pIC50 is 6.5. (3) The small molecule is Cc1n[nH]c2cc(-c3csc4c(=O)cc(N5CCOCC5)oc34)ccc12. The target protein sequence is KHAAYAWPFYKPVDVEALGLHDYCDIIKHPMDMSTIKSKLEAREYRDAQEFGADVRLMFSNCYKYNPPDHEVV. The pIC50 is 6.3. (4) The compound is O=C(OC1CCCCC1)c1cc(NC(=O)c2cnc(Cl)nc2C(F)(F)F)cc(C(F)(F)F)c1. The target protein (P18846) has sequence MEDSHKSTTSETAPQPGSAVQGAHISHIAQQVSSLSESEESQDSSDSIGSSQKAHGILARRPSYRKILKDLSSEDTRGRKGDGENSGVSAAVTSMSVPTPIYQTSSGQYIAIAPNGALQLASPGTDGVQGLQTLTMTNSGSTQQGTTILQYAQTSDGQQILVPSNQVVVQTASGDMQTYQIRTTPSATSLPQTVVMTSPVTLTSQTTKTDDPQLKREIRLMKNREAARECRRKKKEYVKCLENRVAVLENQNKTLIEELKTLKDLYSNKSV. The pIC50 is 5.9. (5) The compound is Cc1occc1C(=O)N/N=C/c1cc(Cl)ccc1O. The target protein sequence is MMNVILFLTLSNIFVFNSAQHQINLLSEIVQSRCTQWKVEHGATNISCSEIWNSFESILLSTHTKSACVMKSGLFDDFVYQLFELEQQQQQRHHTIQTEQYFHSQVMNIIRGMCKRLGVCRSLETTFPGYLFDELNWCNGSLTGNTKYGTVCGCDYKSNVVHAFWQSASAEYARRASGNIFVVLNGSVKAPFNENKTFGKIELPLLKHPRVQQLTVKLVHSLEDVNNRQTCESWSLQELANKLNSVHIPFRCIDDPLEFRHYQCIENPGKQLCQFSASTRSNVETLLILFPLVICLTFYTSMNHHHHHH. The pIC50 is 7.5. (6) The compound is COc1ccc2c(c1)SC1=NC(c3ccc4ccccc4c3)=C/C(=C\C#N)N12. The target protein (P00690) has sequence MKLFLLLSAFGFCWAQYAPQTQSGRTSIVHLFEWRWVDIALECERYLGPKGFGGVQVSPPNENIVVTNPSRPWWERYQPVSYKLCTRSGNENEFRDMVTRCNNVGVRIYVDAVINHMCGSGAAAGTGTTCGSYCNPGNREFPAVPYSAWDFNDGKCKTASGGIESYNDPYQVRDCQLVGLLDLALEKDYVRSMIADYLNKLIDIGVAGFRIDASKHMWPGDIKAVLDKLHNLNTNWFPAGSRPFIFQEVIDLGGEAIQSSEYFGNGRVTEFKYGAKLGTVVRKWSGEKMSYLKNWGEGWGFMPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKVAVGFMLAHPYGFTRVMSSYRWARNFVNGQDVNDWIGPPNNNGVIKEVTINADTTCGNDWVCEHRWRQIRNMVWFRNVVDGQPFANWWANGSNQVAFGRGNRGFIVFNNDDWQLSSTLQTGLPGGTYCDVISGDKVGNSCTGIKVYVSSDGTAQFSISNSAED.... The pIC50 is 4.7. (7) The small molecule is C[C@]1(Cn2ccnn2)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein sequence is MVKKSLRQFTLMATATVTLLLGSVPLYAQTADVQQKLAELERQSGGRLGVALINTADNSQILYRADERFAMCSTSKVMAAAAVLKKSESEPNLLNQRVEIKKSDLVNYNPIAEKHVNGTMSLAELSAAALQYSDNVAMNKLIAHVGGPASVTAFARQLGDETFRLDRTEPTLNTAIPGDPRDTTSPRAMAQTLRNLTLGKALGDSQRAQLVTWMKGNTTGAASIQAGLPASWVVGDKTGSGGYGTTNDIAVIWPKDRAPLILVTYFTQPQPKAESRRDVLASAAKIVTDGL. The pIC50 is 8.7.