This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is ClC1CCCN2CCN=C12.O.O=C(O)/C=C/C(=O)O. The target protein (O97972) has sequence MEGGFTGGDEYQKHFLPRDYLNTYYSFQSGPSPEAEMLKFNLECLHKTFGPGGLQGDTLIDIGSGPTIYQVLAACESFKDITLSDFTDRNREELAKWLKKEPGAYDWTPALKFACELEGNSGRWQEKAEKLRATVKRVLKCDANLSNPLTPVVLPPADCVLTLLAMECACCSLDAYRAALRNLASLLKPGGHLVTTVTLQLSSYMVGEREFSCVALEKEEVEQAVLDAGFDIEQLLYSPQSYSASTAPNRGVCFLVARKKPGS. The pIC50 is 4.3. (2) The small molecule is CC(=O)N1CCC=C(c2nc3ccc(-c4c(C)noc4C)c4c3n2[C@@H](c2ccccn2)CO4)C1. The target protein sequence is PAPEKSSKVSEQLKCCSGILKEMFAKKHAAYAWPFYKPVDVEALGLHDYCDIIKHPMDMSTIKSKLEAREYRDAQEFGADVRLMFSNCYKYNPPDHEVVAMARKLQDVFEMRFAKMPDE. The pIC50 is 7.0. (3) The small molecule is CC(=O)Nc1ccc(-c2ccc(-c3nc4c(C(N)=O)cccc4[nH]3)cc2)cn1. The pIC50 is 4.4. The target protein (Q02127) has sequence MAWRHLKKRAQDAVIILGGGGLLFASYLMATGDERFYAEHLMPTLQGLLDPESAHRLAVRFTSLGLLPRARFQDSDMLEVRVLGHKFRNPVGIAAGFDKHGEAVDGLYKMGFGFVEIGSVTPKPQEGNPRPRVFRLPEDQAVINRYGFNSHGLSVVEHRLRARQQKQAKLTEDGLPLGVNLGKNKTSVDAAEDYAEGVRVLGPLADYLVVNVSSPNTAGLRSLQGKAELRRLLTKVLQERDGLRRVHRPAVLVKIAPDLTSQDKEDIASVVKELGIDGLIVTNTTVSRPAGLQGALRSETGGLSGKPLRDLSTQTIREMYALTQGRVPIIGVGGVSSGQDALEKIRAGASLVQLYTALTFWGPPVVGKVKRELEALLKEQGFGGVTDAIGADHRR. (4) The compound is CCOC(=O)/C(C#N)=C1/C(=O)Nc2ccccc21. The target protein sequence is MENNSTERYIFKPNFLGEGSYGKVYKAYDTILKKEVAIKKMKLNEISNYIDDCGINFVLLREIKIMKEIKHKNIMSALDLYCEKDYINLVMEIMDYDLSKIINRKIFLTDSQKKCILLQILNGLNVLHKYYFMHRDLSPANIFINKKGEVKLADFGLCTKYGYDMYSDKLFRDKYKKNLNLTSKVVTLWYRAPELLLGSNKYNSSIDMWSFGCIFAELLLQKALFPGENEIDQLGKIFFLLGTPNENNWPEALCLPLYTEFTKATKKDFKTYFKIDDDDCIDLLTSFLKLNAHERISAEDAMKHRYFFNDPLPCDISQLPFNDL. The pIC50 is 4.4. (5) The drug is CCNc1cc(-c2cc(C)c(=O)n(C)c2)nc(Oc2c(C)cccc2F)n1. The target protein (Q92793) has sequence MAENLLDGPPNPKRAKLSSPGFSANDSTDFGSLFDLENDLPDELIPNGGELGLLNSGNLVPDAASKHKQLSELLRGGSGSSINPGIGNVSASSPVQQGLGGQAQGQPNSANMASLSAMGKSPLSQGDSSAPSLPKQAASTSGPTPAASQALNPQAQKQVGLATSSPATSQTGPGICMNANFNQTHPGLLNSNSGHSLINQASQGQAQVMNGSLGAAGRGRGAGMPYPTPAMQGASSSVLAETLTQVSPQMTGHAGLNTAQAGGMAKMGITGNTSPFGQPFSQAGGQPMGATGVNPQLASKQSMVNSLPTFPTDIKNTSVTNVPNMSQMQTSVGIVPTQAIATGPTADPEKRKLIQQQLVLLLHAHKCQRREQANGEVRACSLPHCRTMKNVLNHMTHCQAGKACQVAHCASSRQIISHWKNCTRHDCPVCLPLKNASDKRNQQTILGSPASGIQNTIGSVGTGQQNATSLSNPNPIDPSSMQRAYAALGLPYMNQPQTQL.... The pIC50 is 6.3. (6) The compound is Cc1oc2c(C)c3oc(=O)c(CCC(=O)O)c(C)c3cc2c1-c1ccccc1. The target protein (Q9HC29) has sequence MGEEGGSASHDEEERASVLLGHSPGCEMCSQEAFQAQRSQLVELLVSGSLEGFESVLDWLLSWEVLSWEDYEGFHLLGQPLSHLARRLLDTVWNKGTWACQKLIAAAQEAQADSQSPKLHGCWDPHSLHPARDLQSHRPAIVRRLHSHVENMLDLAWERGFVSQYECDEIRLPIFTPSQRARRLLDLATVKANGLAAFLLQHVQELPVPLALPLEAATCKKYMAKLRTTVSAQSRFLSTYDGAETLCLEDIYTENVLEVWADVGMAGPPQKSPATLGLEELFSTPGHLNDDADTVLVVGEAGSGKSTLLQRLHLLWAAGQDFQEFLFVFPFSCRQLQCMAKPLSVRTLLFEHCCWPDVGQEDIFQLLLDHPDRVLLTFDGFDEFKFRFTDRERHCSPTDPTSVQTLLFNLLQGNLLKNARKVVTSRPAAVSAFLRKYIRTEFNLKGFSEQGIELYLRKRHHEPGVADRLIRLLQETSALHGLCHLPVFSWMVSKCHQELL.... The pIC50 is 4.8. (7) The compound is Oc1ccccc1/C=C/c1ccc2ccccc2n1. The target protein (P0DPD8) has sequence MASPGAGRAPPELPERNCGYREVEYWDQRYQGAADSAPYDWFGDFSSFRALLEPELRPEDRILVLGCGNSALSYELFLGGFPNVTSVDYSSVVVAAMQARHAHVPQLRWETMDVRKLDFPSASFDVVLEKGTLDALLAGERDPWTVSSEGVHTVDQVLSEVGFQKGTRQLLGSRTQLELVLAGASLLLAALLLGCLVALGVQYHRDPSHSTCLTEACIRVAGKILESLDRGVSPCEDFYQFSCGGWIRRNPLPDGRSRWNTFNSLWDQNQAILKHLLENTTFNSSSEAEQKTQRFYLSCLQVERIEELGAQPLRDLIEKIGGWNITGPWDQDNFMEVLKAVAGTYRATPFFTVYISADSKSSNSNVIQVDQSGLFLPSRDYYLNRTANEKVLTAYLDYMEELGMLLGGRPTSTREQMQQVLELEIQLANITVPQDQRRDEEKIYHKMSISELQALAPSMDWLEFLSFLLSPLELSDSEPVVVYGMDYLQQVSELINRTEP.... The pIC50 is 5.2. (8) The small molecule is Cc1c[nH]c2ncnc(-c3ccc(NC(=O)N(CCO)c4ccccc4Cl)cc3)c12. The target protein (P53669) has sequence MRLTLLCCTWREERMGEEGSELPVCASCSQSIYDGQYLQALNADWHADCFRCCECSTSLSHQYYEKDGQLFCKKDYWARYGESCHGCSEHITKGLVMVGGELKYHPECFICLACGNFIGDGDTYTLVEHSKLYCGQCYYQTVVTPVIEQILPDSPGSHLPHTVTLVSIPASAHGKRGLSVSIDPPHGPPGCGTEHSHTVRVQGVDPGCMSPDVKNSIHIGDRILEINGTPIRNVPLDEIDLLIQETSRLLQLTLEHDPHDSLGHGPVSDPSPLASPVHTPSGQAGSSARQKPVLRSCSIDTSPGAGSLVSPASQRKDLGRSESLRVVCRPHRIFRPSDLIHGEVLGKGCFGQAIKVTHRETGEVMVMKELIRFDEETQRTFLKEVKVMRCLEHPNVLKFIGVLYKDKRLNFITEYIKGGTLRGIIKSMDSQYPWSQRVSFAKDIASGMAYLHSMNIIHRDLNSHNCLVRENRNVVVADFGLARLMIDEKGQSEDLRSLKK.... The pIC50 is 6.4. (9) The small molecule is C[C@@H]1CN(C2=C(Cl)C(=O)N(c3ccc(Cl)c(Cl)c3)C2=O)CCN1. The target protein (Q16548) has sequence MTDCEFGYIYRLAQDYLQCVLQIPQPGSGPSKTSRVLQNVAFSVQKEVEKNLKSCLDNVNVVSVDTARTLFNQVMEKEFEDGIINWGRIVTIFAFEGILIKKLLRQQIAPDVDTYKEISYFVAEFIMNNTGEWIRQNGGWENGFVKKFEPKSGWMTFLEVTGKICEMLSLLKQYC. The pIC50 is 6.1.