This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is C[C@H]1c2ccncc2N(Cc2ccc(C(F)(F)F)o2)C[C@@H](C)N1C(=O)NCc1ccccc1. The target protein (O43497) has sequence MDEEEDGAGAEESGQPRSFMRLNDLSGAGGRPGPGSAEKDPGSADSEAEGLPYPALAPVVFFYLSQDSRPRSWCLRTVCNPWFERISMLVILLNCVTLGMFRPCEDIACDSQRCRILQAFDDFIFAFFAVEMVVKMVALGIFGKKCYLGDTWNRLDFFIVIAGMLEYSLDLQNVSFSAVRTVRVLRPLRAINRVPSMRILVTLLLDTLPMLGNVLLLCFFVFFIFGIVGVQLWAGLLRNRCFLPENFSLPLSVDLERYYQTENEDESPFICSQPRENGMRSCRSVPTLRGDGGGGPPCGLDYEAYNSSSNTTCVNWNQYYTNCSAGEHNPFKGAINFDNIGYAWIAIFQVITLEGWVDIMYFVMDAHSFYNFIYFILLIIVGSFFMINLCLVVIATQFSETKQRESQLMREQRVRFLSNASTLASFSEPGSCYEELLKYLVYILRKAARRLAQVSRAAGVRVGLLSSPAPLGGQETQPSSSCSRSHRRLSVHHLVHHHHH.... The pIC50 is 7.4. (2) The compound is CC(C)(C)C(=O)Nc1cncc(-c2cnc3[nH]nc(-c4nc5c(-c6cccc(F)c6)cncc5[nH]4)c3c2)c1. The target protein (P25054) has sequence MAAASYDQLLKQVEALKMENSNLRQELEDNSNHLTKLETEASNMKEVLKQLQGSIEDEAMASSGQIDLLERLKELNLDSSNFPGVKLRSKMSLRSYGSREGSVSSRSGECSPVPMGSFPRRGFVNGSRESTGYLEELEKERSLLLADLDKEEKEKDWYYAQLQNLTKRIDSLPLTENFSLQTDMTRRQLEYEARQIRVAMEEQLGTCQDMEKRAQRRIARIQQIEKDILRIRQLLQSQATEAERSSQNKHETGSHDAERQNEGQGVGEINMATSGNGQGSTTRMDHETASVLSSSSTHSAPRRLTSHLGTKVEMVYSLLSMLGTHDKDDMSRTLLAMSSSQDSCISMRQSGCLPLLIQLLHGNDKDSVLLGNSRGSKEARARASAALHNIIHSQPDDKRGRREIRVLHLLEQIRAYCETCWEWQEAHEPGMDQDKNPMPAPVEHQICPAVCVLMKLSFDEEHRHAMNELGGLQAIAELLQVDCEMYGLTNDHYSITLRRY.... The pIC50 is 8.2. (3) The compound is Cc1nnc(SCc2cc(=O)c(O)co2)n1N. The target protein (P36025) has sequence MTSLDDSVLTKKNIALLDNATNYIRPAIDYFHFKFNYDSLDVSTTWRLLLKMRKHKLLRLPSCSSENEFDYSIYMARLYHCIWRRWSIKHFNLDEYKIDPLSINWNKEIDVTVLYGPDLVGIHEREQPTPTDFPMGNIKEQGKQLLDVRKEGSASSLLKKGSVFYSKGKWLSQRSISFDDTVRRRDIDKRGRFRESCVLINDVEQFQNYSIVWDESRHRYRRQALPDTYDYEHLYPNGDETPRNTPHDNIIIHQNLHSITEGSYIYIK. The pIC50 is 4.0. (4) The compound is OCCN1CCN(c2ccc3ncc(-c4cccc(Cl)c4)n3n2)CC1. The target protein (O35491) has sequence MPHPRRYHSSERGSRGSYHEHYQSRKHKRRRSRSWSSSSDRTRRRRREDSYHVRSRSSYDDHSSDRRLYDRRYCGSYRRNDYSRDRGEAYYDTDFRQSYEYHRENSSYRSQRSSRRKHRRRRRRSRTFSRSSSHSSRRAKSVEDDAEGHLIYHVGDWLQERYEIVSTLGEGTFGRVVQCVDHRRGGTQVALKIIKNVEKYKEAARLEINVLEKINEKDPDNKNLCVQMFDWFDYHGHMCISFELLGLSTFDFLKDNNYLPYPIHQVRHMAFQLCQAVKFLHDNKLTHTDLKPENILFVNSDYELTYNLEKKRDERSVKSTAVRVVDFGSATFDHEHHSTIVSTRHYRAPEVILELGWSQPCDVWSIGCIIFEYYVGFTLFQTHDNREHLAMMERILGPVPSRMIRKTRKQKYFYRGRLDWDENTSAGRYVRENCKPLRRYLTSEAEDHHQLFDLIENMLEYEPAKRLTLGEALQHPFFACLRTEPPNTKLWDSSRDISR. The pIC50 is 5.9. (5) The drug is CN(Cc1cc2c(=O)c(C(=O)NCc3cccc(Cl)c3)cn(C)c2s1)CC(O)c1ccc(F)cc1. The target protein (P08546) has sequence MFFNPYLSGGVTGGAVAGGRRQRSQPGSAQGSGKRPPQKQFLQIVPRGVMFDGQTGLIKHKTGRLPLMFYREIKHLLSHDMVWPCPWRETLVGRVVGPIRFHTYDQTDAVLFFDSPENVSPRYRQHLVPSGNVLRFFGATEHGYSICVNVFGQRSYFYCEYSDTDRLREVIASVGELVPEPRTPYAVSVTPATKTSIYGYGTRPVPDLQCVSISNWTMARKIGEYLLEQGFPVYEVRVDPLTRLVIDRRITTFGWCSVNRYDWRQQGRASTCDIEVDCDVSDLVAVPDDSSWPRYRCLSFDIECMSGEGGFPCAEKSDDIVIQISCVCYETGGNTAVDQGIPNGNDGRGCTSEGVIFGHSGLHLFTIGTCGQVGPDVDVYEFPSEYELLLGFMLFFQRYAPAFVTGYNINSFDLKYILTRLEYLYKVDSQRFCKLPTAQGGRFFLHSPAVGFKRQYAAAFPSASHNNPASTAATKVYIAGSVVIDMYPVCMAKTNSPNYK.... The pIC50 is 6.7. (6) The compound is CCCCCCCCCCCCCCCC[C@@]1(O)C[N+](C)(C)CCO1. The target protein (Q9UKG9) has sequence MENQLAKSTEERTFQYQDSLPSLPVPSLEESLKKYLESVKPFANQEEYKKTEEIVQKFQSGIGEKLHQKLLERAKGKRNWLEEWWLNVAYLDVRIPSQLNVNFAGPAAHFEHYWPPKEGTQLERGSITLWHNLNYWQLLRKEKVPVHKVGNTPLDMNQFRMLFSTCKVPGITRDSIMNYFRTESEGRSPNHIVVLCRGRAFVFDVIHEGCLVTPPELLRQLTYIHKKCHSEPDGPGIAALTSEERTRWAKAREYLIGLDPENLALLEKIQSSLLVYSMEDSSPHVTPEDYSEIIAAILIGDPTVRWGDKSYNLISFSNGVFGCNCDHAPFDAMIMVNISYYVDEKIFQNEGRWKGSEKVRDIPLPEELIFIVDEKVLNDINQAKAQYLREASDLQIAAYAFTSFGKKLTKNKMLHPDTFIQLALQLAYYRLHGHPGCCYETAMTRHFYHGRTETMRSCTVEAVRWCQSMQDPSVNLRERQQKMLQAFAKHNKMMKDCSAG.... The pIC50 is 2.4. (7) The target protein (Q8QZV1) has sequence MGAGSSSYRPKAIYLDIDGRIQKVVFSKYCNSSDIMDLFCIATGLPRNTTISLLTTDDAMVSIDPTMPANSERTPYKVRPVAVKQVSEREELVQGVLAQVAEQFSRAFKINELKAEVANHLAMLEKRVELEGLKVVEIEKCKSDIKKMREELAARNNRTNCPCKYSFLDNKKLTPRRDVPTYPKYLLSPETIEALRKPTFDVWLWEPNEMLSCLEHMYHDLGLVRDFSINPITLRRWLLCVHDNYRSNPFHNFRHCFCVTQMMYSMVWLCGLQEKFSQMDILVLMTAAICHDLDHPGYNNTYQINARTELAVRYNDISPLENHHCAIAFQILARPECNIFASVPPEGFRQIRQGMITLILATDMARHAEIMDSFKEKMENFDYSNEEHLTLLKMILIKCCDISNEVRPMEVAEPWVDCLLEEYFMQSDREKSEGLPVAPFMDRDKVTKATAQIGFIKFVLIPMFETVTKLFPIVEETMLRPLWESREHYEELKQLDDAMK.... The drug is COCCCCN(C)[C@H]1CCN(C(=O)c2cc3c(cc2C)[nH]c(=O)c2cnn(C4CCOCC4)c23)C1. The pIC50 is 8.3.