This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCCCC1=NC2(CCCC2)C(=O)N1Cc1ccc(-c2ccccc2C(=O)OC(C)(C)C)cc1. The target protein (P37136) has sequence MRPPWYPLHTPSLASPLLFLLLSLLGGGARAEGREDPQLLVRVRGGQLRGIRLKAPGGPVSAFLGIPFAEPPVGSRRFMPPEPKRPWSGILDATTFQNVCYQYVDTLYPGFEGTEMWNPNRELSEDCLYLNVWTPYPRPTSPTPVLIWIYGGGFYSGASSLDVYDGRFLAQVEGTVLVSMNYRVGTFGFLALPGSREAPGNVGLLDQRLALQWVQENIAAFGGDPMSVTLFGESAGAASVGMHILSLPSRSLFHRAVLQSGTPNGPWATVSAGEARRRATLLARLVGCPPGGAGGNDTELISCLRTRPAQDLVDHEWHVLPQESIFRFSFVPVVDGDFLSDTPDALINTGDFQDLQVLVGVVKDEGSYFLVYGVPGFSKDNESLISRAQFLAGVRIGVPQASDLAAEAVVLHYTDWLHPEDPAHLRDAMSAVVGDHNVVCPVAQLAGRLAAQGARVYAYIFEHRASTLTWPLWMGVPHGYEIEFIFGLPLDPSLNYTVEE.... The pIC50 is 6.6. (2) The drug is COC(=O)c1ccc(-c2ccc(Cl)c(O[C@H]3O[C@H](CO)[C@@H](O)[C@H](O)[C@@H]3O)c2)cc1. The target protein sequence is MKRVITLFAVLLMGWSVNAWSFACKTANGTAIPIGGGSANVYVNLAPAVNVGQNLVVDLSTQIFCHNDYPETITDYVTLQRGAAYGGVLSSFSGTVKYNGSSYPFPTTSETPRVVYNSRTDKPWPVALYLTPVSSAGGVAIKAGSLIAVLILRQTNNYNSDDFQFVWNIYANNDVVVPTGGCDVSARDVTVTLPDYPGSVPIPLTVYCAKSQNLGYYLSGTTADAGNSIFTNTASFSPAQGVGVQLTRNGTIIPANNTVSLGAVGTSAVSLGLTANYARTGGQVTAGNVQSIIGVTFVYQ. The pIC50 is 4.5. (3) The drug is Cc1ccc(OC(=O)c2cccc(Cl)c2)c(C(=O)c2cccc(Cl)c2)c1. The target protein (P59264) has sequence HLLQFRKMIKKMTGKEPIVSYAFYGCYCGKGGRGKPKDATDRCCFVHDCCYEKVTGCDPKWSYYTYSLEDGDIVCEGDPYCTKVKCECDKKAAICFRDNLKTYKNRYMTFPDIFCTDPTEGC. The pIC50 is 5.6. (4) The drug is CN[C@@H](C)C(=O)N[C@H]1CCCC[C@H]2CC[C@@H](C(=O)N[C@@H](c3ccccc3)c3cn(CCCCNC(=O)NCCCCn4cc([C@@H](NC(=O)[C@@H]5CC[C@@H]6CCCC[C@H](NC(=O)[C@H](C)NC)C(=O)N65)c5ccccc5)nn4)nn3)N2C1=O. The target protein (Q13490) has sequence MHKTASQRLFPGPSYQNIKSIMEDSTILSDWTNSNKQKMKYDFSCELYRMSTYSTFPAGVPVSERSLARAGFYYTGVNDKVKCFCCGLMLDNWKLGDSPIQKHKQLYPSCSFIQNLVSASLGSTSKNTSPMRNSFAHSLSPTLEHSSLFSGSYSSLSPNPLNSRAVEDISSSRTNPYSYAMSTEEARFLTYHMWPLTFLSPSELARAGFYYIGPGDRVACFACGGKLSNWEPKDDAMSEHRRHFPNCPFLENSLETLRFSISNLSMQTHAARMRTFMYWPSSVPVQPEQLASAGFYYVGRNDDVKCFCCDGGLRCWESGDDPWVEHAKWFPRCEFLIRMKGQEFVDEIQGRYPHLLEQLLSTSDTTGEENADPPIIHFGPGESSSEDAVMMNTPVVKSALEMGFNRDLVKQTVQSKILTTGENYKTVNDIVSALLNAEDEKREEEKEKQAEEMASDDLSLIRKNRMALFQQLTCVLPILDNLLKANVINKQEHDIIKQKT.... The pIC50 is 8.1. (5) The small molecule is CNC(=O)c1c(-c2ccc(F)cc2)oc2cc(N(C)S(C)(=O)=O)c(-c3cc4c(=O)n(Cc5ccc(F)cc5)nnc4cc3C)cc12. The target protein sequence is APITAYSQQTRGLLGCIITSLTGRDKNQVEGEVQVVSTATQSFLATCVNGVCWTVYHGAGSKTLAAPKGPITQMYTNVDQDLVGWPKPPGARSLTPCTCGSSDLYLVTRHADVIPVRRRGDSRGSLLSPRPVSYLKGSSGGPLLCPFGHAVGIFRAAVCTRGVAKAVDFVPVESMETTMRSPVFTDNSSPPAVPQSFQVAHLHAPTGSGKSTKVPAAYAAQGYKVLVLNPSVAATLGFGAYMSKAHGIDPNIRTGVRTITTGAPVTYSTYGKFLADGGCSGGAYDIIICDECHSTDSTTILGIGTVLDQAETAGARLVVLATATPPGSVTVPHPNIEEVALSNTGEIPFYGKAIPIEAIRGGRHLIFCHSKKKCDELAAKLSGLGINAVAYYRGLDVSVIPTIGDVVVVATDALMTGYTGDFDSVIDCNTCVTQTVDFSLDPTFTIETTTVPQDAVSRSQRRGRTGRGRRGIYRFVTPGERPSGMFDSSVLCECYDAGCA.... The pIC50 is 8.2. (6) The drug is CN1CCN(Cc2ccc(-c3c(-c4cccc(OC(F)(F)F)c4)cc4n3C(CCCl)CNC4=O)cc2)CC1. The target protein (P11309) has sequence MLLSKINSLAHLRAAPCNDLHATKLAPGKEKEPLESQYQVGPLLGSGGFGSVYSGIRVSDNLPVAIKHVEKDRISDWGELPNGTRVPMEVVLLKKVSSGFSGVIRLLDWFERPDSFVLILERPEPVQDLFDFITERGALQEELARSFFWQVLEAVRHCHNCGVLHRDIKDENILIDLNRGELKLIDFGSGALLKDTVYTDFDGTRVYSPPEWIRYHRYHGRSAAVWSLGILLYDMVCGDIPFEHDEEIIRGQVFFRQRVSSECQHLIRWCLALRPSDRPTFEEIQNHPWMQDVLLPQETAEIHLHSLSPGPSK. The pIC50 is 7.8. (7) The pIC50 is 6.4. The target protein (Q29550) has sequence MWLLPLVLTSLASSATWAGQPASPPVVDTAQGRVLGKYVSLEGLAQPVAVFLGVPFAKPPLGSLRFAPPQPAEPWSFVKNTTSYPPMCCQDPVVEQMTSDLFTNGKERLTLEFSEDCLYLNIYTPADLTKRGRLPVMVWIHGGGLVLGGAPMYDGVVLAAHENVVVVAIQYRLGIWGFFSTGDEHSRGNWGHLDQVAALHWVQENIANFGGDPGSVTIFGESAGGESVSVLVLSPLAKNLFHRAISESGVALTVALVRKDMKAAAKQIAVLAGCKTTTSAVFVHCLRQKSEDELLDLTLKMKFLTLDFHGDQRESHPFLPTVVDGVLLPKMPEEILAEKDFNTVPYIVGINKQEFGWLLPTMMGFPLSEGKLDQKTATSLLWKSYPIANIPEELTPVATDKYLGGTDDPVKKKDLFLDLMGDVVFGVPSVTVARQHRDAGAPTYMYEFQYRPSFSSDKKPKTVIGDHGDEIFSVFGFPLLKGDAPEEEVSLSKTVMKFWA.... The small molecule is O=C(C(=O)c1ccccc1)c1ccccc1.