Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 6.1. The target protein (P21396) has sequence MTDLEKPNLAGHMFDVGLIGGGISGLAAAKLLSEYKINVLVLEARDRVGGRTYTVRNEHVKWVDVGGAYVGPTQNRILRLSKELGIETYKVNVNERLVQYVKGKTYPFRGAFPPVWNPLAYLDYNNLWRTMDEMGKEIPVDAPWQARHAQEWDKMTMKDLIDKICWTKTAREFAYLFVNINVTSEPHEVSALWFLWYVRQCGGTARIFSVTNGGQERKFVGGSGQVSEQIMGLLGDKVKLSSPVTYIDQTDDNIIVETLNHEHYECKYVISAIPPILTAKIHFKPELPPERNQLIQRLPMGAVIKCMVYYKEAFWKKKDYCGCMIIEDEEAPIAITLDDTKPDGSLPAIMGFILARKADRQAKLHKDIRKRKICELYAKVLGSQEALYPVHYEEKNWCEEQYSGGCYTAYFPPGIMTQYGRVIRQPVGRIYFAGTETATQWSGYMEGAVEAGERAAREVLNALGKVAKKDIWVEEPESKDVPAIEITHTFLERNLPSVPG.... The compound is C#CC[NH+](C)C/C(=C/F)c1ccccc1. (2) The drug is N#Cc1cccc2c(O[C@H]3CC[C@H](NC(=O)c4ccc(Cl)c(F)c4)CC3)ccnc12. The target protein sequence is MEVQLGLGRVYPRPPSKTYRGAFQNLFQSVREVIQNPGPRHPEAASAAPPGASLLLLQQQQQQQQQQQQQQQQQQQQQETSPRQQQQQQGEDGSPQAHRRGPTGYLVLDEEQQPSQPQSALECHPERGCVPEPGAAVAASKGLPQQLPAPPDEDDSAAPSTLSLLGPTFPGLSSCSADLKDILSEASTMQLLQQQQQEAVSEGSSSGRAREASGAPTSSKDNYLGGTSTISDNAKELCKAVSVSMGLGVEALEHLSPGEQLRGDCMYAPLLGVPPAVRPTPCAPLAECKGSLLDDSAGKSTEDTAEYSPFKGGYTKGLEGESLGCSGSAAAGSSGTLELPSTLSLYKSGALDEAAAYQSRDYYNFPLALAGPPPPPPPPHPHARIKLENPLDYGSAWAAAAAQCRYGDLASLHGAGAAGPGSGSPSAAASSSWHTLFTAEEGQLYGPCGGGGGGGGGGGGGGGGGGGGGGGGEAGAVAPYGYTRPPQGLAGQESDFTAPD.... The pIC50 is 7.2. (3) The compound is Fc1ccc(-n2cc(CCCCN3CCC4(CC3)SCc3ccccc34)c3ccccc32)cc1. The target protein (Q9R0C9) has sequence MPWAVGRRWAWITLFLTIVAVLIQAVWLWLGTQSFVFQREEIAQLARQYAGLDHELAFSRLIVELRRLHPGHVLPDEELQWVFVNAGGWMGAMCLLHASLSEYVLLFGTALGSHGHSGRYWAEISDTIISGTFHQWREGTTKSEVYYPGETVVHGPGEATAVEWGPNTWMVEYGRGVIPSTLAFALSDTIFSTQDFLTLFYTLRAYARGLRLELTTYLFGQDP. The pIC50 is 9.6. (4) The compound is CS/C(Nc1cccc([N+](=O)[O-])c1)=C(/C#N)S(=O)(=O)c1ccccc1. The target protein (P9WMK9) has sequence MSDEDRTDRATEDHTIFDRGVGQRDQLQRLWTPYRMNYLAEAPVKRDPNSSASPAQPFTEIPQLSDEEGLVVARGKLVYAVLNLYPYNPGHLMVVPYRRVSELEDLTDLESAELMAFTQKAIRVIKNVSRPHGFNVGLNLGTSAGGSLAEHLHVHVVPRWGGDANFITIIGGSKVIPQLLRDTRRLLATEWARQP. The pIC50 is 5.9. (5) The compound is C#CCN(Cc1ccc2nc(N)[nH]c(=O)c2c1)c1ccc(C(=O)NC(CCC(=O)O)C(=O)O)cc1. The target protein (P04818) has sequence MPVAGSELPRRPLPPAAQERDAEPRPPHGELQYLGQIQHILRCGVRKDDRTGTGTLSVFGMQARYSLRDEFPLLTTKRVFWKGVLEELLWFIKGSTNAKELSSKGVKIWDANGSRDFLDSLGFSTREEGDLGPVYGFQWRHFGAEYRDMESDYSGQGVDQLQRVIDTIKTNPDDRRIIMCAWNPRDLPLMALPPCHALCQFYVVNSELSCQLYQRSGDMGLGVPFNIASYALLTYMIAHITGLKPGDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYNPHPTIKMEMAV. The pIC50 is 7.3. (6) The small molecule is CCOC(=O)c1cn(-c2ccc3c(c2)n(CC(F)(F)F)c(=O)n3C)c(=O)n(Cc2cccc(C(F)(F)F)c2C)c1=O. The target protein sequence is MLLPALRLLLFLLGSSAEAGKIIGGTECRPHARPYMAYLEIVTPENHLSACSGFLIRRNFVMTAAHCAGRSITVLLGAHNKKVKEDTWQKLEVEKQFPHPKYDDHLVLNDIMLLKLKEKANLTLGVGTLPISAKSNSIPPGRVCRAVGWGRTNVNEPPSDTLQEVKMRILDPQACKHFEDFHQEPQLCVGNPKKIRNVYKGDSGGPLLCAGIAQGIASYVLRNAKPPSVFTRISHYRPWINKILREN. The pIC50 is 8.3. (7) The drug is CC(C)c1cnc(-c2cc(Cl)ccc2F)cc1Nc1ccncc1C(=O)NCC(N)=O. The target protein (P36896) has sequence MAESAGASSFFPLVVLLLAGSGGSGPRGVQALLCACTSCLQANYTCETDGACMVSIFNLDGMEHHVRTCIPKVELVPAGKPFYCLSSEDLRNTHCCYTDYCNRIDLRVPSGHLKEPEHPSMWGPVELVGIIAGPVFLLFLIIIIVFLVINYHQRVYHNRQRLDMEDPSCEMCLSKDKTLQDLVYDLSTSGSGSGLPLFVQRTVARTIVLQEIIGKGRFGEVWRGRWRGGDVAVKIFSSREERSWFREAEIYQTVMLRHENILGFIAADNKDNGTWTQLWLVSDYHEHGSLFDYLNRYTVTIEGMIKLALSAASGLAHLHMEIVGTQGKPGIAHRDLKSKNILVKKNGMCAIADLGLAVRHDAVTDTIDIAPNQRVGTKRYMAPEVLDETINMKHFDSFKCADIYALGLVYWEIARRCNSGGVHEEYQLPYYDLVPSDPSIEEMRKVVCDQKLRPNIPNWWQSYEALRVMGKMMRECWYANGAARLTALRIKKTLSQLSVQ.... The pIC50 is 7.9.