Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCC(c1ccccc1)c1cc(=O)[nH]c(SCCCCCC(=O)NO)n1. The target protein sequence is MAASGEGVSLPSPAGGEDAHRRRVSYFYEPSIGDYYYGQGHPMKPHRIRMAHSLVVHYGLHRLLELSRPYPASEADIRRFHSDDYVAFLASATGNPGVLDPRAIKRFNVGEDCPVFDGLFPFCQASAGGSIGAAVKLNRGDADITVNWAGGLHHAKKSEASGFCYVNDIVLAILELLKFHRRVLYVDIDVHHGDGVEEAFFTTNRVMTVSFHKYGDFFPGTGHITDVGAAEGKHYALNVPLSDGIDDTTFRGLFQCIIKKVMEVYQPDVVVLQCGADSLAGDRLGCFNLSVKGHADCLRFLRSYNVPMMVLGGGGYTIRNVARCWCYETAVAVGVEPDNKLPYNDYYEYFGPDYTLHIQPKSVENLNTTKDLENIKNMILENLSKIEHVPSTQFHDRPSDPEAPEEKEEDMDKRPPQRSRLWSGGAYDSDTEDPDSLKSEGKDVTANFQMKDEPKDDL. The pIC50 is 7.5. (2) The drug is CN(c1ccncc1)n1ccc2cc(Br)ccc21. The target protein (P37136) has sequence MRPPWYPLHTPSLASPLLFLLLSLLGGGARAEGREDPQLLVRVRGGQLRGIRLKAPGGPVSAFLGIPFAEPPVGSRRFMPPEPKRPWSGILDATTFQNVCYQYVDTLYPGFEGTEMWNPNRELSEDCLYLNVWTPYPRPTSPTPVLIWIYGGGFYSGASSLDVYDGRFLAQVEGTVLVSMNYRVGTFGFLALPGSREAPGNVGLLDQRLALQWVQENIAAFGGDPMSVTLFGESAGAASVGMHILSLPSRSLFHRAVLQSGTPNGPWATVSAGEARRRATLLARLVGCPPGGAGGNDTELISCLRTRPAQDLVDHEWHVLPQESIFRFSFVPVVDGDFLSDTPDALINTGDFQDLQVLVGVVKDEGSYFLVYGVPGFSKDNESLISRAQFLAGVRIGVPQASDLAAEAVVLHYTDWLHPEDPAHLRDAMSAVVGDHNVVCPVAQLAGRLAAQGARVYAYIFEHRASTLTWPLWMGVPHGYEIEFIFGLPLDPSLNYTVEE.... The pIC50 is 4.0. (3) The drug is CCCC[C@H](NC(=O)[C@@H](NC(=O)[C@@H](N)Cc1ccc(O)cc1)C(C)C)C(=O)NCC(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](Cc1ccccc1)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@H](Cc1ccc2ccccc2c1)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N[C@@H](Cc1ccccc1)C(=O)NCC(=O)ON. The target protein (P33032) has sequence MNSSFHLHFLDLNLNATEGNLSGPNVKNKSSPCEDMGIAVEVFLTLGVISLLENILVIGAIVKNKNLHSPMYFFVCSLAVADMLVSMSSAWETITIYLLNNKHLVIADAFVRHIDNVFDSMICISVVASMCSLLAIAVDRYVTIFYALRYHHIMTARRSGAIIAGIWAFCTGCGIVFILYSESTYVILCLISMFFAMLFLLVSLYIHMFLLARTHVKRIAALPGASSARQRTSMQGAVTVTMLLGVFTVCWAPFFLHLTLMLSCPQNLYCSRFMSHFNMYLILIMCNSVMDPLIYAFRSQEMRKTFKEIICCRGFRIACSFPRRD. The pIC50 is 8.3. (4) The small molecule is CCCC(=O)c1cnn(-c2ccc(NC(=O)c3cn(CC(=O)N4CCN5CCCC5C4)c4ccc(C)cc34)cc2)c1C. The target protein (Q9EPX4) has sequence MEVPGANATSANTTSIPGTSTLCSRDYKITQVLFPLLYTVLFFAGLITNSLAMRIFFQIRSKSNFIIFLKNTVISDLLMILTFPFKILSDAKLGAGHLRTLVCQVTSVTFYFTMYISISFLGLITIDRYLKTTRPFKTSSPSNLLGAKILSVAIWAFMFLLSLPNMILTNRRPKDKDITKCSFLKSEFGLVWHEIVNYICQVIFWINFLIVIVCYSLITKELYRSYVRTRGSAKAPKKRVNIKVFIIIAVFFICFVPFHFARIPYTLSQTRAVFDCNAENTLFYVKESTLWLTSLNACLDPFIYFFLCKSFRNSLMSMLRCSTSGANKKKGQEGGDPSEETPM. The pIC50 is 6.2. (5) The small molecule is O=C1OCc2cc([C@H]3CN4CCN(C(=O)C5CCc6nc(-n7cnnn7)ccc65)C[C@H]4CO3)ccc21. The target protein (P35560) has sequence MGASERSVFRVLIRALTERMFKHLRRWFITHIFGRSRQRARLVSKEGRCNIEFGNVDAQSRFIFFVDIWTTVLDLKWRYKMTVFITAFLGSWFLFGLLWYVVAYVHKDLPEFYPPDNRTPCVENINGMTSAFLFSLETQVTIGYGFRFVTEQCATAIFLLIFQSILGVIINSFMCGAILAKISRPKKRAKTITFSKNAVISKRGGKLCLLIRVANLRKSLLIGSHIYGKLLKTTITPEGETIILDQTNINFVVDAGNENLFFISPLTIYHIIDHNSPFFHMAAETLSQQDFELVVFLDGTVESTSATCQVRTSYVPEEVLWGYRFVPIVSKTKEGKYRVDFHNFGKTVEVETPHCAMCLYNEKDARARMKRGYDNPNFVLSEVDETDDTQM. The pIC50 is 6.4. (6) The drug is O=C([O-])C/C(=C\c1cccc(C(F)(F)F)c1)c1nc2ccccc2s1. The target protein sequence is MLAPGSSRVELFKRQSSKVPFEKDGKVTERVVHSFRLPALVNVDGVMVAIADARYETSNDNSLIDTVAKYSVDDGETWETQIAIKNSRASSVSRVVDPTVIVKGNKLYVLVGSYNSSRSYWTSHGDARDWDILLAVGEVTKSTAGGKITASIKWGSPVSLKEFFPAEMEGMHTNQFLGGAGVAIVASNGNLVYPVQVTNKKKQVFSKIFYSEDEGKTWKFGKGRSAFGCSEPVALEWEGKLIINTRVDYRRRLVYESSDMGNSWLEAVGTLSRVWGPSPKSNQPGSQSSFTAVTIEGMRVMLFTHPLNFKGRWLRDRLNLWLTDNQRIYNVGQVSIGDENSAYSSVLYKDDKLYCLHEINSNEVYSLVFARLVGELRIIKSVLQSWKNWDSHLSSICTPADPAASSSERGCGPAVTTVGLVGFLSHSATKTEWEDAYRCVNASTANAERVPNGLKFAGVGGGALWPVSQQGQNQRYRFANHAFTVVASVTIHEVPSVASP.... The pIC50 is 3.8. (7) The small molecule is C=CCCC/C=C\C/C=C\C/C=C\C/C=C\C/C=C\CC. The target protein (P24095) has sequence MFGIFDKGQKIKGTVVLMPKNVLDFNAITSIGKGGVIDTATGILGQGVSLVGGVIDTATSFLGRNISMQLISATQTDGSGNGKVGKEVYLEKHLPTLPTLGARQDAFSIFFEWDASFGIPGAFYIKNFMTDEFFLVSVKLEDIPNHGTIEFVCNSWVYNFRSYKKNRIFFVNDTYLPSATPAPLLKYRKEELEVLRGDGTGKRKDFDRIYDYDVYNDLGNPDGGDPRPILGGSSIYPYPRRVRTGRERTRTDPNSEKPGEVYVPRDENFGHLKSSDFLTYGIKSLSHDVIPLFKSAIFQLRVTSSEFESFEDVRSLYEGGIKLPTDILSQISPLPALKEIFRTDGENVLQFPPPHVAKVSKSGWMTDEEFAREVIAGVNPNVIRRLQEFPPKSTLDPTLYGDQTSTITKEQLEINMGGVTVEEALSTQRLFILDYQDAFIPYLTRINSLPTAKAYATRTILFLKDDGTLKPLAIELSKPHPDGDNLGPESIVVLPATEGV.... The pIC50 is 5.3. (8) The compound is CC1=CC(C)(C)Nc2cc3c(cc21)-c1cccc(F)c1C3. The pIC50 is 5.8. The target protein (P06401) has sequence MTELKAKGPRAPHVAGGPPSPEVGSPLLCRPAAGPFPGSQTSDTLPEVSAIPISLDGLLFPRPCQGQDPSDEKTQDQQSLSDVEGAYSRAEATRGAGGSSSSPPEKDSGLLDSVLDTLLAPSGPGQSQPSPPACEVTSSWCLFGPELPEDPPAAPATQRVLSPLMSRSGCKVGDSSGTAAAHKVLPRGLSPARQLLLPASESPHWSGAPVKPSPQAAAVEVEEEDGSESEESAGPLLKGKPRALGGAAAGGGAAAVPPGAAAGGVALVPKEDSRFSAPRVALVEQDAPMAPGRSPLATTVMDFIHVPILPLNHALLAARTRQLLEDESYDGGAGAASAFAPPRSSPCASSTPVAVGDFPDCAYPPDAEPKDDAYPLYSDFQPPALKIKEEEEGAEASARSPRSYLVAGANPAAFPDFPLGPPPPLPPRATPSRPGEAAVTAAPASASVSSASSSGSTLECILYKAEGAPPQQGPFAPPPCKAPGASGCLLPRDGLPSTSA....