Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CCCCCCOC(=O)[C@]1(O)C[C@@H]2O[C@@]1(C)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)NC4. The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAHLDQFERIKTLGTGSFGRVMLVKHMETGNHYAMKILDKQKVVKLKQIEHTLNEKRILQAVNFPFLVKLEFSFKDNSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIQVADFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAAGYPPFFADQPIQIYEKIVSGKVRFPSHFSSDLKDLLRNLLQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The pIC50 is 7.3. (2) The drug is COc1cc(F)ccc1Oc1cc(Cl)ccc1C(=O)Nc1cc[nH]c(=O)c1. The target protein (Q62968) has sequence MELPFASVGTTNFRRFTPESLAEIEKQIAAHRAAKKARTKHRGQEDKGEKPRPQLDLKACNQLPKFYGELPAELVGEPLEDLDPFYSTHRTFMVLNKSRTISRFSATWALWLFSPFNLIRRTAIKVSVHSWFSIFITITILVNCVCMTRTDLPEKVEYVFTVIYTFEALIKILARGFCLNEFTYLRDPWNWLDFSVITLAYVGAAIDLRGISGLRTFRVLRALKTVSVIPGLKVIVGALIHSVRKLADVTILTVFCLSVFALVGLQLFKGNLKNKCIRNGTDPHKADNLSSEMAEYIFIKPGTTDPLLCGNGSDAGHCPGGYVCLKTPDNPDFNYTSFDSFAWAFLSLFRLMTQDSWERLYQQTLRASGKMYMVFFVLVIFLGSFYLVNLILAVVTMAYEEQSQATIAEIEAKEKKFQEALEVLQKEQEVLAALGIDTTSLQSHSGSPLASKNANERRPRVKSRVSEGSTDDNRSPQSDPYNQRRMSFLGLSSGRRRASH.... The pIC50 is 7.8. (3) The compound is c1ccc(Cn2cc(-c3ccccc3)nn2)cc1. The target protein (O88420) has sequence MAARLLAPPGPDSFKPFTPESLANIERRIAESKLKKPPKADGSHREDDEDSKPKPNSDLEAGKSLPFIYGDIPQGLVAVPLEDFDPYYLTQKTFVVLNRGKTLFRFSATPALYILSPFNLIRRIAIKILIHSVFSMIIMCTILTNCVFMTFSNPPEWSKNVEYTFTGIYTFESLVKIIARGFCIDGFTFLRDPWNWLDFSVIMMAYVTEFVDLGNVSALRTFRVLRALKTISVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALIGLQLFMGNLRNKCVVWPINFNESYLENGTRGFDWEEYINNKTNFYMVPGMLEPLLCGNSSDAGQCPEGFQCMKAGRNPNYGYTSFDTFSWAFLALFRLMTQDYWENLYQLTLRAAGKTYMIFFVLVIFVGSFYLVNLILAVVAMAYEEQNQATLEEAEQKEAEFKAMLEQLKKQQEEAQAAAMATSAGTVSEDAIEEEGEDGVGSPRSSSELSKLSSKSAKERRNRRKKRKQ.... The pIC50 is 3.9. (4) The small molecule is O=C(c1ccc(Cl)cc1)N1CCC2(CCN(Cc3ccccc3Oc3ccccc3)CC2)CC1. The target protein (P51685) has sequence MDYTLDLSVTTVTDYYYPDIFSSPCDAELIQTNGKLLLAVFYCLLFVFSLLGNSLVILVLVVCKKLRSITDVYLLNLALSDLLFVFSFPFQTYYLLDQWVFGTVMCKVVSGFYYIGFYSSMFFITLMSVDRYLAVVHAVYALKVRTIRMGTTLCLAVWLTAIMATIPLLVFYQVASEDGVLQCYSFYNQQTLKWKIFTNFKMNILGLLIPFTIFMFCYIKILHQLKRCQNHNKTKAIRLVLIVVIASLLFWVPFNVVLFLTSLHSMHILDGCSISQQLTYATHVTEIISFTHCCVNPVIYAFVGEKFKKHLSEIFQKSCSQIFNYLGRQMPRESCEKSSSCQQHSSRSSSVDYIL. The pIC50 is 7.1. (5) The drug is CC1CCN(C(=O)c2csc(Nc3ccc(Cl)cc3F)n2)CC1. The target protein (Q13507) has sequence MREKGRRQAVRGPAFMFNDRGTSLTAEEERFLDAAEYGNIPVVRKMLEESKTLNVNCVDYMGQNALQLAVGNEHLEVTELLLKKENLARIGDALLLAISKGYVRIVEAILNHPGFAASKRLTLSPCEQELQDDDFYAYDEDGTRFSPDITPIILAAHCQKYEVVHMLLMKGARIERPHDYFCKCGDCMEKQRHDSFSHSRSRINAYKGLASPAYLSLSSEDPVLTALELSNELAKLANIEKEFKNDYRKLSMQCKDFVVGVLDLCRDSEEVEAILNGDLESAEPLEVHRHKASLSRVKLAIKYEVKKFVAHPNCQQQLLTIWYENLSGLREQTIAIKCLVVLVVALGLPFLAIGYWIAPCSRLGKILRSPFMKFVAHAASFIIFLGLLVFNASDRFEGITTLPNITVTDYPKQIFRVKTTQFTWTEMLIMVWVLGMMWSECKELWLEGPREYILQLWNVLDFGMLSIFIAAFTARFLAFLQATKAQQYVDSYVQESDLSE.... The pIC50 is 5.8. (6) The drug is O=c1[nH]c(O)ccc1N=Nc1ccc(S(=O)(=O)O)cc1. The target protein (P24941) has sequence MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pIC50 is 4.3.