This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is C=CC(=O)N[C@H]1CC[C@@H](n2nc(-c3ccc(Oc4ccccc4)cc3)c3c(N)ncnc32)CC1. The target protein (P51813) has sequence MDTKSILEELLLKRSQQKKKMSPNNYKERLFVLTKTNLSYYEYDKMKRGSRKGSIEIKKIRCVEKVNLEEQTPVERQYPFQIVYKDGLLYVYASNEESRSQWLKALQKEIRGNPHLLVKYHSGFFVDGKFLCCQQSCKAAPGCTLWEAYANLHTAVNEEKHRVPTFPDRVLKIPRAVPVLKMDAPSSSTTLAQYDNESKKNYGSQPPSSSTSLAQYDSNSKKIYGSQPNFNMQYIPREDFPDWWQVRKLKSSSSSEDVASSNQKERNVNHTTSKISWEFPESSSSEEEENLDDYDWFAGNISRSQSEQLLRQKGKEGAFMVRNSSQVGMYTVSLFSKAVNDKKGTVKHYHVHTNAENKLYLAENYCFDSIPKLIHYHQHNSAGMITRLRHPVSTKANKVPDSVSLGNGIWELKREEITLLKELGSGQFGVVQLGKWKGQYDVAVKMIKEGSMSEDEFFQEAQTMMKLSHPKLVKFYGVCSKEYPIYIVTEYISNGCLLNY.... The pIC50 is 8.9. (2) The small molecule is CNC(=O)Oc1cccc(C)c1. The target protein sequence is MARSVRTPISPSSSSSSRSSWSSPSSSSFYSLLSSFKASLTRPSSSSSVAHHLAARNNDICRGLFATLVILLRMSALTSAMTDHLTVQTTSGPVRGRSVTVQGRDVHVFTGIPYAKPPVDDLRFRKPVPAEPWHGVLDATRLPATCVQERYEYFPGFSGEEMWNPNTNVSEDCLFMNIWAPAKARLRHGRGTNGGEHSSKTDQDHLIHSATPQNTTNGLPILIWIYGGGFMTGSATLDIYNAEIMSAVGNVIVASFQYRVGAFGFLHLSPVMPGFEEEAPGNVGLWDQALALRWLKENARAFGGNPEWMTLFGGSSGSSSVNAQLMSPVTRGLVKRGMMQSATMNAPWSHMTSEKAVEIGKALVNDCNCNASLLPENPQAVMACMRQVDAKTISVQQWNSYSGILSYPSAPTIDGAFLPADPMTLLKTADLSGYDILIGNVKDEGAYFLLYDFIDYFDKDDATSLPRDKYLEIMNNIFQKASQAEREAIIFQYTSWEGNP.... The pIC50 is 3.8. (3) The small molecule is C=CC(=O)Nc1cccc(Nc2nc(Nc3ccc(NC4CN(C(C)=O)C4)cc3OC)ncc2C(F)(F)F)c1. The target protein sequence is MRRRHIVRKRTLRRLLQERELVEPLTPSGEAPNQALLRILKETEFKKIKVLGSGAFGTVYKGLWIPEGEKVKIPVAIKELREATSPKANKEILDEAYVMASVDNPHVCRLLGICLTSTVQLIMQLMPFGCLLDYVREHKDNIGSQYLLNWCVQIAKGMNYLEDRRLVHRDLAARNVLVKTPQHVKITDFGRAKLLGAEEKEYHAEGGKVPIKWMALESILHRIYTHQSDVWSYGVTVWELMTFGSKPYDGIPASEISSILEKGERLPQPPICTIDVYMIMVKCWMIDADSRPKFRELIIEFSKMARDPQRYLVIQGDERMHLPSPTDSNFYRALMDEEDMDDVVDADEYLIPQQGFFSSPSTSRTPLLSSLSATSNNSTVACIDRNGLQSCPIKEDSFLQRYSSDPTGALTEDSIDDTFLPVPEYINQSVPKRPAGSVQNPVYHNQPLNPAPSRDPHYQDPHSTAVGNPEYLNTVQPTCVNSTFDSPAHWAQKGSHQISL.... The pIC50 is 9.0. (4) The small molecule is Cc1csc2nc([C@H](C)NC(=O)/C=C/c3c(Cl)cccc3Cl)oc(=O)c12. The target protein (P24433) has sequence MSKVWVGGFLCVYGEEPSEECLALPRDTVQKELGSGNIPLPLNINHNEKATIGMVRGLFDLEHGLFCVAQIQSQTFMDIIRNIAGKSKLITAGSVIEPLPPDPEIECLSSSFPGLSLSSKVLQDENLDGKPFFHHVSVCGVGRRPGTIAIFGREISWILDRFSCISESEKRQVLEGVNVYSQGFDENLFSADLYDLLADSLDTSYIRKRFPKLQLDKQLCGLSKCTYIKASEPPVEIIVAATKVAGDQVQLTTEPGSELAVETCDVPVVHGNYDAVESATATTAMSNQNLPNTTPLLSSPPFSDCVFLPKDAFFSLLNVTTGQQPKIVPPVSVHPPVTEQYQMLPYSESAAKIAEHESNRYHSPCQAMYPYWQYSPVPQYPAALHGYRQSKTLKKRHFQSDSEDELSFPGDPEYTKKRRRHRVDNDDDKEMAREKNDLRELVDMIGMLRQEISALKHVRAQSPQRHIVPMETLPTIEEKGAASPKPSILNASLAPETVNR.... The pIC50 is 8.0. (5) The small molecule is CCOC(=O)c1ccc(NC(=O)OC(CN2CCCc3ccccc32)c2ccc(Cl)cc2)cc1. The target protein sequence is PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICAELEEEGKISRIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSIPLDEDFRKYTAFTIPSTNNETPGTRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYVDDLYVGSDLEIGQHRTKVEELRQHLWRWGFYTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQK. The pIC50 is 4.9. (6) The compound is COc1cc2nc(N3CCCC3)nc(NCCCCCN3CCCC3)c2cc1OC. The target protein sequence is MKWLGESKIMVVNGRRNGGKLSNDHQQNQSKLQHTGKDTLKAGKNAVERRSNRCNGNSGFEGQSRYVPSSGMSAKELCENDDLATSLVLDPYLGFQTHKMNTSAFPSRSSRHFSKSDSFSHNNPVRFRPIKGRQEELKEVIERFKKDEHLEKAFKCLTSGEWARHYFLNKNKMQEKLFKEHVFIYLRMFATDSGFEILPCNRYSSEQNGAKIVATKEWKRNDKIELLVGCIAELSEIEENMLLRHGENDFSVMYSTRKNCAQLWLGPAAFINHDCRPNCKFVSTGRDTACVKALRDIEPGEEISCYYGDGFFGENNEFCECYTCERRGTGAFKSRVGLPAPAPVINSKYGLRETDKRLNRLKKLGDSSKNSDSQSVSSNTDADTTQEKNNATSNRKSSVGVKKNSKSRTLTRQSMSRIPASSNSTSSKLTHINNSRVPKKLKKPAKPLLSKIKLRNHCKRLEQKNASRKLEMGNLVLKEPKVVLYKNLPIKKDKEPEGPA.... The pIC50 is 4.0. (7) The compound is Cc1c(C(=O)Nc2cccc([N+](=O)[O-])c2)sc(N)c1C(=O)OC(N)=O. The target protein (Q8WTS6) has sequence MDSDDEMVEEAVEGHLDDDGLPHGFCTVTYSSTDRFEGNFVHGEKNGRGKFFFFDGSTLEGYYVDDALQGQGVYTYEDGGVLQGTYVDGELNGPAQEYDTDGRLIFKGQYKDNIRHGVCWIYYPDGGSLVGEVNEDGEMTGEKIAYVYPDERTALYGKFIDGEMIEGKLATLMSTEEGRPHFELMPGNSVYHFDKSTSSCISTNALLPDPYESERVYVAESLISSAGEGLFSKVAVGPNTVMSFYNGVRITHQEVDSRDWALNGNTLSLDEETVIDVPEPYNHVSKYCASLGHKANHSFTPNCIYDMFVHPRFGPIKCIRTLRAVEADEELTVAYGYDHSPPGKSGPEAPEWYQVELKAFQATQQK. The pIC50 is 4.3.