Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=c1oc2ccccc2n1CCCCCN1CCN(CCn2c(=O)oc3ccccc32)CC1. The target protein (P29477) has sequence MACPWKFLFKVKSYQSDLKEEKDINNNVKKTPCAVLSPTIQDDPKSHQNGSPQLLTGTAQNVPESLDKLHVTSTRPQYVRIKNWGSGEILHDTLHHKATSDFTCKSKSCLGSIMNPKSLTRGPRDKPTPLEELLPHAIEFINQYYGSFKEAKIEEHLARLEAVTKEIETTGTYQLTLDELIFATKMAWRNAPRCIGRIQWSNLQVFDARNCSTAQEMFQHICRHILYATNNGNIRSAITVFPQRSDGKHDFRLWNSQLIRYAGYQMPDGTIRGDAATLEFTQLCIDLGWKPRYGRFDVLPLVLQADGQDPEVFEIPPDLVLEVTMEHPKYEWFQELGLKWYALPAVANMLLEVGGLEFPACPFNGWYMGTEIGVRDFCDTQRYNILEEVGRRMGLETHTLASLWKDRAVTEINVAVLHSFQKQNVTIMDHHTASESFMKHMQNEYRARGGCPADWIWLVPPVSGSITPVFHQEMLNYVLSPFYYYQIEPWKTHIWQNEKL.... The pIC50 is 5.3. (2) The compound is CCCCCCCCC#Cc1cc2cn([C@H]3C[C@H](O)[C@@H](CO)O3)c(=O)nc2o1. The target protein (P09250) has sequence MSTDKTDVKMGVLRIYLDGAYGIGKTTAAEEFLHHFAITPNRILLIGEPLSYWRNLAGEDAICGIYGTQTRRLNGDVSPEDAQRLTAHFQSLFCSPHAIMHAKISALMDTSTSDLVQVNKEPYKIMLSDRHPIASTICFPLSRYLVGDMSPAALPGLLFTLPAEPPGTNLVVCTVSLPSHLSRVSKRARPGETVNLPFVMVLRNVYIMLINTIIFLKTNNWHAGWNTLSFCNDVFKQKLQKSECIKLREVPGIEDTLFAVLKLPELCGEFGNILPLWAWGMETLSNCSRSMSPFVLSLEQTPQHAAQELKTLLPQMTPANMSSGAWNILKELVNAVQDNTS. The pIC50 is 3.7. (3) The compound is Cc1cccnc1NC(=S)Nc1cccc(Cl)c1. The target protein (P07374) has sequence MKLSPREVEKLGLHNAGYLAQKRLARGVRLNYTEAVALIASQIMEYARDGEKTVAQLMCLGQHLLGRRQVLPAVPHLLNAVQVEATFPDGTKLVTVHDPISRENGELQEALFGSLLPVPSLDKFAETKEDNRIPGEILCEDECLTLNIGRKAVILKVTSKGDRPIQVGSHYHFIEVNPYLTFDRRKAYGMRLNIAAGTAVRFEPGDCKSVTLVSIEGNKVIRGGNAIADGPVNETNLEAAMHAVRSKGFGHEEEKDASEGFTKEDPNCPFNTFIHRKEYANKYGPTTGDKIRLGDTNLLAEIEKDYALYGDECVFGGGKVIRDGMGQSCGHPPAISLDTVITNAVIIDYTGIIKADIGIKDGLIASIGKAGNPDIMNGVFSNMIIGANTEVIAGEGLIVTAGAIDCHVHYICPQLVYEAISSGITTLVGGGTGPAAGTRATTCTPSPTQMRLMLQSTDDLPLNFGFTGKGSSSKPDELHEIIKAGAMGLKLHEDWGSTPA.... The pIC50 is 4.8. (4) The drug is COc1cccc(OC)c1-c1nn2c(-c3cc(-c4ccccc4)n[nH]3)nnc2s1. The target is TRQARRNRRRRWRERQR. The pIC50 is 4.1. (5) The compound is O=C(Nc1cccc(Cl)c1)c1ccc(OCCCN2CCCCC2)cc1O. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The pIC50 is 7.2. (6) The compound is OC(CN(Cc1cccc(OC(F)(F)C(F)F)c1)c1cccc(Oc2ccc(-c3ccccc3)cc2)c1)C(F)(F)F. The target protein (P22687) has sequence ACPKGASYEAGIVCRITKPALLVLNQETAKVVQTAFQRAGYPDVSGERAVMLLGRVKYGLHNLQISHLSIASSQVELVDAKTIDVAIQNVSVVFKGTLNYSYTSAWGLGINQSVDFEIDSAIDLQINTELTCDAGSVRTNAPDCYLAFHKLLLHLQGEREPGWLKQLFTNFISFTLKLILKRQVCNEINTISNIMADFVQTRAASILSDGDIGVDISVTGAPVITATYLESHHKGHFTHKNVSEAFPLRAFPPGLLGDSRMLYFWFSDQVLNSLARAAFQEGRLVLSLTGDEFKKVLETQGFDTNQEIFQELSRGLPTGQAQVAVHCLKVPKISCQNRGVVVSSSVAVTFRFPRPDGREAVAYRFEEDIITTVQASYSQKKLFLHLLDFQCVPASGRAGSSANLSVALRTEAKAVSNLTESRSESLQSSLRSLIATVGIPEVMSRLEVAFTALMNSKGLDLFEIINPEIITLDGCLLLQMDFGFPKHLLVDFLQSLS. The pIC50 is 5.6. (7) The drug is O=C(O)Cn1nnc(-c2cc(N3CCC(Oc4cc(F)ccc4Br)CC3)no2)n1. The target protein (O00767) has sequence MPAHLLQDDISSSYTTTTTITAPPSRVLQNGGDKLETMPLYLEDDIRPDIKDDIYDPTYKDKEGPSPKVEYVWRNIILMSLLHLGALYGITLIPTCKFYTWLWGVFYYFVSALGITAGAHRLWSHRSYKARLPLRLFLIIANTMAFQNDVYEWARDHRAHHKFSETHADPHNSRRGFFFSHVGWLLVRKHPAVKEKGSTLDLSDLEAEKLVMFQRRYYKPGLLMMCFILPTLVPWYFWGETFQNSVFVATFLRYAVVLNATWLVNSAAHLFGYRPYDKNISPRENILVSLGAVGEGFHNYHHSFPYDYSASEYRWHINFTTFFIDCMAALGLAYDRKKVSKAAILARIKRTGDGNYKSG. The pIC50 is 6.0.