This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc(NCc2nc(-c3ccncc3)n[nH]2)cc1. The target protein (P25098) has sequence MADLEAVLADVSYLMAMEKSKATPAARASKKILLPEPSIRSVMQKYLEDRGEVTFEKIFSQKLGYLLFRDFCLNHLEEARPLVEFYEEIKKYEKLETEEERVARSREIFDSYIMKELLACSHPFSKSATEHVQGHLGKKQVPPDLFQPYIEEICQNLRGDVFQKFIESDKFTRFCQWKNVELNIHLTMNDFSVHRIIGRGGFGEVYGCRKADTGKMYAMKCLDKKRIKMKQGETLALNERIMLSLVSTGDCPFIVCMSYAFHTPDKLSFILDLMNGGDLHYHLSQHGVFSEADMRFYAAEIILGLEHMHNRFVVYRDLKPANILLDEHGHVRISDLGLACDFSKKKPHASVGTHGYMAPEVLQKGVAYDSSADWFSLGCMLFKLLRGHSPFRQHKTKDKHEIDRMTLTMAVELPDSFSPELRSLLEGLLQRDVNRRLGCLGRGAQEVKESPFFRSLDWQMVFLQKYPPPLIPPRGEVNAADAFDIGSFDEEDTKGIKLLD.... The pIC50 is 6.4. (2) The drug is CC(C)(C)c1ccc(C(=O)C[C@H]2CC[C@@H](O)C3[C@H](O)[C@H](O)CN32)cc1. The target protein (Q24451) has sequence MLRIRRRFALVICSGCLLVFLSLYIILNFAAPAATQIKPNYENIENKLHELENGLQEHGEEMRNLRARLAETSNRDDPIRPPLKVARSPRPGQCQDVVQDVPNVDVQMLELYDRMSFKDIDGGVWKQGWNIKYDPLKYNAHHKLKVFVVPHSHNDPGWIQTFEEYYQHDTKHILSNALRHLHDNPEMKFIWAEISYFARFYHDLGENKKLQMKSIVKNGQLEFVTGGWVMPDEANSHWRNVLLQLTEGQTWLKQFMNVTPTASWAIDPFGHSPTMPYILQKSGFKNMLIQRTHYSVKKELAQQRQLEFLWRQIWDNKGDTALFTHMMPFYSYDIPHTCGPDPKVCCQFDFKRMGSFGLSCPWKVPPRTISDQNVAARSDLLVDQWKKKAELYRTNVLLIPLGDDFRFKQNTEWDVQRVNYERLFEHINSQAHFNVQAQFGTLQEYFDAVHQAERAGQAEFPTLSGDFFTYADRSDNYWSGYYTSRPYHKRMDRVLMHYVR.... The pIC50 is 7.5. (3) The compound is CN1Cc2c(c3c4ccccc4n(C)c3c3c2c2ccccc2n3CCC#N)C1=O. The target protein (P00516) has sequence MSELEEDFAKILMLKEERIKELEKRLSEKEEEIQELKRKLHKCQSVLPVPSTHIGPRTTRAQGISAEPQTYRSFHDLRQAFRKFTKSERSKDLIKEAILDNDFMKNLELSQIQEIVDCMYPVEYGKDSCIIKEGDVGSLVYVMEDGKVEVTKEGVKLCTMGPGKVFGELAILYNCTRTATVKTLVNVKLWAIDRQCFQTIMMRTGLIKHTEYMEFLKSVPTFQSLPEEILSKLADVLEETHYENGEYIIRQGARGDTFFIISKGKVNVTREDSPNEDPVFLRTLGKGDWFGEKALQGEDVRTANVIAAEAVTCLVIDRDSFKHLIGGLDDVSNKAYEDAEAKAKYEAEAAFFANLKLSDFNIIDTLGVGGFGRVELVQLKSEESKTFAMKILKKRHIVDTRQQEHIRSEKQIMQGAHSDFIVRLYRTFKDSKYLYMLMEACLGGELWTILRDRGSFEDSTTRFYTACVVEAFAYLHSKGIIYRDLKPENLILDHRGYAKL.... The pIC50 is 4.5. (4) The compound is C[C@H]1[C@H](NC(=O)/C(=N\OC(C)(C)C(=O)O)c2csc(N)n2)C(=O)N1S(=O)(=O)O. The target protein sequence is MQNTLKLLSVITCLAATVQGALAANIDESKIKDTVDDLIQPLMQKNNIPGMSVAVTVNGKNYIYNYGLAAKQPQQPVTENTLFEVGSLSKTFAATLASYAQVSGKLSLDQSVSHYVPELRGSSFDHVSVLNVGTHTSGLQLFMPEDIKNTTQLMAYLKAWKPADAAGTHRVYSNIGTGLLGMIAAKSLGVSYEDAIEKTLLPQLGMHHSYLKVPADQMENYAWGYNKKDEPVHGNMEILGNEAYGIKTTSSDLLRYVQANMGQLKLDANAKMQQALTATHTGYFKSGEITQDLMWEQLPYPVSLPNLLTGNDMAMTKSVATPIVPPLPPQENVWINKTGSTNGFGAYIAFVPAKKMGIVMLANKNYSIDQRVTVAYKILSSLEGNK. The pIC50 is 7.5. (5) The target protein (P46093) has sequence MGNHTWEGCHVDSRVDHLFPPSLYIFVIGVGLPTNCLALWAAYRQVQQRNELGVYLMNLSIADLLYICTLPLWVDYFLHHDNWIHGPGSCKLFGFIFYTNIYISIAFLCCISVDRYLAVAHPLRFARLRRVKTAVAVSSVVWATELGANSAPLFHDELFRDRYNHTFCFEKFPMEGWVAWMNLYRVFVGFLFPWALMLLSYRGILRAVRGSVSTERQEKAKIKRLALSLIAIVLVCFAPYHVLLLSRSAIYLGRPWDCGFEERVFSAYHSSLAFTSLNCVADPILYCLVNEGARSDVAKALHNLLRFLASDKPQEMANASLTLETPLTSKRNSTAKAMTGSWAATPPSQGDQVQLKMLPPAQ. The pIC50 is 4.9. The drug is CCc1cc2c(C)nc(C)nc2n1Cc1ccc(/C=C/CN2CCN(C(C)C)CC2)cc1. (6) The drug is O=c1cc(-c2ccc(OCc3ccc(Cl)nc3)cc2)oc2cc(OCc3ccc(Cl)nc3)cc(O)c12. The target protein (P19821) has sequence MRGMLPLFEPKGRVLLVDGHHLAYRTFHALKGLTTSRGEPVQAVYGFAKSLLKALKEDGDAVIVVFDAKAPSFRHEAYGGYKAGRAPTPEDFPRQLALIKELVDLLGLARLEVPGYEADDVLASLAKKAEKEGYEVRILTADKDLYQLLSDRIHVLHPEGYLITPAWLWEKYGLRPDQWADYRALTGDESDNLPGVKGIGEKTARKLLEEWGSLEALLKNLDRLKPAIREKILAHMDDLKLSWDLAKVRTDLPLEVDFAKRREPDRERLRAFLERLEFGSLLHEFGLLESPKALEEAPWPPPEGAFVGFVLSRKEPMWADLLALAAARGGRVHRAPEPYKALRDLKEARGLLAKDLSVLALREGLGLPPGDDPMLLAYLLDPSNTTPEGVARRYGGEWTEEAGERAALSERLFANLWGRLEGEERLLWLYREVERPLSAVLAHMEATGVRLDVAYLRALSLEVAEEIARLEAEVFRLAGHPFNLNSRDQLERVLFDELGL.... The pIC50 is 5.7. (7) The target protein (P54253) has sequence MKSNQERSNECLPPKKREIPATSRSSEEKAPTLPSDNHRVEGTAWLPGNPGGRGHGGGRHGPAGTSVELGLQQGIGLHKALSTGLDYSPPSAPRSVPVATTLPAAYATPQPGTPVSPVQYAHLPHTFQFIGSSQYSGTYASFIPSQLIPPTANPVTSAVASAAGATTPSQRSQLEAYSTLLANMGSLSQTPGHKAEQQQQQQQQQQQQHQHQQQQQQQQQQQQQQHLSRAPGLITPGSPPPAQQNQYVHISSSPQNTGRTASPPAIPVHLHPHQTMIPHTLTLGPPSQVVMQYADSGSHFVPREATKKAESSRLQQAIQAKEVLNGEMEKSRRYGAPSSADLGLGKAGGKSVPHPYESRHVVVHPSPSDYSSRDPSGVRASVMVLPNSNTPAADLEVQQATHREASPSTLNDKSGLHLGKPGHRSYALSPHTVIQTTHSASEPLPVGLPATAFYAGTQPPVIGYLSGQQQAITYAGSLPQHLVIPGTQPLLIPVGSTDME.... The drug is Cc1nnn(Cc2cc(C(F)(F)F)ccc2/C=C/C(=O)N2CCN(CC(=O)OC(C)(C)C)C[C@H]2C)n1. The pIC50 is 8.2. (8) The small molecule is O=C1C(Cl)=C(N2CCN(CCO)CC2)C(=O)N1c1ccc(Cl)c(Cl)c1. The target protein (Q16548) has sequence MTDCEFGYIYRLAQDYLQCVLQIPQPGSGPSKTSRVLQNVAFSVQKEVEKNLKSCLDNVNVVSVDTARTLFNQVMEKEFEDGIINWGRIVTIFAFEGILIKKLLRQQIAPDVDTYKEISYFVAEFIMNNTGEWIRQNGGWENGFVKKFEPKSGWMTFLEVTGKICEMLSLLKQYC. The pIC50 is 6.6. (9) The pIC50 is 6.2. The small molecule is CN(C)CCCn1cc(C2=C(c3c[nH]c4ccccc34)C(=O)NC2=O)c2ccccc21. The target protein (P18688) has sequence MRSRSNSGVRLDSYARLVQQTILCHQNPVTGLLPASYDQKDAWVRDNVYSILAVWGLGLAYRKNADRDEDKAKAYELEQSVVKLMRGLLHCMIRQVDKVESFKYSQSTKDSLHAKYNTKTCATVVGDDQWGHLQLDATSVYLLFLAQMTASGLHIIHSLDEVNFIQNLVFYIEAAYKTADFGIWERGDKTNQGISELNASSVGMAKAALEALDELDLFGVKGGPQSVIHVLADEVQHCQSILNSLLPRASTSKEVDASLLSVISFPAFAVEDSKLVEITKQEIITKLQGRYGCCRFLRDGYKTPKEDPNRLYYEPAELKLFENIECEWPLFWTYFILDGVFSGNAEQVQEYREALEAVLIKGKNGVPLLPELYSVPPDKVDEEYQNPHTVDRVPMGKLPHMWGQSLYILGSLMAEGFLAPGEIDPLNRRFSTVPKPDVVVQVSILAETEEIKAILKDKGINVETIAEVYPIRVQPARILSHIYSSLGCNNRMKLSGRPYR.... (10) The small molecule is CC1(C)CC=C(c2nc([C@@H]3CC(C)(C)O[C@](C)(C(=O)O)C3)ccc2NC(=O)c2nc(C#N)c[nH]2)CC1. The target protein sequence is MGLGAPLVLLVATAWHVRGVPVIEPRGPELVVEPGTAVTLRCVGNGSVEWEGPISPHWNLDPDSPSSILSTNNATFLNTGTYRCTEPGSPLGGSATIHIYVKDPVRPWKVLTQEVTVLEGQDALLPCLLTDPALEAGVSLMRVRGRPVLRQTNYSFSPWYGFTIHKAQFTETQGYQCSARVGGRTVTSMGIWLKVQKVIPGPPTLTLKPAELVRIQGEAANIECSASNVDVNFDVFLQHEDTKLTIPQQSDFQGNQYQKVLTLELDHVGFQDAGNYTCVATNVRGISSTSMIFRVVESAYLNLTSEQSLLQEVTVGEKVDLQVKVEAYPSLEGYNWTYLGPFSDQQAKLKFVITKDTYRYTSTLSLPRLKPSEAGRYSFLARNTRGGDSLTFELTLLYPPEVRITWTTVNGSDALLCEASGYPQPNVTWLQCRGHTDRCDEAQALVLEDSYSEVLSQEPFHKVIVHSLLAMGTMEHNMTYECRALNSVGNSSQAFRPIPI.... The pIC50 is 8.7.