Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is CC(C)[C@H](NS(=O)(=O)c1ccc2c(c1)oc1ccc(N3CCOC3=O)cc12)C(=O)O. The target protein (Q63341) has sequence MKFLLVLVLLVSLQVSACGAAPMNESEFAEWYLSRFFDYQGDRIPMTKTKTNRNLLEEKLQEMQQFFGLEVTGQLDTSTLKIMHTSRCGVPDVQHLRAVPQRSRWMKRYLTYRIYNYTPDMKRADVDYIFQKAFQVWSDVTPLRFRKIHKGEADITILFAFGDHGDFYDFDGKGGTLAHAFYPGPGIQGDAHFDEAETWTKSFQGTNLFLVAVHELGHSLGLRHSNNPKSIMYPTYRYLHPNTFRLSADDIHSIQSLYGAPVKNPSLTNPGSPPSTVCHQSLSFDAVTTVGDKIFFFKDWFFWWRLPGSPATNITSISSMWPTIPSGIQAAYEIGGRNQLFLFKDEKYWLINNLVPEPHYPRSIHSLGFPASVKKIDAAVFDPLRQKVYFFVDKQYWRYDVRQELMDAAYPKLISTHFPGIRPKIDAVLYFKRHYYIFQGAYQLEYDPLLDRVTKTLSSTSWFGC. The pIC50 is 7.1. (2) The target protein sequence is MGNAAAAKKGSEQESVKEFLAKAKEDFLKKWENPAQNTAQLDQFERIKTLGTGSFGRVMLSKHKETGNHYAMKILDKQKVVKLKQIEHTLNVKRILQAVNFPFLVKLEFSFKENSNLYMVMEYVPGGEMFSHLRRIGRFSEPHARFYAAQIVLTFEYLHSLDLIYRDLKPENLLIDQQGYIQQTDFGFAKRVKGRTWTLCGTPEYLAPEIILSKGYNKAVDWWALGVLIYEMAADYPPFFADQPIQIYEKIVYGKVRFPSHFSSDLKDLLRNLQQVDLTKRFGNLKNGVNDIKNHKWFATTDWIAIYQRKVEAPFIPKFKGPGDTSNFDDYEEEEIRVSINEKCGKEFSEF. The drug is O=C1NC(=O)c2c1c1c3ccccc3[nH]c1c1[nH]c3ccccc3c21. The pIC50 is 5.0. (3) The compound is CCOC(=O)n1ccn(C)c1=S. The target protein (P22079) has sequence MRVLLHLPALLASLILLQAAASTTRAQTTRTSAISDTVSQAKVQVNKAFLDSRTRLKTAMSSETPTSRQLSEYLKHAKGRTRTAIRNGQVWEESLKRLRQKASLTNVTDPSLDLTSLSLEVGCGAPAPVVRCDPCSPYRTITGDCNNRRKPALGAANRALARWLPAEYEDGLSLPFGWTPGKTRNGFPLPLAREVSNKIVGYLNEEGVLDQNRSLLFMQWGQIVDHDLDFAPDTELGSSEYSKAQCDEYCIQGDNCFPIMFPPNDPKAGTQGKCMPFFRAGFVCPTPPYKSLAREQINALTSFLDASFVYSSEPSLASRLRNLSSPLGLMAVNQEVSDHGLPYLPYDSKKPSPCEFINTTARVPCFLAGDSRASEHILLATSHTLFLREHNRLARELKRLNPQWDGEKLYQEARKILGAFVQIITFRDYLPILLGDHMQKWIPPYQGYSESVDPRISNVFTFAFRFGHLEVPSSMFRLDENYQPWGPEPELPLHTLFFNT.... The pIC50 is 5.0. (4) The small molecule is CC(C)C[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)OCc1ccccc1)C(=O)N[C@H]1CC(=O)OC1O. The target protein (P29466) has sequence MADKVLKEKRKLFIRSMGEGTINGLLDELLQTRVLNKEEMEKVKRENATVMDKTRALIDSVIPKGAQACQICITYICEEDSYLAGTLGLSADQTSGNYLNMQDSQGVLSSFPAPQAVQDNPAMPTSSGSEGNVKLCSLEEAQRIWKQKSAEIYPIMDKSSRTRLALIICNEEFDSIPRRTGAEVDITGMTMLLQNLGYSVDVKKNLTASDMTTELEAFAHRPEHKTSDSTFLVFMSHGIREGICGKKHSEQVPDILQLNAIFNMLNTKNCPSLKDKPKVIIIQACRGDSPGVVWFKDSVGVSGNLSLPTTEEFEDDAIKKAHIEKDFIAFCSSTPDNVSWRHPTMGSVFIGRLIEHMQEYACSCDVEEIFRKVRFSFEQPDGRAQMPTTERVTLTRCFYLFPGH. The pIC50 is 7.4. (5) The drug is O=C1NC(=O)C(=Nn2c(=O)[nH]c(=O)[nH]c2=O)C(=O)N1. The target protein sequence is MSEIENSTITSSADRMVGMDHAEVRYFTSYDHHGIHEEMLKDDVRTRSYRDSIYQNRHIFKDKVVLDVGCGTGILSMFAAKAGAKHVIGVDMSSIIEKAREIVAVNGLADKITLLQGKMEEVQLPFPSVDIIISEWMGYFLLYESMLDTVLYAQDRYLVPGGKIFPDKATMYLAGIEDGEYKDDKIGFWDNVYGFDYSPMKEIALTEPLVDTVELKALVTDPCPIITFDLYTVTKEDLAFEVPYSLPVKRSDFVHAVIAWFDIEFGACHKPINFSTGPHAKYTHWKQTVFYLRDVLTVEEEESISGVLSNRPNDKNKRDLDINLTYKLETQDQTRFAEGGCFYRM. The pIC50 is 3.8. (6) The compound is COc1cc(OC)c(S(=O)(=O)N2c3ccccc3Oc3ccccc32)cc1NC(=O)CCC(=O)O. The target protein (P43889) has sequence MTKKALSAVILAAGKGTRMYSDLPKVLHTIAGKPMVKHVIDTAHQLGSENIHLIYGHGGDLMRTHLANEQVNWVLQTEQLGTAHAVQQAAPFFKDNENIVVLYGDAPLITKETLEKLIEAKPENGIALLTVNLDNPTGYGRIIRENGNVVAIVEQKDANAEQLNIKEVNTGVMVSDGASFKKWLARVGNNNAQGEYYLTDLIALANQDNCQVVAVQATDVMEVEGANNRLQLAALERYFQNKQASKLLLEGVMIYDPARFDLRGTLEHGKDVEIDVNVIIEGNVKLGDRVKIGTGCVLKNVVIGNDVEIKPYSVLEDSIVGEKAAIGPFSRLRPGAELAAETHVGNFVEIKKSTVGKGSKVNHLTYVGDSEIGSNCNIGAGVITCNYDGANKFKTIIGDDVFVGSDTQLVAPVKVANGATIGAGTTITRDVGENELVITRVAQRHIQGWQRPIKKK. The pIC50 is 7.6. (7) The small molecule is C[C@]1(/C=C\C#N)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein sequence is MMKKSLCCALLLTASFSTFAAAKTEQQIADIVNRTITPLMQEQAIPGMAVAVIYQGKPYYFTWGKADIANNHPVTQQTLFELGSVSKTFNGVLGGDCIARGEIKLSDPVTKYWPELTGKQWQGIRLLHLATYTAGGLPLQIPDDVRDKAALLHFYQNWQPQWTPGAKRLYANSSIGLFGALAVKPSGMSYEEAMTRRVLQPLKLAHTWITVPENEQKDYAWGYREGKPVHVSPGQLDAEAYGVKSSVIDMARWVQANMDASHVQEKTLQQGIALAQSRYWRIGDMYQGLGWEMLNWPLKADSIINGSDSKVALAALPAVEVNPPAPAVKASWVHKTGSTGGFGSYVAFVPEKNLGIVMLANKSYPNPVRVEAAWRILEKLQ. The pIC50 is 6.5. (8) The drug is CC1(C)CC=C(c2nc([C@H]3CC(C)(C)O[C@](C)(CO)C3)ccc2NC(=O)c2nc(C#N)c[nH]2)CC1. The target protein sequence is MGLGAPLVLLVATAWHVRGVPVIEPRGPELVVEPGTAVTLRCVGNGSVEWEGPISPHWNLDPDSPSSILSTNNATFLNTGTYRCTEPGSPLGGSATIHIYVKDPVRPWKVLTQEVTVLEGQDALLPCLLTDPALEAGVSLMRVRGRPVLRQTNYSFSPWYGFTIHKAQFTETQGYQCSARVGGRTVTSMGIWLKVQKVIPGPPTLTLKPAELVRIQGEAANIECSASNVDVNFDVFLQHEDTKLTIPQQSDFQGNQYQKVLTLELDHVGFQDAGNYTCVATNVRGISSTSMIFRVVESAYLNLTSEQSLLQEVTVGEKVDLQVKVEAYPSLEGYNWTYLGPFSDQQAKLKFVITKDTYRYTSTLSLPRLKPSEAGRYSFLARNTRGGDSLTFELTLLYPPEVRITWTTVNGSDALLCEASGYPQPNVTWLQCRGHTDRCDEAQALVLEDSYSEVLSQEPFHKVIVHSLLAMGTMEHNMTYECRALNSVGNSSQAFRPIPI.... The pIC50 is 8.1. (9) The small molecule is CN(C)C(=O)Nc1cccc(-c2ccc(N(C)C)cc2)c1. The target protein (Q96AD5) has sequence MFPREKTWNISFAGCGFLGVYYVGVASCLREHAPFLVANATHIYGASAGALTATALVTGVCLGEAGAKFIEVSKEARKRFLGPLHPSFNLVKIIRSFLLKVLPADSHEHASGRLGISLTRVSDGENVIISHFNSKDELIQANVCSGFIPVYCGLIPPSLQGVRYVDGGISDNLPLYELKNTITVSPFSGESDICPQDSSTNIHELRVTNTSIQFNLRNLYRLSKALFPPEPLVLREMCKQGYRDGLRFLQRNGLLNRPNPLLALPPARPHGPEDKDQAVESAQAEDYSQLPGEDHILEHLPARLNEALLEACVEPTDLLTTLSNMLPVRLATAMMVPYTLPLESALSFTIRLLEWLPDVPEDIRWMKEQTGSICQYLVMRAKRKLGRHLPSRLPEQVELRRVQSLPSVPLSCAAYREALPGWMRNNLSLGDALAKWEECQRQLLLGLFCTNVAFPPEALRMRAPADPAPAPADPASPQHQLAGPAPLLSTPAPEARPVIG.... The pIC50 is 6.2.