This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 6.8. The target protein (Q06210) has sequence MCGIFAYLNYHVPRTRREILETLIKGLQRLEYRGYDSAGVGFDGGNDKDWEANACKIQLIKKKGKVKALDEEVHKQQDMDLDIEFDVHLGIAHTRWATHGEPSPVNSHPQRSDKNNEFIVIHNGIITNYKDLKKFLESKGYDFESETDTETIAKLVKYMYDNRESQDTSFTTLVERVIQQLEGAFALVFKSVHFPGQAVGTRRGSPLLIGVRSEHKLSTDHIPILYRTARTQIGSKFTRWGSQGERGKDKKGSCNLSRVDSTTCLFPVEEKAVEYYFASDASAVIEHTNRVIFLEDDDVAAVVDGRLSIHRIKRTAGDHPGRAVQTLQMELQQIMKGNFSSFMQKEIFEQPESVVNTMRGRVNFDDYTVNLGGLKDHIKEIQRCRRLILIACGTSYHAGVATRQVLEELTELPVMVELASDFLDRNTPVFRDDVCFFLSQSGETADTLMGLRYCKERGALTVGITNTVGSSISRETDCGVHINAGPEIGVASTKAYTSQF.... The small molecule is N[C@@H](CNC(=O)CBr)C(=O)O. (2) The small molecule is CC[C@H](C)[C@@H]1NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@H](CCCN=C(N)N)NC(=O)CNC(=O)CNC(=O)[C@H](Cc2ccccc2)NC(=O)[C@@H](NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO)CSSC[C@@H](C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)O)NC(=O)CNC(=O)[C@H](CC(C)C)NC(=O)CNC(=O)[C@H](CO)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](C)NC(=O)CNC1=O. The target protein (P20594) has sequence MALPSLLLLVAALAGGVRPPGARNLTLAVVLPEHNLSYAWAWPRVGPAVALAVEALGRALPVDLRFVSSELEGACSEYLAPLSAVDLKLYHDPDLLLGPGCVYPAASVARFASHWRLPLLTAGAVASGFSAKNDHYRTLVRTGPSAPKLGEFVVTLHGHFNWTARAALLYLDARTDDRPHYFTIEGVFEALQGSNLSVQHQVYAREPGGPEQATHFIRANGRIVYICGPLEMLHEILLQAQRENLTNGDYVFFYLDVFGESLRAGPTRATGRPWQDNRTREQAQALREAFQTVLVITYREPPNPEYQEFQNRLLIRAREDFGVELGPSLMNLIAGCFYDGILLYAEVLNETIQEGGTREDGLRIVEKMQGRRYHGVTGLVVMDKNNDRETDFVLWAMGDLDSGDFQPAAHYSGAEKQIWWTGRPIPWVKGAPPSDNPPCAFDLDDPSCDKTPLSTLAIVALGTGITFIMFGVSSFLIFRKLMLEKELASMLWRIRWEELQ.... The pIC50 is 7.4. (3) The compound is COc1cc2nc(N3CCC(N4CCCC(CO)C4)CC3)nc(NC3CCCCCC3)c2cc1OC. The target protein (P51680) has sequence MNATEVTDTTQDETVYNSYYFYESMPKPCTKEGIKAFGEVFLPPLYSLVFLLGLFGNSVVVLVLFKYKRLKSMTDVYLLNLAISDLLFVLSLPFWGYYAADQWVFGLGLCKIVSWMYLVGFYSGIFFIMLMSIDRYLAIVHAVFSLKARTLTYGVITSLITWSVAVFASLPGLLFSTCYTEHNHTYCKTQYSVNSTTWKVLSSLEINVLGLLIPLGIMLFCYSMIIRTLQHCKNEKKNRAVRMIFAVVVLFLGFWTPYNVVLFLETLVELEVLQDCTLERYLDYAIQATETLAFIHCCLNPVIYFFLGEKFRKYITQLFRTCRGPLVLCKHCDFLQVYSADMSSSSYTQSTVDHDFRDAL. The pIC50 is 7.7. (4) The drug is COc1cc([C@H]2[C@](NC(=O)c3ccc(NC(=O)OC(C)(C)C)cc3)(C(=O)O)[C@@H](c3ccc(OC(=O)C45CC6CC(CC(C6)C4)C5)c(OC)c3)[C@]2(NC(=O)c2ccc(NC(=O)OC(C)(C)C)cc2)C(=O)O)ccc1OC(=O)C12CC3CC(CC(C3)C1)C2. The target protein (P32301) has sequence MAVTPSLLRLALLLLGAVGRAGPRPQGATVSLSETVQKWREYRHQCQRFLTEAPLLATGLFCNRTFDDYACWPDGPPGSFVNVSCPWYLPWASSVLQGHVYRFCTAEGIWLHKDNSSLPWRDLSECEESKQGERNSPEEQLLSLYIIYTVGYALSFSALVIASAILVSFRHLHCTRNYIHLNLFASFILRALSVFIKDAALKWMYSTAAQQHQWDGLLSYQDSLGCRLVFLLMQYCVAANYYWLLVEGVYLYTLLAFSVFSEQRIFKLYLSIGWGVPLLFVIPWGIVKYLYEDEGCWTRNSNMNYWLIIRLPILFAIGVNFLVFIRVICIVIAKLKANLMCKTDIKCRLAKSTLTLIPLLGTHEVIFAFVMDEHARGTLRFVKLFTELSFTSFQGFMVAVLYCFVNNEVQMEFRKSWERWRLERLNIQRDSSMKPLKCPTSSVSSGATVGSSVYAATCQNSCS. The pIC50 is 5.4. (5) The compound is C/C(Br)=C\CC/C(C)=C/CC(C)(C)/C=C/C(=O)NC(Cc1c[nH]c2ccccc12)C(=O)O. The target protein (Q45614) has sequence MNKVGFFRSIQFKITLIYVLLIIIAMQIIGVYFVNQVEKSLISSYEQSLNQRIDNLSYYIEQEYKSDNDSTVIKDDVSRILNDFTKSDEVREISFVDKSYEVVGSSKPYGEEVAGKQTTDLIFKRIFSTKQSYLRKYYDPKSKIRVLISAKPVMTENQEVVGAIYVVASMEDVFNQMKTINTILASGTGLALVLTALLGIFLARTITHPLSDMRKQAMELAKGNFSRKVKKYGHDEIGQLATTFNHLTRELEDAQAMTEGERRKLASVIAYMTDGVIATNRNGAIILLNSPALELLNVSRETALEMPITSLLGLQENYTFEDLVEQQDSMLLEIERDDELTVLRVNFSVIQREHGKIDGLIAVIYDVTEQEKMDQERREFVANVSHELRTPLTTMRSYLEALAEGAWENKDIAPRFLMVTQNETERMIRLVNDLLQLSKFDSKDYQFNREWIQIVRFMSLIIDRFEMTKEQHVEFIRNLPDRDLYVEIDQDKITQVLDNI.... The pIC50 is 4.4. (6) The small molecule is CC(C)c1c(-c2cc(NCc3ccccc3)c3ncnn3c2)[nH]c2ccc(C3CCNCC3)cc12. The target protein (Q9NYK1) has sequence MVFPMWTLKRQILILFNIILISKLLGARWFPKTLPCDVTLDVPKNHVIVDCTDKHLTEIPGGIPTNTTNLTLTINHIPDISPASFHRLDHLVEIDFRCNCVPIPLGSKNNMCIKRLQIKPRSFSGLTYLKSLYLDGNQLLEIPQGLPPSLQLLSLEANNIFSIRKENLTELANIEILYLGQNCYYRNPCYVSYSIEKDAFLNLTKLKVLSLKDNNVTAVPTVLPSTLTELYLYNNMIAKIQEDDFNNLNQLQILDLSGNCPRCYNAPFPCAPCKNNSPLQIPVNAFDALTELKVLRLHSNSLQHVPPRWFKNINKLQELDLSQNFLAKEIGDAKFLHFLPSLIQLDLSFNFELQVYRASMNLSQAFSSLKSLKILRIRGYVFKELKSFNLSPLHNLQNLEVLDLGTNFIKIANLSMFKQFKRLKVIDLSVNKISPSGDSSEVGFCSNARTSVESYEPQVLEQLHYFRYDKYARSCRFKNKEASFMSVNESCYKYGQTLDL.... The pIC50 is 8.2. (7) The small molecule is CN(C)C(=O)[C@]12C[C@H]1[C@@](C)(c1cc(/C=C(\F)c3ccc(Cl)cn3)cnc1F)N=C(N)S2. The target protein (P07339) has sequence MQPSSLLPLALCLLAAPASALVRIPLHKFTSIRRTMSEVGGSVEDLIAKGPVSKYSQAVPAVTEGPIPEVLKNYMDAQYYGEIGIGTPPQCFTVVFDTGSSNLWVPSIHCKLLDIACWIHHKYNSDKSSTYVKNGTSFDIHYGSGSLSGYLSQDTVSVPCQSASSASALGGVKVERQVFGEATKQPGITFIAAKFDGILGMAYPRISVNNVLPVFDNLMQQKLVDQNIFSFYLSRDPDAQPGGELMLGGTDSKYYKGSLSYLNVTRKAYWQVHLDQVEVASGLTLCKEGCEAIVDTGTSLMVGPVDEVRELQKAIGAVPLIQGEYMIPCEKVSTLPAITLKLGGKGYKLSPEDYTLKVSQAGKTLCLSGFMGMDIPPPSGPLWILGDVFIGRYYTVFDRDNNRVGFAEAARL. The pIC50 is 3.2. (8) The target protein (P25104) has sequence MILNSSTEDGIKRIQDDCPKAGRHNYIFIMIPTLYSIIFVVGIFGNSLVVIVIYFYMKLKTVASVFLLNLALADLCFLLTLPLWAVYTAMEYRWPFGNYLCKIASASVSFNLYASVFLLTCLSIDRYLAIVHPMKSRLRRTMLVAKVTCIIIWLLAGLASLPTIIHRNVFFIENTNITVCAFHYESQNSTLPVGLGLTKNILGFLFPFLIILTSYTLIWKTLKKAYEIQKNKPRKDDIFKIILAIVLFFFFSWVPHQIFTFMDVLIQLGLIRDCKIEDIVDTAMPITICLAYFNNCLNPLFYGFLGKKFKKYFLQLLKYIPPKAKSHSNLSTKMSTLSYRPSENGNSSTKKPAPCIEVE. The pIC50 is 7.3. The drug is CCCc1cn2ncc(C(=O)O)c2n1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1. (9) The drug is Cc1ccc(NC(=O)Nc2cc(C(F)(F)F)ccc2F)cc1Nc1ccc2c(c1)NC(=O)/C2=C\c1ccc[nH]1. The target protein (P29319) has sequence MDCHLSILVLLGCCVLSCSGELSPQPSNEVNLLDSKTIQGELGWISYPSHGWEEISGVDEHYTPIRTYQVCNVMDHSQNNWLRTNWVPRNSAQKIYVELKFTLRDCNSIPLVLGTCKETFNLYYMESDDHGVKFREHQFTKIDTIAADESFTQMDLGDRILKLNTEIREVGPVNKKGFYLAFQDVGACVALVSVRVYFKKCPFTVKNLAMFPDTVPMDSQSLVEVRGSCVNNSKEEDPPRMYCSTEGEWLVPIGKCTCNAGYEERGFICQACRPGFYKASDGAAKCAKCPPHSSTQEDGSMNCRCENNYFRAEKDPPSMACARPPSAPRNVISNINETSVILDWSWPLDTGGRKDITFNIICKKCGWNVRQCEPCSPNVRFLPRQLGLTNTTVTVTDLLAHTNYTFEIDAVNGVSELSSPPRQYAAVSITTNQAAPSPVMTIKKDRTSRNSISLSWQEPEHPNGIILDYEVKYYQKQEQETSYTILRARGTNVTISSLKP.... The pIC50 is 5.2.