This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cc1ccc2c(C)cc[n+](CCCS(=O)(=O)[O-])c2c1. The pIC50 is 5.2. The target protein (P9WQA3) has sequence MPIATPEVYAEMLGQAKQNSYAFPAINCTSSETVNAAIKGFADAGSDGIIQFSTGGAEFGSGLGVKDMVTGAVALAEFTHVIAAKYPVNVALHTDHCPKDKLDSYVRPLLAISAQRVSKGGNPLFQSHMWDGSAVPIDENLAIAQELLKAAAAAKIILEIEIGVVGGEEDGVANEINEKLYTSPEDFEKTIEALGAGEHGKYLLAATFGNVHGVYKPGNVKLRPDILAQGQQVAAAKLGLPADAKPFDFVFHGGSGSLKSEIEEALRYGVVKMNVDTDTQYAFTRPIAGHMFTNYDGVLKVDGEVGVKKVYDPRSYLKKAEASMSQRVVQACNDLHCAGKSLTH. (2) The small molecule is C[C@H](CNC(=O)CN(Cc1ccc(OCc2ccccc2)cc1)C(=O)[C@H](Cc1cnc[nH]1)NC(=O)OCc1ccccc1)c1ccccc1. The target protein (P01112) has sequence MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLAARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKLRKLNPPDESGPGCMSCKCVLS. The pIC50 is 6.5. (3) The compound is Nc1ncnc2c1nc(-c1ccc(P(=O)(O)O)o1)n2CCc1ccccc1. The target protein (P00569) has sequence MEEKLKKAKIIFVVGGPGSGKGTQCEKIVHKYGYTHLSTGDLLRAEVSSGSARGKKLSEIMEKGQLVPLETVLDMLRDAMVAKADTSKGFLIDGYPRQVQQGEEFERRIAQPTLLLYVDAGPETMQKRLLKRGETSGRVDDNEETIKKRLETYYKATEPVIAFYEKRGIVRKVNAEGSVDNVFSQVCTHLDALK. The pIC50 is 3.3. (4) The drug is O=c1[nH]c(CCc2ccccc2)cc(NC2CCCC2)c1Br. The target protein sequence is PISPITVPVKLKPGMDGPKVKQWPLTEEKIKALTEICTEMEKEGKIEKIGPENPYNTPVFAIKKKDSTKWRKVVDFRELNKRTQDFWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDKDFRKYTAFTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIEQHRAKIEELRQHLLRWGFTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVN. The pIC50 is 4.4. (5) The compound is C=C1C(=O)O[C@H]2[C@@H]3C(C)=CCC[C@@]3(C)CC[C@H]12. The target protein (Q9HVW7) has sequence MDKLIITGGNRLDGEIRISGAKNSALPILAATLLADTPVTVCNLPHLHDITTMIELFGRMGVQPIIDEKLNVEVDASSIKTLVAPYELVKTMRASILVLGPMLARFGEAEVALPGGCAIGSRPVDLHIRGLEAMGAQIEVEGGYIKAKAPAGGLRGGHFFFDTVSVTGTENLMMAAALANGRTVLQNAAREPEVVDLANCLNAMGANVQGAGSDTIVIEGVKRLGGARYDVLPDRIETGTYLVAAAATGGRVKLKDTDPTILEAVLQKLEEAGAHISTGSNWIELDMKGNRPKAVNVRTAPYPAFPTDMQAQFISMNAVAEGTGAVIETVFENRFMHVYEMNRMGAQILVEGNTAIVTGVPKLKGAPVMATDLRASASLVIAGLVAEGDTLIDRIYHIDRGYECIEEKLQLLGAKIRRVPG. The pIC50 is 4.3. (6) The compound is Cc1[nH]c2c(=O)n(C)cc(C#CC(C)(O)C3CC3)c2c1-c1nnc(C(C)CN2CCOCC2)o1. The target protein sequence is QVAFSFILDNIVTQKMMAVPDSWPFHHPVNKKFVPDYYKVIVNPMDLETIRKNISKHKYQSRESFLDDVNLILANSVKYNGPESQYTKTAQEIVNVCYQTLTEYDEHLTQLEKDICTAKEAALEEAELESLD. The pIC50 is 7.4. (7) The small molecule is COC(=O)/C=C/[C@H](CCc1ccccc1)n1cccc(NC(=O)OCc2ccccc2)c1=O. The target protein sequence is MEYSPNEVIKQEREVFVGKEKSGSKFKRKRSIFIVLTVSICFMFALMLFYFTRNENNKTLFTNSLSNNINDDYIINSLLKSESGKKFIVSKLEELISSYDKEKKMRTTGAEENNMNMNGIDDKDNKSVSFVNKKNGNLKVNNNNQVSYSNLFDTKFLMDNLETVNLFYIFLKENNKKYETSEEMQKRFIIFSENYRKIELHNKKTNSLYKRGMNKFGDLSPEEFRSKYLNLKTHGPFKTLSPPVSYEANYEDVIKKYKPADAKLDRIAYDWRLHGGVTPVKDQALCGSCWAFSSVGSVESQYAIRKKALFLFSEQELVDCSVKNNGCYGGYITNAFDDMIDLGGLCSQDDYPYVSNLPETCNLKRCNERYTIKSYVSIPDDKFKEALRYLGPISISIAASDDFAFYRGGFYDGECGAAPNHAVILVGYGMKDIYNEDTGRMEKFYYYIIKNSWGSDWGEGGYINLETDENGYKKTCSIGTEAYVPLLE. The pIC50 is 4.6. (8) The compound is CC(C)CCCCCCCCC[C@@H]1CC(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(=O)O1. The target protein sequence is MTASPRAPHQEHVLGEPTLEGLAHYIREKNVRRILVLVGAGASVAAGIPDFRSPDTGIYANLGKYNLEDPTDAFSLTLLREKPEIFYSIARELNLWPGHFQPTAVHHFIRLLQDEGRLLRCCTQNIDGLEKAAGVSPELLVEAHGSFAAAACIECHTPFSIEQNYLEAMSGTVSRCSTCGGIVKPNVVFFGENLPDAFFDALHHDAPIAELVIIIGTSMQVHPFALLPCVVPKSVPRVVMNRERVGGLLFRFPDDPLNTVHEDAVAKEGRSSSSQSRSPSASPRREEGGTEDSPSSPNEEVEEASTSSSSDGYGQYGDYHAHPDVCRDVLFRGDCQENVVTLAEYLGLSEALAKRMRLSDAAPATAQRAPNET. The pIC50 is 4.5.