This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is Cl.NC[C@H]1OB(O)c2c(OCCO)ccc(Cl)c21. The target protein (Q15031) has sequence MASVWQRLGFYASLLKRQLNGGPDVIKWERRVIPGCTRSIYSATGKWTKEYTLQTRKDVEKWWHQRIKEQASKISEADKSKPKFYVLSMFPYPSGKLHMGHVRVYTISDTIARFQKMRGMQVINPMGWDAFGLPAENAAVERNLHPQSWTQSNIKHMRKQLDRLGLCFSWDREITTCLPDYYKWTQYLFIKLYEAGLAYQKEALVNWDPVDQTVLANEQVDEHGCSWRSGAKVEQKYLRQWFIKTTAYAKAMQDALADLPEWYGIKGMQAHWIGDCVGCHLDFTLKVHGQATGEKLTAYTATPEAIYGTSHVAISPSHRLLHGHSSLKEALRMALVPGKDCLTPVMAVNMLTQQEVPVVILAKADLEGSLDSKIGIPSTSSEDTILAQTLGLAYSEVIETLPDGTERLSSSAEFTGMTRQDAFLALTQKARGKRVGGDVTSDKLKDWLISRQRYWGTPIPIVHCPVCGPTPVPLEDLPVTLPNIASFTGKGGPPLAMASE.... The pIC50 is 3.5. (2) The compound is O=C(NN=Cc1cc(Br)c(O)c(Br)c1O)c1ccc2c(c1)Cc1ccccc1-2. The target protein (O25928) has sequence MEQSHQNLQSQFFIEHILQILPHRYPMLLVDRIIELQANKKIVAYKNITFNEDVFNGHFPNKPIFPGVLIVEGMAQTGGFLAFTSLWGFDPEIAKTKIVYFMTIDKVKFRIPVTPGDRLEYHLEVLKHKGMIWQVGGTAQVDGKVVAEAELKAMIAERD. The pIC50 is 5.4. (3) The target protein sequence is MSETRKPWHGVIVATSLPFDDDLSVDFGAYGESVAHLAAQGMHGVAPNGSLGEYQTLTYEERDRVVETAVANAPEGFTVMPGVGAYGGREAERHARFAKDAGCQAVMCLPPNAYRADDRAVLQHFERVASVGLPVTAYNNPIDTKVDLRPDLLAKLHAEGYIVGVKEFSGDVRRCYEISELAPGLDLMIGTDDTVLEVALAGAKGWVAGYPQVFPRACLALYEASVRGDLEAALPLYRQLHPVLRWDSKTEFVQAIKLGQELTGRRGGPCRPPRQPLAPETEAVVRAATQALVDAGVN. The drug is O=NN1CCCCC1C(=O)O. The pIC50 is 2.5. (4) The drug is CC(=O)N1C[C@H](C)N(c2cccc(-c3cc(-c4ccnn4C4CCOCC4)c4c(N)ncnn34)c2)[C@H](C)C1. The target protein (O35904) has sequence MPPGVDCPMEFWTKEESQSVVVDFLLPTGVYLNFPVSRNANLSTIKQVLWHRAQYEPLFHMLSDPEAYVFTCVNQTAEQQELEDEQRRLCDIQPFLPVLRLVAREGDRVKKLINSQISLLIGKGLHEFDSLRDPEVNDFRTKMRQFCEEAAAHRQQLGWVEWLQYSFPLQLEPSARGWRAGLLRVSNRALLVNVKFEGSEESFTFQVSTKDMPLALMACALRKKATVFRQPLVEQPEEYALQVNGRHEYLYGNYPLCHFQYICSCLHSGLTPHLTMVHSSSILAMRDEQSNPAPQVQKPRAKPPPIPAKKPSSVSLWSLEQPFSIELIEGRKVNADERMKLVVQAGLFHGNEMLCKTVSSSEVNVCSEPVWKQRLEFDISVCDLPRMARLCFALYAVVEKAKKARSTKKKSKKADCPIAWANLMLFDYKDQLKTGERCLYMWPSVPDEKGELLNPAGTVRGNPNTESAAALVIYLPEVAPHPVYFPALEKILELGRHGER.... The pIC50 is 8.4. (5) The drug is COc1ccc2nc(SC3Cc4ccc(F)cc4C3=NN=C(N)N)[nH]c2c1. The target protein (P26431) has sequence MMLRWSGIWGLYPPRIFPSLLVVVALVGLLPVLRSHGLQLNPTASTIRGSEPPRERSIGDVTTAPSEPLHHPDDRNLTNLYIEHGAKPVRKAFPVLDIDYLHVRTPFEISLWILLACLMKIGFHVIPTISSIVPESCLLIVVGLLVGGLIKGVGETPPFLQSDVFFLFLLPPIILDAGYFLPLRQFTENLGTILIFAVVGTLWNAFFLGGLLYAVCLVGGEQINNIGLLDTLLFGSIISAVDPVAVLAVFEEIHINELLHILVFGESLLNDAVTVVLYHLFEEFASYEYVGISDIFLGFLSFFVVSLGGVFVGVVYGVIAAFTSRFTSHIRVIEPLFVFLYSYMAYLSAELFHLSGIMALIASGVVMRPYVEANISHKSHTTIKYFLKMWSSVSETLIFIFLGVSTVAGSHQWNWTFVISTLLFCLIARVLGVLVLTWFINKFRIVKLTPKDQFIIAYGGLRGAIAFSLGYLLDKKHFPMCDLFLTAIITVIFFTVFVQG.... The pIC50 is 7.6. (6) The small molecule is CC(=O)O[C@H]1CC[C@H]2[C@@H]3CC[C@]45OC4C(=O)C(S(C)(=O)=O)C[C@]5(C)[C@H]3CC[C@]12C. The target protein (P04326) has sequence MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALGISYGRKKRRQRRRAPQGSQTHQVSLSKQPTSQSRGDPTGPKE. The pIC50 is 5.6. (7) The compound is O=C1C2=NCCS(=O)(=O)C2C(=O)c2c(O)cccc21. The target protein sequence is MIDTLRPVPFASEMAISKTVAWLNEQLELGNEQLLLMDCRPQELYESSHIESAINVAIPGIMLRRLQKGNLPVRALFTRCEDRDRFTRRCGTDTVVLYDENSSDWNENTGGESVLGLLLKKLKDEGCRAFYLEGGFSKFQAEFALHCETNLDGSCSSSSPPLPVLGLGGLRISSDSSSDIESDLDRDPNSATDSDGSPLSNSQPSFPVEILPFLYLGCAKDSTNLDVLEEFGIKYILNVTPNLPNLFENAGEFKYKQIPISDHWSQNLSQFFPEAISFIDEARGKNCGVLVHCLAGISRSVTVTVAYLMQKLNLSMNDAYDIVKMKKSNISPNFNFMGQLLDFERTLGLSSPCDNRVPAQQLYFTAPSNQNVYQVDSLQST. The pIC50 is 4.3.