This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is N[C@@H](Cc1ccc(Oc2ccc(O)c(I)c2)c(I)c1)C(=O)O. The target protein (P12004) has sequence MFEARLVQGSILKKVLEALKDLINEACWDISSSGVNLQSMDSSHVSLVQLTLRSEGFDTYRCDRNLAMGVNLTSMSKILKCAGNEDIITLRAEDNADTLALVFEAPNQEKVSDYEMKLMDLDVEQLGIPEQEYSCVVKMPSGEFARICRDLSHIGDAVVISCAKDGVKFSASGELGNGNIKLSQTSNVDKEEEAVTIEMNEPVQLTFALRYLNFFTKATPLSSTVTLSMSADVPLVVEYKIADMGHLKYYLAPKIEDEEGS. The pIC50 is 4.3. (2) The target protein (P28065) has sequence MLRAGAPTGDLPRAGEVHTGTTIMAVEFDGGVVMGSDSRVSAGEAVVNRVFDKLSPLHERIYCALSGSAADAQAVADMAAYQLELHGIELEEPPLVLAAANVVRNISYKYREDLSAHLMVAGWDQREGGQVYGTLGGMLTRQPFAIGGSGSTFIYGYVDAAYKPGMSPEECRRFTTDAIALAMSRDGSSGGVIYLVTITAAGVDHRVILGNELPKFYDE. The compound is CCCC[C@H](NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)NC(=O)CN=[N+]=[N-])C(=O)N[C@@H](Cc1ccccc1)C(=O)[C@@]1(C)CO1. The pIC50 is 7.2. (3) The target protein sequence is MLQNVTPHNKLPGEGNAGLLGLGPEAAAPGKRIRKPSLLYEGFESPTMASVPALQLTPANPPPPEVSNPKKPGRVTNQLQYLHKVVMKALWKHQFAWPFRQPVDAVKLGLPDYHKIIKQPMDMGTIKRRLENNYYWAASECMQDFNTMFTNCYIYNKPTDDIVLMAQTLEKIFLQKVASMPQEEQELVVTIPKNSHKKGAKLAALQGSVTSAHQVPAVSSVSHTALYTPPPEIPTTVLNIPHPSVISSPLLKSLHSAGPPLLAVTAAPPAQPLAKKKGVKRKADTTTPTPTAILAPGSPASPPGSLEPKAARLPPMRRESGRPIKPPRKDLPDSQQQHQSSKKGKLSEQLKHCNGILKELLSKKHAAYAWPFYKPVDASALGLHDYHDIIKHPMDLSTVKRKMENRDYRDAQEFAADVRLMFSNCYKYNPPDHDVVAMARKLQDVFEFRYAKMPDEPLEPGPLPVSTAMPPGL. The compound is Cc1noc(C)c1-c1ccc2c(c1)C(O)(C1CCOCC1)C(=O)N2. The pIC50 is 6.4. (4) The drug is CN[C@@H](C)C(=O)N[C@H](C(=O)N1c2ncccc2C[C@H]1CNS(C)(=O)=O)C(C)C. The target protein sequence is MRHHHHHHRDHFALDRPSETHADYLLRTGQVVDISDTIYPRNPAMYSEEARLKSFQNWPDYAHLTPRELASAGLYYTGIGDQVQCFACGGKLKNWEPGDRAWSEHRRHEPNCFFVLGRNLNIRSE. The pIC50 is 8.9. (5) The compound is Nc1ncc(C#Cc2cccc(C(=O)Nc3cccc(C(F)(F)F)c3)c2)cn1. The target protein sequence is LKLVERLGAGQFGEVWMGYYNGHTKVAVKSLKQGSMSPDAFLAEANLMKQLQHQRLVRLYAVVTQEPIYIITEYMENGSLVDFLKTPSGIKLTINKLLDMAAQIAEGMAFIEERNYIHRDLRAANILVSDTLSCKIADFGLARLIEDNEYTAREGAKFPIKWTAPEAINYGTFTIKSDVWSFGILLTEIVTHGRIPYPGMTNPEVIQNLERGYRMVRPDNCPEELYQLMRLCWKERPEDRPTFDYLRSVLEDFF. The pIC50 is 7.7. (6) The drug is CCN(CC)CCCCCOC(=O)C(C)(c1ccccc1)C1CCCCC1. The target protein (P00689) has sequence MKFVLLLSLIGFCWAQYDPHTADGRTAIVHLFEWRWADIAKECERYLAPKGFGGVQVSPPNENIIINNPSRPWWERYQPISYKICSRSGNENEFKDMVTRCNNVGVRIYVDAVINHMCGSGNSAGTHSTCGSYFNPNNREFSAVPYSAWYFNDNKCNGEINNYNDANQVRNCRLSGLLDLALDKDYVRTKVADYMNNLIDIGVAGFRLDAAKHMWPGDIKAVLDKLHNLNTKWFSQGSRPFIFQEVIDLGGEAIKGSEYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFVPTDRALVFVDNHDNQRGHGAGGASILTFWDARMYKMAVGFMLAHPYGFTRVMSSYRRTRNFQNGKDVNDWIGPPNNNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVAFRNVVNGQPFANWWDNGSNQVAFSRGNRGFIVFNNDDWALSSTLQTGLPAGTYCDVISGDKVNGNCTGLKVNVGSDGKAHFSISNSAEDPFI.... The pIC50 is 5.6. (7) The compound is C[C@@H]1[NH2+][C@H](COP(=O)([O-])OP(=O)([O-])OC[C@H]2O[C@@H](n3cnc4c(=O)[nH]c([NH3+])nc43)[C@H](O)[C@@H]2O)[C@@H](O)[C@@H]1O. The target protein (Q11128) has sequence MDPLGPAKPQWLWRRCLAGLLFQLLVAVCFFSYLRVSRDDATGSPRPGLMAVEPVTGAPNGSRCQDSMATPAHPTLLILLWTWPFNTPVALPRCSEMVPGAADCNITADSSVYPQADAVIVHHWDIMYNPSANLPPPTRPQGQRWIWFSMESPSNCRHLEALDGYFNLTMSYRSDSDIFTPYGWLEPWSGQPAHPPLNLSAKTELVAWAVSNWKPDSARVRYYQSLQAHLKVDVYGRSHKPLPKGTMMETLSRYKFYLAFENSLHPDYITEKLWRNALEAWAVPVVLGPSRSNYERFLPPDAFIHVDDFQSPKDLARYLQELDKDHARYLSYFRWRETLRPRSFSWALAFCKACWKLQQESRYQTVRSIAAWFT. The pIC50 is 4.1. (8) The drug is CCn1cc(Cc2cncnc2)c(=O)nc1SCCCCCCCC(=O)c1ccc(Cl)cc1. The target protein (Q13093) has sequence MVPPKLHVLFCLCGCLAVVYPFDWQYINPVAHMKSSAWVNKIQVLMAAASFGQTKIPRGNGPYSVGCTDLMFDHTNKGTFLRLYYPSQDNDRLDTLWIPNKEYFWGLSKFLGTHWLMGNILRLLFGSMTTPANWNSPLRPGEKYPLVVFSHGLGAFRTLYSAIGIDLASHGFIVAAVEHRDRSASATYYFKDQSAAEIGDKSWLYLRTLKQEEETHIRNEQVRQRAKECSQALSLILDIDHGKPVKNALDLKFDMEQLKDSIDREKIAVIGHSFGGATVIQTLSEDQRFRCGIALDAWMFPLGDEVYSRIPQPLFFINSEYFQYPANIIKMKKCYSPDKERKMITIRGSVHQNFADFTFATGKIIGHMLKLKGDIDSNVAIDLSNKASLAFLQKHLGLHKDFDQWDCLIEGDDENLIPGTNINTTNQHIMLQNSSGIEKYN. The pIC50 is 7.4. (9) The small molecule is O=C([O-])C1=CS[C@@H]2/C(=C\c3cnc4n3CCNC4)C(=O)N12. The target protein sequence is MSLNVKPSRIAILFSSCLVSISFFSQANTKGIDEIKDLETDFNGRIGVYALDTGSGKSFSYKANERFPLCSSFKGFLAAAVLKGSQDNQLNLNQIVNYNTRSLEFHSPITTKYKDNGMSLGDMAAAALQYSDNGATNIILERYIGGPEGMTKFMRSIGDKDFRLDRWELDLNTAIPGDERDTSTPAAVAKSLKTLALGNILNEREKETYQTWLKGNTTGAARIRASVPSDWVVGDKTGSCGAYGTANDYAVVWPKNRAPLIISVYTTKNEKEAKHEDKVIAEASRIAIDNLK. The pIC50 is 8.3. (10) The drug is CC1(C)CC[C@]2(C(=O)O)CC[C@]3(C)C(=CC[C@@H]4[C@@]5(C)C[C@@H](O)[C@H](O)C(C)(C)[C@@H]5CC[C@]43C)[C@@H]2C1. The target protein (P09811) has sequence MAKPLTDQEKRRQISIRGIVGVENVAELKKGFNRHLHFTLVKDRNVATPRDYYFALAHTVRDHLVGRWIRTQQHYYDKCPKRVYYLSLEFYMGRTLQNTMINLGLQNACDEAIYQLGLDMEELEEIEEDAGLGNGGLGRLAACFLDSMATLGLAAYGYGIRYEYGIFNQKIREGWQVEEADDWLRHGNPWEKARPEFMLPVHFYGRVEHTQAGTKWVDTQVVLALPYDTPVPGYMNNTVNTMRLWSARAPNDFNLQDFNVGDYIQAVLDRNLAENISRVLYPNDNFFEGKELRLKQEYFVVAATLQDVIRRFKASKFGSKDGVGTVFDAFPDQVAIQLNDTHPALAIPELMRIFVDIEKLPWSKAWEITKKTFAYTNHTVLPEALERWPVDLVEKLLPRHLQIIYEINQKHLDRIVALFPKDIDRMRRMSLIEEEGGKRINMAHLCIVGCHAVNGVAKIHSDIVKTQVFKDFSELEPDKFQNKTNGITPRRWLLLCNPGL.... The pIC50 is 4.0.