This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is O=C(NCCCCNc1c2c(nc3ccccc13)CCCC2)c1cccc(F)c1. The target protein sequence is MQSWGTIICIRILLRFLLLWVLIGNSHTEEDIIITTKNGKVRGMNLPVLGGTVTAFLGIPYAQPPLGRLRFKKPQSLTKWSNIWNATKYANSCYQNTDQSFPGFLGSEMWNPNTELSEDCLYLNVWIPAPKPKNATVMIWIYGGGFQTGTSSLPVYDGKFLARVERVIVVSMNYRVGALGFLALSENPEAPGNMGLFDQQLALQWVQKNIAAFGGNPRSVTLFGESAGAASVSLHLLSPRSQPLFTRAILQSGSSNAPWAVTSLYEARNRTLTLAKRMGCSRDNETEMIKCLRDKDPQEILLNEVFVVPYDTLLSVNFGPTVDGDFLTDMPDTLLQLGQFKRTQILVGVNKDEGTAFLVYGAPGFSKDNNSIITRKEFQEGLKIFFPRVSEFGRESILFHYMDWLDDQRAENYREALDDVVGDYNIICPALEFTKKFSELGNDAFFYYFEHRSTKLPWPEWMGVMHGYEIEFVFGLPLERRVNYTKAEEILSRSIMKRWA.... The pIC50 is 8.9. (2) The drug is OCCN1CCN(CCCN2c3ccccc3Sc3ccc(Cl)cc32)CC1. The target protein (Q9Z0U5) has sequence MDPPQLLFYVNGQKVVENNVDPEMMLLPYLRKNLRLTGTKYGCGGGGCGACTVMISRYNPSTKSIRHHPVNACLTPICSLYGTAVTTVEGIGNTRTRLHPVQERIAKCHSTQCGFCTPGMVMSMYALLRNHPEPSLDQLTDALGGNLCRCTGYRPIIDACKTFCRASGCCESKENGVCCLDQGINGSAEFQEGDETSPELFSEKEFQPLDPTQELIFPPELMRIAEKQPPKTRVFYSNRMTWISPVTLEELVEAKFKYPGAPIVMGYTSVGPEVKFKGVFHPIIISPDRIEELSIINQTGDGLTLGAGLSLDQVKDILTDVVQKLPEETTQTYRALLKHLRTLAGSQIRNMASLGGHIVSRHLDSDLNPLLAVGNCTLNLLSKDGKRQIPLSEQFLRKCPDSDLKPQEVLVSVNIPCSRKWEFVSAFRQAQRQQNALAIVNSGMRVLFREGGGVIKELSILYGGVGPTTIGAKNSCQKLIGRPWNEEMLDTACRLVLDEV.... The pIC50 is 6.1. (3) The small molecule is O=C(Nc1ccc(S(=O)(=O)N2CCCC2)cc1)c1ccc(CN2CCc3ccccc3C2)cc1. The target protein (P01375) has sequence MSTESMIRDVELAEEALPKKTGGPQGSRRCLFLSLFSFLIVAGATTLFCLLHFGVIGPQREEFPRDLSLISPLAQAVRSSSRTPSDKPVAHVVANPQAEGQLQWLNRRANALLANGVELRDNQLVVPSEGLYLIYSQVLFKGQGCPSTHVLLTHTISRIAVSYQTKVNLLSAIKSPCQRETPEGAEAKPWYEPIYLGGVFQLEKGDRLSAEINRPDYLDFAESGQVYFGIIAL. The pIC50 is 6.2. (4) The small molecule is CO[C@H]1CC[C@]2(CC1)Cc1ccc(-c3ccc(C#N)s3)cc1C21N=C(N)N(C(C)C)C1=O. The target protein sequence is MAQALPWLLLWMGAGVLPAHGTQHGIRLPLRSGLGGAPLGLRLPRETDEEPEEPGRRGSFVEMVDNLRGKSGQGYYVEMTVGSPPQTLNILVDTGSSNFAVGAAPHPFLHRYYQRQLSSTYRDLRKGVYVPYTQGKWEGELGTDLPDDSLEPFFDSLVKQTHVPNLFSLQLCGAGFPLNQSEVLASVGGSMIIGGIDHSLYTGSLWYTPIRREWYYEVIIVRVEINGQDLKMDCKEYNYDKSIVDSGTTNLRLPKKVFEAAVKSIKAASSTEKFPDGFWLGEQLVCWQAGTTPWNIFPVISLYLMGEVTNQSFRITILPQQYLRPVEDVATSQDDCYKFAISQSSTGTVMGAVIMEGFYVVFDRARKRIGFAVSACHVHDEFRTAAVEGPFVTLDMEDCGYNIPQTDESTLMTIAYVMAAICALFMLPLCLMVCQWCCLRCLRQQHDDFADDISLLK. The pIC50 is 6.3. (5) The drug is COc1ccc(S(=O)(=O)N(CC(C)C)C[C@@H](O)[C@H](Cc2ccccc2)n2cc(COC(=O)N[C@H]3c4ccccc4C[C@H]3O)nn2)cc1. The target protein sequence is PQITLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMNLPGRWKPKMIGGIGGFIKVRQYDQILIEICGKKAIGTVLVGPTPVNIIGRNLLTQIGCTLNF. The pIC50 is 7.3. (6) The drug is CC(=O)C(C#N)C(=O)Nc1cc(Br)ccc1Br. The target protein (P70032) has sequence MAQVAGKKLTVAPEAAKPPGIPGSSSAVKEIPEILVDPRTRRRYLRGRFLGKGGFAKCYEITDLESREVFAGKIVPKTMLLKPHQKDKMTMEIAIQRSLDHRHVVGFHGFFEDNDFVYVVLELCRRRSLLELHKRRKAVTEPEARYYLKQTISGCQYLHSNRVIHRDLKLGNLFLNDEMEVKIGDFGLATKVEYDGERKKTLCGTPNYIAPEVLGKKGHSFEVDIWSIGCIMYTLLVGKPPFETSCLKETYMRIKKNEYSIPKHINPVAAALIQKMLRSDPTSRPTIDDLLNDEFFTSGYIPSRLPTTCLTVPPRFSIAPSTIDQSLRKPLTAINKGQDSPLVEKQVAPAKEEEMQQPEFTEPADCYLSEMLQQLTCLNAVKPSERALIRQEEAEDPASIPIFWISKWVDYSDKYGLGYQLCDNSVGVLFNDSTRLIMYNDGDSLQYIERNNTESYLNVRSYPTTLTKKITLLKYFRNYMSEHLLKAGANTTPREGDELA.... The pIC50 is 5.0. (7) The compound is CC#CC#CC/C=C\CCC/C=C/C=C/C(=O)NCC(C)C. The target protein (O35678) has sequence MPEASSPRRTPQNVPYQDLPHLVNADGQYLFCRYWKPSGTPKALIFVSHGAGEHCGRYDELAHMLKGLDMLVFAHDHVGHGQSEGERMVVSDFQVFVRDVLQHVDTIQKDYPDVPIFLLGHSMGGAISILVAAERPTYFSGMVLISPLVLANPESASTLKVLAAKLLNFVLPNMTLGRIDSSVLSRNKSEVDLYNSDPLVCRAGLKVCFGIQLLNAVARVERAMPRLTLPFLLLQGSADRLCDSKGAYLLMESSRSQDKTLKMYEGAYHVLHRELPEVTNSVLHEVNSWVSHRIAAAGAGCPP. The pIC50 is 4.0.