Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (Q67344) has sequence MQIAILVTTVTLHFNQYECDSLADNQVMPCEPIIIERNITEIIYLTNTTIEKEICPKLMEYRNWSRPQCKITGFAPFSKDNSIRLSAGGDIWVTREPYVSCDPGKCYQFALGQGTTLDNKHSNDTIHDRIPHRTLLMNELGVPFHLGTRQVCIAWSSSSCHDGKAWLHVCVTGDDKNATASFIYDGRLVDSMGSWSQNILRTQESECVCINGTCTVVMTDGSASGRADTRILFIEEGKIVHISPLSGSAQHVEECSCYPRYPSVRCICRDNWKGSNRPIVDINIKDYSIDSRYVCSGLVGDTPRNNDRSSSSDCKNPNNDKGNHGVKGWAFDDGNDVWMGRTISKDSRSGYETFKVIDGWSTPNSKSQINRQVIVDRDNRSGYSGIFSVESKGCINRCFYVELIRGRKQETRVWWTSSSIVVFCGTSGTYGKGSWPDGANINFMPI. The compound is CCOC(=O)Cc1csc(NC(=O)[C@@H](N)[C@H](C)CC)n1. The pIC50 is 3.9. (2) The small molecule is NCCCCCC(CCN)SC[C@H]1O[C@@H](n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1O. The target protein (P52788) has sequence MAAARHSTLDFMLGAKADGETILKGLQSIFQEQGMAESVHTWQDHGYLATYTNKNGSFANLRIYPHGLVLLDLQSYDGDAQGKEEIDSILNKVEERMKELSQDSTGRVKRLPPIVRGGAIDRYWPTADGRLVEYDIDEVVYDEDSPYQNIKILHSKQFGNILILSGDVNLAESDLAYTRAIMGSGKEDYTGKDVLILGGGDGGILCEIVKLKPKMVTMVEIDQMVIDGCKKYMRKTCGDVLDNLKGDCYQVLIEDCIPVLKRYAKEGREFDYVINDLTAVPISTSPEEDSTWEFLRLILDLSMKVLKQDGKYFTQGNCVNLTEALSLYEEQLGRLYCPVEFSKEIVCVPSYLELWVFYTVWKKAKP. The pIC50 is 7.3. (3) The compound is CC(=O)Nc1cc(Oc2ccc3c(C(=O)Nc4ccc(CN5CCN(C)CC5)c(C(F)(F)F)c4)cccc3c2)ncn1. The target protein sequence is MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRXDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFLHSHRVLHRDLKPQNLLINTEGAIKLCDFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pIC50 is 5.0. (4) The target protein sequence is MALSDLVLLRWLRDSRHSRKLILFIVFLALLLDNMLLTVVVPIIPSYLYSIKHEKNSTEIQTTRPELVVSTSESIFSYYNNSTVLITGNATGTLPGGQSHKATSTQHTVANTTVPSDCPSEDRDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFAGFCIMFISTVMFAFSSSYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGKPMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLKDPYILIAAGSICFANMAIAMLEPALPIWMMETMCSRKWQLGVAFLPASISYLIGTNIFGILAHKMGRWLCALLGMVIVGISILCIPFAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDIAFAPLCFFLRSPPAKEEKMAILMDHNCPIKRKMYTQNN.... The pIC50 is 8.3. The small molecule is COC(=O)[C@H]1[C@H]2C[C@@H]3c4[nH]c5cc(OC)ccc5c4CCN3C[C@H]2C[C@@H](OC(=O)c2cc(OC)c(OC)c(OC)c2)[C@@H]1OC. (5) The compound is C[C@@H](NC(=O)Cc1ccc(C2CC2)cc1)c1ccc(OCC(F)(F)F)cn1. The target protein (Q15878) has sequence MARFGEAVVARPGSGDGDSDQSRNRQGTPVPASGQAAAYKQTKAQRARTMALYNPIPVRQNCFTVNRSLFIFGEDNIVRKYAKKLIDWPPFEYMILATIIANCIVLALEQHLPEDDKTPMSRRLEKTEPYFIGIFCFEAGIKIVALGFIFHKGSYLRNGWNVMDFIVVLSGILATAGTHFNTHVDLRTLRAVRVLRPLKLVSGIPSLQIVLKSIMKAMVPLLQIGLLLFFAILMFAIIGLEFYSGKLHRACFMNNSGILEGFDPPHPCGVQGCPAGYECKDWIGPNDGITQFDNILFAVLTVFQCITMEGWTTVLYNTNDALGATWNWLYFIPLIIIGSFFVLNLVLGVLSGEFAKERERVENRRAFMKLRRQQQIERELNGYRAWIDKAEEVMLAEENKNAGTSALEVLRRATIKRSRTEAMTRDSSDEHCVDISSVGTPLARASIKSAKVDGVSYFRHKERLLRISIRHMVKSQVFYWIVLSLVALNTACVAIVHHNQ.... The pIC50 is 4.5. (6) The small molecule is Nc1ncnc2c1nc(NCc1ccc(-c3ccccc3)c(OCCCO)c1)n2[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O. The target protein (Q62773) has sequence MAKSEGRKSASQDTSENGMENPGLELMEVGNLEQGKTLEEVTQGHSLKDGLGHSSLWRRILQPFTKARSFYQRHAGLFKKILLGLLCLAYAAYLLAACILNFRRALALFVITCLVIFILACHFLKKFFAKKSIRCLKPLKNTRLRLWLKRVFMGAAVVGLILWLALDTAQRPEQLISFAGICMFILILFACSKHHSAVSWRTVFWGLGLQFVFGILVIRTEPGFNAFQWLGDQIQIFLAYTVEGSSFVFGDTLVQSVFAFQSLPIIIFFGCVMSILYYLGLVQWVIQKIAWFLQITMGTTAAETLAVAGNIFVGMTEAPLLIRPYLADMTLSEIHAVMTGGFATIAGTVLGAFISFGIDASSLISASVMAAPCALALSKLVYPEVEESKFKSKEGVKLPRGEERNILEAASNGATDAIALVANVAANLIAFLAVLAFINSTLSWLGEMVDIHGLTFQVICSYVLRPMVFMMGVQWADCPLVAEIVGVKFFINEFVAYQQL.... The pIC50 is 4.3. (7) The compound is CC12CCC(=O)C=C1CCC1C2CCC2(C)C1CC[C@@H]2C(=O)O. The target protein (O35627) has sequence MTAMLTLETMASEEEYGPRNCVVCGDRATGYHFHALTCEGCKGFFRRTVSKTIGPICPFAGRCEVSKAQRRHCPACRLQKCLNVGMRKDMILSAEALALRRARQAQRRAEKASLQLNQQQKELVQILLGAHTRHVGPMFDQFVQFKPPAYLFMHHRPFQPRGPVLPLLTHFADINTFMVQQIIKFTKDLPLFRSLTMEDQISLLKGAAVEILHISLNTTFCLQTENFFCGPLCYKMEDAVHAGFQYEFLESILHFHKNLKGLHLQEPEYVLMAATALFSPDRPGVTQREEIDQLQEEMALILNNHIMEQQSRLQSRFLYAKLMGLLADLRSINNAYSYELQRLEELSAMTPLLGEICS. The pIC50 is 4.4. (8) The drug is C/C1=C\[C@@H](C)[C@@H](C)OC(=O)C[C@H](c2ccc(O)cc2)NC(=O)[C@@H](Cc2c[nH]c3ccccc23)N(C)C(=O)[C@H](C)NC(=O)[C@@H](C)C1. The target protein (P60010) has sequence MDSEVAALVIDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGIMVGMGQKDSYVGDEAQSKRGILTLRYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPMNPKSNREKMTQIMFETFNVPAFYVSIQAVLSLYSSGRTTGIVLDSGDGVTHVVPIYAGFSLPHAILRIDLAGRDLTDYLMKILSERGYSFSTTAEREIVRDIKEKLCYVALDFEQEMQTAAQSSSIEKSYELPDGQVITIGNERFRAPEALFHPSVLGLESAGIDQTTYNSIMKCDVDVRKELYGNIVMSGGTTMFPGIAERMQKEITALAPSSMKVKIIAPPERKYSVWIGGSILASLTTFQQMWISKQEYDESGPSIVHHKCF. The pIC50 is 4.8.