Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The pIC50 is 8.0. The compound is CC[C@](OC)(C(=O)OC)c1cc(F)cc(OCc2ccc3c(ccc(=O)n3C)c2)c1. The target protein (P48999) has sequence MPSYTVTVATGSQWFAGTDDYIYLSLIGSAGCSEKHLLDKAFYNDFERGAVDSYDVTVDEELGEIYLVKIEKRKYWLHDDWYLKYITLKTPHGDYIEFPCYRWITGEGEIVLRDGRAKLARDDQIHILKQHRRKELEARQKQYRWMEWNPGFPLSIDAKCHKDLPRDIQFDSEKGVDFVLNYSKAMENLFINRFMHMFQSSWHDFADFEKIFVKISNTISERVKNHWQEDLMFGYQFLNGCNPVLIKRCTALPPKLPVTTEMVECSLERQLSLEQEVQEGNIFIVDYELLDGIDANKTDPCTHQFLAAPICLLYKNLANKIVPIAIQLNQTPGESNPIFLPTDSKYDWLLAKIWVRSSDFHVHQTITHLLRTHLVSEVFGIAMYRQLPAVHPLFKLLVAHVRFTIAINTKAREQLICEYGLFDKANATGGGGHVQMVQRAVQDLTYSSLCFPEAIKARGMDSTEDIPFYFYRDDGLLVWEAIQSFTMEVVSIYYENDQVV.... (2) The drug is CCCc1cc(C)[nH]c(=O)c1CNC(=O)c1cc(-c2ccnc(N3CCN(C)CC3)c2)cc2c1cnn2C(C)C. The target protein (Q9UMN6) has sequence MAAAAGGGSCPGPGSARGRFPGRPRGAGGGGGRGGRGNGAERVRVALRRGGGATGPGGAEPGEDTALLRLLGLRRGLRRLRRLWAGPRVQRGRGRGRGRGWGPSRGCVPEEESSDGESDEEEFQGFHSDEDVAPSSLRSALRSQRGRAPRGRGRKHKTTPLPPPRLADVAPTPPKTPARKRGEEGTERMVQALTELLRRAQAPQAPRSRACEPSTPRRSRGRPPGRPAGPCRRKQQAVVVAEAAVTIPKPEPPPPVVPVKHQTGSWKCKEGPGPGPGTPRRGGQSSRGGRGGRGRGRGGGLPFVIKFVSRAKKVKMGQLSLGLESGQGQGQHEESWQDVPQRRVGSGQGGSPCWKKQEQKLDDEEEEKKEEEEKDKEGEEKEERAVAEEMMPAAEKEEAKLPPPPLTPPAPSPPPPLPPPSTSPPPPLCPPPPPPVSPPPLPSPPPPPAQEEQEESPPPVVPATCSRKRGRPPLTPSQRAEREAARAGPEGTSPPTPTPS.... The pIC50 is 4.0. (3) The compound is Nc1nc(-c2cc3ccc(O)cc3oc2=O)cs1. The target protein (P9WHH1) has sequence MTAPPVHDRAHHPVRDVIVIGSGPAGYTAALYAARAQLAPLVFEGTSFGGALMTTTDVENYPGFRNGITGPELMDEMREQALRFGADLRMEDVESVSLHGPLKSVVTADGQTHRARAVILAMGAAARYLQVPGEQELLGRGVSSCATCDGFFFRDQDIAVIGGGDSAMEEATFLTRFARSVTLVHRRDEFRASKIMLDRARNNDKIRFLTNHTVVAVDGDTTVTGLRVRDTNTGAETTLPVTGVFVAIGHEPRSGLVREAIDVDPDGYVLVQGRTTSTSLPGVFAAGDLVDRTYRQAVTAAGSGCAAAIDAERWLAEHAATGEADSTDALIGAQR. The pIC50 is 4.0. (4) The compound is Cc1nn(C)c(C)c1N[S+](=O)([O-])c1ccc(CCCN2CCN(C)CC2)cc1. The target protein (Q9UVX3) has sequence MSDSKDRKGKAPEGQSSEKKDGAVNITPQMAESLLENNPALRNETAGMDKDKAAEAMRKMNIAELLTGLSVSGKNQKDMASYKFWQTQPVPRFDETSTDTGGPIKIIDPEKVSKEPDALLEGFEWATLDLTNETELQELWDLLTYHYVEDDNAMFRFRYSQSFLHWALMSPGWKKEWHVGVRATKSRKLVASICGVPTEINVRNQKLKVVEINFLCIHKKLRSKRLTPVLIKEITRRCYLNGIYQAIYTAGVVLPTPVSSCRYYHRPLDWLKLYEVGFSPLPAGSTKARQITKNHLPSTTSTPGLRPMEPKDIDTVHDLLQRYLSRFALNQAFTREEVDHWLVHKPETVKEQVVWAYVVEDPETHKITDFFSFYNLESTVIQNPKHDNVRAAYLYYYATETAFTNNMKALKERLLMLMNDALILAKKAHFDVFNALTLHDNPLFLEQLKFGAGDGQLHFYLYNYRTAPVPGGVNEKNLPDEKRMGGVGIVML. The pIC50 is 5.0. (5) The drug is CC(=O)N1CCN(C(=O)/C=C/c2ccc(Sc3ccc4c(c3)OCCO4)c(Cl)c2)CC1. The target protein (P20701) has sequence MKDSCITVMAMALLSGFFFFAPASSYNLDVRGARSFSPPRAGRHFGYRVLQVGNGVIVGAPGEGNSTGSLYQCQSGTGHCLPVTLRGSNYTSKYLGMTLATDPTDGSILACDPGLSRTCDQNTYLSGLCYLFRQNLQGPMLQGRPGFQECIKGNVDLVFLFDGSMSLQPDEFQKILDFMKDVMKKLSNTSYQFAAVQFSTSYKTEFDFSDYVKRKDPDALLKHVKHMLLLTNTFGAINYVATEVFREELGARPDATKVLIIITDGEATDSGNIDAAKDIIRYIIGIGKHFQTKESQETLHKFASKPASEFVKILDTFEKLKDLFTELQKKIYVIEGTSKQDLTSFNMELSSSGISADLSRGHAVVGAVGAKDWAGGFLDLKADLQDDTFIGNEPLTPEVRAGYLGYTVTWLPSRQKTSLLASGAPRYQHMGRVLLFQEPQGGGHWSQVQTIHGTQIGSYFGGELCGVDVDQDGETELLLIGAPLFYGEQRGGRVFIYQRR.... The pIC50 is 7.4. (6) The drug is C=CCOc1ccc(C[C@@H](COP(=O)(O)O)NC(=O)CCCCCCC/C=C\CCCCCCCC)cc1. The target protein (Q92633) has sequence MAAISTSIPVISQPQFTAMNEPQCFYNESIAFFYNRSGKHLATEWNTVSKLVMGLGITVCIFIMLANLLVMVAIYVNRRFHFPIYYLMANLAAADFFAGLAYFYLMFNTGPNTRRLTVSTWLLRQGLIDTSLTASVANLLAIAIERHITVFRMQLHTRMSNRRVVVVIVVIWTMAIVMGAIPSVGWNCICDIENCSNMAPLYSDSYLVFWAIFNLVTFVVMVVLYAHIFGYVRQRTMRMSRHSSGPRRNRDTMMSLLKTVVIVLGAFIICWTPGLVLLLLDVCCPQCDVLAYEKFFLLLAEFNSAMNPIIYSYRDKEMSATFRQILCCQRSENPTGPTEGSDRSASSLNHTILAGVHSNDHSVV. The pIC50 is 5.0. (7) The compound is O=C1C=CC(=C(c2ccc(O)cc2)c2ccc(O)cc2)C=C1. The target protein sequence is MAEPRQEFEVMEDHAGTYGLGDRKDQGGYTMHQDQEGDTDAGLKESPLQTPTEDGSEEPGSETSDAKSTPTAEDVTAPLVDEGAPGKQAAAQPHTEIPEGTTAEEAGIGDTPSLEDEAAGHVTQARMVSKSKDGTGSDDKKAKGADGKTKIATPRGAAPPGQKGQANATRIPAKTPPAPKTPPSSGEPPKSGDRSGYSSPGSPGTPGSRSRTPSLPTPPTREPKKVAVVRTPPKSPSSAKSRLQTAPVPMPDLKNVKSKIGSTENLKHQPGGGKVQIINKKLDLSNVQSKCGSKDNIKHVPGGGSVQIVYKPVDLSKVTSKCGSLGNIHHKPGGGQVEVKSEKLDFKDRVQSKIGSLDNITHVPGGGNKKIETHKLTFRENAKAKTDHGAEIVYKSPVVSGDTSPRHLSNVSSTGSIDMVDSPQLATLADEVSASLAKQGL. The pIC50 is 5.0.