Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The target protein (P54687) has sequence MKDCSNGCSAECTGEGGSKEVVGTFKAKDLIVTPATILKEKPDPNNLVFGTVFTDHMLTVEWSSEFGWEKPHIKPLQNLSLHPGSSALHYAVELFEGLKAFRGVDNKIRLFQPNLNMDRMYRSAVRATLPVFDKEELLECIQQLVKLDQEWVPYSTSASLYIRPTFIGTEPSLGVKKPTKALLFVLLSPVGPYFSSGTFNPVSLWANPKYVRAWKGGTGDCKMGGNYGSSLFAQCEAVDNGCQQVLWLYGEDHQITEVGTMNLFLYWINEDGEEELATPPLDGIILPGVTRRCILDLAHQWGEFKVSERYLTMDDLTTALEGNRVREMFGSGTACVVCPVSDILYKGETIHIPTMENGPKLASRILSKLTDIQYGREESDWTIVLS. The drug is CCCCc1cc(=O)n2nc(NCc3c(F)cc(Cl)cc3F)c(C#N)c2[nH]1. The pIC50 is 7.2. (2) The compound is Cc1ccc(-c2c(CN)c(CC(C)C)nc3ccc(N4CC(=O)NCC4=O)cc23)cc1. The target protein (Q9EPB1) has sequence MGLHPCSPVDHGVPSWVLVLLLTLGLCSLQATADSVLDPDFRENYFEQYMDHFNFESFSNKTFGQRFLVSDKFWKMGEGPIFFYTGNEGDIWSLANNSGFIVELAAQQEALLVFAEHRYYGKSLPFGVQSTQRGYTQLLTVEQALADFAVLLQALRHNLGVQDAPTIAFGGSYGGMLSAYMRMKYPHLVAGALAASAPVIAVAGLGNPDQFFRDVTADFYGQSPKCAQAVRDAFQQIKDLFLQGAYDTISQNFGTCQSLSSPKDLTQLFGFARNAFTVLAMMDYPYPTNFLGPLPANPVKVGCERLLSEGQRIMGLRALAGLVYNSSGMEPCFDIYQMYQSCADPTGCGTGSNARAWDYQACTEINLTFDSNNVTDMFPEIPFSDELRQQYCLDTWGVWPRPDWLQTSFWGGDLKAASNIIFSNGDLDPWAGGGIQRNLSTSIIAVTIQGGAHHLDLRASNSEDPPSVVEVRKLEATLIREWVAAARLKQPAEAQWPGPK.... The pIC50 is 4.7. (3) The drug is O=C(NCCN1CCC(n2c(=O)[nH]c3ccccc32)CC1)c1ccc(F)cc1. The target protein (O14939) has sequence MTATPESLFPTGDELDSSQLQMESDEVDTLKEGEDPADRMHPFLAIYELQSLKVHPLVFAPGVPVTAQVVGTERYTSGSKVGTCTLYSVRLTHGDFSWTTKKKYRHFQELHRDLLRHKVLMSLLPLARFAVAYSPARDAGNREMPSLPRAGPEGSTRHAASKQKYLENYLNRLLTMSFYRNYHAMTEFLEVSQLSFIPDLGRKGLEGMIRKRSGGHRVPGLTCCGRDQVCYRWSKRWLVVKDSFLLYMCLETGAISFVQLFDPGFEVQVGKRSTEARHGVRIDTSHRSLILKCSSYRQARWWAQEITELAQGPGRDFLQLHRHDSYAPPRPGTLARWFVNGAGYFAAVADAILRAQEEIFITDWWLSPEVYLKRPAHSDDWRLDIMLKRKAEEGVRVSILLFKEVELALGINSGYSKRALMLLHPNIKVMRHPDQVTLWAHHEKLLVVDQVVAFLGGLDLAYGRWDDLHYRLTDLGDSSESAASQPPTPRPDSPATPDLS.... The pIC50 is 5.8. (4) The compound is C[C@]12C(=O)OC(=O)[C@@]1(C)[C@@H]1CC[C@H]2O1. The target protein (Q76MZ3) has sequence MAAADGDDSLYPIAVLIDELRNEDVQLRLNSIKKLSTIALALGVERTRSELLPFLTDTIYDEDEVLLALAEQLGTFTTLVGGPEYVHCLLPPLESLATVEETVVRDKAVESLRAISHEHSPSDLEAHFVPLVKRLAGGDWFTSRTSACGLFSVCYPRVSSAVKAELRQYFRNLCSDDTPMVRRAAASKLGEFAKVLELDNVKSEIIPMFSNLASDEQDSVRLLAVEACVNIAQLLPQEDLEALVMPTLRQAAEDKSWRVRYMVADKFTELQKAVGPEITKTDLVPAFQNLMKDCEAEVRAAASHKVKEFCENLSADCRENVIMTQILPCIKELVSDANQHVKSALASVIMGLSPILGKDNTIEHLLPLFLAQLKDECPEVRLNIISNLDCVNEVIGIRQLSQSLLPAIVELAEDAKWRVRLAIIEYMPLLAGQLGVEFFDEKLNSLCMAWLVDHVYAIREAATSNLKKLVEKFGKEWAHATIIPKVLAMSGDPNYLHRMT.... The pIC50 is 5.1. (5) The target protein sequence is MRFKKISCLLLSPLFIFSTSIYAGNTPKDQEIKKLVDQNFKPLLEKYDVPGMAVGVIQNNKKYEIYYGLQSVQDKKAVNSSTIFELGSVSKLFTATAGGYAKTKGTISFKDTPGKYWKELKNTPIDQVNLLQLATYTSGNLALQFPDEVQTDQQVLTFFKDWKPKNPIGEYRQYSNPSIGLFGKVVALSMNKPFDQVLEKTIFPGLSLKHSYVNVPKTQMQNYAFGYNQENQPIRVNPGPLDAPAYGVKSTLPDMLKFINANLNPQKYPADIQRAINETHQGFYQVGTMYQALGWEEFSYPAPLQTLLDSNSEQIVMKPNKVTAISKEPSVKMFHKTGSTNGFGTYVVFIPKENIGLVMLTNKRIPNEERFKAAYAVLNAIKK. The compound is CC1(C)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The pIC50 is 6.0. (6) The pIC50 is 7.1. The target protein (P14422) has sequence HLLDFRKMIRYTTGKEATTSYGAYGCHCGVGGRGAPKAKFLSYKFSMKKAAAACFKYQFYPNNRCG. The drug is CCCCCCCCCCCC(=O)c1c(O)c(C(=O)CCCCCCCCCCC)c(O)c(C2CC(c3ccc(O)cc3)Oc3cc(O)ccc32)c1O.