Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is CS[C@@H]1O[C@H](CO)[C@H](O)[C@H](n2cc(C3(O)CCCCC3)nn2)[C@H]1O. The target protein (O00182) has sequence MAFSGSQAPYLSPAVPFSGTIQGGLQDGLQITVNGTVLSSSGTRFAVNFQTGFSGNDIAFHFNPRFEDGGYVVCNTRQNGSWGPEERKTHMPFQKGMPFDLCFLVQSSDFKVMVNGILFVQYFHRVPFHRVDTISVNGSVQLSYISFQNPRTVPVQPAFSTVPFSQPVCFPPRPRGRRQKPPGVWPANPAPITQTVIHTVQSAPGQMFSTPAIPPMMYPHPAYPMPFITTILGGLYPSKSILLSGTVLPSAQRFHINLCSGNHIAFHLNPRFDENAVVRNTQIDNSWGSEERSLPRKMPFVRGQSFSVWILCEAHCLKVAVDGQHLFEYYHRLRNLPTINRLEVGGDIQLTHVQT. The pKd is 3.0. (2) The drug is CC(C)[C@@H](C(=O)N[C@H](C)Cc1ccc(O)c(C(C)(C)C)c1)N(C)C(=O)[C@@H](N)Cc1ccccc1. The target protein sequence is MGSPWNGSDGPEDAREPPWAALPPCDERRCSPFPLGTLVPVTAVCLGLFAVGVSGNVVTVLLIGRYRDMRTTTNLYLGSMAVSDLLILLGLPFDLYRLWRSRPWVFGQLLCRLSLYVGEGCTYASLLHMTALSVERYLAICRPLRARVLVTRRRVRALIAALWAVALLSAGPFFFLVGVEQDPAVFAAPDRNGTVPLDPSSPAPASPPSGPGAEAAALFSRECRPSRAQLGLLRVMLWVTTAYFFLPFLCLSILYGLIARQLWRGRGPLRGPAATGRERGHRQTVRVLLVVVLAFIVCWLPFHVGRIIYINTQDSRMMYFSQYFNIVALQLFYLSASINPILYNLISKKYRAAARRLLRESRAGPSGVCGSRGPEQDVAGDTGGDTAGCTETSANTKTAA. The pKd is 8.4. (3) The compound is CC[C@H](C)[C@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)CCSC)C(=O)O. The target protein (P45040) has sequence MAIYADNSYSIGNTPLVRLKHFGHNGNVVVKIEGRNPSYSVKCRIGANMVWQAEKDGTLTKGKEIVDATSGNTGIALAYVAAARGYKITLTMPETMSLERKRLLCGLGVNLVLTEGAKGMKGAIAKAEEIVASDPSRYVMLKQFENPANPQIHRETTGPEIWKDTDGKVDVVVAGVGTGGSITGISRAIKLDFGKQITSVAVEPVESPVISQTLAGEEVKPGPHKIQGIGAGFIPKNLDLSIIDRVETVDSDTALATARRLMAEEGILAGISSGAAVAAADRLAKLPEFADKLIVVILPSASERYLSTALFEGIEG. The pKd is 4.6. (4) The compound is Cn1cc(C2=C(c3cn(C4CCN(Cc5ccccn5)CC4)c4ccccc34)C(=O)NC2=O)c2ccccc21. The target protein (Q9UF33) has sequence MGGCEVREFLLQFGFFLPLLTAWPGDCSHVSNNQVVLLDTTTVLGELGWKTYPLNGWDAITEMDEHNRPIHTYQVCNVMEPNQNNWLRTNWISRDAAQKIYVEMKFTLRDCNSIPWVLGTCKETFNLFYMESDESHGIKFKPNQYTKIDTIAADESFTQMDLGDRILKLNTEIREVGPIERKGFYLAFQDIGACIALVSVRVFYKKCPFTVRNLAMFPDTIPRVDSSSLVEVRGSCVKSAEERDTPKLYCGADGDWLVPLGRCICSTGYEEIEGSCHACRPGFYKAFAGNTKCSKCPPHSLTYMEATSVCQCEKGYFRAEKDPPSMACTRPPSAPRNVVFNINETALILEWSPPSDTGGRKDLTYSVICKKCGLDTSQCEDCGGGLRFIPRHTGLINNSVIVLDFVSHVNYTFEIEAMNGVSELSFSPKPFTAITVTTDQDAPSLIGVVRKDWASQNSIALSWQAPAFSNGAILDYEIKYYEKEHEQLTYSSTRSKAPSV.... The pKd is 5.0. (5) The drug is CO[C@@H]1[C@H](N(C)C(=O)c2ccccc2)C[C@H]2O[C@]1(C)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)NC4. The target protein (Q96L34) has sequence MSSRTVLAPGNDRNSDTHGTLGSGRSSDKGPSWSSRSLGARCRNSIASCPEEQPHVGNYRLLRTIGKGNFAKVKLARHILTGREVAIKIIDKTQLNPSSLQKLFREVRIMKGLNHPNIVKLFEVIETEKTLYLVMEYASAGEVFDYLVSHGRMKEKEARAKFRQIVSAVHYCHQKNIVHRDLKAENLLLDAEANIKIADFGFSNEFTLGSKLDTFCGSPPYAAPELFQGKKYDGPEVDIWSLGVILYTLVSGSLPFDGHNLKELRERVLRGKYRVPFYMSTDCESILRRFLVLNPAKRCTLEQIMKDKWINIGYEGEELKPYTEPEEDFGDTKRIEVMVGMGYTREEIKESLTSQKYNEVTATYLLLGRKTEEGGDRGAPGLALARVRAPSDTTNGTSSSKGTSHSKGQRSSSSTYHRQRRHSDFCGPSPAPLHPKRSPTSTGEAELKEERLPGRKASCSTAGSGSRGLPPSSPMVSSAHNPNKAEIPERRKDSTSTPNN.... The pKd is 6.4. (6) The drug is CC(C)CCOc1cc(OCCCN)cc(-c2ccc(CCC(=O)O)c(CCC(N)=O)c2)c1. The target protein (Q99062) has sequence MARLGNCSLTWAALIILLLPGSLEECGHISVSAPIVHLGDPITASCIIKQNCSHLDPEPQILWRLGAELQPGGRQQRLSDGTQESIITLPHLNHTQAFLSCCLNWGNSLQILDQVELRAGYPPAIPHNLSCLMNLTTSSLICQWEPGPETHLPTSFTLKSFKSRGNCQTQGDSILDCVPKDGQSHCCIPRKHLLLYQNMGIWVQAENALGTSMSPQLCLDPMDVVKLEPPMLRTMDPSPEAAPPQAGCLQLCWEPWQPGLHINQKCELRHKPQRGEASWALVGPLPLEALQYELCGLLPATAYTLQIRCIRWPLPGHWSDWSPSLELRTTERAPTVRLDTWWRQRQLDPRTVQLFWKPVPLEEDSGRIQGYVVSWRPSGQAGAILPLCNTTELSCTFHLPSEAQEVALVAYNSAGTSRPTPVVFSESRGPALTRLHAMARDPHSLWVGWEPPNPWPQGYVIEWGLGPPSASNSNKTWRMEQNGRATGFLLKENIRPFQLY.... The pKd is 4.7.