This data is from Drug-target binding data from BindingDB using Kd measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is NS(=O)(=O)c1ccc(C(=O)NCc2cccc(F)c2F)cc1. The target protein sequence is SHHWGYGKHNGPEHWHKDFPIAKGERQSPVDIDTHTAKYDPSLKPLSVSYDQATSLRILNNGHAFNVEFDDSQDKAVLKGGPLDGTYRLIQFHFHWGSLDGQGSEHTVDKKKYAAELHLVHWNTKYGDVGKAVQQPDGLAVLGIFLKVGSAKPGLQKVVDVLDSIKTKGKSADFTNFDPRGLLPESLDYWTYPGSLTTPPLLECVTWIVLKEPISVSSEQVLKFRKLNFNGEGEPEELMVDNWRPAQPLKNRQIKASFK. The pKd is 8.8. (2) The drug is COc1cc2cc(-c3cccc(-c4ccc(-c5ccccc5)cc4)c3)[n+](C)c(C)c2cc1OC. The target protein (P0A031) has sequence MLEFEQGFNHLATLKVIGVGGGGNNAVNRMIDHGMNNVEFIAINTDGQALNLSKAESKIQIGEKLTRGLGAGANPEIGKKAAEESREQIEDAIQGADMVFVTSGMGGGTGTGAAPVVAKIAKEMGALTVGVVTRPFSFEGRKRQTQAAAGVEAMKAAVDTLIVIPNDRLLDIVDKSTPMMEAFKEADNVLRQGVQGISDLIAVSGEVNLDFADVKTIMSNQGSALMGIGVSSGENRAVEAAKKAISSPLLETSIVGAQGVLMNITGGESLSLFEAQEAADIVQDAADEDVNMIFGTVINPELQDEIVVTVIATGFDDKPTSHGRKSGSTGFGTSVNTSSNATSKDESFTSNSSNAQATDSVSERTHTTKEDDIPSFIRNREERRSRRTRR. The pKd is 5.7. (3) The small molecule is COc1ccc(COc2ccc(Cc3cnc(N)nc3N)cc2OC)cc1. The target is PFCDPK1(Pfalciparum). The pKd is 5.0. (4) The small molecule is N=C(N)NCCCC(NC(=O)C1CC2CCCCC2N1C(=O)C1Cc2ccccc2CN1C(=O)C(Cc1cccs1)NC(=O)CCCCCCCCCN)C(=O)O. The target protein (Q28642) has sequence MLNITSQVLAPALNGSVSQSSGCPNTEWSGWLNVIQAPFLWVLFVLATLENLFVLSVFCLHKSSCTVAEVYLGNLAAADLILACGLPFWAVTIANHFDWLFGEALCRVVNTMIYMNLYSSICFLMLVSIDRYLALVKTMSIGRMRRVRWAKLYSLVIWGCTLLLSSPMLVFRTMKDYRDEGYNVTACIIDYPSRSWEVFTNVLLNLVGFLLPLSVITFCTVQILQVLRNNEMQKFKEIQTERRATVLVLAVLLLFVVCWLPFQVSTFLDTLLKLGVLSSCWDEHVIDVITQVGSFMGYSNSCLNPLVYVIVGKRFRKKSREVYRAACPKAGCVLEPVQAESSMGTLRTSISVERQIHKLPEWTRSSQ. The pKd is 6.3. (5) The target protein sequence is MKQKPAFIPYAGAQFEPEEMLSKSAEYYQFMDHRRTVREFSNRAIPLEVIENIVMTASTAPSGAHKQPWTFVVVSDPQIKAKIRQAAEKEEFESYNGRMSNEWLEDLQPFGTDWHKPFLEIAPYLIVVFRKAYDVLPDGTQRKNYYVQESVGIACGFLLAAIHQAGLVALTHTPSPMNFLQKILQRPENERPFLLVPVGYPAEGAMVPDLQRKDKAAVMVVYHHHHHH. The pKd is 4.0. The compound is Nc1ccc(O)c(I)c1. (6) The small molecule is NC(=O)c1ncn([C@@H]2O[C@H](COP(=O)(O)O)[C@@H](O)[C@H]2O)c1N. The target protein (P62959) has sequence MADEIAKAQVAQPGGDTIFGKIIRKEIPAKIIFEDDRCLAFHDISPQAPTHFLVIPKKHISQISVADDDDESLLGHLMIVGKKCAADLGLKRGYRMVVNEGADGGQSVYHIHLHVLGGRQMNWPPG. The pKd is 2.9. (7) The small molecule is CC(O)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCNC(N)=[NH2+])NC(=O)[C@H](CCCCNC(=O)C[C@@H](NC(=O)CCCCCNC(=O)[C@H]1O[C@@H](n2cc(I)c3c(N)ncnc32)[C@H](O)[C@@H]1O)C(=O)[O-])NC(=O)[C@H](CCCNC(N)=[NH2+])NC(=O)[C@H](C)[NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](CCCCNC(=O)c1ccc(-c2c3ccc(=[N+](C)C)cc-3oc3cc(N(C)C)ccc23)c(C(=O)O)c1)C(N)=O. The target protein sequence is KGPVPFSHCLPTEKLQRCEKIGEGVFGEVFQTIADHTPVAIKIIAIEGPDLVNGSHQKTFEEILPEIIISKELSLLSGEVCNRTEGFIGLNSVHCVQGSYPPLLLKAWDHYNSTKGSANDRPDFFKDDQLFIVLEFEFGGIDLEQMRTKLSSLATAKSILHQLTASLAVAEASLRFEHRDLHWGNVLLKKTSLKKLHYTLNGKSSTIPSCGLQVSIIDYTLSRLERDGIVVFCDVSMDEDLFTGDGDYQFDIYRLMKKENNNRWGEYHPYSNVLWLHYLTDKMLKQMTFKTKCNTPAMKQIKRKIQEFHRTMLNFSSATDLLCQHSLFK. The pKd is 9.2. (8) The small molecule is CO[C@@H]1[C@H](N(C)C(=O)c2ccccc2)C[C@H]2O[C@]1(C)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)N[C@H]4O. The target protein (O15197) has sequence MATEGAAQLGNRVAGMVCSLWVLLLVSSVLALEEVLLDTTGETSEIGWLTYPPGGWDEVSVLDDQRRLTRTFEACHVAGAPPGTGQDNWLQTHFVERRGAQRAHIRLHFSVRACSSLGVSGGTCRETFTLYYRQAEEPDSPDSVSSWHLKRWTKVDTIAADESFPSSSSSSSSSSSAAWAVGPHGAGQRAGLQLNVKERSFGPLTQRGFYVAFQDTGACLALVAVRLFSYTCPAVLRSFASFPETQASGAGGASLVAAVGTCVAHAEPEEDGVGGQAGGSPPRLHCNGEGKWMVAVGGCRCQPGYQPARGDKACQACPRGLYKSSAGNAPCSPCPARSHAPNPAAPVCPCLEGFYRASSDPPEAPCTGPPSAPQELWFEVQGSALMLHWRLPRELGGRGDLLFNVVCKECEGRQEPASGGGGTCHRCRDEVHFDPRQRGLTESRVLVGGLRAHVPYILEVQAVNGVSELSPDPPQAAAINVSTSHEVPSAVPVVHQVSRA.... The pKd is 5.0. (9) The drug is CCCCCO[C@H]1O[C@H](COS(=O)(=O)[O-])[C@@H](O[C@@H]2O[C@@H](C(=O)[O-])[C@@H](O[C@H]3O[C@H](COS(=O)(=O)[O-])[C@@H](O[C@@H]4O[C@@H](C(=O)[O-])[C@@H](O[C@H]5O[C@H](COS(=O)(=O)[O-])[C@@H](O[C@@H]6O[C@@H](C(=O)[O-])[C@@H](O)[C@H](O)[C@H]6OS(=O)(=O)[O-])[C@H](O)[C@H]5NC(C)=O)[C@H](O)[C@H]4OS(=O)(=O)[O-])[C@H](O)[C@H]3NC(C)=O)[C@H](O)[C@H]2OS(=O)(=O)[O-])[C@H](O)[C@H]1NC(C)=O.[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+].[Na+]. The target protein (Q15465) has sequence MLLLARCLLLVLVSSLLVCSGLACGPGRGFGKRRHPKKLTPLAYKQFIPNVAEKTLGASGRYEGKISRNSERFKELTPNYNPDIIFKDEENTGADRLMTQRCKDKLNALAISVMNQWPGVKLRVTEGWDEDGHHSEESLHYEGRAVDITTSDRDRSKYGMLARLAVEAGFDWVYYESKAHIHCSVKAENSVAAKSGGCFPGSATVHLEQGGTKLVKDLSPGDRVLAADDQGRLLYSDFLTFLDRDDGAKKVFYVIETREPRERLLLTAAHLLFVAPHNDSATGEPEASSGSGPPSGGALGPRALFASRVRPGQRVYVVAERDGDRRLLPAAVHSVTLSEEAAGAYAPLTAQGTILINRVLASCYAVIEEHSWAHRAFAPFRLAHALLAALAPARTDRGGDSGGGDRGGGGGRVALTAPGAADAPGAGATAGIHWYSQLLYQIGTWLLDSEALHPLGMAVKSS. The pKd is 4.6. (10) The drug is Cn1cnc2c(F)c(Nc3ccc(Br)cc3Cl)c(C(=O)NOCCO)cc21. The target protein (Q8IVW4) has sequence MEMYETLGKVGEGSYGTVMKCKHKNTGQIVAIKIFYERPEQSVNKIAMREIKFLKQFHHENLVNLIEVFRQKKKIHLVFEFIDHTVLDELQHYCHGLESKRLRKYLFQILRAIDYLHSNNIIHRDIKPENILVSQSGITKLCDFGFARTLAAPGDIYTDYVATRWYRAPELVLKDTSYGKPVDIWALGCMIIEMATGNPYLPSSSDLDLLHKIVLKVGNLSPHLQNIFSKSPIFAGVVLPQVQHPKNARKKYPKLNGLLADIVHACLQIDPADRISSSDLLHHEYFTRDGFIEKFMPELKAKLLQEAKVNSLIKPKESSKENELRKDERKTVYTNTLLSSSVLGKEIEKEKKPKEIKVRVIKVKGGRGDISEPKKKEYEGGLGQQDANENVHPMSPDTKLVTIEPPNPINPSTNCNGLKENPHCGGSVTMPPINLTNSNLMAANLSSNLFHPSVRLTERAKKRRTSSQSIGQVMPNSRQEDPGPIQSQMEKGIFNERTGH.... The pKd is 5.0.