Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is Nc1nc2c(nc(Cl)n2[C@@H]2O[C@H](COP(=O)([O-])OP(=O)([O-])OP(=O)([O-])O)C(O)C2O)c(=O)[nH]1. The target protein (O66809) has sequence MEEFVNPCKIKVIGVGGGGSNAVNRMYEDGIEGVELYAINTDVQHLSTLKVPNKIQIGEKVTRGLGAGAKPEVGEEAALEDIDKIKEILRDTDMVFISAGLGGGTGTGAAPVIAKTAKEMGILTVAVATLPFRFEGPRKMEKALKGLEKLKESSDAYIVIHNDKIKELSNRTLTIKDAFKEVDSVLSKAVRGITSIVVTPAVINVDFADVRTTLEEGGLSIIGMGEGRGDEKADIAVEKAVTSPLLEGNTIEGARRLLVTIWTSEDIPYDIVDEVMERIHSKVHPEAEIIFGAVLEPQEQDFIRVAIVATDFPEEKFQVGEKEVKFKVIKKEEKEEPKEEPKPLSDTTYLEEEEIPAVIRRKNKRLL. The pKd is 7.1. (2) The compound is Clc1ccc(COC(Cn2ccnc2)c2ccc(Cl)cc2Cl)cc1. The target protein (P9WPN4) has sequence MTSVMSHEFQLATAETWPNPWPMYRALRDHDPVHHVVPPQRPEYDYYVLSRHADVWSAARDHQTFSSAQGLTVNYGELEMIGLHDTPPMVMQDPPVHTEFRKLVSRGFTPRQVETVEPTVRKFVVERLEKLRANGGGDIVTELFKPLPSMVVAHYLGVPEEDWTQFDGWTQAIVAANAVDGATTGALDAVGSMMAYFTGLIERRRTEPADDAISHLVAAGVGADGDTAGTLSILAFTFTMVTGGNDTVTGMLGGSMPLLHRRPDQRRLLLDDPEGIPDAVEELLRLTSPVQGLARTTTRDVTIGDTTIPAGRRVLLLYGSANRDERQYGPDAAELDVTRCPRNILTFSHGAHHCLGAAAARMQCRVALTELLARCPDFEVAESRIVWSGGSYVRRPLSVPFRVTS. The pKd is 5.7. (3) The drug is C[C@@H](O)[C@H](NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@@H](N)CCC(=O)O)C(=O)N[C@@H](Cc1ccc(OP(=O)(O)O)cc1)C(=O)N[C@@H](CC(=O)O)C(=O)NCC(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCC(=O)O)C(=O)O. The target protein sequence is SLEKHSWYHGPVSRNAAEYLLSSGINGSFLVRESESSPGQRSISLRYEGRVYHYRINTASDGKLYVSSESRFNTLAELVHHHSTVADGLITTLHYPAPKRNKPTVY. The pKd is 5.1. (4) The small molecule is CO[C@@H]1O[C@@H](COS(=O)(=O)O)[C@@H](O[C@@H]2O[C@@H](COS(=O)(=O)O)[C@H](OS(=O)(=O)O)[C@H](OS(=O)(=O)O)[C@@H]2OS(=O)(=O)O)[C@H](OS(=O)(=O)O)[C@@H]1OS(=O)(=O)O. The target protein (P09038) has sequence MVGVGGGDVEDVTPRPGGCQISGRGARGCNGIPGAAAWEAALPRRRPRRHPSVNPRSRAAGSPRTRGRRTEERPSGSRLGDRGRGRALPGGRLGGRGRGRAPERVGGRGRGRGTAAPRAAPAARGSRPGPAGTMAAGSITTLPALPEDGGSGAFPPGHFKDPKRLYCKNGGFFLRIHPDGRVDGVREKSDPHIKLQLQAEERGVVSIKGVCANRYLAMKEDGRLLASKCVTDECFFFERLESNNYNTYRSRKYTSWYVALKRTGQYKLGSKTGPGQKAILFLPMSAKS. The pKd is 4.8. (5) The drug is COc1cc2c(cc1OC)C1CC(=O)C(CC(C)C)CN1CC2. The target protein sequence is MALSDLVLLRWLRDSRHSRKLILFIVFLALLLDNMLLTVVVPIIPSYLYSIKHEKNSTEIQTTRPELVVSTSESIFSYYNNSTVLITGNATGTLPGGQSHKATSTQHTVANTTVPSDCPSEDRDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFAGFCIMFISTVMFAFSSSYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGKPMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLKDPYILIAAGSICFANMAIAMLEPALPIWMMETMCSRKWQLGVAFLPASISYLIGTNIFGILAHKMGRWLCALLGMVIVGISILCIPFAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDIAFAPLCFFLRSPPAKEEKMAILMDHNCPIKRKMYTQNN.... The pKd is 7.0. (6) The drug is O=P(O)(O)O[C@H]1[C@@H](O)[C@@H](OP(=O)(O)O)[C@H](O)[C@@H](OP(=O)(O)O)[C@@H]1OP(=O)(O)O. The target protein (P97696) has sequence MDEGGGGEGGSVPEDLSLEEREELLDIRRRKKELIDDIERLKYEIAEVMTEIDNLTSVEESKTTQRNKQIAMGRKKFNMDPKKGIQFLIENDLLQSSPEDVAQFLYKGEGLNKTVIGDYLGERDDFNIKVLQAFVELHEFADLNLVQALRQFLWSFRLPGEAQKIDRMMEAFASRYCLCNPGVFQSTDTCYVLSFAIIMLNTSLHNHNVRDKPTAERFITMNRGINEGGDLPEELLRNLYESIKNEPFKIPEDDGNDLTHTFFNPDREGWLLKLGGGRVKTWKRRWFILTDNCLYYFEYTTDKEPRGIIPLENLSIREVEDPRKPNCFELYNPSHKGQVIKACKTEADGRVVEGNHVVYRISAPSPEEKEEWMKSIKASISRDPFYDMLATRKRRIANKK. The pKd is 7.0.