Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is O=C(O)c1ccc(O)c(I)c1. The target protein sequence is MKQKPAFIPYAGAQFEPEEMLSKSAEYYQFMDHRRTVREFSNRAIPLEVIENIVMTASTAPSGAHKQPWTFVVVSDPQIKAKIRQAAEKEEFESYNGRMSNEWLEDLQPFGTDWHKPFLEIAPYLIVVFRKAYDVLPDGTQRKNYYVQESVGIACGFLLAAIHQAGLVALTHTPSPMNFLQKILQRPENERPFLLVPVGYPAEGAMVPDLQRKDKAAVMVVYHHHHHH. The pKd is 4.6. (2) The drug is CC(=O)NC1C(O)CC(OC2C(O)C(CO)OC(OC3C(CO)OC(O)C(O)C3O)C2O)(C(=O)O)OC1[C@H](O)[C@H](O)CO. The target protein sequence is MLAPGSSRVELFKRKNSTVPFEDKAGKVTERVVHSFRLPALVNVDGVMVAIADARYDTSNDNSLIDTVAKYSVDDGETWETQIAIKNSRVSSVSRVVDPTVIVKGNKLYVLVGSYYSSRSYWSSHGDARDWDILLAVGEVTKSIAGGKITASIKWGSPVSLKKFFPAEMEGMHTNQFLGGAGVAIVASNGNLVYPVQVTNKRKQVFSKIFYSEDDGKTWKFGKGRSDFGCSEPVALEWEGKLIINTRVAWKRRLVYESSDMGNTWVEAVGTLSRVWGPSPKSDHPGSQSSFTAVTIEGMRVMLFTHPLNFKGRWLRDRLNLWLTDNQRIYNVGQVSIGDENSAYSSVLYKDDKLYCLHEINTDEVYSLVFARLVGELRIIKSVLRSWKNWDSHLSSICTPADPAASSSESGCGPAVTTVGLVGFLSGNASQNVWEDAYRCVNASTANAERVRNGLKFAGVGGGALWPVSQQGQNQRYRFANHAFTLVASVTIHEAPRAAS.... The pKd is 4.0. (3) The drug is C[N+]1(C)[C@H]2CC(OC(=O)[C@H](CO)c3ccccc3)C[C@@H]1[C@H]1O[C@@H]21. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTCFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKd is 9.2. (4) The small molecule is Cc1ccc(Nc2nccc(N(C)c3ccc4c(C)n(C)nc4c3)n2)cc1S(N)(=O)=O. The target protein (P53671) has sequence MSALAGEDVWRCPGCGDHIAPSQIWYRTVNETWHGSCFRCSECQDSLTNWYYEKDGKLYCPKDYWGKFGEFCHGCSLLMTGPFMVAGEFKYHPECFACMSCKVIIEDGDAYALVQHATLYCGKCHNEVVLAPMFERLSTESVQEQLPYSVTLISMPATTEGRRGFSVSVESACSNYATTVQVKEVNRMHISPNNRNAIHPGDRILEINGTPVRTLRVEEVEDAISQTSQTLQLLIEHDPVSQRLDQLRLEARLAPHMQNAGHPHALSTLDTKENLEGTLRRRSLRRSNSISKSPGPSSPKEPLLFSRDISRSESLRCSSSYSQQIFRPCDLIHGEVLGKGFFGQAIKVTHKATGKVMVMKELIRCDEETQKTFLTEVKVMRSLDHPNVLKFIGVLYKDKKLNLLTEYIEGGTLKDFLRSMDPFPWQQKVRFAKGIASGMAYLHSMCIIHRDLNSHNCLIKLDKTVVVADFGLSRLIVEERKRAPMEKATTKKRTLRKNDR.... The pKd is 6.4. (5) The small molecule is Cc1cn(C)c2cc3c(cc12)N(C(=O)Nc1ccccc1)CC3. The target protein (P30994) has sequence MASSYKMSEQSTISEHILQKTCDHLILTDRSGLKAESAAEEMKQTAENQGNTVHWAALLIFAVIIPTIGGNILVILAVSLEKRLQYATNYFLMSLAVADLLVGLFVMPIALLTIMFEATWPLPLALCPAWLFLDVLFSTASIMHLCAISLDRYIAIKKPIQANQCNSRTTAFVKITVVWLISIGIAIPVPIKGIEADVVNAHNITCELTKDRFGSFMLFGSLAAFFAPLTIMIVTYFLTIHALRKKAYLVRNRPPQRLTRWTVSTVLQREDSSFSSPEKMVMLDGSHKDKILPNSTDETLMRRMSSAGKKPAQTISNEQRASKVLGIVFLFFLLMWCPFFITNVTLALCDSCNQTTLKTLLQIFVWVGYVSSGVNPLIYTLFNKTFREAFGRYITCNYQATKSVKVLRKCSSTLYFGNSMVENSKFFTKHGIRNGINPAMYQSPVRLRSSTIQSSSIILLNTFLTENDGDKVEDQVSYI. The pKd is 8.2. (6) The drug is OCc1cn2ccsc2n1. The target protein (Q63T71) has sequence MDFRIGQGYDVHQLVPGRPLIIGGVTIPYERGLLGHSDADVLLHAITDALFGAAALGDIGRHFSDTDPRFKGADSRALLRECASRVAQAGFAIRNVDSTIIAQAPKLAPHIDAMRANIAADLDLPLDRVNVKAKTNEKLGYLGRGEGIEAQAAALVVREAAA. The pKd is 3.9. (7) The target protein (Q9AIU7) has sequence MADLSSRVNELHDLLNQYSYEYYVEDNPSVPDSEYDKLLHELIKIEEEHPEYKTVDSPTVRVGGEAQASFNKVNHDTPMLSLGNAFNEDDLRKFDQRIREQIGNVEYMCELKIDGLAVSLKYVDGYFVQGLTRGDGTTGEDITENLKTIHAIPLKMKEPLNVEVRGEAYMPRRSFLRLNEEKEKNDEQLFANPRNAAAGSLRQLDSKLTAKRKLSVFIYSVNDFTDFNARSQSEALDELDKLGFTTNKNRARVNNIDGVLEYIEKWTSQRESLPYDIDGIVIKVNDLDQQDEMGFTQKSPRWAIAYKFPAEEVVTKLLDIELSIGRTGVVTPTAILEPVKVAGTTVSRASLHNEDLIHDRDIRIGDSVVVKKAGDIIPEVVRSIPERRPEDAVTYHMPTHCPSCGHELVRIEGEVALRCINPKCQAQLVEGLIHFVSRQAMNIDGLGTKIIQQLYQSELIKDVADIFYLTEEDLLPLDRMGQKKVDNLLAAIQQAKDNSL.... The drug is Clc1cccc(-c2nnc[nH]2)n1. The pKd is 5.0. (8) The compound is CN1CCN(c2ccc3nc(-c4c(N)c5c(F)cccc5[nH]c4=O)[nH]c3c2)CC1. The target protein (Q9H0K1) has sequence MVMADGPRHLQRGPVRVGFYDIEGTLGKGNFAVVKLGRHRITKTEVAIKIIDKSQLDAVNLEKIYREVQIMKMLDHPHIIKLYQVMETKSMLYLVTEYAKNGEIFDYLANHGRLNESEARRKFWQILSAVDYCHGRKIVHRDLKAENLLLDNNMNIKIADFGFGNFFKSGELLATWCGSPPYAAPEVFEGQQYEGPQLDIWSMGVVLYVLVCGALPFDGPTLPILRQRVLEGRFRIPYFMSEDCEHLIRRMLVLDPSKRLTIAQIKEHKWMLIEVPVQRPVLYPQEQENEPSIGEFNEQVLRLMHSLGIDQQKTIESLQNKSYNHFAAIYFLLVERLKSHRSSFPVEQRLDGRQRRPSTIAEQTVAKAQTVGLPVTMHSPNMRLLRSALLPQASNVEAFSFPASGCQAEAAFMEEECVDTPKVNGCLLDPVPPVLVRKGCQSLPSNMMETSIDEGLETEGEAEEDPAHAFEAFQSTRSGQRRHTLSEVTNQLVVMPGAGK.... The pKd is 5.4. (9) The pKd is 6.0. The target protein (Q9Y232) has sequence MTFQASHRSAWGKSRKKNWQYEGPTQKLFLKRNNVSAPDGPSDPSISVSSEQSGAQQPPALQVERIVDKRKNKKGKTEYLVRWKGYDSEDDTWEPEQHLVNCEEYIHDFNRRHTEKQKESTLTRTNRTSPNNARKQISRSTNSNFSKTSPKALVIGKDHESKNSQLFAASQKFRKNTAPSLSSRKNMDLAKSGIKILVPKSPVKSRTAVDGFQSESPEKLDPVEQGQEDTVAPEVAAEKPVGALLGPGAERARMGSRPRIHPLVPQVPGPVTAAMATGLAVNGKGTSPFMDALTANGTTNIQTSVTGVTASKRKFIDDRRDQPFDKRLRFSVRQTESAYRYRDIVVRKQDGFTHILLSTKSSENNSLNPEVMREVQSALSTAAADDSKLVLLSAVGSVFCCGLDFIYFIRRLTDDRKRESTKMAEAIRNFVNTFIQFKKPIIVAVNGPAIGLGASILPLCDVVWANEKAWFQTPYTTFGQSPDGCSTVMFPKIMGGASAN.... The drug is CCN(CC)CCCC[C@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](Cc1ccccc1)NC(=O)c1ccc(C(C)(C)C)cc1)C(=O)N[C@@H](CO)C(=O)OC.