Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The compound is O=C(O)/C=C/C(=O)O. The target protein (Q9Z429) has sequence MQQFTIRTRLLMLVGAMFIGFITIELMGFSALQRGVASLNTVYLDRVVPLRDLKTIADLYAVKIVDSSHKARSGRMTYAQAEQEVKDAGRQIDMLWHAYQKTKKIDEEQRSVDALAKLVDEAQDPIERLKGILERGDKAALDTFVENEMYPLIDPLSEGLSHLTQIQVEESKRAYDAAVVLYDSSRTMLALLLLGILICGGVFATRLIRSIIHPLTTLKDAAARVALGDLSQSIQVSGRNEVTDVQQSVQAMQANLRNTLQDIQGSAAQLAAAAEELQTATESTAQGIHRQNDEMQMAATAVTEMSAAVDEVADNANRTSNASHEAMDLADGGRKQVMLTRETIDRLSGKLNETTRTVFRLAEEASNIGRVLDVIRAIAEQTKLLALNAAIEAAHAGEAGRGFAVVADEVRNLAQRTQTSTQEIERMISAIQSVTQEGVRDVQQSCEFAARSQTMSSEADQALTLIAERITEINGMNLVIASAAEEQAQVAREVDRNLVA.... The pKd is 4.8. (2) The drug is Cc1[nH]c(/C=C2\C(=O)Nc3ccc(F)cc32)c(C)c1C(=O)NC[C@H](O)CN1CCOCC1. The target protein (P57078) has sequence MEGDGGTPWALALLRTFDAGEFTGWEKVGSGGFGQVYKVRHVHWKTWLAIKCSPSLHVDDRERMELLEEAKKMEMAKFRYILPVYGICREPVGLVMEYMETGSLEKLLASEPLPWDLRFRIIHETAVGMNFLHCMAPPLLHLDLKPANILLDAHYHVKISDFGLAKCNGLSHSHDLSMDGLFGTIAYLPPERIREKSRLFDTKHDVYSFAIVIWGVLTQKKPFADEKNILHIMVKVVKGHRPELPPVCRARPRACSHLIRLMQRCWQGDPRVRPTFQGNGLNGELIRQVLAALLPVTGRWRSPGEGFRLESEVIIRVTCPLSSPQEITSETEDLCEKPDDEVKETAHDLDVKSPPEPRSEVVPARLKRASAPTFDNDYSLSELLSQLDSGVSQAVEGPEELSRSSSESKLPSSGSGKRLSGVSSVDSAFSSRGSLSLSFEREPSTSDLGTTDVQKKKLVDAIVSGDTSKLMKILQPQDVDLALDSGASLLHLAVEAGQEE.... The pKd is 5.0. (3) The compound is CCOC(=O)/C=C1\C2C3C=CC(CCC3)C2C(=O)N1Cc1ccc(-c2ccccc2-c2nnn[nH]2)cc1. The target protein (P34976) has sequence MMLNSSTEDGIKRIQDDCPKAGRHNYIFVMIPTLYSIIFVVGIFGNSLAVIVIYFYMKLKTVASVFLLNLALADLCFLLTLPLWAVYTAMEYRWPFGNYLCKIASASVSFNLYASVFLLTCLSIDRYLAIVHPMKSRLRRTMLVAKVTCIIIWLLAGLASLPAIIHRNVFFIENTNITVCAFHYESQNSTLPIGLGLTKNILGFLFPFLIILTSYTLIWKALKKAYEIQKNKPRNDDIFKIIMAIVLFFFFSWVPHQIFTFLDVLIQLGVIHDCRIADIVDTAMPITICIAYFNNCLNPLFYGFLGKKFKKYFLQLLKYIPPKAKSHSNLSTKMSTLSYRPSDNVSSSSKKPVPCFEVE. The pKd is 7.7. (4) The compound is CCN(CC)c1nc2ccc(C(O)(c3ccc(C(F)(F)F)nc3)c3cncn3C)cc2c(C(F)(F)F)c1Oc1ccc(Cl)cc1. The target protein sequence is MAHHHHHHAGGAENLYFQGAMDSTPEAPYASLTEIEHLVQSVCKSYRETCQLRLEDLLRQRSNIFSREEVTGYQRKSMWEMWERCAHHLTEAIQYVVEFAKRLSGFMELCQNDQIVLLKAGAMEVVLVRMCRAYNADNRTVFFEGKYGGMELFRALGCSELISSIFDFSHSLSALHFSEDEIALYTALVLINAHRPGLQEKRKVEQLQYNLELAFHHHLCKTHRQSILAKLPPKGKLRSLCSQHVERLQIFQHLHPIVVQAAFPPLYKELFSTETESPVGLSK. The pKd is 7.1. (5) The drug is COc1cc(OC)c2c(=O)[nH]c(-c3cc(C)c(OCCO)c(C)c3)nc2c1. The target protein (P25440) has sequence MLQNVTPHNKLPGEGNAGLLGLGPEAAAPGKRIRKPSLLYEGFESPTMASVPALQLTPANPPPPEVSNPKKPGRVTNQLQYLHKVVMKALWKHQFAWPFRQPVDAVKLGLPDYHKIIKQPMDMGTIKRRLENNYYWAASECMQDFNTMFTNCYIYNKPTDDIVLMAQTLEKIFLQKVASMPQEEQELVVTIPKNSHKKGAKLAALQGSVTSAHQVPAVSSVSHTALYTPPPEIPTTVLNIPHPSVISSPLLKSLHSAGPPLLAVTAAPPAQPLAKKKGVKRKADTTTPTPTAILAPGSPASPPGSLEPKAARLPPMRRESGRPIKPPRKDLPDSQQQHQSSKKGKLSEQLKHCNGILKELLSKKHAAYAWPFYKPVDASALGLHDYHDIIKHPMDLSTVKRKMENRDYRDAQEFAADVRLMFSNCYKYNPPDHDVVAMARKLQDVFEFRYAKMPDEPLEPGPLPVSTAMPPGLAKSSSESSSEESSSESSSEEEEEEDEE.... The pKd is 6.6. (6) The pKd is 4.3. The target protein (P9WIK2) has sequence MSGETTRLTEPQLRELAARGAAELDGATATDMLRWTDETFGDIGGAGGGVSGHRGWTTCNYVVASNMADAVLVDLAAKVRPGVPVIFLDTGYHFVETIGTRDAIESVYDVRVLNVTPEHTVAEQDELLGKDLFARNPHECCRLRKVVPLGKTLRGYSAWVTGLRRVDAPTRANAPLVSFDETFKLVKVNPLAAWTDQDVQEYIADNDVLVNPLVREGYPSIGCAPCTAKPAEGADPRSGRWQGLAKTECGLHAS. The small molecule is O=C(O)CSc1nnc2c3ccccc3c3ccccc3c2n1. (7) The small molecule is Cc1ccc(NC(=O)c2ccc(CN3CCN(C)CC3)cc2)cc1Nc1nccc(-c2cccnc2)n1. The target protein sequence is MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKT.... The pKd is 8.5. (8) The small molecule is CN[C@@H]1C[C@H]2O[C@@](C)([C@@H]1OC)n1c3ccccc3c3c4c(c5c6ccccc6n2c5c31)C(=O)NC4. The target protein (Q9JI10) has sequence MEQPPASKSKLKKLSEDSLTKQPEEVFDVLEKLGEGSYGSVFKAIHKESGQVVAIKQVPVESDLQEIIKEISIMQQCDSPYVVKYYGSYFKNTDLWIVMEYCGAGSVSDIIRLRNKTLTEDEIATILKSTLKGLEYLHFMRKIHRDIKAGNILLNTEGHAKLADFGVAGQLTDTMAKRNTVIGTPFWMAPEVIQEIGYNCVADIWSLGITSIEMAEGKPPYADIHPMRAIFMIPTNPPPTFRKPELWSDDFTDFVKKCLVKSPEQRATATQLLQHPFIKNAKPVSILRDLIAEAMEIKAKRHEEQQRELEEEEENSDEDELDSHTMVKTSSESVGTMRATSTMSEGAQTMIEHNSTMLESDLGTMVINSEEEEEEEEEEEEDGTMKRNATSPQVQRPSFMDYFDKQDFKNKSHENCDQSMREPGPMSNSVFPDNWRVPQDGDFDFLKNLSLEELQMRLKALDPMMEREIEELHQRYSAKRQPILDAMDAKKRRQQNF. The pKd is 9.6. (9) The small molecule is CC(C)[C@@H]1NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](Cc2ccc([N+](=O)[O-])cc2)NC(=O)C(C)(C)CSSC[C@@H](C(=O)N2CCC[C@H]2C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc2ccc(O)cc2)C(N)=O)NC(=O)[C@H](CC(N)=O)NC1=O. The target protein (P30560) has sequence MSFPRGSQDRSVGNSSPWWPLTTEGSNGSQEAARLGEGDSPLGDVRNEELAKLEIAVLAVIFVVAVLGNSSVLLALHRTPRKTSRMHLFIRHLSLADLAVAFFQVLPQLCWDITYRFRGPDWLCRVVKHLQVFAMFASAYMLVVMTADRYIAVCHPLKTLQQPARRSRLMIATSWVLSFILSTPQYFIFSVIEIEVNNGTKTQDCWATFIQPWGTRAYVTWMTSGVFVAPVVVLGTCYGFICYHIWRNIRGKTASSRHSKGDKGSGEAVGPFHKGLLVTPCVSSVKSISRAKIRTVKMTFVIVSAYILCWAPFFIVQMWSVWDENFIWTDSENPSITITALLASLNSCCNPWIYMFFSGHLLQDCVQSFPCCHSMAQKFAKDDSDSMSRRQTSYSNNRSPTNSTGMWKDSPKSSKSIRFIPVST. The pKd is 8.7.