Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. From a dataset of Drug-target binding data from BindingDB using Kd measurements. (1) The drug is NC(=O)C[C@H](NC(=O)C1(NC(=O)[C@@H](CC(=O)O)Cc2ccc(CP(=O)(O)O)cc2)CCN(S(=O)(=O)CCc2ccccc2)CC1)C(=O)NCCCc1ccc2ccccc2c1. The target protein (P62993) has sequence MEAIAKYDFKATADDELSFKRGDILKVLNEECDQNWYKAELNGKDGFIPKNYIEMKPHPWFFGKIPRAKAEEMLSKQRHDGAFLIRESESAPGDFSLSVKFGNDVQHFKVLRDGAGKYFLWVVKFNSLNELVDYHRSTSVSRNQQIFLRDIEQVPQQPTYVQALFDFDPQEDGELGFRRGDFIHVMDNSDPNWWKGACHGQTGMFPRNYVTPVNRNV. The pKd is 5.7. (2) The drug is O=S1(=O)C=CC(OS(=O)(=O)c2ccccc2)C1. The target protein sequence is MAAAAAAGPEMVRGQVFDVGPRYTNLSYIGEGAYGMVCSAYDNLNKVRVAIKKISPFEHQTYCQRTLREIKILLRFRHENIIGINDIIRAPTIEQMKDVYIVQDLMETDLYKLLKTQHLSNDHICYFLYQILRGLKYIHSANVLHRDLKPSNLLLNTTCDLKICDFGLARVADPDHDHTGFLTEYVATRWYRAPEIMANSKGYTKSIDIWSVGCILAEMLSNRPIFPGKHYLDQANHILGILGSPSQEDLNCIINLKARNYLLSLPHKNKVPWNRLFPNADSKALDLLDKMLTFNPHKRIEVEQALAHPYLEQYYDPSDEPIAEAPFKFDMELDDLPKEKLKELIFEETARFQPGYRS. The pKd is 4.7. (3) The small molecule is CC(O)[C@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CCCNC(N)=[NH2+])NC(=O)[C@H](CCCCNC(=O)C[C@@H](NC(=O)CCCCCNC(=O)[C@H]1O[C@@H](n2cc(I)c3c(N)ncnc32)[C@H](O)[C@@H]1O)C(=O)[O-])NC(=O)[C@H](CCCNC(N)=[NH2+])NC(=O)[C@H](C)[NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](CCCC[NH3+])C(N)=O. The target protein sequence is KGPVPFSHCLPTEKLQRCEKIGEGVFGEVFQTIADHTPVAIKIIAIEGPDLVNGSHQKTFEEILPEIIISKELSLLSGEVCNRTEGFIGLNSVHCVQGSYPPLLLKAWDHYNSTKGSANDRPDFFKDDQLFIVLEFEFGGIDLEQMRTKLSSLATAKSILHQLTASLAVAEASLRFEHRDLHWGNVLLKKTSLKKLHYTLNGKSSTIPSCGLQVSIIDYTLSRLERDGIVVFCDVSMDEDLFTGDGDYQFDIYRLMKKENNNRWGEYHPYSNVLWLHYLTDKMLKQMTFKTKCNTPAMKQIKRKIQEFHRTMLNFSSATDLLCQHSLFK. The pKd is 9.9. (4) The compound is CN(C)CC(=O)N1CCC(c2ccc(NC(=O)c3ncc(C#N)[nH]3)c(C3=CCCCC3)c2)CC1. The target protein (Q96PY6) has sequence MEKYVRLQKIGEGSFGKAILVKSTEDGRQYVIKEINISRMSSKEREESRREVAVLANMKHPNIVQYRESFEENGSLYIVMDYCEGGDLFKRINAQKGVLFQEDQILDWFVQICLALKHVHDRKILHRDIKSQNIFLTKDGTVQLGDFGIARVLNSTVELARTCIGTPYYLSPEICENKPYNNKSDIWALGCVLYELCTLKHAFEAGSMKNLVLKIISGSFPPVSLHYSYDLRSLVSQLFKRNPRDRPSVNSILEKGFIAKRIEKFLSPQLIAEEFCLKTFSKFGSQPIPAKRPASGQNSISVMPAQKITKPAAKYGIPLAYKKYGDKKLHEKKPLQKHKQAHQTPEKRVNTGEERRKISEEAARKRRLEFIEKEKKQKDQIISLMKAEQMKRQEKERLERINRAREQGWRNVLSAGGSGEVKAPFLGSGGTIAPSSFSSRGQYEHYHAIFDQMQQQRAEDNEAKWKREIYGRGLPERGILPGVRPGFPYGAAGHHHFPDA.... The pKd is 5.0. (5) The drug is Cc1ccc(NC(=O)c2ccc(CN3CCN(C)CC3)cc2)cc1Nc1nccc(-c2cccnc2)n1. The target protein sequence is MRGARGAWDFLCVLLLLLRVQTGSSQPSVSPGEPSPPSIHPGKSDLIVRVGDEIRLLCTDPGFVKWTFEILDETNENKQNEWITEKAEATNTGKYTCTNKHGLSNSIYVFVRDPAKLFLVDRSLYGKEDNDTLVRCPLTDPEVTNYSLKGCQGKPLPKDLRFIPDPKAGIMIKSVKRAYHRLCLHCSVDQEGKSVLSEKFILKVRPAFKAVPVVSVSKASYLLREGEEFTVTCTIKDVSSSVYSTWKRENSQTKLQEKYNSWHHGDFNYERQATLTISSARVNDSGVFMCYANNTFGSANVTTTLEVVDKGFINIFPMINTTVFVNDGENVDLIVEYEAFPKPEHQQWIYMNRTFTDKWEDYPKSENESNIRYVSELHLTRLKGTEGGTYTFLVSNSDVNAAIAFNVYVNTKPEILTYDRLVNGMLQCVAAGFPEPTIDWYFCPGTEQRCSASVLPVDVQTLNSSGPPFGKLVVQSSIDSSAFKHNGTVECKAYNDVGKT.... The pKd is 5.5. (6) The drug is O=C(NC(C1CC1)C1CC1)OCCCc1cnc[nH]1. The target protein (Q9JI35) has sequence MERAPPDGLMNASGALAGEAAAAAGGARTFSAAWTAVLAALMALLIVATVLGNALVMLAFVADSSLRTQNNFFLLNLAISDFLVGVFCIPLYVPYVLTGRWTFGRGLCKLWLVVDYLLCTSSVFNIVLISYDRFLSVTRAVSYRAQQGDTRRAVRKMVLVWVLAFLLYGPAILSWEYLSGGSSIPEGHCYAEFFYNWYFLITASTLEFFTPFLSVTFFNLSIYLNIQRRTRLRLDGGAREAGPDPLPEAQSSPPQPPPGCWGCWPKGQGESMPLHRYGVGEAGPGAEAGEAALGGGSGAAASPTSSSGSSSRGTERPRSLKRGSKPSASSASLEKRMKMVSQSITQRFRLSRDKKVAKSLAIIVSIFGLCWAPYTLLMIIRAACHGHCVPDYWYETSFWLLWANSAVNPVLYPLCHYSFRRAFTKLLCPQKLKVQPHSSLEHCWK. The pKd is 9.0. (7) The drug is O=NC1CCc2cc(-c3cn(CCO)nc3-c3ccncc3)ccc21. The target protein (P24941) has sequence MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRLDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLADFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pKd is 5.0. (8) The pKd is 5.0. The target protein (Q15208) has sequence MAMTGSTPCSSMSNHTKERVTMTKVTLENFYSNLIAQHEEREMRQKKLEKVMEEEGLKDEEKRLRRSAHARKETEFLRLKRTRLGLEDFESLKVIGRGAFGEVRLVQKKDTGHVYAMKILRKADMLEKEQVGHIRAERDILVEADSLWVVKMFYSFQDKLNLYLIMEFLPGGDMMTLLMKKDTLTEEETQFYIAETVLAIDSIHQLGFIHRDIKPDNLLLDSKGHVKLSDFGLCTGLKKAHRTEFYRNLNHSLPSDFTFQNMNSKRKAETWKRNRRQLAFSTVGTPDYIAPEVFMQTGYNKLCDWWSLGVIMYEMLIGYPPFCSETPQETYKKVMNWKETLTFPPEVPISEKAKDLILRFCCEWEHRIGAPGVEEIKSNSFFEGVDWEHIRERPAAISIEIKSIDDTSNFDEFPESDILKPTVATSNHPETDYKNKDWVFINYTYKRFEGLTARGAIPSYMKAAK. The compound is O=c1ncn2nc(Sc3ccc(F)cc3F)ccc2c1-c1c(Cl)cccc1Cl. (9) The small molecule is O=C(NCC(c1ccccc1)n1ccnc1)c1ccc2sc3ccccc3c(=O)c2c1. The target protein sequence is MLLEVAIFLLTALALYSFYFVKSFNVTRPTDPPVYPVTVPILGHIIQFGKSPLGFMQECKRQLKSGIFTINIVGKRVTIVGDPHEHSRFFLPRNEVLSPREVYSFMVPVFGEGVAYAAPYPRMREQLNFLAEELTIAKFQNFVPAIQHEVRKFMAANWDKDEGEINLLEDCSTMIINTACQCLFGEDLRKRLDARRFAQLLAKMESSLIPAAVFLPILLKLPLPQSARCHEARTELQKILSEIIIARKEEEVNKDSSTSDLLSGLLSAVYRDGTPMSLHEVCGMIVAAMFAGQHTSSITTTWSMLHLMHPANVKHLEALRKEIEEFPAQLNYNNVMDEMPFAERCARESIRRDPPLLMLMRKVMADVKVGSYVVPKGDIIACSPLLSHHDEEAFPEPRRWDPERDEKVEGAFIGFGAGVHKCIGQKFGLLQVKTILATAFRSYDFQLLRDEVPDPDYHTMVVGPTASQCRVKYIRRKAAAA. The pKd is 6.5.