Dataset: Drug-target binding data from BindingDB using Kd measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKd (pKd = -log10(Kd in M); higher means stronger binding). Dataset: bindingdb_kd. (1) The drug is CCCCC/C=C\C/C=C\C/C=C\C/C=C\CCCC(=O)NCCO. The target protein (Q05816) has sequence MASLKDLEGKWRLMESHGFEEYMKELGVGLALRKMAAMAKPDCIITCDGNNITVKTESTVKTTVFSCNLGEKFDETTADGRKTETVCTFQDGALVQHQQWDGKESTITRKLKDGKMIVECVMNNATCTRVYEKVQ. The pKd is 7.0. (2) The small molecule is NC(=O)c1ccc(-c2nc(-c3ccccn3)c(-c3ccc4c(c3)OCO4)[nH]2)cc1. The target protein (Q96RR4) has sequence MSSCVSSQPSSNRAAPQDELGGRGSSSSESQKPCEALRGLSSLSIHLGMESFIVVTECEPGCAVDLGLARDRPLEADGQEVPLDTSGSQARPHLSGRKLSLQERSQGGLAAGGSLDMNGRCICPSLPYSPVSSPQSSPRLPRRPTVESHHVSITGMQDCVQLNQYTLKDEIGKGSYGVVKLAYNENDNTYYAMKVLSKKKLIRQAGFPRRPPPRGTRPAPGGCIQPRGPIEQVYQEIAILKKLDHPNVVKLVEVLDDPNEDHLYMVFELVNQGPVMEVPTLKPLSEDQARFYFQDLIKGIEYLHYQKIIHRDIKPSNLLVGEDGHIKIADFGVSNEFKGSDALLSNTVGTPAFMAPESLSETRKIFSGKALDVWAMGVTLYCFVFGQCPFMDERIMCLHSKIKSQALEFPDQPDIAEDLKDLITRMLDKNPESRIVVPEIKLHPWVTRHGAEPLPSEDENCTLVEVTEEEVENSVKHIPSLATVILVKTMIRKRSFGNPF.... The pKd is 5.0. (3) The compound is O=C(O)C(Cc1ccccc1)N1C(=O)/C(=C/c2cccc(Cl)c2)SC1=S. The target protein sequence is MKRKKILIVGAGFSGAVIGRQLAEKGHQVHIIDQRDHIGGNSYDARDSETNVMVHVYGPHIFHTDNESVWNYVNKHAEMMPYVNRVKATVNGQVFSLPINLHTINQFFSKTCSPDEARALIAEKGDSTIADPQTFEEQALRFIGKELYEAFFKGYTIKQWGMQPSELPASILKRLPVRFNYDDNYFNHKFQGMPKCGYTQMIKSILKHENIKVDLQREFIVDERTHYDHVFYSGPLDAFYGYQYGRLGYRTLDFKKFIYQGDYQGCAVMNYCSVDVPYTRITEHKYFSPWEQHDGSVCYKEYSRACEENDIPYYPIRQMGEMALLEKYLSLAENETNITFVGRLGTYRYLDMDVTIAEALKTAEVYLNSLTENQPMPVFTVSVR. The pKd is 4.7. (4) The drug is COC(=O)c1ccc2c(c1)NC(=O)/C2=C(\Nc1ccc(N(C)C(=O)CN2CCN(C)CC2)cc1)c1ccccc1. The target protein (O14936) has sequence MADDDVLFEDVYELCEVIGKGPFSVVRRCINRETGQQFAVKIVDVAKFTSSPGLSTEDLKREASICHMLKHPHIVELLETYSSDGMLYMVFEFMDGADLCFEIVKRADAGFVYSEAVASHYMRQILEALRYCHDNNIIHRDVKPHCVLLASKENSAPVKLGGFGVAIQLGESGLVAGGRVGTPHFMAPEVVKREPYGKPVDVWGCGVILFILLSGCLPFYGTKERLFEGIIKGKYKMNPRQWSHISESAKDLVRRMLMLDPAERITVYEALNHPWLKERDRYAYKIHLPETVEQLRKFNARRKLKGAVLAAVSSHKFNSFYGDPPEELPDFSEDPTSSGLLAAERAVSQVLDSLEEIHALTDCSEKDLDFLHSVFQDQHLHTLLDLYDKINTKSSPQIRNPPSDAVQRAKEVLEEISCYPENNDAKELKRILTQPHFMALLQTHDVVAHEVYSDEALRVTPPPTSPYLNGDSPESANGDMDMENVTRVRLVQFQKNTDEP.... The pKd is 5.0. (5) The compound is COc1cc(Oc2ccnc3cc(OC)c(OC)cc23)ccc1NC(=O)NC(C)c1nccs1. The target protein (Q9NQU5) has sequence MFRKKKKKRPEISAPQNFQHRVHTSFDPKEGKFVGLPPQWQNILDTLRRPKPVVDPSRITRVQLQPMKTVVRGSAMPVDGYISGLLNDIQKLSVISSNTLRGRSPTSRRRAQSLGLLGDEHWATDPDMYLQSPQSERTDPHGLYLSCNGGTPAGHKQMPWPEPQSPRVLPNGLAAKAQSLGPAEFQGASQRCLQLGACLQSSPPGASPPTGTNRHGMKAAKHGSEEARPQSCLVGSATGRPGGEGSPSPKTRESSLKRRLFRSMFLSTAATAPPSSSKPGPPPQSKPNSSFRPPQKDNPPSLVAKAQSLPSDQPVGTFSPLTTSDTSSPQKSLRTAPATGQLPGRSSPAGSPRTWHAQISTSNLYLPQDPTVAKGALAGEDTGVVTHEQFKAALRMVVDQGDPRLLLDSYVKIGEGSTGIVCLAREKHSGRQVAVKMMDLRKQQRRELLFNEVVIMRDYQHFNVVEMYKSYLVGEELWVLMEFLQGGALTDIVSQVRLNE.... The pKd is 5.0. (6) The small molecule is CCN(CC)CCNC(=O)c1c(C)[nH]c(/C=C2\C(=O)Nc3ccc(F)cc32)c1C. The pKd is 6.4. The target protein sequence is HHSTVADGLITTLHYPAPKRNKPTVYGVSPNYDKWEMERTDITMKHKLGGGQYGEVYEGVWKKYSLTVAVKTLKEDTMEVEEFLKEAAVMKEIKHPNLVQLLGVCTREPPFYIINEFMTYGNLLDYLRECNRQEVNAVVLLYMATQISSAMEYLEKKNFIHRDLAARNCLVGENHLVKVADFGLSRLMTGDTYTAHAGAKFPIKWTAPESLAYNKFSIKSDVWAFGVLLWEIATYGMSPYPGIDLSQVYELLEKDYRMERPEGCPEKVYELMRACWQWNPSDRPSFAEIHQAFETMFQES.