Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is C=CCN(Cc1cccc(N)c1)C[C@H]1O[C@@H](n2cnc3c(N)ncnc32)[C@H](O)[C@@H]1O. The target protein (Q8TEK3) has sequence MGEKLELRLKSPVGAEPAVYPWPLPVYDKHHDAAHEIIETIRWVCEEIPDLKLAMENYVLIDYDTKSFESMQRLCDKYNRAIDSIHQLWKGTTQPMKLNTRPSTGLLRHILQQVYNHSVTDPEKLNNYEPFSPEVYGETSFDLVAQMIDEIKMTDDDLFVDLGSGVGQVVLQVAAATNCKHHYGVEKADIPAKYAETMDREFRKWMKWYGKKHAEYTLERGDFLSEEWRERIANTSVIFVNNFAFGPEVDHQLKERFANMKEGGRIVSSKPFAPLNFRINSRNLSDIGTIMRVVELSPLKGSVSWTGKPVSYYLHTIDRTILENYFSSLKNPKLREEQEAARRRQQRESKSNAATPTKGPEGKVAGPADAPMDSGAEEEKAGAATVKKPSPSKARKKKLNKKGRKMAGRKRGRPKKMNTANPERKPKKNQTALDALHAQTVSQTAASSPQDAYRSPHSPFYQLPPSVQRHSPNPLLVAPTPPALQKLLESFKIQYLQFLA.... The pKi is 4.0. (2) The drug is CN(c1cccc(NC(=O)CN2N=C(C3CCCCC3)c3ccccc3N(CC(=O)C(C)(C)C)C2=O)c1)c1nnn[nH]1. The target protein (P30552) has sequence MELLKLNRSAQGSGAGPGASLCRAGGALLNSSGAGNLSCEPPRLRGAGTRELELAIRVTLYAVIFLMSVGGNVLIIVVLGLSRRLRTVTNAFLLSLAVSDLLLAVACMPFTLLPNLMGTFIFGTVVCKAVSYLMGVSVSVSTLSLVAIALERYSAICRPLQARVWQTRSHAARVIIATWMLSGLLMVPYPVYTAVQPAGGARALQCVHRWPSARVRQTWSVLLLLLLFFVPGVVMAVAYGLISRELYLGLRFDEDSDSESRVRSQGGLRGGAGPGPAPPNGSCRPEGGLAGEDGDGCYVQLPRSRQTLELSALTAPTPGPGGGPRPYQAKLLAKKRVVRMLLVIVVLFFLCWLPLYSANTWRAFDSSGAHRALSGAPISFIHLLSYASACVNPLVYCFMHRRFRQACLETCARCCPRPPRARPRPLPDEDPPTPSIASLSRLSYTTISTLGPG. The pKi is 8.3. (3) The drug is CC[C@H](C)[C@@H]1NC(=O)[C@@H](Cc2c[nH]c3ccccc23)NC(=O)[C@@H]2CCCN2C(=O)[C@@H](Cc2cnc[nH]2)NC(=O)[C@@H]2CCCCN2C(=O)[C@H]2CCC=NN2C1=O. The target protein (P56494) has sequence MEGELAANWSTEAVNSSAAPPGAEGNCTAGPPRRNEALARVEVAVLCLILFLALSGNACVLLALRTTRHKHSRLFFFMKHLSIADLVVAVFQVLPQLLWDITFRFYGPDLLCRLVKYLQVVGMFASTYLLLLMSLDRCLAICQPLRSLRRRTDRLAVLATWLGCLVASAPQVHIFSLREVADGVFDCWAVFIQPWGPKAYITWITLAVYIVPVIVLAACYGLISFKIWQNLRLKTAAAAAAEAPEGAAAGDGGRMALARVSSVKLISKAKIRTVKMTFIIVLAFIVCWTPFFFVQMWSVWDANAPKEASAFIIVMLLASLNSCCNPWIYMLFTGHLFHELVQRFLCCSASYLKGNRLGETSTSKKSNSSSFVLSHRSSSQRSCSQPSTA. The pKi is 8.6. (4) The small molecule is CC(C)(C)NC(=O)[C@@H]1CN(Cc2cccnc2)CCN1C[C@@H](O)C[C@@H](Cc1ccccc1)C(=O)N[C@H]1c2ccccc2C[C@H]1O. The target protein sequence is PQVTLWKRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGPTPVNVIGRNLLTQIGCTLNF. The pKi is 8.3. (5) The pKi is 2.6. The target protein (Q3MHH4) has sequence MAALDSLSLFTGLGLSEQKARETLKNTVLSAQLREAATQAQQTLGSSIDKATGTLLYGLASRLRDPRRLSFLVSYITSRKIHTETQLSAALEYVRSHPLDPINTEDFEQECGVGVVVTPEQIEEAVEAAINRHRAKLLVERYHFSMGLLMGEARAALKWADGKMIKHEVDMQVLHLLGPKTETDLEKKPKVAKARPEETDQRTAKDVVENGEVVVQTLSLMEQLRGEALKFHKPGENYKTPGYVTTPHTMDLLKQHLDITGGQVRTRFPPEPNGILHIGHAKAINFNFGYAKANNGICFLRFDDTNPEKEEAKFFTAIYDMVAWLGYTPYKVTYASDYFDQLYAWAVELIRRDQAYVCHQRGEELKGHNPLPSPWRDRPIEESLLLFEAMRKGKFAEGEATLRMKLVMEDGKMDPVAYRVKYTPHHRTGDTWCIYPTYDYTHCLCDSIEHITHSLCTKEFQARRSSYFWLCNALDVYCPVQWEYGRLNLHYAVVSKRKIL.... The compound is Nc1ncnc2c1ncn2[C@@H]1O[C@H](COP(=O)(O)CC(=O)[C@@H](N)CCC(=O)O)[C@@H](O)[C@H]1O. (6) The small molecule is Cn1cncc1COc1ccc(C(F)(F)F)cc1C(=O)/N=c1\sc(C(C)(C)C)cn1C[C@H]1CCCO1. The target protein (Q9QZN9) has sequence MAGCRELELTNGSNGGLEFNPMKEYMILSDAQQIAVAVLCTLMGLLSALENVAVLYLILSSQRLRRKPSYLFIGSLAGADFLASVIFACNFVIFHVFHGVDSRNIFLLKIGSVTMTFTASVGSLLLTAVDRYLCLCYPPTYKALVTRGRALVALGVMWVLSALISYLPLMGWTCCPSPCSELFPLIPNDYLLGWLLFIAILFSGIIYTYGYVLWKAHQHVASLAEHQDRQVPGIARMRLDVRLAKTLGLVMAVLLICWFPALALMGHSLVTTLSDKVKEAFAFCSMLCLVNSMINPIIYALRSGEIRSAAQHCLTGWKKYLQGLGSEGKEEAPKSSVTETEAEVKTTTGPGSRTPGCSNC. The pKi is 7.9. (7) The drug is CC(C)CN(C(=O)c1cccc(F)c1C(F)(F)F)[C@H]1CCNC1. The target protein (P23975) has sequence MLLARMNPQVQPENNGADTGPEQPLRARKTAELLVVKERNGVQCLLAPRDGDAQPRETWGKKIDFLLSVVGFAVDLANVWRFPYLCYKNGGGAFLIPYTLFLIIAGMPLFYMELALGQYNREGAATVWKICPFFKGVGYAVILIALYVGFYYNVIIAWSLYYLFSSFTLNLPWTDCGHTWNSPNCTDPKLLNGSVLGNHTKYSKYKFTPAAEFYERGVLHLHESSGIHDIGLPQWQLLLCLMVVVIVLYFSLWKGVKTSGKVVWITATLPYFVLFVLLVHGVTLPGASNGINAYLHIDFYRLKEATVWIDAATQIFFSLGAGFGVLIAFASYNKFDNNCYRDALLTSSINCITSFVSGFAIFSILGYMAHEHKVNIEDVATEGAGLVFILYPEAISTLSGSTFWAVVFFVMLLALGLDSSMGGMEAVITGLADDFQVLKRHRKLFTFGVTFSTFLLALFCITKGGIYVLTLLDTFAAGTSILFAVLMEAIGVSWFYGVDR.... The pKi is 7.6. (8) The compound is COC(=O)[C@@H]1C[C@H](OC(C)=O)C(=O)[C@H]2[C@@]1(C)CC[C@H]1C(=O)O[C@H](c3ccoc3)C[C@]21C. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVALMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSLQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSYYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKi is 6.6. (9) The small molecule is Nc1nc(N)c2nc(CN(CCCCCC(=O)O)c3ccc(C(=O)NC(CCC(=O)O)C(=O)O)cc3)cnc2n1. The target protein (P00374) has sequence MVGSLNCIVAVSQNMGIGKNGDLPWPPLRNEFRYFQRMTTTSSVEGKQNLVIMGKKTWFSIPEKNRPLKGRINLVLSRELKEPPQGAHFLSRSLDDALKLTEQPELANKVDMVWIVGGSSVYKEAMNHPGHLKLFVTRIMQDFESDTFFPEIDLEKYKLLPEYPGVLSDVQEEKGIKYKFEVYEKND. The pKi is 8.8. (10) The pKi is 7.0. The target protein (Q6QHF9) has sequence MESTGSVGEAPGGPRVLVVGGGIAGLGAAQRLCGHSAFPHLRVLEATARAGGRIRSERCFGGVVEVGAHWIHGPSRGNPVFQLAAEYGLLGEKELSQENQLVETGGHVGLPSVSYASSGTSVSLQLVAEMATLFYGLIDQTREFLHAAETPVPSVGEYLKKEIGQHVARLCGHSAFPHLRVLEATARAGGRIRSERCFGGVVEVGAHWIHGPSRGNPVFQLAAEYGLLGEKELSQENQLVETGGHVGLPSVSYASSGASVSLQLVAEMATLFYGLIDQTREFLHAAETPVPSVGEYLKKEIGQHVAGWTEDEETRKLKLAVLNSFFNLECCVSGTHSMDLVALAPFGEYTVLPGLDCTFSKGYQGLTNCMMAALPEDTVVFEKPVKTIHWNGSFQEAAFPGETFPVSVECEDGDRFPAHHVIVTVPLGFLREHLDTFFDPPLPAEKAEAIRKIGFGTNNKIFLEFEEPFWEPDCQLIQLVWEDTSPLEDAAPELQDAWFR.... The drug is CC(=O)CCCCCNCCCN(C)C.