Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is CN1CCC[C@H]1c1cccnc1. The target protein (Q16696) has sequence MLASGLLLVTLLACLTVMVLMSVWRQRKSRGKLPPGPTPLPFIGNYLQLNTEQMYNSLMKISERYGPVFTIHLGPRRVVVLCGHDAVKEALVDQAEEFSGRGEQATFDWLFKGYGVAFSNGERAKQLRRFSIATLRGFGVGKRGIEERIQEEAGFLIDALRGTHGANIDPTFFLSRTVSNVISSIVFGDRFDYEDKEFLSLLRMMLGSFQFTATSTGQLYEMFSSVMKHLPGPQQQAFKELQGLEDFIAKKVEHNQRTLDPNSPRDFIDSFLIRMQEEEKNPNTEFYLKNLVMTTLNLFFAGTETVSTTLRYGFLLLMKHPEVEAKVHEEIDRVIGKNRQPKFEDRAKMPYTEAVIHEIQRFGDMLPMGLAHRVNKDTKFRDFFLPKGTEVFPMLGSVLRDPRFFSNPRDFNPQHFLDKKGQFKKSDAFVPFSIGKRYCFGEGLARMELFLFFTTIMQNFRFKSPQSPKDIDVSPKHVGFATIPRNYTMSFLPR. The pKi is 4.1. (2) The target protein sequence is HAARFKTFFNWPSSVLVNPEQLASAGFYYVGNSDDVKCFCCDGGLRCWESGDDPWVQHAKWFPRCEYL. The pKi is 4.5. The compound is CN[C@H](C)C(=O)N[C@@H](C(=O)N1CCC[C@@H]1c1nc2c(-c3ccccc3Cl)nccc2s1)C1CCOCC1. (3) The compound is Cc1ccc(C[C@H](O)/C=C/[C@H]2[C@H](O)CC(=O)[C@@H]2SCCCSCC(=O)O)cc1. The target protein (P35375) has sequence MSPCGLNLSLADEAATCATPRLPNTSVVLPTGDNGTSPALPIFSMTLGAVSNVLALALLAQVAGRMRRRRSAATFLLFVASLLAIDLAGHVIPGALVLRLYTAGRAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTQPLIHAARVSVARARLALAVLAAMALAVALLPLVHVGRYELQYPGTWCFISLGPRGGWRQALLAGLFAGLGLAALLAALVCNTLSGLALLRARWRRRRSRRFRKTAGPDDRRRWGSRGPRLASASSASSITSATATLRSSRGGGSARRVHAHDVEMVGQLVGIMVVSCICWSPLLVLVVLAIGGWNSNSLQRPLFLAVRLASWNQILDPWVYILLRQAMLRQLLRLLPLRVSAKGGPTELGLTKSAWEASSLRSSRHSGFSHL. The pKi is 5.0. (4) The drug is Cc1ccc(-n2c(C)nc3ccc(C(=O)c4c(C)nn(C)c4O)cc3c2=O)c(Br)c1. The target protein (P32754) has sequence MTTYSDKGAKPERGRFLHFHSVTFWVGNAKQAASFYCSKMGFEPLAYRGLETGSREVVSHVIKQGKIVFVLSSALNPWNKEMGDHLVKHGDGVKDIAFEVEDCDYIVQKARERGAKIMREPWVEQDKFGKVKFAVLQTYGDTTHTLVEKMNYIGQFLPGYEAPAFMDPLLPKLPKCSLEMIDHIVGNQPDQEMVSASEWYLKNLQFHRFWSVDDTQVHTEYSSLRSIVVANYEESIKMPINEPAPGKKKSQIQEYVDYNGGAGVQHIALKTEDIITAIRHLRERGLEFLSVPSTYYKQLREKLKTAKIKVKENIDALEELKILVDYDEKGYLLQIFTKPVQDRPTLFLEVIQRHNHQGFGAGNFNSLFKAFEEEQNLRGNLTNMETNGVVPGM. The pKi is 7.1. (5) The small molecule is O[C@@H]1[C@H](O)CN2CCC[C@@H](O)[C@H]12. The target protein (P53624) has sequence MYRISPIGRKSNFHSREKCLIGLVLVTLCFLCFGGIFLLPDNFGSDRVLRVYKHFRKAGPEIFIPAPPLAAHAPHRSEDPHFIGDRQRLEQKIRAELGDMLDEPPAAGGGEPGQFQVLAQQAQAPAPVAALADQPLDQDEGHAAIPVLAAPVQGDNAASQASSHPQSSAQQHNQQQPQLPLGGGGNDQAPDTLDATLEERRQKVKEMMEHAWHNYKLYAWGKNELRPLSQRPHSASIFGSYDLGATIVDGLDTLYIMGLEKEYREGRDWIERKFSLDNISAELSVFETNIRFVGGMLTLYAFTGDPLYKEKAQHVADKLLPAFQTPTGIPYALVNTKTGVAKNYGWASGGSSILSEFGTLHLEFAYLSDITGNPLYRERVQTIRQVLKEIEKPKGLYPNFLNPKTGKWGQLHMSLGALGDSYYEYLLKAWLQSGQTDEEAREMFDEAMLAILDKMVRTSPGGLTYVSDLKFDRLEHKMDHLACFSGGLFALGAATRQNDY.... The pKi is 8.5. (6) The compound is O=C1OC2C=CCC=C2C(=O)C1Cc1c(O)c2ccccc2oc1=O. The target protein (P00547) has sequence MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEPRENIVYQCWERFCQELGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLNDTRLLALMGELEGRISGSIHYDNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGIKVSTAEARAILPAQYRRQDCIAHGRHLAGFIHACYSRQPELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKPETAQRVADWLGKNYLQNQEGFVHICRLDTAGARVLEN. The pKi is 6.3. (7) The drug is NCCC(=O)N[C@H](Cc1ccc(Cl)cc1Cl)C(=O)N1CCN(C2(CNC(=O)Cc3ccccc3)CCCCC2)CC1. The target protein (P41968) has sequence MNASCCLPSVQPTLPNGSEHLQAPFFSNQSSSAFCEQVFIKPEVFLSLGIVSLLENILVILAVVRNGNLHSPMYFFLCSLAVADMLVSVSNALETIMIAIVHSDYLTFEDQFIQHMDNIFDSMICISLVASICNLLAIAVDRYVTIFYALRYHSIMTVRKALTLIVAIWVCCGVCGVVFIVYSESKMVIVCLITMFFAMMLLMGTLYVHMFLFARLHVKRIAALPPADGVAPQQHSCMKGAVTITILLGVFIFCWAPFFLHLVLIITCPTNPYCICYTAHFNTYLVLIMCNSVIDPLIYAFRSLELRNTFREILCGCNGMNLG. The pKi is 6.2. (8) The compound is CC(C)n1cc2c3c(cccc31)C1CC(C(=O)NC3CCCCC3)CN(C)C1C2. The target protein (P50128) has sequence MDILCEENTSLSSTTNSLMQLNEDTRLYSNDFNSGEANTSDAFNWTVESENRTNLSCEGCLSPSCLSLLHLQEKNWSALLTAVVIILTIAGNILVIMAVSLEKKLQNATNYFLMSLAIADMLLGFLVMPVSMLTILYGYRWPLPSKLCAVWIYLDVLFSTASIMHLCAISLDRYVAIQNPIHHSRFNSRTKAFLKIIAVWTISVGISMPIPVFGLQDDSKVFKEGSCLLADDNFVLIGSFVSFFIPLTIMVITYFLTIKSLQKEATLCVSDLGTRAKLASFSFLPQSSLSSEKLFQRSIHRDPGSYTGRRTMQSISNEQKACKVLGIVFFLFVVMWCPFFITNIMAVICKESCNEDVIGALLNVFVWIGYLSSAVNPLVYTLFNKTYRSAFSRYIQCQYKENKKPLQLILVNTIPALAYKSSQLQMGQKKNSKQDAKTTDNDCSMVALGKQHSEDASKDNSDGVNEKVSCV. The pKi is 7.6. (9) The compound is CC[C@H](C)[C@@H](NC(=O)[C@@H](S)[C@H]([NH3+])CCS(=O)(=O)[O-])C(=O)N[C@@H](CC(=O)O)C(=O)O. The target protein (Q07075) has sequence MNFAEREGSKRYCIQTKHVAILCAVVVGVGLIVGLAVGLTRSCDSSGDGGPGTAPAPSHLPSSTASPSGPPAQDQDICPASEDESGQWKNFRLPDFVNPVHYDLHVKPLLEEDTYTGTVSISINLSAPTRYLWLHLRETRITRLPELKRPSGDQVQVRRCFEYKKQEYVVVEAEEELTPSSGDGLYLLTMEFAGWLNGSLVGFYRTTYTENGQVKSIVATDHEPTDARKSFPCFDEPNKKATYTISITHPKEYGALSNMPVAKEESVDDKWTRTTFEKSVPMSTYLVCFAVHQFDSVKRISNSGKPLTIYVQPEQKHTAEYAANITKSVFDYFEEYFAMNYSLPKLDKIAIPDFGTGAMENWGLITYRETNLLYDPKESASSNQQRVATVVAHELVHQWFGNIVTMDWWEDLWLNEGFASFFEFLGVNHAETDWQMRDQMLLEDVLPVQEDDSLMSSHPIIVTVTTPDEITSVFDGISYSKGSSILRMLEDWIKPENFQK.... The pKi is 6.5. (10) The compound is CC(C(=O)O)N(Cc1ccccc1Cl)S(=O)(=O)c1ccccc1. The target protein sequence is MKKNILKILMDSYSKESKIQTVRRVTSVSLLAVYLTMNTSSLVLAKPIENTNDTSIKNVEKLRNAPNEENSKKVEDSKNDKVEHVKNIEEAKVEQVAPEVKSKSTLRSASIANTNSEKYDFEYLNGLSYTELTNLIKNIKWNQINGLFNYSTGSQKFFGDKNRVQAIINALQESGRTYTANDMKGIETFTEVLRAGFYLGYYNDGLSYLNDRNFQDKCIPAMIAIQKNPNFKLGTAVQDEVITSLGKLIGNASANAEVVNNCVPVLKQFRENLNQYAPDYVKGTAVNELIKGIEFDFSGAAYEKDVKTMPWYGKIDPFINELKALGLYGNITSATEWASDVGIYYLSKFGLYSTNRNDIVQSLEKAVDMYKYGKIAFVAMERITWDYDGIGSNGKKVDHDKFLDDAEKHYLPKTYTFDNGTFIIRAGDKVSEEKIKRLYWASREVKSQFHRVVGNDKALEVGNADDVLTMKIFNSPEEYKFNTNINGVSTDNGGLYIEPR.... The pKi is 4.6.