This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is CNC(=O)CCC(=O)N[C@@H](Cc1ccccc1)[C@H](O)CN(CC1CC1)S(=O)(=O)c1cccc(OC)c1. The target protein sequence is PQVTLWQRPLVTIRVGGQLKEALLDTGADDTVLEDMNLPGRWKPKMIGGIGGFIKVRQYDQITVEICGHKAIGTVLVGPTPVNIIGRNLLTXIGCTLNF. The pKi is 7.6. (2) The small molecule is C[C@]1(Cn2ccnn2)[C@H](C(=O)O)N2C(=O)C[C@H]2S1(=O)=O. The target protein sequence is MSLYRRLVLLSCLSWPLAGFSATALTNLVAEPFAKLEQDFGGSIGVYAMDTGSGATVSYRAEERFPLCSSFKGFLAAAVLARSQQQAGLLDTPIRYGKNALVPWSPISEKYLTTGMTVAELSAAAVQYSDNAAANLLLKELGGPAGLTAFMRSIGDTTFRLDRWELELNSAIPGDARDTSSPRAVTESLQKLTLGSALAAPQRQQFVDWLKGNTTGNHRIRAAVPADWAVGDKTGTCGVYGTANDYAVVWPTGRAPIVLAVYTRAPNKDDKHSEAVIAAAARLALEGLGVNGQ. The pKi is 4.1. (3) The compound is NS(N)(=O)=O. The target protein sequence is MKKTTWVLAMAASMSFGVQASEWGYEGEHAPEHWGKVAPLCAEGKNQSPIDVAQSVEADLQPFTLNYQGQVVGLLNNGHTLQAIVSGNNPLQIDGKTFQLKQFHFHTPSENLLKGKQFPLEAHFVHADEQGNLAVVAVMYQVGSESPLLKALTADMPTKGNSTQLTQGIPLADWIPESKHYYRFNGSLTTPPCSEGVRWIVLKEPAHVSNQQEQQLSAVMGHNNRPVQPHNARLVLQAD. The pKi is 5.1. (4) The small molecule is Cc1nc(-c2ccc(-c3ccc(C(=O)N4CCc5cc6c(cc54)C4(CCN(C)CC4)CO6)cc3)c(C)c2)no1. The target protein (P30545) has sequence MVHQEPYSVQATAAIASAITFLILFTIFGNALVILAVLTSRSLRAPQNLFLVSLAAADILVATLIIPFSLANELLGYWYFWRAWCEVYLALDVLFCTSSIVHLCAISLDRYWAVSRALEYNSKRTPRRIKCIILTVWLIAAVISLPPLIYKGDQRPEPHGLPQCELNQEAWYILASSIGSFFAPCLIMILVYLRIYVIAKRSHCRGLGAKRGSGEGESKKPRPGPAAGGVPASAKVPTLVSPLSSVGEANGHPKPPREKEEGETPEDPEARALPPNWSALPRSVQDQKKGTSGATAEKGAEEDEEEVEECEPQTLPASPASVFNPPLQQPQTSRVLATLRGQVLLSKNVGVASGQWWRRRTQLSREKRFTFVLAVVIGVFVVCWFPFFFSYSLGAICPQHCKVPHGLFQFFFWIGYCNSSLNPVIYTIFNQDFRRAFRRILCRQWTQTGW. The pKi is 6.0. (5) The target protein (P56658) has sequence MAQTPAFNKPKVELHVHLDGAIKPETILYYGRKRGIALPADTPEELQNIIGMDKPLSLPEFLAKFDYYMPAIAGCREAVKRIAYEFVEMKAKDGVVYVEVRYSPHLLANSKVEPIPWNQAEGDLTPDEVVSLVNQGLQEGERDFGVKVRSILCCMRHQPSWSSEVVELCKKYREQTVVAIDLAGDETIEGSSLFPGHVKAYAEAVKSGVHRTVHAGEVGSANVVKEAVDTLKTERLGHGYHTLEDATLYNRLRQENMHFEVCPWSSYLTGAWKPDTEHPVVRFKNDQVNYSLNTDDPLIFKSTLDTDYQMTKNEMGFTEEEFKRLNINAAKSSFLPEDEKKELLDLLYKAYGMPSPASAEQCL. The pKi is 4.2. The compound is CC(O)[C@H](C=O)O[C@@H](C=O)n1cnc2c(N)ncnc21. (6) The drug is Cc1nc(C#Cc2ccc(-c3cccc(C(F)(F)F)c3)nc2)cs1. The pKi is 7.9. The target protein (P31424) has sequence MVLLLILSVLLLKEDVRGSAQSSERRVVAHMPGDIIIGALFSVHHQPTVDKVHERKCGAVREQYGIQRVEAMLHTLERINSDPTLLPNITLGCEIRDSCWHSAVALEQSIEFIRDSLISSEEEEGLVRCVDGSSSFRSKKPIVGVIGPGSSSVAIQVQNLLQLFNIPQIAYSATSMDLSDKTLFKYFMRVVPSDAQQARAMVDIVKRYNWTYVSAVHTEGNYGESGMEAFKDMSAKEGICIAHSYKIYSNAGEQSFDKLLKKLRSHLPKARVVACFCEGMTVRGLLMAMRRLGLAGEFLLLGSDGWADRYDVTDGYQREAVGGITIKLQSPDVKWFDDYYLKLRPETNLRNPWFQEFWQHRFQCRLEGFAQENSKYNKTCNSSLTLRTHHVQDSKMGFVINAIYSMAYGLHNMQMSLCPGYAGLCDAMKPIDGRKLLDSLMKTNFTGVSGDMILFDENGDSPGRYEIMNFKEMGKDYFDYINVGSWDNGELKMDDDEVWS.... (7) The compound is COC(=O)[C@@H](c1ccccc1)[C@@H]1CCCCN1. The target is MLLARMKPQVQPELGGADQ. The pKi is 7.0. (8) The compound is CCOc1ccc(C[C@@H]2NC(=O)CC(C3CCCC3)(C3CCCC3)SSC[C@@H](C(=O)N[C@H](CCCN=C(N)N)C(=O)N[C@@H](CCCN=C(N)N)C(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](Cc3ccccc3)NC2=O)cc1. The target protein (Q8NFM4) has sequence MARLFSPRPPPSEDLFYETYYSLSQQYPLLLLLLGIVLCALAALLAVAWASGRELTSDPSFLTTVLCALGGFSLLLGLASREQRLQRWTRPLSGLVWVALLALGHAFLFTGGVVSAWDQVSYFLFVIFTAYAMLPLGMRDAAVAGLASSLSHLLVLGLYLGPQPDSRPALLPQLAANAVLFLCGNVAGVYHKALMERALRATFREALSSLHSRRRLDTEKKHQEHLLLSILPAYLAREMKAEIMARLQAGQGSRPESTNNFHSLYVKRHQGVSVLYADIVGFTRLASECSPKELVLMLNELFGKFDQIAKEHECMRIKILGDCYYCVSGLPLSLPDHAINCVRMGLDMCRAIRKLRAATGVDINMRVGVHSGSVLCGVIGLQKWQYDVWSHDVTLANHMEAGGVPGRVHITGATLALLAGAYAVEDAGMEHRDPYLRELGEPTYLVIDPRAEEEDEKGTAGGLLSSLEGLKMRPSLLMTRYLESWGAAKPFAHLSHGDSP.... The pKi is 8.7. (9) The small molecule is CN(C)CC/C=C1\c2ccccc2COc2ccc(CC(=O)O)cc21. The target protein (P31389) has sequence MSFLPGMTPVTLSNFSWALEDRMLEGNSTTTPTRQLMPLVVVLSSVSLVTVALNLLVLYAVRSERKLHTVGNLYIVSLSVADLIVGAVVMPMSILYLHRSAWILGRPLCLFWLSMDYVASTASIFSVFILCIDRYRSVQQPLRYLRYRTKTRASATILGAWLLSFLWVIPILGWHHFMAPTSEPREKKCETDFYDVTWFKVMTAIINFYLPTLLMLWFYIRIYKAVRRHCQHRQLINSSLPSFSEMKLKLENAKVDTRRMGKESPWEDPKRCSKDASGVHTPMPSSQHLVDMPCAAVLSEDEGGEVGTRQMPMLAVGDGRCCEALNHMHSQLELSGQSRATHSISARPEEWTVVDGQSFPITDSDTSTEAAPMGGQPRSGSNSGLDYIKFTWRRLRSHSRQYTSGLHLNRERKAAKQLGCIMAAFILCWIPYFVFFMVIAFCKSCSNEPVHMFTIWLGYLNSTLNPLIYPLCNENFRKTFKRILRIPP. The pKi is 8.3. (10) The compound is CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)N[C@H](C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(N)=O)[C@@H](C)CC. The target protein (Q9Z2D5) has sequence MGPIGTEADENQTVEEIKVEPYGPGHTTPRGELAPDPEPELIDSTKLTEVRVVLILAYCSIILLGVVGNSLVIHVVIKFKSMRTVTNFFIANLAVADLLVNTLCLPFTLTYTLMGEWKMGPVLCHLVPYAQGLAVQVSTVTLTVIALDRHRCIVYHLDSKISKQNSFLIIGLAWGISALLASPLAIFREYSLIEIIPDFEIVACTEKWPGEEKSIYGTVYSLSSLLILYVLPLGIISVSYVRIWSKLKNHVSPGAANDHYHQRRQKTTKMLVFVVVVFAVSWLPLHAFQLAVDIDSQVLDLKEYKLIFTVFHIIAMCSTFANPLLYGWMNSNYRKAFLSAFRCQQRLDAIQSEVCVTGKAKTNVEVEKNHGAADSAEATNV. The pKi is 6.5.