Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki.. Dataset: Drug-target binding data from BindingDB using Ki measurements (1) The compound is O=C1CCc2ccccc2N1CCCN1CCC(c2ccccc2)CC1. The target protein (P08173) has sequence MANFTPVNGSSGNQSVRLVTSSSHNRYETVEMVFIATVTGSLSLVTVVGNILVMLSIKVNRQLQTVNNYFLFSLACADLIIGAFSMNLYTVYIIKGYWPLGAVVCDLWLALDYVVSNASVMNLLIISFDRYFCVTKPLTYPARRTTKMAGLMIAAAWVLSFVLWAPAILFWQFVVGKRTVPDNQCFIQFLSNPAVTFGTAIAAFYLPVVIMTVLYIHISLASRSRVHKHRPEGPKEKKAKTLAFLKSPLMKQSVKKPPPGEAAREELRNGKLEEAPPPALPPPPRPVADKDTSNESSSGSATQNTKERPATELSTTEATTPAMPAPPLQPRALNPASRWSKIQIVTKQTGNECVTAIEIVPATPAGMRPAANVARKFASIARNQVRKKRQMAARERKVTRTIFAILLAFILTWTPYNVMVLVNTFCQSCIPDTVWSIGYWLCYVNSTINPACYALCNATFKKTFRHLLLCQYRNIGTAR. The pKi is 5.9. (2) The drug is C=CCN1C2CCC1[C@@H](C(=O)OCC)[C@@H](OC(c1ccc(F)cc1)c1ccc(F)cc1)C2. The target protein sequence is MVTRTRETWGKKIDFLLSVVGFAVDLANVWRFPYLCYKNGGGAFLIPYTLFLIIAGMPLFYMELALGQFNREGAATVWKICPFFKGVGYAVILIALYVGFYYNVIIAWSLYYLFASFTLNLPWTNCGHAWNSPNCTDPKLLNASVLGDHTKYSKYKFTPAAEFYERGVLHLHESSGIHDIGLPQWQLLLCLMVVIVVLYVSLWKGVKTSGKVVWITATLPYFVLFVLLVHGVTLPGASNGINAYLHIDFYRLKEATVWIDAATQIFFSLGAGFGVLIAFASYNKFDNNCYRDALLTSTINCVTSFISGFAIFSILGYMAHEHKVKIEDVATEGAGLVFVLYPEAISTLSGSTFWAVLFFLMLLALGLDSSMGGMEAVITGLADDFQVLKRHRKLFTCAVTLGTFLLAMFCITKGGIYVLTLLDTFAAGTSILFAVLMEAIGVSWFYGVDRFSNDIQQMMGFKPGLYWRLCWKFVSPAFLLFVVVVSIINFKPLTYDDYVY.... The pKi is 6.0. (3) The drug is O=C1OC[C@H](c2ccc(OCCCC(F)F)cc2)N1c1ccc2[nH]cnc2c1. The target protein (Q16769) has sequence MAGGRHRRVVGTLHLLLLVAALPWASRGVSPSASAWPEEKNYHQPAILNSSALRQIAEGTSISEMWQNDLQPLLIERYPGSPGSYAARQHIMQRIQRLQADWVLEIDTFLSQTPYGYRSFSNIISTLNPTAKRHLVLACHYDSKYFSHWNNRVFVGATDSAVPCAMMLELARALDKKLLSLKTVSDSKPDLSLQLIFFDGEEAFLHWSPQDSLYGSRHLAAKMASTPHPPGARGTSQLHGMDLLVLLDLIGAPNPTFPNFFPNSARWFERLQAIEHELHELGLLKDHSLEGRYFQNYSYGGVIQDDHIPFLRRGVPVLHLIPSPFPEVWHTMDDNEENLDESTIDNLNKILQVFVLEYLHL. The pKi is 7.8. (4) The compound is CNCCC(Oc1ccccc1C)c1ccccc1. The target is MLLARMKPQVQPELGGADQ. The pKi is 8.7. (5) The compound is O=C(O[C@H]1CN2CCC1CC2)N1CCc2ccccc2[C@H]1c1ccccc1. The target protein (P08483) has sequence MTLHSNSTTSPLFPNISSSWVHSPSEAGLPLGTVTQLGSYNISQETGNFSSNDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLACADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEAEAENFVHPTGSSRSCSSYELQQQGVKRSSRRKYGRCHFWFTTKSWKPSAEQMDQDHSSSDSWNNNDAAASLENSASSDEEDIGSETRAIYSIVLKLPGHSSILNSTKLPSSDNLQVSNEDLGTVDVERNAHKLQAQKSMGDGDNCQKDFTKLPIQLESAVDTGKTSDTNSSADKTTATLPLSFKEATLAKRFALKTRSQITKRKRMSLIKEKKAAQTLSAILLAFI.... The pKi is 9.0. (6) The target protein sequence is PQITLWQRPLVTVKIGGQLREALLDTGADDTVLEDINLPGKWKPKMIGGVGGFIKVKQYEQVLIEICGKKVIGTVLVGPTPVNIIGRNMLTQIGCTLNF. The pKi is 9.6. The drug is CC[C@H](C)[C@H](NC(C)=O)C(=O)N[C@@H](Cc1ccccc1)[C@H](O)CN(CC(C)C)S(=O)(=O)c1ccc(OC)cc1. (7) The compound is Nc1cc(C(O)CNCCCCNc2ncnc3c2ncn3[C@@H]2O[C@H](CO)[C@@H](O)[C@H]2O)ccc1O. The target protein (P07550) has sequence MGQPGNGSAFLLAPNGSHAPDHDVTQERDEVWVVGMGIVMSLIVLAIVFGNVLVITAIAKFERLQTVTNYFITSLACADLVMGLAVVPFGAAHILMKMWTFGNFWCEFWTSIDVLCVTASIETLCVIAVDRYFAITSPFKYQSLLTKNKARVIILMVWIVSGLTSFLPIQMHWYRATHQEAINCYANETCCDFFTNQAYAIASSIVSFYVPLVIMVFVYSRVFQEAKRQLQKIDKSEGRFHVQNLSQVEQDGRTGHGLRRSSKFCLKEHKALKTLGIIMGTFTLCWLPFFIVNIVHVIQDNLIRKEVYILLNWIGYVNSGFNPLIYCRSPDFRIAFQELLCLRRSSLKAYGNGYSSNGNTGEQSGYHVEQEKENKLLCEDLPGTEDFVGHQGTVPSDNIDSQGRNCSTNDSLL. The pKi is 5.4.