Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is N[C@@H](CCP(=O)(O)O)C(=O)O. The target protein (P70579) has sequence MVCEGKRLASCPCFFLLTAKFYWILTMMQRTHSQEYAHSIRVDGDIILGGLFPVHAKGERGVPCGELKKEKGIHRLEAMLYAIDQINKDPDLLSNITLGVRILDTCSRDTYALEQSLTFVQALIEKDASDVKCANGDPPIFTKPDKISGVIGAAASSVSIMVANILRLFKIPQISYASTAPELSDNTRYDFFSRVVPPDSYQAQAMVDIVTALGWNYVSTLASEGNYGESGVEAFTQISREIGGVCIAQSQKIPREPRPGEFEKIIKRLLETPNARAVIMFANEDDIRRILEAAKKLNQSGHFLWIGSDSWGSKIAPVYQQEEIAEGAVTILPKRASIDGFDRYFRSRTLANNRRNVWFAEFWEENFGCKLGSHGKRNSHIKKCTGLERIARDSSYEQEGKVQFVIDAVYSMAYALHNMHKERCPGYIGLCPRMVTIDGKELLGYIRAVNFNGSAGTPVTFNENGDAPGRYDIFQYQINNKSTEYKIIGHWTNQLHLKVE.... The pKi is 5.7. (2) The drug is C=C(C=O)C[C@H]1C[C@H](O)[C@]2(C)O[C@@H]3C[C@@H]4O[C@@H]5C[C@]6(C)O[C@]7(C)CC[C@@H]8O[C@@H]9C[C@]%10(C)O[C@@H]%11C(C)=CC(=O)O[C@H]%11C[C@H]%10O[C@H]9C[C@@H](C)[C@H]8O[C@H]7C[C@H]6O[C@@]5(C)C/C=C\[C@H]4O[C@H]3C[C@H]2O1. The target protein (P60615) has sequence MKTLLLTLVVVTIVCLDLGYTIVCHTTATSPISAVTCPPGENLCYRKMWCDAFCSSRGKVVELGCAATCPSKKPYEEVTCCSTDKCNPHPKQRPG. The pKi is 9.3. (3) The drug is [SH-]. The target protein sequence is MSDLQQLFENNVRWAEAIKQEDPDFFAKLARQQTPEYLWIGCSDARVPANEIVGMLPGDLFVHRNVANVVLHTDLNCLSVIQFAVDVLKVKHILVTGHYGCGGVRASLHNDQLGLIDGWLRSIRDLAYEYREHLEQLPTEEERVDRLCELNVIQQVANVSHTSIVQNAWHRGQSLSVHGCIYGIKDGLWKNLNVTVSGLDQLPPQYRLSPLGGCC. The pKi is 3.2. (4) The drug is CCCCCCCCOc1c(OC)cc(Cc2cnc(N)nc2N)cc1OC. The target protein (P00381) has sequence MTAFLWAQDRDGLIGKDGHLPWHLPDDLHYFRAQTVGKIMVVGRRTYESFPKRPLPERTNVVLTHQEDYQAQGAVVVHDVAAVFAYAKQHPDQELVIAGGAQIFTAFKDDVDTLLVTRLAGSFEGDTKMIPLNWDDFTKVSSRTVEDTNPALTHTYEVWQKKA. The pKi is 5.7. (5) The drug is O=C(c1cc(-c2cc(F)cc([N+](=O)[O-])c2O)cc([N+](=O)[O-])c1O)C1CCc2ccccc21. The target protein (O43447) has sequence MAVANSSPVNPVVFFDVSIGGQEVGRMKIELFADVVPKTAENFRQFCTGEFRKDGVPIGYKGSTFHRVIKDFMIQGGDFVNGDGTGVASIYRGPFADENFKLRHSAPGLLSMANSGPSTNGCQFFITCSKCDWLDGKHVVFGKIIDGLLVMRKIENVPTGPNNKPKLPVVISQCGEM. The pKi is 4.0. (6) The compound is CN1CNc2c1[nH]c(=O)[nH]c2=O. The target protein (P30543) has sequence MGSSVYITVELAIAVLAILGNVLVCWAVWINSNLQNVTNFFVVSLAAADIAVGVLAIPFAITISTGFCAACHGCLFFACFVLVLTQSSIFSLLAIAIDRYIAIRIPLRYNGLVTGVRAKGIIAICWVLSFAIGLTPMLGWNNCSQKDGNSTKTCGEGRVTCLFEDVVPMNYMVYYNFFAFVLLPLLLMLAIYLRIFLAARRQLKQMESQPLPGERTRSTLQKEVHAAKSLAIIVGLFALCWLPLHIINCFTFFCSTCRHAPPWLMYLAIILSHSNSVVNPFIYAYRIREFRQTFRKIIRTHVLRRQEPFQAGGSSAWALAAHSTEGEQVSLRLNGHPLGVWANGSATHSGRRPNGYTLGLGGGGSAQGSPRDVELPTQERQEGQEHPGLRGHLVQARVGASSWSSEFAPS. The pKi is 7.8. (7) The compound is O=C(CN1CCN(C(=O)c2ccco2)CC1)Nc1cc(C(F)(F)F)ccc1Cl. The target protein sequence is MCGNNMSTPLPAIVPAARKATAAVIFLHGLGDTGHGWAEAFAGIRSSHIKYICPHAPVRPVTLNMNVAMPSWFDLIGLSPDAPEDESGIKQAAENIKALIDQEVKNGIPSNRIILGGFSQGGALSLYTALTTQQKLAGVTALSCWLPLRASFPQGPIGGANRDISILQCHGDCDPLVPLMFGSLTVEKLKTLVNPANVTFKTYEGMMHSSCQQEMMDVKQFIDKLLPPID. The pKi is 5.0. (8) The small molecule is CCCN(CCC)[C@H]1CCc2c(F)ccc(O)c2C1. The target protein sequence is MCRQLQRASFPEHRCSLSRKKNGGPGNQLEIARSPFAQGCCNLTLNQSLPTSDPLNASEKGEVSRMSVREKNWPALLILVVILLTIGGNILVIMAVSLEKKLQNATNFFLMSLAVADMLVGILVMPVSLITVLYDYAWPLPKQLCPIWISLDVLFSTASIMHLCAISLDRYVAIRNPIEHSRFNSRTKAIMKIAAVWTISIGISMPIPVMGLQDDSRVFVNGTCVLNDENFVLIGSFMAFFIPLIIMVITYCLTIQVLQRQATVFMCGEVPRQRRSSVNCLKKENNTENISMLHNHEGASHLNSPVNKEAVLFRKGTMQSINNERRASKVLGIVFFLFLIMWCPFFITNVMSVLCKEACDKDLLSELLDVFVWVGRLQFPERRWGMKFWCGFVILTGITCTLGEV. The pKi is 5.0. (9) The small molecule is O=C([O-])CN1C(=O)/C(=C/c2ccc(O)c(O)c2)SC1=S. The target protein (P04036) has sequence MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL. The pKi is 7.6.