Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is O=c1ncccn1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O. The target protein (P03958) has sequence MAQTPAFNKPKVELHVHLDGAIKPETILYFGKKRGIALPADTVEELRNIIGMDKPLSLPGFLAKFDYYMPVIAGCREAIKRIAYEFVEMKAKEGVVYVEVRYSPHLLANSKVDPMPWNQTEGDVTPDDVVDLVNQGLQEGEQAFGIKVRSILCCMRHQPSWSLEVLELCKKYNQKTVVAMDLAGDETIEGSSLFPGHVEAYEGAVKNGIHRTVHAGEVGSPEVVREAVDILKTERVGHGYHTIEDEALYNRLLKENMHFEVCPWSSYLTGAWDPKTTHAVVRFKNDKANYSLNTDDPLIFKSTLDTDYQMTKKDMGFTEEEFKRLNINAAKSSFLPEEEKKELLERLYREYQ. The pKi is 6.0. (2) The compound is CC(C)n1nnc(-c2o[nH]c(=O)c2CC(N)C(=O)O)n1. The target protein (P19492) has sequence MGQSVLRAVFFLVLGLLGHSHGGFPNTISIGGLFMRNTVQEHSAFRFAVQLYNTNQNTTEKPFHLNYHVDHLDSSNSFSVTNAFCSQFSRGVYAIFGFYDQMSMNTLTSFCGALHTSFVTPSFPTDADVQFVIQMRPALKGAILSLLSYYKWEKFVYLYDTERGFSVLQAIMEAAVQNNWQVTARSVGNIKDVQEFRRIIEEMDRRQEKRYLIDCEVERINTILEQVVILGKHSRGYHYMLANLGFTDILLERVMHGGANITGFQIVNNENPMVQQFIQRWVRLDEREFPEAKNAPLKYTSALTHDAILVIAEAFRYLRRQRVDVSRRGSAGDCLANPAVPWSQGIDIERALKMVQVQGMTGNIQFDTYGRRTNYTIDVYEMKVSGSRKAGYWNEYERFVPFSDQQISNDSSSSENRTIVVTTILESPYVMYKKNHEQLEGNERYEGYCVDLAYEIAKHVRIKYKLSIVGDGKYGARDPETKIWNGMVGELVYGRADIAV.... The pKi is 8.4. (3) The drug is COC(=O)[C@@H](CC(C)C)NC(=O)CNC(=O)[C@H](NC(=O)CCC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](NC(=O)[C@H](CCCNC(=N)N)NC(C)=O)C(C)C)C(C)C. The target protein (P09958) has sequence MELRPWLLWVVAATGTLVLLAADAQGQKVFTNTWAVRIPGGPAVANSVARKHGFLNLGQIFGDYYHFWHRGVTKRSLSPHRPRHSRLQREPQVQWLEQQVAKRRTKRDVYQEPTDPKFPQQWYLSGVTQRDLNVKAAWAQGYTGHGIVVSILDDGIEKNHPDLAGNYDPGASFDVNDQDPDPQPRYTQMNDNRHGTRCAGEVAAVANNGVCGVGVAYNARIGGVRMLDGEVTDAVEARSLGLNPNHIHIYSASWGPEDDGKTVDGPARLAEEAFFRGVSQGRGGLGSIFVWASGNGGREHDSCNCDGYTNSIYTLSISSATQFGNVPWYSEACSSTLATTYSSGNQNEKQIVTTDLRQKCTESHTGTSASAPLAAGIIALTLEANKNLTWRDMQHLVVQTSKPAHLNANDWATNGVGRKVSHSYGYGLLDAGAMVALAQNWTTVAPQRKCIIDILTEPKDIGKRLEVRKTVTACLGEPNHITRLEHAQARLTLSYNRRGD.... The pKi is 6.5. (4) The compound is Cc1ccc(Oc2ccc(NC(=O)CN3C(=O)/C(=C/c4ccc(O)c(O)c4)SC3=S)cc2)cc1. The target protein (P04036) has sequence MHDANIRVAIAGAGGRMGRQLIQAALALEGVQLGAALEREGSSLLGSDAGELAGAGKTGVTVQSSLDAVKDDFDVFIDFTRPEGTLNHLAFCRQHGKGMVIGTTGFDEAGKQAIRDAAADIAIVFAANFSVGVNVMLKLLEKAAKVMGDYTDIEIIEAHHRHKVDAPSGTALAMGEAIAHALDKDLKDCAVYSREGHTGERVPGTIGFATVRAGDIVGEHTAMFADIGERLEITHKASSRMTFANGAVRSALWLSGKESGLFDMRDVLDLNNL. The pKi is 4.6. (5) The compound is CS(=O)CC[C@H](N)C(=O)O. The target protein (P07314) has sequence MKNRFLVLGLVAVVLVFVIIGLCIWLPTTSGKPDHVYSRAAVATDAKRCSEIGRDMLQEGGSVVDAAIASLLCMGLINAHSMGIGGGLFFTIYNSTTRKAEVINAREMAPRLANTSMFNNSKDSEEGGLSVAVPGEIRGYELAHQRHGRLPWARLFQPSIQLARHGFPVGKGLARALDKKRDIIEKTPALCEVFCRQGKVLQEGETVTMPKLADTLQILAQEGARAFYNGSLTAQIVKDIQEAGGIMTVEDLNNYRAEVIEHPMSIGLGDSTLYVPSAPLSGPVLILILNILKGYNFSPKSVATPEQKALTYHRIVEAFRFAYAKRTMLGDPKFVDVSQVIRNMSSEFYATQLRARITDETTHPTAYYEPEFYLPDDGGTAHLSVVSEDGSAVAATSTINLYFGSKVLSRVSGILFNDEMDDFSSPNFTNQFGVAPSPANFIKPGKQPLSSMCPSIIVDKDGKVRMVVGASGGTQITTSVALAIINSLWFGYDVKRAVEE.... The pKi is 2.2. (6) The target protein (P06401) has sequence MTELKAKGPRAPHVAGGPPSPEVGSPLLCRPAAGPFPGSQTSDTLPEVSAIPISLDGLLFPRPCQGQDPSDEKTQDQQSLSDVEGAYSRAEATRGAGGSSSSPPEKDSGLLDSVLDTLLAPSGPGQSQPSPPACEVTSSWCLFGPELPEDPPAAPATQRVLSPLMSRSGCKVGDSSGTAAAHKVLPRGLSPARQLLLPASESPHWSGAPVKPSPQAAAVEVEEEDGSESEESAGPLLKGKPRALGGAAAGGGAAAVPPGAAAGGVALVPKEDSRFSAPRVALVEQDAPMAPGRSPLATTVMDFIHVPILPLNHALLAARTRQLLEDESYDGGAGAASAFAPPRSSPCASSTPVAVGDFPDCAYPPDAEPKDDAYPLYSDFQPPALKIKEEEEGAEASARSPRSYLVAGANPAAFPDFPLGPPPPLPPRATPSRPGEAAVTAAPASASVSSASSSGSTLECILYKAEGAPPQQGPFAPPPCKAPGASGCLLPRDGLPSTSA.... The pKi is 7.6. The small molecule is CC1=CC(C)(C)Nc2cc3c(cc21)-c1c(F)cc(F)cc1C3O.