Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is Nc1ncnc2nc(-c3ccc(N4CCOCC4)nc3)cc(-c3cccc(Br)c3)c12. The target is MLLARMKPQVQPELGGADQ. The pKi is 5.0. (2) The drug is O=C(CI)NCCCCCNc1ncnc2c1ncn2[C@@H]1O[C@H](COP(=O)(O)OP(=O)(O)OP(=O)(O)O)[C@@H](O)[C@H]1O. The target protein (P27144) has sequence MASKLLRAVILGPPGSGKGTVCQRIAQNFGLQHLSSGHFLRENIKASTEVGEMAKQYIEKSLLVPDHVITRLMMSELENRRGQHWLLDGFPRTLGQAEALDKICEVDLVISLNIPFETLKDRLSRRWIHPPSGRVYNLDFNPPHVHGIDDVTGEPLVQQEDDKPEAVAARLRQYKDVAKPVIELYKSRGVLHQFSGTETNKIWPYVYTLFSNKITPIQSKEAY. The pKi is 2.3. (3) The small molecule is CCCC[C@H](NC(=O)c1ccccc1)C(=O)N[C@@H](CCCCN)C(=O)N[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)O)[C@@H](C)O. The target protein (P29990) has sequence MNDQRKKAKNTPFNMLKRERNRVSTVQQLTKRFSLGMLQGRGPLKLYMALVAFLRFLTIPPTAGILKRWGTIKKSKAINVLRGFRKEIGRMLNILNRRRRSAGMIIMLIPTVMAFHLTTRNGEPHMIVSRQEKGKSLLFKTEDGVNMCTLMAMDLGELCEDTITYKCPLLRQNEPEDIDCWCNSTSTWVTYGTCTTMGEHRRQKRSVALVPHVGMGLETRTETWMSSEGAWKHVQRIETWILRHPGFTMMAAILAYTIGTTHFQRALIFILLTAVTPSMTMRCIGMSNRDFVEGVSGGSWVDIVLEHGSCVTTMAKNKPTLDFELIKTEAKQPATLRKYCIEAKLTNTTTESRCPTQGEPSLNEEQDKRFVCKHSMVDRGWGNGCGLFGKGGIVTCAMFRCKKNMEGKVVQPENLEYTIVITPHSGEEHAVGNDTGKHGKEIKITPQSSTTEAELTGYGTVTMECSPRTGLDFNEMVLLQMENKAWLVHRQWFLDLPLPW.... The pKi is 3.3. (4) The small molecule is Nc1ncnc2c1ncn2[C@@H]1O[C@H](CO[P@](=O)(O)OCCCCC[C@@H]2SC[C@@H]3NC(=O)N[C@H]23)[C@@H](O)[C@H]1O. The target protein (P06709) has sequence MKDNTVPLKLIALLANGEFHSGEQLGETLGMSRAAINKHIQTLRDWGVDVFTVPGKGYSLPEPIQLLNAKQILGQLDGGSVAVLPVIDSTNQYLLDRIGELKSGDACIAEYQQAGRGRRGRKWFSPFGANLYLSMFWRLEQGPAAAIGLSLVIGIVMAEVLRKLGADKVRVKWPNDLYLQDRKLAGILVELTGKTGDAAQIVIGAGINMAMRRVEESVVNQGWITLQEAGINLDRNTLAAMLIRELRAALELFEQEGLAPYLSRWEKLDNFINRPVKLIIGDKEIFGISRGIDKQGALLLEQDGIIKPWMGGEISLRSAEK. The pKi is 7.1. (5) The compound is C[NH+](C)C1CCc2cc(O)c(O)cc2C1. The target protein (P41595) has sequence MALSYRVSELQSTIPEHILQSTFVHVISSNWSGLQTESIPEEMKQIVEEQGNKLHWAALLILMVIIPTIGGNTLVILAVSLEKKLQYATNYFLMSLAVADLLVGLFVMPIALLTIMFEAMWPLPLVLCPAWLFLDVLFSTASIMHLCAISVDRYIAIKKPIQANQYNSRATAFIKITVVWLISIGIAIPVPIKGIETDVDNPNNITCVLTKERFGDFMLFGSLAAFFTPLAIMIVTYFLTIHALQKKAYLVKNKPPQRLTWLTVSTVFQRDETPCSSPEKVAMLDGSRKDKALPNSGDETLMRRTSTIGKKSVQTISNEQRASKVLGIVFFLFLLMWCPFFITNITLVLCDSCNQTTLQMLLEIFVWIGYVSSGVNPLVYTLFNKTFRDAFGRYITCNYRATKSVKTLRKRSSKIYFRNPMAENSKFFKKHGIRNGINPAMYQSPMRLRSSTIQSSSIILLDTLLLTENEGDKTEEQVSYV. The pKi is 5.7. (6) The drug is N[C@@H](CCC(=O)N[C@@H](CSC(=O)N(O)c1ccc(Cl)cc1)C(=O)NCC(=O)O)C(=O)O. The target protein (Q3B7M2) has sequence MVLGRGLLGRWSVAELGAVCARLGLGPALLGSLHHLGLRKSLTVDQGTMKVELLPALTDNYMYLLIDEDTKEAAIVDPVQPQKVVETARKHGVKLTTVLTTHHHWDHAGGNEKLVKLEPGLKVYGGDDRIGALTHKVTHLSTLQVGSLHVKCLSTPCHTSGHICYFVTKPNSPEPPAVFTGDTLFVAGCGKFYEGTADEMYKALLEVLGRLPADTRVYCGHEYTINNLKFARHVEPDNTAVREKLAWAKEKYSIGEPTVPSTIAEEFTYNPFMRVREKTVQQHAGETEPVATMRAIRKEKDQFKMPRD. The pKi is 5.8.