Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is Cc1cccc(OC(=O)C2=C(c3ccc(Cl)c(Cl)c3)CCC2)c1. The target protein (P23975) has sequence MLLARMNPQVQPENNGADTGPEQPLRARKTAELLVVKERNGVQCLLAPRDGDAQPRETWGKKIDFLLSVVGFAVDLANVWRFPYLCYKNGGGAFLIPYTLFLIIAGMPLFYMELALGQYNREGAATVWKICPFFKGVGYAVILIALYVGFYYNVIIAWSLYYLFSSFTLNLPWTDCGHTWNSPNCTDPKLLNGSVLGNHTKYSKYKFTPAAEFYERGVLHLHESSGIHDIGLPQWQLLLCLMVVVIVLYFSLWKGVKTSGKVVWITATLPYFVLFVLLVHGVTLPGASNGINAYLHIDFYRLKEATVWIDAATQIFFSLGAGFGVLIAFASYNKFDNNCYRDALLTSSINCITSFVSGFAIFSILGYMAHEHKVNIEDVATEGAGLVFILYPEAISTLSGSTFWAVVFFVMLLALGLDSSMGGMEAVITGLADDFQVLKRHRKLFTFGVTFSTFLLALFCITKGGIYVLTLLDTFAAGTSILFAVLMEAIGVSWFYGVDR.... The pKi is 4.0. (2) The compound is O=C(Nc1ccc(Cl)c(C(F)(F)F)c1)[C@H]1CC=C[C@H]2CCN(Cc3ccccc3)C(=O)[C@@H]12. The target protein sequence is MDSPIQIFRGEPGPTCAPSACLPPNSSAWFPGWAEPDSNGSAGSEDAQLEPAHISPAIPVIITAVYSVVFVVGLVGNSLVMFVIIRYTKMKTATNIYIFNLALADALVTTTMPFQSTVYLMNSWPFGDVLCKIVISIDYYNMFTSIFTLTMMSVDRYIAVCHPVKALDFRTPLKAKIINICIWLLSSSVGISAIVLGGTKVREDVDVIECSAQFPDDDYSWWDLFMKICVFIFAFVIPVLIIIVCYTLMILRLKSVRLLSGSREKDRNLRRITRLVLVVVAVFVVCWTPIHIFILVEALGSTSHSTAALSSYYFCIALGYTNSSLNPILYAFLDENFKRCFRDFCFPLKMRMERQSTSRVRNTVQDPAYLRDIDGMNKPV. The pKi is 7.0. (3) The target protein (P35348) has sequence MVFLSGNASDSSNCTQPPAPVNISKAILLGVILGGLILFGVLGNILVILSVACHRHLHSVTHYYIVNLAVADLLLTSTVLPFSAIFEVLGYWAFGRVFCNIWAAVDVLCCTASIMGLCIISIDRYIGVSYPLRYPTIVTQRRGLMALLCVWALSLVISIGPLFGWRQPAPEDETICQINEEPGYVLFSALGSFYLPLAIILVMYCRVYVVAKRESRGLKSGLKTDKSDSEQVTLRIHRKNAPAGGSGMASAKTKTHFSVRLLKFSREKKAAKTLGIVVGCFVLCWLPFFLVMPIGSFFPDFKPSETVFKIVFWLGYLNSCINPIIYPCSSQEFKKAFQNVLRIQCLCRKQSSKHALGYTLHPPSQAVEGQHKDMVRIPVGSRETFYRISKTDGVCEWKFFSSMPRGSARITVSKDQSSCTTARVRSKSFLQVCCCVGPSTPSLDKNHQVPTIKVHTISLSENGEEV. The pKi is 8.4. The drug is COc1ccc([N+](=O)[O-])cc1S(=O)(=O)N[C@H]1CC[C@@H](N2CCN(c3ccccc3OC(C)C)CC2)CC1. (4) The compound is Nc1ccc2c(c1)CC1(C(=O)NC(=O)NC1=O)C1CN(Cc3ccccc3)CCN21. The target protein (O67625) has sequence MVDIIIAEHAGFCFGVKRAVKLAEESLKESQGKVYTLGPIIHNPQEVNRLKNLGVFPSQGEEFKEGDTVIIRSHGIPPEKEEALRKKGLKVIDATCPYVKAVHEAVCQLTREGYFVVLVGEKNHPEVIGTLGYLRACNGKGIVVETLEDIGEALKHERVGIVAQTTQNEEFFKEVVGEIALWVKEVKVINTICNATSLRQESVKKLAPEVDVMIIIGGKNSGNTRRLYYISKELNPNTYHIETAEELQPEWFRGVKRVGISAGASTPDWIIEQVKSRIQEICEGQLVSS. The pKi is 6.2. (5) The drug is CCCC[C@H](NC(=O)[C@H](C)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](Cc1ccccc1)NC[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](N)CCCNC(=N)N)C(C)C)C(N)=O. The target protein sequence is PQITLWKRPLVTIKIGGQLKEALLDTGADDTVIEEMSLPGRWKPKMIGGIGGFIKVRQYDQIIIEIAGHKAISTVLVGPTPVNIIGRNLLTQIGATLNF. The pKi is 6.5. (6) The compound is COC(=O)c1c(C)nc(C)c(C(=O)OCCCN2CCC(c3ccccc3)(c3ccccc3)CC2)c1-c1ccc([N+](=O)[O-])cc1. The target protein (P22002) has sequence MIRAFAQPSTPPYQPLSSCLSEDTERKFKGKVVHEAQLNCFYISPGGSNYGSPRPAHANMNANAAAGLAPEHIPTPGAALSWLAAIDAARQAKLMGSAGNATISTVSSTQRKRQQYGKPKKQGGTTATRPPRALLCLTLKNPIRRACISIVEWKPFEIIILLTIFANCVALAIYIPFPEDDSNATNSNLERVEYLFLIIFTVEAFLKVIAYGLLFHPNAYLRNGWNLLDFIIVVVGLFSAILEQATKADGANALGGKGAGFDVKALRAFRVLRPLRLVSGVPSLQVVLNSIIKAMVPLLHIALLVLFVIIIYAIIGLELFMGKMHKTCYNQEGIIDVPAEEDPSPCALETGHGRQCQNGTVCKPGWDGPKHGITNFDNFAFAMLTVFQCITMEGWTDVLYWMQDAMGYELPWVYFVSLVIFGSFFVLNLVLGVLSGEFSKEREKAKARGDFQKLREKQQLEEDLKGYLDWITQAEDIDPENEDEGMDEDKPRNMSMPTSE.... The pKi is 6.9. (7) The compound is CC/C=C/C[C@H](C[C@H](N)C(=O)O)C(=O)O. The target protein (Q13002) has sequence MKIIFPILSNPVFRRTVKLLLCLLWIGYSQGTTHVLRFGGIFEYVESGPMGAEELAFRFAVNTINRNRTLLPNTTLTYDTQKINLYDSFEASKKACDQLSLGVAAIFGPSHSSSANAVQSICNALGVPHIQTRWKHQVSDNKDSFYVSLYPDFSSLSRAILDLVQFFKWKTVTVVYDDSTGLIRLQELIKAPSRYNLRLKIRQLPADTKDAKPLLKEMKRGKEFHVIFDCSHEMAAGILKQALAMGMMTEYYHYIFTTLDLFALDVEPYRYSGVNMTGFRILNTENTQVSSIIEKWSMERLQAPPKPDSGLLDGFMTTDAALMYDAVHVVSVAVQQFPQMTVSSLQCNRHKPWRFGTRFMSLIKEAHWEGLTGRITFNKTNGLRTDFDLDVISLKEEGLEKIGTWDPASGLNMTESQKGKPANITDSLSNRSLIVTTILEEPYVLFKKSDKPLYGNDRFEGYCIDLLRELSTILGFTYEIRLVEDGKYGAQDDANGQWNG.... The pKi is 5.5. (8) The drug is CCC(C)C(=O)Nc1cc(-c2ccccc2)nc(-c2ccccc2)n1. The target protein (P30542) has sequence MPPSISAFQAAYIGIEVLIALVSVPGNVLVIWAVKVNQALRDATFCFIVSLAVADVAVGALVIPLAILINIGPQTYFHTCLMVACPVLILTQSSILALLAIAVDRYLRVKIPLRYKMVVTPRRAAVAIAGCWILSFVVGLTPMFGWNNLSAVERAWAANGSMGEPVIKCEFEKVISMEYMVYFNFFVWVLPPLLLMVLIYLEVFYLIRKQLNKKVSASSGDPQKYYGKELKIAKSLALILFLFALSWLPLHILNCITLFCPSCHKPSILTYIAIFLTHGNSAMNPIVYAFRIQKFRVTFLKIWNDHFRCQPAPPIDEDLPEERPDD. The pKi is 8.6.