Dataset: Drug-target binding data from BindingDB using Ki measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is COc1cc(OC)c(C(=O)CCCCN2CCC3(CC2)NC(=O)NC3=O)cc1NS(=O)(=O)c1ccc(C(F)(F)F)cc1. The target protein (P34968) has sequence MVNLGTAVRSLLVHLIGLLVWQFDISISPVAAIVTDTFNSSDGGRLFQFPDGVQNWPALSIVVIIIMTIGGNILVIMAVSMEKKLHNATNYFLMSLAIADMLVGLLVMPLSLLAILYDYVWPLPRYLCPVWISLDVLFSTASIMHLCAISLDRYVAIRNPIEHSRFNSRTKAIMKIAIVWAISIGVSVPIPVIGLRDESKVFVNNTTCVLNDPNFVLIGSFVAFFIPLTIMVITYFLTIYVLRRQTLMLLRGHTEEELRNISLNFLKCCCKKGDEEENAPNPNPDQKPRRKKKEKRPRGTMQAINNEKKASKVLGIVFFVFLIMWCPFFITNILSVLCGKACNQKLMEKLLNVFVWIGYVCSGINPLVYTLFNKIYRRAFSKYLRCDYKPDKKPPVRQIPRVAATALSGRELNVNIYRHTNERVVRKANDTEPGIEMQVENLELPVNPSNVVSERISSV. The pKi is 5.0. (2) The drug is CC1=C(/C=C/C(C)=C/C=C/C(C)=C/C=C/C=C(C)/C=C/C=C(C)/C=C/C2=C(C)CCCC2(C)C)C(C)(C)CCC1. The target protein (Q9NPD5) has sequence MDQHQHLNKTAESASSEKKKTRRCNGFKMFLAALSFSYIAKALGGIIMKISITQIERRFDISSSLAGLIDGSFEIGNLLVIVFVSYFGSKLHRPKLIGIGCLLMGTGSILTSLPHFFMGYYRYSKETHINPSENSTSSLSTCLINQTLSFNGTSPEIVEKDCVKESGSHMWIYVFMGNMLRGIGETPIVPLGISYIDDFAKEGHSSLYLGSLNAIGMIGPVIGFALGSLFAKMYVDIGYVDLSTIRITPKDSRWVGAWWLGFLVSGLFSIISSIPFFFLPKNPNKPQKERKISLSLHVLKTNDDRNQTANLTNQGKNVTKNVTGFFQSLKSILTNPLYVIFLLLTLLQVSSFIGSFTYVFKYMEQQYGQSASHANFLLGIITIPTVATGMFLGGFIIKKFKLSLVGIAKFSFLTSMISFLFQLLYFPLICESKSVAGLTLTYDGNNSVASHVDVPLSYCNSECNCDESQWEPVCGNNGITYLSPCLAGCKSSSGIKKHTV.... The pKi is 5.7. (3) The target protein (P10611) has sequence MSVSALSPTRLPGSLSGLLQVAALLGLLLLLLKAAQLYLHRQWLLRALQQFPCPPFHWLLGHSREFQNDQELERIQKWVEKFPGACPWWLSGNKARLLVYDPDYLKVILGRSDPKAPRNYKLMTPWIGYGLLLLDGQTWFQHRRMLTPAFHYDILKPYVGLMVDSVQIMLDRWEQLISQDSSLEIFQHVSLMTLDTIMKCAFSYQGSVQLDRNSHSYIQAINDLNNLVFYRARNVFHQSDFLYRLSPEGRLFHRACQLAHEHTDRVIQQRKAQLQQEGELEKVRRKRRLDFLDVLLFAKMENGSSLSDQDLRAEVDTFMFEGHDTTASGVSWIFYALATHPEHQHRCREEIQGLLGDGASITWEHLDQMPYTTMCIKEALRLYPPVPSVTRQLSKPVTFPDGRSLPKGVILFLSIYGLHYNPKVWQNPEVFDPFRFAPDSAYHSHAFLPFSGGARNCIGKQFAMRELKVAVALTLVRFELLPDPTRIPIPIARVVLKSKN.... The compound is C#CCCCC(O)CCCCCCCCCC(C)(C)C(=O)O. The pKi is 5.3. (4) The compound is O=C(c1ccccc1)N(C(=O)c1ccccc1)c1nc2ccccc2n2c(=O)n(-c3ccccc3)nc12. The target protein (P28190) has sequence MPPSISAFQAAYIGIEVLIALVSVPGNVLVIWAVKVNQALRDATFCFIVSLAVADVAVGALVIPLAILINIGPRTYFHTCLKVACPVLILTQSSILALLAIAVDRYLRVKIPLRYKTVVTPRRAVVAITGCWILSFVVGLTPMFGWNNLSAVERDWLANGSVGEPVIECQFEKVISMEYMVYFNFFVWVLPPLLLMVLIYMEVFYLIRKQLNKKVSASSGDPQKYYGKELKIAKSLALILFLFALSWLPLHILNCITLFCPSCHMPRILIYIAIFLSHGNSAMNPIVYAFRIQKFRVTFLKIWNDHFRCQPAPPVDEDAPAERPDD. The pKi is 7.5. (5) The pKi is 5.3. The target protein sequence is MQFSGEKISGQRDLQKSKMRFTFTSRCLALFLLLNHPTPILPAFSNQTYPTIEPKPFLYVVGRKKMMDAQYKCYDRMQQLPAYQGEGPYCNRTWDGWLCWDDTPAGVLSYQFCPDYFPDFDPSEKVTKYCDEKGVWFKHPENNRTWSNYTMCNAFTPEKLKNAYVLYYLAIVGHSLSIFTLVISLGIFVFFRKLTTIFPLNWKYRKALSLGCQRVTLHKNMFLTYILNSMIIIIHLVEVVPNGELVRRDPVSCKILHFFHQYMMACNYFWMLCEGIYLHTLIVVAVFTEKQRLRWYYLLGWGFPLVPTTIHAITRAVYFNDNCWLSVETHLLYIIHGPVMAALVVNFFFLLNIVRVLVTKMRETHEAESHMYLKAVKATMILVPLLGIQFVVFPWRPSNKMLGKIYDYVMHSLIHFQGFFVATIYCFCNNEVQTTVKRQWAQFKIQWNQRWGRRPSNRSARAAAAAAEAGDIPIYICHQEPRNEPANNQGEESAEIIPLN.... The compound is Cc1cc(C[C@@H](NC(=O)N2CCC(N3Cc4cccc(F)c4NC3=O)CC2)C(=O)N2CCC(N3CCCCC3)CC2)cc2cn[nH]c12. (6) The target protein (P18901) has sequence MAPNTSTMDEAGLPAERDFSFRILTACFLSLLILSTLLGNTLVCAAVIRFRHLRSKVTNFFVISLAVSDLLVAVLVMPWKAVAEIAGFWPLGPFCNIWVAFDIMCSTASILNLCVISVDRYWAISSPFQYERKMTPKAAFILISVAWTLSVLISFIPVQLSWHKAKPTWPLDGNFTSLEDTEDDNCDTRLSRTYAISSSLISFYIPVAIMIVTYTSIYRIAQKQIRRISALERAAVHAKNCQTTAGNGNPVECAQSESSFKMSFKRETKVLKTLSVIMGVFVCCWLPFFISNCMVPFCGSEETQPFCIDSITFDVFVWFGWANSSLNPIIYAFNADFQKAFSTLLGCYRLCPTTNNAIETVSINNNGAVVFSSHHEPRGSISKDCNLVYLIPHAVGSSEDLKKEEAGGIAKPLEKLSPALSVILDYDTDVSLEKIQPVTHSGQHST. The pKi is 6.0. The compound is CCCN(C(C)C)C1COc2cccc(C(N)=O)c2C1.