This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is OC[C@H]1OC(n2cnc3c(NC4CCC[C@@H]4OCc4ccc(F)cc4)ncnc32)[C@H](O)[C@@H]1O. The target protein sequence is MPPRAASLPPGSTCSGCRVLPLHPVPAHILPEELTAAASRLQVRACLSAAVPTMGSWVYITVELAIAVLAVLGNVLVCWAVWLNSNLQNVTNYFVVSLAAADIAVGVLAIPFAITISTGFCAACHGCLFIACFVLVLTQSSIFSLLAIAIDRYIAIRIPLRYNGLVTGTRAKGIIAICWVLSFAIGLTPMLGWNNCGQPREGRNHSQGCGAGQVACLFEDVVPMNYMVYYNFFACVLLPLLLMLGIYLRIFLAARRQLKQMESQPLPGERTRSTLQKEVHAAKSLAIIVGLFALCWLPLHIINCFTLFCPECSHAPLWLMYLAIVLSHSNSVVNPFIYAYRIREFRQTFRKIIRSHVLRRQDPFKAGGTSARALAAHGSDGEHVSLRLNGHPPGLWANGSAPHPQRRPNGYALGLGSTGGARASHRDVSLPDVELLGQERKSMCPESPGLEEPLAQDGAGVS. The pKi is 5.3. (2) The target protein sequence is MHARRLPRLLPLALAFLLSPAAFAADTPAAELLRQAEAERPAYLDTLRQLVAVDSGTGQAEGLGQLSALLAERLQALGAQVRSAPATPSAGDNLVATLDGTGSKRFLLMIHYDTVFAAGSAAKRPFREDAERAYGPGVADAKGGVAMVLHALALLRQQGFRDYGRITVLFNPDEETGSAGSKQLIAELARQQDYVFSYEPPDRDAVTVATNGIDGLLLEVKGRSSHAGSAPEQGRNAILELSHQLLRLKDLGDPAKGTTLNWTLARGGEKRNIIPAEASAEADMRYSDPAESERVLADARKLTGERLVADTEVSLRLDKGRPPLVKNPASQRLAETAQTLYGRIGKRIEPIAMRFGTDAGYAYVPGSDKPAVLETLGVVGAGLHSEAEYLELSSIAPRLYLTVALIRELSAD. The compound is COc1ccc(SC(=O)N[C@H](CCC(=O)O)C(=O)O)cc1. The pKi is 6.5. (3) The small molecule is CNCCCN1c2ccccc2CCc2ccccc21. The target is MLLARMKPQVQPELGGADQ. The pKi is 8.2. (4) The compound is CC(C(=O)O)N(Cc1ccccc1Cl)S(=O)(=O)c1cccc2cccnc12. The target protein sequence is MKKNILKILMDSYSKESKIQTVRRVTSVSLLAVYLTMNTSSLVLAKPIENTNDTSIKNVEKLRNAPNEENSKKVEDSKNDKVEHVKNIEEAKVEQVAPEVKSKSTLRSASIANTNSEKYDFEYLNGLSYTELTNLIKNIKWNQINGLFNYSTGSQKFFGDKNRVQAIINALQESGRTYTANDMKGIETFTEVLRAGFYLGYYNDGLSYLNDRNFQDKCIPAMIAIQKNPNFKLGTAVQDEVITSLGKLIGNASANAEVVNNCVPVLKQFRENLNQYAPDYVKGTAVNELIKGIEFDFSGAAYEKDVKTMPWYGKIDPFINELKALGLYGNITSATEWASDVGIYYLSKFGLYSTNRNDIVQSLEKAVDMYKYGKIAFVAMERITWDYDGIGSNGKKVDHDKFLDDAEKHYLPKTYTFDNGTFIIRAGDKVSEEKIKRLYWASREVKSQFHRVVGNDKALEVGNADDVLTMKIFNSPEEYKFNTNINGVSTDNGGLYIEPR.... The pKi is 5.7. (5) The compound is CCNC(=O)[C@H]1CCCN1C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](Cc1ccc(O)cc1)NC(=O)[C@H](CO)NC(=O)[C@H](Cc1c[nH]c2ccccc12)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@@H]1CCC(=O)N1. The target protein (P16235) has sequence MGRRVPALRQLLVLAVLLLKPSQLQSRELSGSRCPEPCDCAPDGALRCPGPRAGLARLSLTYLPVKVIPSQAFRGLNEVVKIEISQSDSLERIEANAFDNLLNLSELLIQNTKNLLYIEPGAFTNLPRLKYLSICNTGIRTLPDVTKISSSEFNFILEICDNLHITTIPGNAFQGMNNESVTLKLYGNGFEEVQSHAFNGTTLISLELKENIYLEKMHSGAFQGATGPSILDISSTKLQALPSHGLESIQTLIALSSYSLKTLPSKEKFTSLLVATLTYPSHCCAFRNLPKKEQNFSFSIFENFSKQCESTVRKADNETLYSAIFEENELSGWDYDYGFCSPKTLQCAPEPDAFNPCEDIMGYAFLRVLIWLINILAIFGNLTVLFVLLTSRYKLTVPRFLMCNLSFADFCMGLYLLLIASVDSQTKGQYYNHAIDWQTGSGCGAAGFFTVFASELSVYTLTVITLERWHTITYAVQLDQKLRLRHAIPIMLGGWLFSTL.... The pKi is 9.9. (6) The compound is Cc1ccc(C(=O)n2nc(-c3cccnc3)nc2N)cc1. The target protein sequence is MATARPPWMWVLCALITALLLGVTEHVLANNDVSCDHPSNTVPSGSNQDLGAGAGEDARSDDSSSRIINGSDCDMHTQPWQAALLLRPNQLYCGAVLVHPQWLLTAAHCRKKVFRVRLGHYSLSPVYESGQQMFQGVKSIPHPGYSHPGHSNNLMLIKLNRRIRPTKDVRPINVSSHCPSAGTKCLVSGWGTTKSPQVHFPKVLQCLNISVLSQKRCEDAYPRQIDDTMFCAGDKAGRDSCQGDSGGPVVCNGSLQGLVSWGDYPCARPNRPGVYTNLCKFTKWIQETIQANS. The pKi is 5.4.