This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is C=C(C[n+]1ccc(-c2ccc(O)cc2)cc1)c1ccc(Br)cc1. The target protein (Q27963) has sequence MALSELALLRRLQESRHSRKLILFIVFLALLLDNMLLTVVVPIIPSYLYSIEHEKDALEIQTAKPGLTASAPGSFQNIFSYYDNSTMVTGNSTDHLQGALVHEATTQHMATNSSSASSDCPSEDKDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFTGFCIMFISTVMFAFSRTYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGNAMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLRDPYILIAAGSICFANMGIAMLEPALPIWMMETMCSHKWQLGVAFLPASVSYLIGTNVFGILAHKMGRWLCALLGMIIVGMSILCIPLAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDILFAPLCFFLRSPPAKEEKMAILMDHNCPIKTKMYTQ.... The pKi is 6.3. (2) The drug is CCCCCCCCc1ccc2c3c(c[nH]c13)C[C@@H](CO)NC(=O)[C@H](C(C)C)N2C. The target protein (Q9R1K8) has sequence MGTLGKAREAPRKPCHGSRAGPKGRLEAKSTNSPLPAQPSLAQITQFRMMVSLGHLAKGASLDDLIDSCIQSFDADGNLCRSNQLLQVMLTMHRIIISSAELLQKLMNLYKDALEKNSPGICLKICYFVRYWITEFWIMFKMDASLTSTMEEFQDLVKANGEESHCHLIDTTQINSRDWSRKLTQRIKSNTSKKRKVSLLFDHLEPEELSEHLTYLEFKSFRRISFSDYQNYLVNSCVKENPTMERSIALCNGISQWVQLMVLSRPTPQLRAEVFIKFIHVAQKLHQLQNFNTLMAVIGGLCHSSISRLKETSSHVPHEINKVLGEMTELLSSCRNYDNYRRAYGECTHFKIPILGVHLKDLISLYEAMPDYLEDGKVNVQKLLALYNHINELVQLQDVAPPLDANKDLVHLLTLSLDLYYTEDEIYELSYAREPRNHRAPPLTPSKPPVVVDWASGVSPKPDPKTISKHVQRMVDSVFKNYDLDQDGYISQEEFEKIAA.... The pKi is 9.0. (3) The small molecule is CCCCC[C@H](O)/C=C/[C@H]1[C@H](O)CC(=O)[C@@H]1CCCCCCC(=O)O. The target protein (P35375) has sequence MSPCGLNLSLADEAATCATPRLPNTSVVLPTGDNGTSPALPIFSMTLGAVSNVLALALLAQVAGRMRRRRSAATFLLFVASLLAIDLAGHVIPGALVLRLYTAGRAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTQPLIHAARVSVARARLALAVLAAMALAVALLPLVHVGRYELQYPGTWCFISLGPRGGWRQALLAGLFAGLGLAALLAALVCNTLSGLALLRARWRRRRSRRFRKTAGPDDRRRWGSRGPRLASASSASSITSATATLRSSRGGGSARRVHAHDVEMVGQLVGIMVVSCICWSPLLVLVVLAIGGWNSNSLQRPLFLAVRLASWNQILDPWVYILLRQAMLRQLLRLLPLRVSAKGGPTELGLTKSAWEASSLRSSRHSGFSHL. The pKi is 8.2. (4) The drug is CN1Cc2c(C(=O)OC(C)(C)C)ncn2-c2ccc(C#C[Si](C)(C)C)cc2C1=O. The target protein (P19969) has sequence MDNGMLSRFIMTKTLLVFCISMTLSSHFGFSQMPTSSVQDETNDNITIFTRILDGLLDGYDNRLRPGLGERITQVRTDIYVTSFGPVSDTEMEYTIDVFFRQSWKDERLRFKGPMQRLPLNNLLASKIWTPDTFFHNGKKSIAHNMTTPNKLLRLEDDGTLLYTMRLTISAECPMQLEDFPMDAHACPLKFGSYAYPNSEVVYVWTNGSTKSVVVAEDGSRLNQYHLMGQTVGTENISTSTGEYTIMTAHFHLKRKIGYFVIQTYLPCIMTVILSQVSFWLNRESVPARTVFGVTTVLTMTTLSISARNSLPKVAYATAMDWFIAVCYAFVFSALIEFATVNYFTKRGWAWDGKKALEAAKIKKKERELILNKSTNAFTTGKLTHPPNIPKEQLPGGTGNAVGTASIRASEEKTSESKKTYNSISKIDKMSRIVFPILFGTFNLVYWATYLNREPVIKGATSPK. The pKi is 8.9. (5) The compound is CCCCCCCCCCCOc1ccc(N2C(N)=NC(N)=NC2(C)C)cc1. The target protein (P00378) has sequence VRSLNSIVAVCQNMGIGKDGNLPWPPLRNEYKYFQRMTSTSHVEGKQNAVIMGKKTWFSIPEKNRPLKDRINIVLSRELKEAPKGAHYLSKSLDDALALLDSPELKSKVDMVWIVGGTAVYKAAMEKPINHRLFVTRILHEFESDTFFPEIDYKDFKLLTEYPGVPADIQEEDGIQYKFEVYQKSVLAQ. The pKi is 6.0. (6) The compound is COc1ccccc1N1CCN(C[C@@H](C(=O)NC(C)(C)C)c2ccccc2)CC1. The target protein sequence is MDVANNTTSPERSPEGAGGPGLAEVTLGYQLLTSLLLGTLILCAVSGNACVIAAIALERSLQTVANYLIGSLAVTDLMVSVLVLPMAALYQVLNKWTLGQVTCDIFISLDVLCCTSSILHLCAIALDRYWAITDPIDYVNKRTPRRAAVLISLTWLIGFLISIPPMLGWRTPEDRSDPDACTISKDHGYTIYSTFGAFYIPLLLMLVLYGRIFKAARFRIRKTVRKVEKKKVADTCLTLSPSALQKKSNGEPGKGWRRTVEHKPGVCVNGAVRQGEDGAALEIIEVQRCNSSSKTHLPLPSEACGSPPPPSFEKRNEKNTEAKRRMALSRERKTVKTLGIIMGTFILCWLPFFIVALVLPFCDSKCYMPKWLEAVINWLGYSNSLLNPIIYAYFNKDFQSAFKKIIKCKFCRQ. The pKi is 8.0. (7) The compound is Brc1ccc([C@H]2CC3CCC2N3)cn1. The target protein sequence is MDYTASCLIFFFIAAGPVFSSDHETRLIGDLFANYNKVVRPVETYKDQVVVTVGLQLIQLINVDEVNQIVSTNIRLKQQWVDVNLKWDPAKYGGVKKLRIPSSEVWCPDLVLYNNADGDFAISKDTKILLEHTGKITWTPPAIFKSYCEIIVTHFPFDQQNCSMKFGTWTYDGTLVVINPDRDRPDLSNFMASGEWMMKDYRCWKHWVYYTCCPDKPYLDITYHFVLQRLPLYFIVNVIIPCLLFSFLTGLVFYLPTDSGEKMTLSISVLLSLTVFLLVIVELIPSTSSAVPLIGKYMLFTMVFVIASIIITVIVINTHHRSPSTHTMPPWVRKIFIDTIPNIMFFSTMKRPSQEKQPQTTFAEEMDISDISGKLGPAAVTYQSPALKNPDVKSAIEGIKYIAETMKSDQESNKASEEWKFVAMVLDHILLAVFMTVCVIGTLAVFAGRIIEMNMQD. The pKi is 10.0. (8) The drug is C[N+](C)(C)CCOC(N)=O. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNCTFRTTFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKi is 4.8. (9) The compound is C/C(=C\CC12OC1(CF)C(=O)c1ccccc1C2=O)CCCC(C)CCCC(C)CCCC(C)C. The target protein (Q6B4J2) has sequence MGATWRSPGWVRLALCLAGLVLSLYALHVKAARARDRDYRALCDVGTAISCSRVFSSRWGRGFGLVEHVLGKDSILNQSNSIFGCIFYTLQLLLGCLQGRWASVLLRLSCLVSLAGSVYLAWILFFVLYDFCIVCITTYAINVGLTVLSFREVQGPQGKVKGH. The pKi is 4.8.