This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is C=C(C)CNC(=N)NCCCCCCN1CCCCCCCCNC(=N)NC1=O. The target protein sequence is MLGFLGKSMALLAALQATLTSATPVSTNDVSVEKRASGYTNAVYFTNWGIYGRNFQPQDLVASDITHVIYPFMNFQADGTVVSGDAYADYQKHYSDDSWNDVGNNAYGCVKQLFKLKKANRNLKVMLSIGGWTWSTNFPSAASTDANRKNFAKTAITFMKDWGFDGIDVDWEYPADDTQATNMVLLLKEIRSQLDAYAAQYAPGYHFLLSIAAPAGPEHYSALHMADLGQVLDYVNLMAYDYAGSWSSYSGHDANLFANPSNPNSSPYNTDQAIKAYINGGVPASKIVLGMPIYGRSFESTNGIGQTYNGIGSGSWENGIWDYKVLPKAGATVQYDSVAQAYYSYDSSSKELISFDTPDMVSKKVSYLKNLGLGGSMFWEASADKTGSDSLIGTSHRALGSLDSTQNLLSYPNSQYDNIRSGLN. The pKi is 4.5. (2) The compound is COc1ccc2ccc(CCNC(C)=O)cc2n1. The target protein (P49286) has sequence MSENGSFANCCEAGGWAVRPGWSGAGSARPSRTPRPPWVAPALSAVLIVTTAVDVVGNLLVILSVLRNRKLRNAGNLFLVSLALADLVVAFYPYPLILVAIFYDGWALGEEHCKASAFVMGLSVIGSVFNITAIAINRYCYICHSMAYHRIYRRWHTPLHICLIWLLTVVALLPNFFVGSLEYDPRIYSCTFIQTASTQYTAAVVVIHFLLPIAVVSFCYLRIWVLVLQARRKAKPESRLCLKPSDLRSFLTMFVVFVIFAICWAPLNCIGLAVAINPQEMAPQIPEGLFVTSYLLAYFNSCLNAIVYGLLNQNFRREYKRILLALWNPRHCIQDASKGSHAEGLQSPAPPIIGVQHQADAL. The pKi is 7.7. (3) The compound is NC1=NC(c2ccccc2)N(c2ccc(Cl)cc2)C(N)=N1. The target protein sequence is MMEQVCDVFDIYAICACCKVESKNEGKKNEVFNNYTFRGLGNKGVLPWKCNSLDMKYFCAVTTYVNESKYEKLKYKRCKYLNKETVDNVNDMPNSKKLQNVVVMGRTSWESIPKKFKPLSNRINVILSRTLKKEDFDEDVYIINKVEDLIVLLGKLNYYKCFIIGGSVVYQEFLEKKLIKKIYFTRINSTYECDVFFPEINENEYQIISVSDVYTSNNTTLDFIIYKKTNNKMLNEQNCIKGEEKNNDMPLKNDDKDTCHMKKLTEFYKNVDKYKINYENDDDDEEEDDFVYFNFNKEKEEKNKNSIHPNDFQIYNSLKYKYHPEYQYLNIIYDIMMNGNKQSDRTGVGVLSKFGYIMKFDLSQYFPLLTTKKLFLRGIIEELLWFIRGETNGNTLLNKNVRIWEANGTREFLDNRKLFHREVNDLGPIYGFQWRHFGAEYTNMYDNYENKGVDQLKNIINLIKNDPTSRRILLCAWNVKDLDQMALPPCHILCQFYVFD.... The pKi is 8.3. (4) The target protein (P06835) has sequence MLLPLYGLASFLVLSQAALVNTSAPQASNDDPFNHSPSFYPTPQGGRINDGKWQAAFYRARELVDQMSIAEKVNLTTGVGSASGPCSGNTGSVPRLNISSICVQDGPLSVRAADLTDVFPCGMAASSSFNKQLIYDRAVAIGSEFKGKGADAILGPVYGPMGVKAAGGRGWEGHGPDPYLEGVIAYLQTIGIQSQGVVSTAKHLIGNEQEHFRFAKKDKHAGKIDPGMFNTSSSLSSEIDDRAMHEIYLWPFAEAVRGGVSSIMCSYNKLNGSHACQNSYLLNYLLKEELGFQGFVMTDWGALYSGIDAANAGLDMDMPCEAQYFGGNLTTAVLNGTLPQDRLDDMATRILSALIYSGVHNPDGPNYNAQTFLTEGHEYFKQQEGDIVVLNKHVDVRSDINRAVALRSAVEGVVLLKNEHETLPLGREKVKRISILGQAAGDDSKGTSCSLRGCGSGAIGTGYGSGAGTFSYFVTPADGIGARAQQEKISYEFIGDSWNQ.... The compound is OC[C@H]1NC[C@@H](O)[C@@H](O)C1(F)F. The pKi is 3.0. (5) The small molecule is COc1ccc(C2=N[C@@H](c3ccc(Cl)cc3)[C@@H](c3ccc(Cl)cc3)N2C(=O)N2CCNC(=O)C2)c(OC(C)C)c1. The target protein (P56950) has sequence MCNTNMSVSTGGAVSTSQIPASEQETLVRPKPLLLKLLKSVGAQKDTYTMKEVIFYLGQYIMTKRLYDEKQQHIVYCSNDLLGDLFGVPSFSVKEHRKIYTMIYRNLVVVNQHEPSDSGTSVSENSCHREGGSDQKDPVQELQEEKPSSSDLISRPSTSSRRRTISETEEHADDLPGERQRKRHKSDSISLSFDESLALCVIREICCERSSSSESTGTPSNPDLDAGVSEHSGDWLDQDSVSDQFSVEFEVESLDSEDYSLSEEGQELSDEDDEVYRVTVYQAGESDTDSFEEDPEISLADYWKCTSCNEMNPPLPPHCNRCWALRENWLPEDKGKIPEKATPENSTQVEEGFDVPDCKKAAASDSRESCAEEIDDKITQASHSQESEDYSQPSTSNSIIYSSQEDVKEFEREETQDKEEIVESSFPLNAIEPCVICQGRPKNGCIVHGKTGHLMACFTCAKKLKKRNKPCPVCRQPIQMIVLTYFP. The pKi is 8.0. (6) The compound is CCC(C)c1ccc([C@H](C)C(=O)SCCNC(=O)CCNC(=O)C(O)C(C)(C)COP(=O)(O)OP(=O)(O)OC[C@H]2O[C@@H](n3cnc4c(N)ncnc43)[C@H](O)[C@@H]2OP(=O)(O)O)cc1. The target protein (P70473) has sequence MALRGVRVLELAGLAPGPFCGMILADFGAEVVLVDRLGSVNHPSHLARGKRSLALDLKRSPGAAVLRRMCARADVLLEPFRCGVMEKLQLGPETLRQDNPKLIYARLSGFGQSGIFSKVAGHDINYVALSGVLSKIGRSGENPYPPLNLLADFGGGGLMCTLGILLALFERTRSGLGQVIDANMVEGTAYLSTFLWKTQAMGLWAQPRGQNLLDGGAPFYTTYKTADGEFMAVGAIEPQFYTLLLKGLGLESEELPSQMSIEDWPEMKKKFADVFARKTKAEWCQIFDGTDACVTPVLTLEEALHHQHNRERGSFITDEEQHACPRPAPQLSRTPAVPSAKRDPSVGEHTVEVLKDYGFSQEEIHQLHSDRIIESNKLKANL. The pKi is 4.7. (7) The compound is CN1CC=C(c2ccc(O)cc2)CC1. The target protein (P09417) has sequence MAAAAAAGEARRVLVYGGRGALGSRCVQAFRARNWWVASVDVVENEEASASIIVKMTDSFTEQADQVTAEVGKLLGEEKVDAILCVAGGWAGGNAKSKSLFKNCDLMWKQSIWTSTISSHLATKHLKEGGLLTLAGAKAALDGTPGMIGYGMAKGAVHQLCQSLAGKNSGMPPGAAAIAVLPVTLDTPMNRKSMPEADFSSWTPLEFLVETFHDWITGKNRPSSGSLIQVVTTEGRTELTPAYF. The pKi is 5.5. (8) The compound is O=C(Cn1c(-c2ccccn2)nc2ccccc21)Nc1ccc2ccccc2c1. The target protein sequence is MWESKFVKEGLTFDDVLLVPAKSDVLPREVSVKTVLSESLQLNIPLISAGMDTVTEADMAIAMARQGGLGIIHKNMSIEQQAEQVDKVKRSESGVISDPFFLTPEHQVYDAEHLMGKYRISGVPVVNNLDERKLVGIITNRDMRFIQDYSIKISDVMTKEQLITAPVGTTLSEAEKILQKYKIEKLPLVDNNGVLQGLITIKDIEKVIEFPNSAKDKQGRLLVGAAVGVTADAMTRIDALVKASVDAIVLDTAHGHSQGVIDKVKEVRAKYPSLNIIAGNVATAEATKALIEAGANVVKVGIGPGSICTTRVVAGVGVPQLTAVYDCATEARKHGIPVIADGGIKYSGDMVKALAAGAHVVMLGSMFAGVAESPGETEIYQGRQFKVYRGMGSVGAMEKGSKDRYFQEGNKKLVPEGIEGRVPYKGPLADTVHQLVGGLRAGMGYCGAQDLEFLRENAQFIRMSGAGLLESHPHHVQITKEAPNYSL. The pKi is 7.4. (9) The small molecule is CC(C)C[C@@H]1NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](Cc2cnc[nH]2)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H]2CSCc3cc(cc(c3)CSC[C@@H](C(=O)N[C@@H](C)C(=O)O)NC1=O)CSC[C@H](NC(=O)[C@H](C)N)C(=O)N[C@@H](CO)C(=O)N[C@@H](Cc1c[nH]c3ccccc13)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(=N)N)C(=O)N2. The target protein (P14272) has sequence MILFKQVGYFVSLFATVSCGCLSQLYANTFFRGGDLAAIYTPDAQHCQKMCTFHPRCLLFSFLAVSPTKETDKRFGCFMKESITGTLPRIHRTGAISGHSLKQCGHQLSACHQDIYEGLDMRGSNFNISKTDSIEECQKLCTNNIHCQFFTYATKAFHRPEYRKSCLLKRSSSGTPTSIKPVDNLVSGFSLKSCALSEIGCPMDIFQHFAFADLNVSQVVTPDAFVCRTVCTFHPNCLFFTFYTNEWETESQRNVCFLKTSKSGRPSPPIIQENAVSGYSLFTCRKARPEPCHFKIYSGVAFEGEELNATFVQGADACQETCTKTIRCQFFTYSLLPQDCKAEGCKCSLRLSTDGSPTRITYEAQGSSGYSLRLCKVVESSDCTTKINARIVGGTNSSLGEWPWQVSLQVKLVSQNHMCGGSIIGRQWILTAAHCFDGIPYPDVWRIYGGILNLSEITNKTPFSSIKELIIHQKYKMSEGSYDIALIKLQTPLNYTEFQK.... The pKi is 9.0. (10) The compound is c1cnc(N2CCN(Cc3ccc4c(c3)OCO4)CC2)nc1. The target is MLLARMKPQVQPELGGADQ. The pKi is 5.0.