This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The target protein sequence is MDNLLRHLKISKEQITPVVLVVGDPGRVDKIKVVCDSYVDLAYNREYKSVECHYKGQKFLCVSHGVGSAGCAVCFEELCQNGAKVIIRAGSCGSLQPDLIKRGDICICNAAVREDRVSHLLIHGDFPAVGDFDVYDTLNKCAQELNVPVFNGISVSSDMYYPNKIIPSRLEDYSKANAAVVEMELATLMVIGTLRKVKTGGILIVDGCPFKWDEGDFDNNLVPHQLENMIKIALGACAKLATKYA. The drug is O=c1[nH]cnc2c(CN3[C@H](CO)C[C@@H]3CO)c[nH]c12. The pKi is 5.9. (2) The small molecule is CC(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](C)C(=O)C(F)(F)F. The target protein (Q9UNI1) has sequence MLVLYGHSTQDLPETNARVVGGTEAGRNSWPSQISLQYRSGGSRYHTCGGTLIRQNWVMTAAHCVDYQKTFRVVAGDHNLSQNDGTEQYVSVQKIVVHPYWNSDNVAAGYDIALLRLAQSVTLNSYVQLGVLPQEGAILANNSPCYITGWGKTKTNGQLAQTLQQAYLPSVDYAICSSSSYWGSTVKNTMVCAGGDGVRSGCQGDSGGPLHCLVNGKYSVHGVTSFVSSRGCNVSRKPTVFTQVSAYISWINNVIASN. The pKi is 4.4. (3) The drug is Cc1cccnc1NC(P(=O)(O)O)P(=O)(O)O. The target protein sequence is MAHMERFQKVYEEVQEFLLGDAEKRFEMDVHRKGYLKSMMDTTCLGGKYNRGLCVVDVAEAMAKDTQMDAAAMERVLHDACVCGWMIEMLQAHFLVEDDIMDHSKTRRGKPCWYLHPGVTAQVAINDGLILLAWATQMALHYFADRPFLAEVLRVFHDVDLTTTIGQLYDVTSMVDSAKLDAKVAHANTTDYVEYTPFNHRRIVVYKTAYYTYWLPLVMGLLVSGTLEKVDKKATHKVAMVMGEYFQVQDDVMDCFTPPEKLGKIGTDIEDAKCSWLAVTFLTTAPAEKVAEFKANYGSTDPAAVAVIKQLYTEQNLLARFEEYEKAVVAEVEQLIAALEAQNAAFAASVKVLWSKTYKRQK. The pKi is 8.0. (4) The drug is CS(C)(=O)=O. The pKi is 2.2. The target protein sequence is MPLDAGGQNSTQMVLAPGASIFRCRQCGQTISRRDWLLPMGGDHEHVVFNPAGMIFRVWCFSLAQGLRLIGAPSGEFSWFKGYDWTIALCGQCGSHLGWHYEGGSQPQTFFGLIKDRLAEGPAD. (5) The small molecule is C[N+](C)(C)CCOC(N)=O. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKCFRTTFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKi is 4.9. (6) The compound is COc1ccc(CCCn2ncc3c2nc(N)n2nc(-c4ccco4)nc32)cc1. The target protein (Q0VC81) has sequence MPVNSTAVSLASVTYISVEILIGLCAIVGNVLVIWVVKLNPSLQTTTFYFIVSLALADIAVGVLVMPLAIVISLGVTIHFYSCLLMTCLLMIFTHASIMSLLAIAVDRYLRVKLTVRYRRVTTQRRIWLALGLCWLVSFLVGLTPMFGWNMKLSSADKNLTFLPCQFRSVMRMDYMVYFSFFTWILIPLVVMCAIYFDIFYVIRNRLSQNFSGSKETGAFYGREFKTAKSLSLVLFLFALSWLPLSIINCIIYFNGEVPQIVLYLGILLSHANSMMNPIVYAYKIKKFKETYLLILKACVICQPSKSMDPSIEQTSE. The pKi is 5.0. (7) The target protein sequence is MDFSLTRLIFLFIAATLVFSSEDESRLINDLFKSYNKVVRPVKAFKDKVVVTLGLQLIQLINVDEVNQIVTTNVRLKQQWEDVHLKWNPEDYGGIKKVRISSGDIWRPDIVLYNNADGDFAIVQETKVLLDYTGKIIWTPPAIFKSYCEMIVTYFPFDLQNCSMKLGTWTYDGSLVVINPESDRPDLSNFMESGEWYMKDYRGWKHWVYYDCCPETPYLDITYHFLLQRLPLYFIVNVVIPCLLFSFLTGLVFYLPTDSGEKITLSVSVLLSLVVFLLVIVELIPSTSSAVPLIGKYMLFTMVFVITSIVITVIVINTHHRSPSTHIMPQWLKKIFIETIPRVMFFSTMKRPAQDQQKKKIFTEDIDISDISGKLGPAAVKYQSPILKNPDVKSAIEGAKYIAETMKSDQESNKASEEWKFVAMVLDHLLLAVFMIVCIIGTLAIFAGRLIELHMQG. The drug is c1cncc([C@H]2CC3CCC2N3)c1. The pKi is 8.5. (8) The compound is C[n+]1ccc(-c2cccc(Cl)c2)cc1. The target protein (Q27963) has sequence MALSELALLRRLQESRHSRKLILFIVFLALLLDNMLLTVVVPIIPSYLYSIEHEKDALEIQTAKPGLTASAPGSFQNIFSYYDNSTMVTGNSTDHLQGALVHEATTQHMATNSSSASSDCPSEDKDLLNENVQVGLLFASKATVQLLTNPFIGLLTNRIGYPIPMFTGFCIMFISTVMFAFSRTYAFLLIARSLQGIGSSCSSVAGMGMLASVYTDDEERGNAMGIALGGLAMGVLVGPPFGSVLYEFVGKTAPFLVLAALVLLDGAIQLFVLQPSRVQPESQKGTPLTTLLRDPYILIAAGSICFANMGIAMLEPALPIWMMETMCSHKWQLGVAFLPASVSYLIGTNVFGILAHKMGRWLCALLGMIIVGMSILCIPLAKNIYGLIAPNFGVGFAIGMVDSSMMPIMGYLVDLRHVSVYGSVYAIADVAFCMGYAIGPSAGGAIAKAIGFPWLMTIIGIIDILFAPLCFFLRSPPAKEEKMAILMDHNCPIKTKMYTQ.... The pKi is 4.3. (9) The small molecule is COC(=O)[C@H]1[C@@H](O)CC[C@H]2CN3CCc4c([nH]c5ccccc45)[C@@H]3C[C@@H]21. The target protein sequence is MAYWYFGQVWCGVYLALDVLFCTSSIVHLCAISLDRYWSVTQAVEYNLKRTPRRVKATIVAVWLISAVISFPPLVSFYRRSDGAAYPQCGLNDETWYILSSCIGSFFAPCLIMGLVYARIYRFFLSRRRRARSSVCRRKVAQAREKRFTFVLAVVMGVFVLCWFPFFFSYSLYGICREACQLPEPLFKFFFWIGYCKSSLNPVIYTVFNQDFRRSFKHILFRRRRRGFRQ. The pKi is 9.8. (10) The compound is C[C@@H]1NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](Cc2ccc(-c3ccccc3)cc2)NC(=O)[C@H](Cc2ccccc2)NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@@H]2CCCN2C(=O)[C@H]2CCCN2C(=O)[C@H](Cc2ccccc2)NC1=O. The target protein (P56450) has sequence MNSTHHHGMYTSLHLWNRSSYGLHGNASESLGKGHPDGGCYEQLFVSPEVFVTLGVISLLENILVIVAIAKNKNLHSPMYFFICSLAVADMLVSVSNGSETIVITLLNSTDTDAQSFTVNIDNVIDSVICSSLLASICSLLSIAVDRYFTIFYALQYHNIMTVRRVGIIISCIWAACTVSGVLFIIYSDSSAVIICLISMFFTMLVLMASLYVHMFLMARLHIKRIAVLPGTGTIRQGTNMKGAITLTILIGVFVVCWAPFFLHLLFYISCPQNPYCVCFMSHFNLYLILIMCNAVIDPLIYALRSQELRKTFKEIICFYPLGGICELSSRY. The pKi is 6.8.