Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki.. Dataset: Drug-target binding data from BindingDB using Ki measurements (1) The compound is Cc1cn([C@@H]2C[C@H](O)[C@@H](CNCc3ccc([N+](=O)[O-])cc3)O2)c(=O)[nH]c1=O. The target protein sequence is MTDDKKKGKFIVFEGLDRSGKSTQSKLLVEYLKNNNVEVKHLYFPNRETGIGQIISKYLKMENSMSNETIHLLFSANRWEHMNEIKSLLLKGIWVVCDRYAYSGVAYSSGALNLNKTWCMNPDQGLIKPDVVFYLNVPPNYAQNRSDYGEEIYEKVETQKKIYETYKHFAHEDYWINIDATRKIEDIHNDIVKEVTKIKVEPEEFNFLWS. The pKi is 3.0. (2) The drug is COP(=O)(C(=O)N[C@@H](CC(C)C)C(=O)[O-])C([NH3+])CC(C)C. The target protein (P28838) has sequence MFLLPLPAAGRVVVRRLAVRRFGSRSLSTADMTKGLVLGIYSKEKEDDVPQFTSAGENFDKLLAGKLRETLNISGPPLKAGKTRTFYGLHQDFPSVVLVGLGKKAAGIDEQENWHEGKENIRAAVAAGCRQIQDLELSSVEVDPCGDAQAAAEGAVLGLYEYDDLKQKKKMAVSAKLYGSGDQEAWQKGVLFASGQNLARQLMETPANEMTPTRFAEIIEKNLKSASSKTEVHIRPKSWIEEQAMGSFLSVAKGSDEPPVFLEIHYKGSPNANEPPLVFVGKGITFDSGGISIKASANMDLMRADMGGAATICSAIVSAAKLNLPINIIGLAPLCENMPSGKANKPGDVVRAKNGKTIQVDNTDAEGRLILADALCYAHTFNPKVILNAATLTGAMDVALGSGATGVFTNSSWLWNKLFEASIETGDRVWRMPLFEHYTRQVVDCQLADVNNIGKYRSAGACTAAAFLKEFVTHPKWAHLDIAGVMTNKDEVPYLRKGMT.... The pKi is 2.8. (3) The pKi is 8.6. The compound is Nc1nc2c(c(N3CC4CCCNC4C3)n1)CCCCC2c1ccccc1. The target protein (Q91ZY1) has sequence MSESNGTDVLPLTAQVPLAFLMSLLAFAITIGNAVVILAFVADRNLRHRSNYFFLNLAISDFFVGVISIPLYIPHTLFNWNFGSGICMFWLITDYLLCTASVYSIVLISYDRYQSVSNAVRYRAQHTGILKIVAQMVAVWILAFLVNGPMILASDSWKNSTNTEECEPGFVTEWYILAITAFLEFLLPVSLVVYFSVQIYWSLWKRGSLSRCPSHAGFIATSSRGTGHSRRTGLACRTSLPGLKEPAASLHSESPRGKSSLLVSLRTHMSGSIIAFKVGSFCRSESPVLHQREHVELLRGRKLARSLAVLLSAFAICWAPYCLFTIVLSTYRRGERPKSIWYSIAFWLQWFNSLINPFLYPLCHRRFQKAFWKILCVTKQPAPSQTQSVSS. (4) The small molecule is O=P(O)(O)CC1=CC(O)C(O)C(CO)O1. The target protein (P08037) has sequence MKFREPLLGGSAAMPGASLQRACRLLVAVCALHLGVTLVYYLAGRDLRRLPQLVGVHPPLQGSSHGAAAIGQPSGELRLRGVAPPPPLQNSSKPRSRAPSNLDAYSHPGPGPGPGSNLTSAPVPSTTTRSLTACPEESPLLVGPMLIEFNIPVDLKLVEQQNPKVKLGGRYTPMDCISPHKVAIIIPFRNRQEHLKYWLYYLHPILQRQQLDYGIYVINQAGESMFNRAKLLNVGFKEALKDYDYNCFVFSDVDLIPMNDHNTYRCFSQPRHISVAMDKFGFSLPYVQYFGGVSALSKQQFLSINGFPNNYWGWGGEDDDIYNRLAFRGMSVSRPNAVIGKCRMIRHSRDKKNEPNPQRFDRIAHTKETMLSDGLNSLTYMVLEVQRYPLYTKITVDIGTPS. The pKi is 2.8. (5) The small molecule is CC1(C)OC[C@@H]([C@H]2OC(=O)C(=O)C2O)O1. The target protein sequence is MTQQQVISYYESTAHENEVELILARAKKIIQAQQSLQGNAIVLDIDETALNHYYSLKLAGFPQGENHTIWNELLSRTDAYPIKATLDFYLYCLTSGLKVFFISARFAQYLESTKQALRNAGYVNFEDVFVFPENIEQYNSKDFKNFKAERRAYIESLGYKILISIGDQSSDLLGGYTLYTLQLPNYLYGENSRF. The pKi is 5.5. (6) The compound is CC(=O)N[C@H]1CSCc2cc3cc(c2)CSC[C@H](NC(=O)[C@H](CCCNC(=N)N)NC(=O)[C@H](C)NC(=O)[C@@H]2CCCN2C(=O)[C@H](Cc2c[nH]c4ccccc24)NC(=O)[C@H](CO)NC1=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(=O)O)CSC3. The target protein (P14272) has sequence MILFKQVGYFVSLFATVSCGCLSQLYANTFFRGGDLAAIYTPDAQHCQKMCTFHPRCLLFSFLAVSPTKETDKRFGCFMKESITGTLPRIHRTGAISGHSLKQCGHQLSACHQDIYEGLDMRGSNFNISKTDSIEECQKLCTNNIHCQFFTYATKAFHRPEYRKSCLLKRSSSGTPTSIKPVDNLVSGFSLKSCALSEIGCPMDIFQHFAFADLNVSQVVTPDAFVCRTVCTFHPNCLFFTFYTNEWETESQRNVCFLKTSKSGRPSPPIIQENAVSGYSLFTCRKARPEPCHFKIYSGVAFEGEELNATFVQGADACQETCTKTIRCQFFTYSLLPQDCKAEGCKCSLRLSTDGSPTRITYEAQGSSGYSLRLCKVVESSDCTTKINARIVGGTNSSLGEWPWQVSLQVKLVSQNHMCGGSIIGRQWILTAAHCFDGIPYPDVWRIYGGILNLSEITNKTPFSSIKELIIHQKYKMSEGSYDIALIKLQTPLNYTEFQK.... The pKi is 8.7.