This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The small molecule is Oc1cc2c(cc1O)[C@@H]1c3ccccc3CN[C@@H]1CC2. The target protein (P07522) has sequence MLFSLTFLSVFLKITVLSVTAQQTRNCQSGPLERSGTTTYAAAGPPRFLIFLQGNSIFRINTDGTNHQQLVVDAGVSVVMDFHYKEERLYWVDLERQLLQRVFFNGSGQETVCKVDKNVSGLAINWIDGEILRTDRWKGVITVTDMNGNNSRVLLSSLKRPANILVDPTERLIFWSSVVTGNLHRADLGGMDVKTLLEAPERISVLILDILDKRLFWAQDGREGSHGYIHSCDYNGGSIHHIRHQARHDLLTMAIFGDKILYSALKEKAIWIADKHTGKNVVRVNLDPASVPPRELRVVHLHAQPGTENRAQASDSERCKQRRGQCLYSLSERDPNSDSSACAEGYTLSRDRKYCEDVNECALQNHGCTLGCENIPGSYYCTCPTGFVLLPDGKRCHELVACPGNRSECSHDCILTSDGPLCICPAGSVLGKDGKTCTGCSFSDNGGCSQICLPLSLASWECDCFPGYDLQLDRKSCAASMGPQPFLLFANSQDIRHMHF.... The pKi is 5.0. (2) The compound is Cc1ccccc1CNC(=O)[C@H]1N(C(=O)[C@@H](O)[C@H](Cc2ccccc2)NC(=O)c2cccc(O)c2C)CSC1(C)C. The target protein sequence is PQVTLWQRPLVTIKIGGQLREALLDTGADDTIFEEISLPGRWKPKMIGGIGGFVKVRQYDQIPIEICGHKVIGTVLVGPTPANIIGRNLMTQIGCTLNF. The pKi is 9.5. (3) The small molecule is O=C(/C=C/c1c(-c2ccccc2)nn2ccccc12)N1CCCC[C@@H]1CCO. The target protein (Q60612) has sequence MPPYISAFQAAYIGIEVLIALVSVPGNVLVIWAVKVNQALRDATFCFIVSLAVADVAVGALVIPLAILINIGPQTYFHTCLMVACPVLILTQSSILALLAIAVDRYLRVKIPLRYKTVVTQRRAAVAIAGCWILSLVVGLTPMFGWNNLSEVEQAWIANGSVGEPVIKCEFEKVISMEYMVYFNFFVWVLPPLLLMVLIYLEVFYLIRKQLNKKVSASSGDPQKYYGKELKIAKSLALILFLFALSWLPLHILNCITLFCPTCQKPSILIYIAIFLTHGNSAMNPIVYAFRIHKFRVTFLKIWNDHFRCQPKPPIEEDIPEEKADD. The pKi is 9.2. (4) The compound is O=C(O)CSc1cc(NS(=O)(=O)c2ccc(Oc3ccccc3)cc2)c2ccccc2c1O. The target protein (Q92843) has sequence MATPASAPDTRALVADFVGYKLRQKGYVCGAGPGEGPAADPLHQAMRAAGDEFETRFRRTFSDLAAQLHVTPGSAQQRFTQVSDELFQGGPNWGRLVAFFVFGAALCAESVNKEMEPLVGQVQEWMVAYLETQLADWIHSSGGWAEFTALYGDGALEEARRLREGNWASVRTVLTGAVALGALVTVGAFFASK. The pKi is 5.8. (5) The drug is C[C@@H]1O[C@@H](Oc2c(-c3cc(O)c(O)c(O)c3)oc3cc(O)cc(O)c3c2=O)[C@H](O[C@@H]2O[C@H](COC(=O)/C=C/c3ccc(O)c(O)c3)[C@@H](O)[C@H](O)[C@H]2O[C@@H]2O[C@H](CO)[C@@H](O)[C@H](O)[C@H]2O)[C@H](O)[C@H]1O. The target protein (P04746) has sequence MKFFLLLFTIGFCWAQYSPNTQQGRTSIVHLFEWRWVDIALECERYLAPKGFGGVQVSPPNENVAIYNPFRPWWERYQPVSYKLCTRSGNEDEFRNMVTRCNNVGVRIYVDAVINHMCGNAVSAGTSSTCGSYFNPGSRDFPAVPYSGWDFNDGKCKTGSGDIENYNDATQVRDCRLTGLLDLALEKDYVRSKIAEYMNHLIDIGVAGFRLDASKHMWPGDIKAILDKLHNLNSNWFPAGSKPFIYQEVIDLGGEPIKSSDYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFVPSDRALVFVDNHDNQRGHGAGGASILTFWDARLYKMAVGFMLAHPYGFTRVMSSYRWPRQFQNGNDVNDWVGPPNNNGVIKEVTINPDTTCGNDWVCEHRWRQIRNMVIFRNVVDGQPFTNWYDNGSNQVAFGRGNRGFIVFNNDDWSFSLTLQTGLPAGTYCDVISGDKINGNCTGIKIYVSDDGKAHFSISNSAED.... The pKi is 7.4. (6) The compound is O=C1NCCCC[C@@H](C(=O)NCC(=O)N2CCNCC2)NC(=O)[C@H](Cc2ccc(-c3cc(C(F)(F)F)cc(C(F)(F)F)c3)cc2)[C@@H](C(=O)NO)CCCO1. The target protein (P51511) has sequence MGSDPSAPGRPGWTGSLLGDREEAARPRLLPLLLVLLGCLGLGVAAEDAEVHAENWLRLYGYLPQPSRHMSTMRSAQILASALAEMQRFYGIPVTGVLDEETKEWMKRPRCGVPDQFGVRVKANLRRRRKRYALTGRKWNNHHLTFSIQNYTEKLGWYHSMEAVRRAFRVWEQATPLVFQEVPYEDIRLRRQKEADIMVLFASGFHGDSSPFDGTGGFLAHAYFPGPGLGGDTHFDADEPWTFSSTDLHGNNLFLVAVHELGHALGLEHSSNPNAIMAPFYQWKDVDNFKLPEDDLRGIQQLYGTPDGQPQPTQPLPTVTPRRPGRPDHRPPRPPQPPPPGGKPERPPKPGPPVQPRATERPDQYGPNICDGDFDTVAMLRGEMFVFKGRWFWRVRHNRVLDNYPMPIGHFWRGLPGDISAAYERQDGRFVFFKGDRYWLFREANLEPGYPQPLTSYGLGIPYDRIDTAIWWEPTGHTFFFQEDRYWRFNEETQRGDPGY.... The pKi is 5.7. (7) The small molecule is CSCC[C@H](NC(=O)[C@H](Cc1ccccc1)NC(=O)CNC(=O)CNC(=O)[C@@H](N)Cc1ccc(O)cc1)C(=O)O. The target protein sequence is MEPPTVTVSDFSERYPLFLHNSSFLEEPAGLLSNWSGGSSELKAVRGSSAVAIAVSITALYSVICVVGLVGNVLVMYGVVRYTKMKTATNIYIFNLALADALATSTLPFQSAKYLMGTWPFGELLCKVVIAIDYYNMFTSIFTLTMMSVDRYIAVCHPVRALDFRTPVKAKIINICVWILSSAVGFPVMVMAVTKELDSGKTICMLKFPDPEWYWDTVTKICVFIFAFVFPVLVITVCYGLMILRLKSVRLLSGSKEKDRNLRRITRMVLVVVAAFIICWTPIHIFIIVKTVVEIDQKNLLVVACWHLCIALGYMNSSLNPVLYAFLDENFKRCFREFCLPFRTRIEQNSFSKARSVIREPISVCAKSESIKQPT. The pKi is 7.3. (8) The target protein (P39040) has sequence MSRAYDLVVIGAGSGGLEAGWNAASLHKKRVAVIDLQKHHGPPHYAALGGTCVNVGCVPKKLMVTGANYMDTIRESAGFGWELDRESVRPNWKALIAAKNKAVSGINDSYEGMFADTEGLTFHQGFGALQDNHTVLVRESADPNSAVLETLDTEYILLATGSWPQHLGIEGDDLCITSNEAFYLDEAPKRALCVGGGYISIEFAGIFNAYKARGGQVDLAYRGDMILRGFDSELRKQLTEQLRANGINVRTHENPAKVTKNADGTRHVVFESGAEADYDVVMLAIGRVPRSQTLQLDKAGVEVAKNGAIKVDAYSKTNVDNIYAIGDVTDRVMLTPVAINEGAAFVDTVFANKPRATDHTKVACAVFSIPPMGVCGYVEEDAAKKYDQVAVYESSFTPLMHNISGSTYKKFMVRIVTNHADGEVLGVHMLGDSSPEIIQSVAICLKMGAKISDFYNTIGVHPTSAEELCSMRTPAYFYQKGKRVEKIDSNL. The pKi is 4.9. The compound is O=C(CCc1ccc(O)c(O)c1)NCCCNCCCCNCCCNC(=O)CCc1ccc(O)c(O)c1. (9) The drug is CC(C)=CCC/C(C)=C/COc1c2ccoc2cc2oc(=O)ccc12. The target protein (P00178) has sequence MEFSLLLLLAFLAGLLLLLFRGHPKAHGRLPPGPSPLPVLGNLLQMDRKGLLRSFLRLREKYGDVFTVYLGSRPVVVLCGTDAIREALVDQAEAFSGRGKIAVVDPIFQGYGVIFANGERWRALRRFSLATMRDFGMGKRSVEERIQEEARCLVEELRKSKGALLDNTLLFHSITSNIICSIVFGKRFDYKDPVFLRLLDLFFQSFSLISSFSSQVFELFPGFLKHFPGTHRQIYRNLQEINTFIGQSVEKHRATLDPSNPRDFIDVYLLRMEKDKSDPSSEFHHQNLILTVLSLFFAGTETTSTTLRYGFLLMLKYPHVTERVQKEIEQVIGSHRPPALDDRAKMPYTDAVIHEIQRLGDLIPFGVPHTVTKDTQFRGYVIPKNTEVFPVLSSALHDPRYFETPNTFNPGHFLDANGALKRNEGFMPFSLGKRICLGEGIARTELFLFFTTILQNFSIASPVPPEDIDLTPRESGVGNVPPSYQIRFLAR. The pKi is 5.5. (10) The small molecule is Cc1nc2sccn2c(=O)c1CCN1CCC(=C(c2ccc(F)cc2)c2ccc(F)cc2)CC1. The target protein (Q02152) has sequence MASSYKMSEQSTTSEHILQKTCDHLILTNRSGLETDSVAEEMKQTVEGQGHTVHWAALLILAVIIPTIGGNILVILAVALEKRLQYATNYFLMSLAIADLLVGLFVMPIALLTIMFEAIWPLPLALCPAWLFLDVLFSTASIMHLCAISLDRYIAIKKPIQANQCNSRATAFIKITVVWLISIGIAIPVPIKGIETDVINPHNVTCELTKDRFGSFMVFGSLAAFFAPLTIMVVTYFLTIHTLQKKAYLVKNKPPQRLTRWTVPTVFLREDSSFSSPEKVAMLDGSHRDKILPNSSDETLMRRMSSVGKRSAQTISNEQRASKALGVVFFLFLLMWCPFFITNLTLALCDSCNQTTLKTLLEIFVWIGYVSSGVNPLIYTLFNKTFREAFGRYITCNYRATKSVKALRKFSSTLCFGNSMVENSKFFTKHGIRNGINPAMYQSPMRLRSSTIQSSSIILLDTLLTENDGDKAEEQVSYI. The pKi is 8.4.