This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The compound is O=C(O)c1ccccc1O. The target protein (P16924) has sequence HTDFFTSIGHMTDLINTEKDLVISKLKDYIKAEESKLEQIKKWAEKLDKLTDTATKDPEGFLGHPANAFKLMKRLNTEWGELESLVLKDMSDGFISNMTIQRQFFPNDEDQTGARKALLRLQDTYNLDTDTLSRGNLPGVKHKSFLTAEDCFELGKIRYTEADYYHTELWMEQALKQLDEGEVSSADKVYILDYLSYAVYQQGDLSKAMMLTKRLLELDPEHQRANGNMKYFEYIMAKEKEANKSSTDAEDQTDKETEVKKKDYLPERRKYEMLCRGEGLKMTPRRQKRLFCRYYDGNRNPRYILGPVKQEDEWDKPRIVRFLDIISDEEIETVKELAKPRLSRATVHDPETGKLTTAHYRVSKSAWLSGYESPVVSRINTRIQDLTGLDVSTAEELQVANYGVGGQYEPHFDFGRKDEPDAFKELGTGNRIATWLFYMSDVSAGGATVFPEVGASVWPKKGTAVFWYNLFPSGEGDYSTRHAACPVLVGNKWVSNKWLH.... The pKi is 2.6. (2) The small molecule is COc1ccccc1N1CCN(CCN(C(=O)C2CCCCC2)c2ccccn2)CC1. The target protein sequence is MQKPEKFLYLPRGAQEEKTREKSASKHQVCRGVKLEPGTLTSMDPLNLSWYSGDIGDRNWSKPLNESGVDQKPQYNYYAMLLTLLIFVIVFGNVLVCMAVSREKALQTTTNYLIVSLAVADLLVATLVMPWVVYLEVVGEWRFSRIHCDIFVTLDVMMCTASILNLCAISIDRYTAVAMPMLYNTRYSSKRRVTVMIAVVWVLSFAISCPLLFGLNNTDENECIIANPAFVVYSSIVSFYVPFIVTLLVYVQIYIVLRRRRKRVNTKRSSHGLDSDTQAPLKDKCTHPEDVKLCTVIVKSNGSFQVNKRKVEVESHIEEMEMVSSTSPLEKTTIKPAAPSNHRLVVPIASTQGTNSTLQAPLDSPGKAEKNGHAKETPRIAKVFEIQSMPNGKLRTSLLKAMNRRKLSQQKEKKATQMLAIVLGVFIICWLPFFITHILNMHCDCSIPPAMYSAFTWLGYVNSAVNPIIYTTFNIEFRKAFMKILHC. The pKi is 7.1. (3) The drug is O=C(O)C(=O)Nc1sccc1C(=O)O. The target protein (P23469) has sequence MEPLCPLLLVGFSLPLARALRGNETTADSNETTTTSGPPDPGASQPLLAWLLLPLLLLLLVLLLAAYFFRFRKQRKAVVSTSDKKMPNGILEEQEQQRVMLLSRSPSGPKKYFPIPVEHLEEEIRIRSADDCKQFREEFNSLPSGHIQGTFELANKEENREKNRYPNILPNDHSRVILSQLDGIPCSDYINASYIDGYKEKNKFIAAQGPKQETVNDFWRMVWEQKSATIVMLTNLKERKEEKCHQYWPDQGCWTYGNIRVCVEDCVVLVDYTIRKFCIQPQLPDGCKAPRLVSQLHFTSWPDFGVPFTPIGMLKFLKKVKTLNPVHAGPIVVHCSAGVGRTGTFIVIDAMMAMMHAEQKVDVFEFVSRIRNQRPQMVQTDMQYTFIYQALLEYYLYGDTELDVSSLEKHLQTMHGTTTHFDKIGLEEEFRKLTNVRIMKENMRTGNLPANMKKARVIQIIPYDFNRVILSMKRGQEYTDYINASFIDGYRQKDYFIATQ.... The pKi is 3.1. (4) The small molecule is CCCN(CCC)CCc1ccc(OC)c(OCCc2ccccc2)c1. The target protein (Q15125) has sequence MTTNAGPLHPYWPQHLRLDNFVPNDRPTWHILAGLFSVTGVLVVTTWLLSGRAAVVPLGTWRRLSLCWFAVCGFIHLVIEGWFVLYYEDLLGDQAFLSQLWKEYAKGDSRYILGDNFTVCMETITACLWGPLSLWVVIAFLRQHPLRFILQLVVSVGQIYGDVLYFLTEHRDGFQHGELGHPLYFWFYFVFMNALWLVLPGVLVLDAVKHLTHAQSTLDAKATKAKSKKN. The pKi is 7.8. (5) The compound is CCOc1ccc(C[C@H]2NC(=O)CC(C3CCCC3)(C3CCCC3)SSC[C@@H](C(=O)N3CCC[C@H]3C(=O)N[C@@H](CCCN=C(N)N)C(N)=O)NC(=O)[C@H](CC(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](Cc3ccccc3)NC2=O)cc1. The target protein (Q8NFM4) has sequence MARLFSPRPPPSEDLFYETYYSLSQQYPLLLLLLGIVLCALAALLAVAWASGRELTSDPSFLTTVLCALGGFSLLLGLASREQRLQRWTRPLSGLVWVALLALGHAFLFTGGVVSAWDQVSYFLFVIFTAYAMLPLGMRDAAVAGLASSLSHLLVLGLYLGPQPDSRPALLPQLAANAVLFLCGNVAGVYHKALMERALRATFREALSSLHSRRRLDTEKKHQEHLLLSILPAYLAREMKAEIMARLQAGQGSRPESTNNFHSLYVKRHQGVSVLYADIVGFTRLASECSPKELVLMLNELFGKFDQIAKEHECMRIKILGDCYYCVSGLPLSLPDHAINCVRMGLDMCRAIRKLRAATGVDINMRVGVHSGSVLCGVIGLQKWQYDVWSHDVTLANHMEAGGVPGRVHITGATLALLAGAYAVEDAGMEHRDPYLRELGEPTYLVIDPRAEEEDEKGTAGGLLSSLEGLKMRPSLLMTRYLESWGAAKPFAHLSHGDSP.... The pKi is 8.4. (6) The pKi is 5.0. The target protein (Q62758) has sequence MDRLDANVSSNEGFGSVEKVVLLTFFAMVILMAILGNLLVMVAVCRDRQLRKIKTNYFIVSLAFADLLVSVLVNAFGAIELVQDIWFYGEMFCLVRTSLDVLLTTASIFHLCCISLDRYYAICCQPLVYRNKMTPLRIALMLGGCWVIPMFISFLPIMQGWNNIGIVDVIEKRKFNHNSNSTFCVFMVNKPYAITCSVVAFYIPFLLMVLAYYRIYVTAKEHAQQIQMLQRAGATSESRPQTADQHSTHRMRTETKAAKTLCVIMGCFCFCWAPFFVTNIVDPFIDYTVPEKVWTAFLWLGYINSGLNPFLYAFLNKSFRRAFLIILCCDDERYKRPPILGQTVPCSTTTINGSTHVLRDTVECGGQWESRCHLTATSPLVAAQPVIRRPQDNDLEDSCSLKRSQS. The compound is CN1C2CCC1CC(OC(=O)c1cccc3nc[nH]c13)C2. (7) The compound is CCN(CC)CC#CCOC(=O)[C@](O)(c1ccccc1)C1CCCCC1. The target protein sequence is MANFTPVNGSSSNQSVRLVTSAHNRYETVEMVFIATVTGSLSLVTVVGNVLVMLSIKVNRQLQTVNNYFLFSLACADLIIGAFSMNLYTVYIIKGYWPLGAVVCDLWLALDYVVSNASVMNLLIISFDRYFCVTKPLTYPARRTTKMAGLMIAAAWVLSFVLWAPAILFWQFVVGKRTVPDNQCFIQFLSNPAVTFGTAIAAFYLPVVIMTVLYIHISLASRSRVHKHRPEGQKEKKAKTLAFLKSPLMKQSVKKPPPGEAAREELRNGKLEEAPPPALPPPPRPMADKDTSNESSSGSATQNTKERPATELSTAEATTPAMSAPPLQPRTLNPASKWSKIQIVTKQTGNECVTAIEIVPATPAGMRPAANVARKFASIARNQVRKKRQMAARERKVTRTIFAILLAFILTWTPYNVMVLVNTFCQSCIPDTVWSIGYWLCYVNSTINPACYALCNATFKKTFRHLLLCQYRNIGTAR. The pKi is 9.3. (8) The compound is Nc1c(S(=O)(=O)[O-])cc(Nc2ccc(F)cc2C(=O)O)c2c1C(=O)c1ccccc1C2=O. The target protein (O75355) has sequence MFTVLTRQPCEQAGLKALYRTPTIIALVVLLVSIVVLVSITVIQIHKQEVLPPGLKYGIVLDAGSSRTTVYVYQWPAEKENNTGVVSQTFKCSVKGSGISSYGNNPQDVPRAFEECMQKVKGQVPSHLHGSTPIHLGATAGMRLLRLQNETAANEVLESIQSYFKSQPFDFRGAQIISGQEEGVYGWITANYLMGNFLEKNLWHMWVHPHGVETTGALDLGGASTQISFVAGEKMDLNTSDIMQVSLYGYVYTLYTHSFQCYGRNEAEKKFLAMLLQNSPTKNHLTNPCYPRDYSISFTMGHVFDSLCTVDQRPESYNPNDVITFEGTGDPSLCKEKVASIFDFKACHDQETCSFDGVYQPKIKGPFVAFAGFYYTASALNLSGSFSLDTFNSSTWNFCSQNWSQLPLLLPKFDEVYARSYCFSANYIYHLFVNGYKFTEETWPQIHFEKEVGNSSIAWSLGYMLSLTNQIPAESPLIRLPIEPPVFVGTLAFFTAAALL.... The pKi is 4.2. (9) The drug is COc1ccccc1N1CCN(CCCCn2ncc(=O)n(C)c2=O)CC1. The target protein (P05363) has sequence MGACVVMTDINISSGLDSNATGITAFSMPGWQLALWTAAYLALVLVAVMGNATVIWIILAHQRMRTVTNYFIVNLALADLCMAAFNAAFNFVYASHNIWYFGRAFCYFQNLFPITAMFVSIYSMTAIAADRYMAIVHPFQPRLSAPGTRAVIAGIWLVALALAFPQCFYSTITTDEGATKCVVAWPEDSGGKMLLLYHLIVIALIYFLPLVVMFVAYSVIGLTLWRRSVPGHQAHGANLRHLQAKKKFVKTMVLVVVTFAICWLPYHLYFILGTFQEDIYCHKFIQQVYLALFWLAMSSTMYNPIIYCCLNHRFRSGFRLAFRCCPWVTPTEEDKMELTYTPSLSTRVNRCHTKEIFFMSGDVAPSEAVNGQAESPQAGVSTEP. The pKi is 5.0. (10) The compound is CN[C@@H](C)C(=O)N[C@H]1Cc2ccccc2[C@H]2CC[C@@H](C(=O)N[C@H]3CCCc4ccccc43)N2C1=O. The target protein sequence is QLQDTSRYTVSNLSMQTHAARFKTFFNWPSSVLVNPEQLASAGFYYVGNSDDVKCFCCDGGLRCWESGDDPWVQHAKWFPRCEYLIRIKGQEFIRQVQASYPHLLEQLLSTS. The pKi is 6.4.