This data is from Drug-target binding data from BindingDB using Ki measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. (1) The drug is O=C(O)c1cc(O)c2ccc(Cl)cc2n1. The target protein (Q7TSF2) has sequence MPFNAFDTFKEKILKPGKEGVKNAVGDSLGILQRKLDGTNEEGDAIELSEEGRPVQTSRARAPVCDCSCCGIPKRYIIAVMSGLGFCISFGIRCNLGVAIVEMVNNSTVYVDGKPEIQTAQFNWDPETVGLIHGSFFWGYIVTQIPGGFISNKFAANRVFGAAIFLTSTLNMFIPSAARVHYGCVMCVRILQGLVEGVTYPACHGMWSKWAPPLERSRLATTSFCGSYAGAVVAMPLAGVLVQYIGWASVFYIYGMFGIIWYMFWLLQAYECPAVHPTISNEERTYIETSIGEGANLASLSKFNTPWRRFFTSLPVYAIIVANFCRSWTFYLLLISQPAYFEEVFGFAISKVGLLSAVPHMVMTIVVPIGGQLADYLRSRKILTTTAVRKIMNCGGFGMEATLLLVVGFSHTKGVAISFLVLAVGFSGFAISGFNVNHLDIAPRYASILMGISNGVGTLSGMVCPLIVGAMTKHKTREEWQNVFLIAALVHYSGVIFYGV.... The pKi is 3.2. (2) The compound is C[N+](C)(C)CCOC(N)=O. The target protein sequence is MTLHSQSTTSPLFPQISSSWVHSPSEAGLPLGTVTQLGSYQISQETGQFSSQDTSSDPLGGHTIWQVVFIAFLTGFLALVTIIGNILVIVAFKVNKQLKTVNNYFLLSLASADLIIGVISMNLFTTYIIMNRWALGNLACDLWLSIDYVASNASVMNLLVISFDRYFSITRPLTYRAKRTTKRAGVMIGLAWVISFVLWAPAILFWQYFVGKRTVPPGECFIQFLSEPTITFGTAIAAFYMPVTIMTILYWRIYKETEKRTKELAGLQASGTEIEGRIEGRIEGRTRSQITKRKRMSLIKEKKAAQTLSAILLAFIITWTPYNIMVLVNTFADSAIPKTYWNLGYWLCYINSTVNPVAYALSNKTFRTTFKTLLLSQSDKRKRRKQQYQQRQSVIFHKRVPEQAL. The pKi is 4.5. (3) The compound is Cc1cccc(/C=C/c2ccccc2)n1. The target protein (O00222) has sequence MVCEGKRSASCPCFFLLTAKFYWILTMMQRTHSQEYAHSIRVDGDIILGGLFPVHAKGERGVPCGELKKEKGIHRLEAMLYAIDQINKDPDLLSNITLGVRILDTCSRDTYALEQSLTFVQALIEKDASDVKCANGDPPIFTKPDKISGVIGAAASSVSIMVANILRLFKIPQISYASTAPELSDNTRYDFFSRVVPPDSYQAQAMVDIVTALGWNYVSTLASEGNYGESGVEAFTQISREIGGVCIAQSQKIPREPRPGEFEKIIKRLLETPNARAVIMFANEDDIRRILEAAKKLNQSGHFLWIGSDSWGSKIAPVYQQEEIAEGAVTILPKRASIDGFDRYFRSRTLANNRRNVWFAEFWEENFGCKLGSHGKRNSHIKKCTGLERIARDSSYEQEGKVQFVIDAVYSMAYALHNMHKDLCPGYIGLCPRMSTIDGKELLGYIRAVNFNGSAGTPVTFNENGDAPGRYDIFQYQITNKSTEYKVIGHWTNQLHLKVE.... The pKi is 4.0. (4) The pKi is 8.0. The small molecule is Nc1ccc(SC[C@H]2CO[C@](CCc3ccc(Cl)cc3)(Cn3ccnc3)O2)cc1. The target protein (Q64654) has sequence MVLLGLLQSGGSVLGQAMEQVTGGNLLSTLLIACAFTLSLVYLFRLAVGHMVQLPAGAKSPPYIYSPIPFLGHAIAFGKSPIEFLENAYEKYGPVFSFTMVGKTFTYLLGSDAAALLFNSKNEDLNAEEVYGRLTTPVFGKGVAYDVPNAVFLEQKKILKSGLNIAHFKQYVSIIEKEAKEYFKSWGESGERNVFEALSELIILTASHCLHGKEIRSQLNEKVAQLYADLDGGFSHAAWLLPGWLPLPSFRRRDRAHREIKNIFYKAIQKRRLSKEPAEDILQTLLDSTYKDGRPLTDDEIAGMLIGLLLAGQHTSSTTSAWMGFFLARDKPLQDKCYLEQKTVCGEDLPPLTYEQLKDLNLLDRCIKETLRLRPPIMTMMRMAKTPQTVAGYTIPPGHQVCVSPTVNQRLKDSWVERLDFNPDRYLQDNPASGEKFAYVPFGAGRHRCIGENFAYVQIKTIWSTMLRLYEFDLINGYFPSVNYTTMIHTPENPVIRYKR.... (5) The compound is O=[As](O)(O)c1ccccc1. The target protein sequence is MSDLQQLFENNVRWAEAIKQEDPDFFAKLARQQTPEYLWIGCSDARVPANEIVGMLPGDLFVHRNVANVVLHTDLNCLSVIQFAVDVLKVKHILVTGHYGCGGVRASLHNDQLGLIDGWLRSIRDLAYEYREHLEQLPTEEERVDRLCELNVIQQVANVSHTSIVQNAWHRGQSLSVHGCIYGIKDGLWKNLNVTVSGLDQLPPQYRLSPLGGCC. The pKi is 5.0. (6) The compound is CC1(C)[C@H]2CC=C(C[N+](C)(C)Cc3ccc(-c4ccsc4)cc3)[C@@H]1C2. The target protein (P49682) has sequence MVLEVSDHQVLNDAEVAALLENFSSSYDYGENESDSCCTSPPCPQDFSLNFDRAFLPALYSLLFLLGLLGNGAVAAVLLSRRTALSSTDTFLLHLAVADTLLVLTLPLWAVDAAVQWVFGSGLCKVAGALFNINFYAGALLLACISFDRYLNIVHATQLYRRGPPARVTLTCLAVWGLCLLFALPDFIFLSAHHDERLNATHCQYNFPQVGRTALRVLQLVAGFLLPLLVMAYCYAHILAVLLVSRGQRRLRAMRLVVVVVVAFALCWTPYHLVVLVDILMDLGALARNCGRESRVDVAKSVTSGLGYMHCCLNPLLYAFVGVKFRERMWMLLLRLGCPNQRGLQRQPSSSRRDSSWSETSEASYSGL. The pKi is 6.5. (7) The drug is c1ccc(OC[C@@H]2CN(CCN3CCc4ccccc43)CCO2)cc1. The target protein (P97718) has sequence MVLLSENASEGSNCTHPPAQVNISKAILLGVILGGLIIFGVLGNILVILSVACHRHLHSVTHYYIVNLAVADLLLTSTVLPFSAIFEILGYWAFGRVFCNIWAAVDVLCCTASIMGLCIISIDRYIGVSYPLRYPTIVTQRRGVRALLCVWALSLVISIGPLFGWRQQAPEDETICQINEEPGYVLFSALGSFYVPLTIILVMYCRVYVVAKRESRGLKSGLKTDKSDSEQVTLRIHRKNVPAEGSGVSSAKNKTHFSVRLLKFSREKKAAKTLGIVVGCFVLCWLPFFLVMPIGSFFPNFKPPETVFKIVFWLGYLNSCINPIIYPCSSQEFKKAFQNVLRIQCLRRRQSSKHALGYTLHPPSQAVEGQHRGMVRIPVGSGETFYKISKTDGVREWKFFSSMPQGSARITMPKDQSACTTARVRSKSFLQVCCCVGSSTPRPEENHQVPTIKIHTISLGENGEEV. The pKi is 5.6. (8) The compound is Nc1ccc(S(N)(=O)=O)cc1Cl. The target protein sequence is MRKILISAVLVLSSISISFAEHEWSYEGEKGPEHWAQLKPEFFWCKLKNQSPINIDKKYKVKANLPKLNLYYKTAKESEVVNNGHTIQINIKEDNTLNYLGEKYQLKQFHFHTPSEHTIEKKSYPLEIHFVHKTEDGKILVVGVMAKLGKTNKELDKILNVAPAEEGEKILDKNLNLNNLIPKDKRYMTYSGSLTTPPCTEGVRWIVLKKPISISKQQLEKLKSVMVNPNNRPVQEINSRWIIEGF. The pKi is 7.4.