Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pKi (pKi = -log10(Ki in M); higher means stronger inhibition). Dataset: bindingdb_ki. From a dataset of Drug-target binding data from BindingDB using Ki measurements. (1) The compound is CC(=O)Nc1nnc(S(N)(=O)=O)s1. The target protein sequence is MSDLQQLFENNVRWAEAIKQEDPDFFAKLARQQTPEYLWIGCSDARVPANEIVGMLPGDLFVHRNVANVVLHTDLNCLSVIQFAVDVLKVKHILVTGHYGCGGVRASLHNDQLGLIDGWLRSIRDLAYEYREHLEQLPTEEERVDRLCELNVIQQVANVSHTSIVQNAWHRGQSLSVHGCIYGIKDGLWKNLNVTVSGLDQLPPQYRLSPLGGCC. The pKi is 7.1. (2) The compound is O=C(O)CNC(=O)Cn1ccc(=O)[nH]c1=O. The pKi is 3.0. The target protein sequence is MEAQLRATSFLWHHPLQVSGCLNFLFIYFSSFLFRVLFLFYSTSLLCLFLSVLAVLEMNRVQSSFRVPARVLNSLVHLQDGLNTFMDPDWRQIRHVDDWALAITMESAELIDSYPWKWWKNVKAQADMHNVRIEIADILHFSLSGEMQKRTQDGKGAGDVALKSLKEMGFFCRPPAHAKSTEASDHRTNGGDDDGDDELLELIFFPLTEVASAVATFRNIIQLASIYRFDLITKGLLLAAQDLDFNLVGYYVAKYTLNQIRQLKGYKEGAYVKVREGVEDNELLHECVQSVSVEDVLNEGTYLKTWEKIACSVFDAFGMPEEERRHAYEWLKSAALEGKR. (3) The small molecule is Nc1nc2ncc(CNc3ccc(C(=O)NC(CCP(=O)(O)O)C(=O)O)cc3)cc2c(=O)[nH]1. The target protein (P48760) has sequence MSWARSRLCSTLSLAAVSARGATTEGAARRGMSAWPAPQEPGMEYQDAVRTLNTLQTNASYLEQVKRQRSDPQAQLEAMEMYLARSGLQVEDLNRLNIIHVTGTKGKGSTCAFTERILRNYGLKTGFFSSPHMVQVRERIRINGKPISPELFTKHFWCLYNQLEEFKDDSHVSMPSYFRFLTLMAFHVFLQEKVDLAVVEVGIGGAFDCTNIIRKPVVCGVSSLGIDHTSLLGDTVEKIAWQKGGIFKPGVPAFTVVQPEGPLAVLRDRAQQIGCPLYLCPPLEALEEVGLPLSLGLEGAHQRSNAALALQLAHCWLERQDHQDIQELKVSRPSIRWQLPLAPVFRPTPHMRRGLRDTVWPGRTQILQRGPLTWYLDGAHTTSSVQACVHWYRQSLERSKRTDGGSEVHILLFNSTGDRDSAALLKLLQPCQFDYAVFCPNVTEVSSIGNADQQNFTVTLDQVLLRCLQHQQHWNGLAEKQASSNLWSSCGPDPAGPGSL.... The pKi is 5.0. (4) The compound is CC(C)NCC(O)COc1cccc2[nH]ccc12. The target protein sequence is MTQYNHSAELALQSSANKSLNFTEALDERTLLGLKISLSVLLSVITLATILANVFVVITIFLTRKLHTPANYLIGSLAVTDLLVSVLVMPISIAYTVTHTWAFGQVLCDIWLSSDITCCTASILHLCVIALDRYWAITDALEYAKRRTAGRAALMIAVVWMISVSISVPPFFWRQVKAHEEIAKCAVNTDQISYTIYSTCGAFYIPSVLLLILYGRIYVAARSRILKPPSLYGKRFTTAHLITGSAGSSLCSINASLHEGHSHPGGSPIFINHVQIKLADSVLERKRISAARERKATKTLGIILGAFIFCWLPFFVMSLVLPICQDACWFHPILLDFFTWLGYLNSLINPVIYTAFNEEFKQAFQNLIRVKKRLP. The pKi is 5.0. (5) The compound is CC(C)COc1cccc(-c2nc3cc(F)c(C(N)=[NH2+])cc3[nH]2)c1[O-]. The target protein sequence is IIGGEFTTIENQPWFAAIYRRHRGGSVTYVCGGSLMSPCWVISATHCFIDYPKKEDYIVYLGRSRLNSNTQGEMKFEVENLILHKDYSADTLAHHNDIALLKIRSKEGRCAQPSRTIQTICLPSMYNDPQFGTSCEITGFGKEASTDYLYPEQLKMTVVKLISHRECQQPHYYGSEVTTKMLCAADPQWKTDACQGDSGGPLVCSLQGRMTLTGIVSWGRGCALKDKPGVYTRVSHFLPWIRSHTKEENGLAL. The pKi is 4.6. (6) The compound is O=C(c1ccc(O)c(O)c1)c1[nH]c(=O)cc2cc(O)c(O)cc12. The target protein (P18065) has sequence MLPRVGCPALPLPPPPLLPLLLLLLGASGGGGGARAEVLFRCPPCTPERLAACGPPPVAPPAAVAAVAGGARMPCAELVREPGCGCCSVCARLEGEACGVYTPRCGQGLRCYPHPGSELPLQALVMGEGTCEKRRDAEYGASPEQVADNGDDHSEGGLVENHVDSTMNMLGGGGSAGRKPLKSGMKELAVFREKVTEQHRQMGKGGKHHLGLEEPKKLRPPPARTPCQQELDQVLERISTMRLPDERGPLEHLYSLHIPNCDKHGLYNLKQCKMSLNGQRGECWCVNPNTGKLIQGAPTIRGDPECHLFYNEQQEARGVHTQRMQ. The pKi is 7.8. (7) The small molecule is Cc1ccc(C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](Cc2cccc(Cl)c2)C(N)=O)c([N+](=O)[O-])c1O. The target protein (P28305) has sequence MFLINGHKQESLAVSDRATQFGDGCFTTARVIDGKVSLLSAHIQRLQDACQRLMISCDFWPQLEQEMKTLAAEQQNGVLKVVISRGSGGRGYSTLNSGPATRILSVTAYPAHYDRLRNEGITLALSPVRLGRNPHLAGIKHLNRLEQVLIRSHLEQTNADEALVLDSEGWVTECCAANLFWRKGNVVYTPRLDQAGVNGIMRQFCIRLLAQSSYQLVEVQASLEESLQADEMVICNALMPVMPVCACGDVSFSSATLYEYLAPLCERPN. The pKi is 2.2. (8) The compound is Cc1nc(COc2ccc(C[C@H](NC(=O)O[C@H]3CO[C@H]4OCC[C@@H]34)[C@H](O)CN(CC(C)C)S(=O)(=O)c3ccc4c(c3)OCO4)cc2)cs1. The target protein sequence is PQVTLWQRPLVTIKIGGQLKEALLDTGADDTVLEEMSLPGRWKPKIIGGIGGFIKVRQYDQIPIEICGHKVIGTVLVGPTPFNVIGRNLLTQIGCTLNF. The pKi is 10.