From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is COc1ccc(CNC(=O)CNS(=O)(=O)c2ccc3[nH]c(=O)oc3c2)cc1. The target protein (P24822) has sequence MQGPWVLLLLGLRLQLSLSVIPVEEENPAFWNKKAAEALDAAKKLQPIQTSAKNLIIFLGDGMGVPTVTATRILKGQLEGHLGPETPLAMDRFPYMALSKTYSVDRQVPDSASTATAYLCGVKTNYKTIGLSAAARFDQCNTTFGNEVFSVMYRAKKAGKSVGVVTTTRVQHASPSGTYVHTVNRNWYGDADMPASALREGCKDIATQLISNMDINVILGGGRKYMFPAGTPDPEYPNDANETGTRLDGRNLVQEWLSKHQGSQYVWNREQLIQKAQDPSVTYLMGLFEPVDTKFDIQRDPLMDPSLKDMTETAVKVLSRNPKGFYLFVEGGRIDRGHHLGTAYLALTEAVMFDLAIERASQLTSERDTLTIVTADHSHVFSFGGYTLRGTSIFGLAPLNALDGKPYTSILYGNGPGYVGTGERPNVTAAESSGSSYRRQAAVPVKSETHGGEDVAIFARGPQAHLVHGVQEQNYIAHVMASAGCLEPYTDCGLAPPADE.... The pIC50 is 5.7. (2) The compound is COc1cccc(C2c3cc(OC)c(OC)cc3CCN2C)c1. The target protein (Q9ERK7) has sequence MARCSNSMALLFSFGLLWLCSGVLGTDTEERLVEHLLDPSRYNKLIRPATNGSELVTVQLMVSLAQLISVHEREQIMTTNVWLTQEWEDYRLTWKPEDFDNMKKVRLPSKHIWLPDVVLYNNADGMYEVSFYSNAVVSYDGSIFWLPPAIYKSACKIEVKHFPFDQQNCTMKFRSWTYDRTEIDLVLKSDVASLDDFTPSGEWDIIALPGRRNENPDDSTYVDITYDFIIRRKPLFYTINLIIPCVLITSLAILVFYLPSDCGEKMTLCISVLLALTVFLLLISKIVPPTSLDVPLVGKYLMFTMVLVTFSIVTSVCVLNVHHRSPTTHTMAPWVKVVFLEKLPTLLFLQQPRHRCARQRLRLRRRQREREGAGTLFFREGPAADPCTCFVNPASMQGLAGAFQAEPAAAGLGRSMGPCSCGLREAVDGVRFIADHMRSEDDDQSVREDWKYVAMVIDRLFLWIFVFVCVFGTIGMFLQPLFQNYTATTFLHSDHSAPSS.... The pIC50 is 3.5. (3) The drug is N#CCNC(=O)OCc1ccccc1. The target protein (P0C0J0) has sequence MNKKKLGIRLLSLLALGGFVLANPVFADQNFARNEKEAKDSAITFIQKSAAIKAGARSAEDIKLDKVNLGGELSGSNMYVYNISTGGFVIVSGDKRSPEILGYSTSGSFDANGKENIASFMESYVEQIKENKKLDTTYAGTAEIKQPVVKSLLDSKGIHYNQGNPYNLLTPVIEKVKPGEQSFVGQHAATGCVATATAQIMKYHNYPNKGLKDYTYTLSSNNPYFNHPKNLFAAISTRQYNWNNILPTYSGRESNVQKMAISELMADVGISVDMDYGPSSGSAGSSRVQRALKENFGYNQSVHQINRSDFSKQDWEAQIDKELSQNQPVYYQGVGKVGGHAFVIDGADGRNFYHVNWGWGGVSDGFFRLDALNPSALGTGGGAGGFNGYQSAVVGIKP. The pIC50 is 4.8. (4) The drug is Cc1ccnc2c1C(=NN=C(N)N)CC(c1cccc(Cl)c1)C2. The pIC50 is 7.2. The target protein (P26431) has sequence MMLRWSGIWGLYPPRIFPSLLVVVALVGLLPVLRSHGLQLNPTASTIRGSEPPRERSIGDVTTAPSEPLHHPDDRNLTNLYIEHGAKPVRKAFPVLDIDYLHVRTPFEISLWILLACLMKIGFHVIPTISSIVPESCLLIVVGLLVGGLIKGVGETPPFLQSDVFFLFLLPPIILDAGYFLPLRQFTENLGTILIFAVVGTLWNAFFLGGLLYAVCLVGGEQINNIGLLDTLLFGSIISAVDPVAVLAVFEEIHINELLHILVFGESLLNDAVTVVLYHLFEEFASYEYVGISDIFLGFLSFFVVSLGGVFVGVVYGVIAAFTSRFTSHIRVIEPLFVFLYSYMAYLSAELFHLSGIMALIASGVVMRPYVEANISHKSHTTIKYFLKMWSSVSETLIFIFLGVSTVAGSHQWNWTFVISTLLFCLIARVLGVLVLTWFINKFRIVKLTPKDQFIIAYGGLRGAIAFSLGYLLDKKHFPMCDLFLTAIITVIFFTVFVQG.... (5) The compound is Cc1cc(N)nc2ccccc12. The target protein sequence is MEEKEILWNEAKAFIAACYQELGKAAEVKDRLADIKSEIDLTGSYVHTKEELEHGAKMAWRNSNRCIGRLFWNSLNVIDRRDVRTKEEVRDALFHHIETATNNGKIRPTITIFPPEEKGEKQVEIWNHQLIRYAGYESDGERIGDPASCSLTAACEELGWRGERTDFDLLPLIFRMKGDEQPVWYELPRSLVIEVPITHPDIEAFSDLELKWYGVPIISDMKLEVGGIHYNAAPFNGWYMGTEIGARNLADEKRYDKLKKVASVIGIAADYNTDLWKDQALVELNKAVLHSYKKQGVSIVDHHTAASQFKRFEEQAEEAGRKLTGDWTWLIPPISPAATHIFHRSYDNSIVKPNYFYQDKPYE. The pIC50 is 4.4. (6) The drug is COc1ccc(S(=O)(=O)NN(C(=O)Oc2ccccc2OC)c2ccccc2)cc1. The target protein (Q9JM51) has sequence MPSPGLVMESGQVLPAFLLCSTLLVIKMYAVAVITGQMRLRKKAFANPEDALKRGGLQYYRSDPDVERCLRAHRNDMETIYPFLFLGFVYSFLGPNPLIAWIHFLVVLTGRVVHTVAYLGKLNPRLRSGAYVLAQFSCFSMALQILWEVAHHL. The pIC50 is 5.0. (7) The drug is O=[N+]([O-])c1ccc2nc(Cc3ccc(Cl)cc3Cl)n(Cc3nnc(S)o3)c2c1. The target protein (P53008) has sequence MLISKSKMFKTFWILTSIVLLASATVDISKLQEFEEYQKFTNESLLWAPYRSNCYFGMRPRYVHESPLIMGIMWFNSLSQDGLHSLRHFATPQDKLQKYGWEVYDPRIGGKEVFIDEKNNLNLTVYFVKSKNGENWSVRVQGEPLDPKRPSTASVVLYFSQNGGEIDGKSSLAMIGHDGPNDMKFFGYSKELGEYHLTVKDNFGHYFKNPEYETMEVAPGSDCSKTSHLSLQIPDKEVWKARDVFQSLVSDSIRDILEKEETKQRPADLIPSVLTIRNLYNFNPGNFHYIQKTFDLTKKDGFQFDITYNKLGTTQSISTREQVTELITWSLNEINARFDKQFSFGEGPDSIESVEVKRRFALETLSNLLGGIGYFYGNQLIDRETEFDESQFTEIKLLNAKEEGPFELFTSVPSRGFFPRGFYWDEGFHLLQIMEYDFDLAFEILASWFEMIEDDSGWIAREIILGNEARSKVPQEFQVQNPNIANPPTLLLAFSEMLSR.... The pIC50 is 4.6. (8) The drug is O=C(CSc1nnc(-c2ccccc2)c(-c2ccccc2)n1)Nc1nccs1. The target protein (P9WMN1) has sequence MLRVAVPNKGALSEPATEILAEAGYRRRTDSKDLTVIDPVNNVEFFFLRPKDIAIYVGSGELDFGITGRDLVCDSGAQVRERLALGFGSSSFRYAAPAGRNWTTADLAGMRIATAYPNLVRKDLATKGIEATVIRLDGAVEISVQLGVADAIADVVGSGRTLSQHDLVAFGEPLCDSEAVLIERAGTDGQDQTEARDQLVARVQGVVFGQQYLMLDYDCPRSALKKATAITPGLESPTIAPLADPDWVAIRALVPRRDVNGIMDELAAIGAKAILASDIRFCRF. The pIC50 is 5.4. (9) The compound is O=S(=O)(Nc1cncc(Cl)c1O)c1ccc(OC(F)(F)F)c(Cl)c1. The target protein (P06702) has sequence MTCKMSQLERNIETIINTFHQYSVKLGHPDTLNQGEFKELVRKDLQNFLKKENKNEKVIEHIMEDLDTNADKQLSFEEFIMLMARLTWASHEKMHEGDEGPGHHHKPGLGEGTP. The pIC50 is 5.7. (10) The target protein (P06278) has sequence MKQQKRLYARLLTLLFALIFLLPHSAAAAANLNGTLMQYFEWYMPNDGQHWKRLQNDSAYLAEHGITAVWIPPAYKGTSQADVGYGAYDLYDLGEFHQKGTVRTKYGTKGELQSAIKSLHSRDINVYGDVVINHKGGADATEDVTAVEVDPADRNRVISGEHRIKAWTHFHFPGRGSTYSDFKWHWYHFDGTDWDESRKLNRIYKFQGKAWDWEVSNENGNYDYLMYADIDYDHPDVAAEIKRWGTWYANELQLDGFRLDAVKHIKFSFLRDWVNHVREKTGKEMFTVAEYWQNDLGALENYLNKTNFNHSVFDVPLHYQFHAASTQGGGYDMRKLLNSTVVSKHPLKAVTFVDNHDTQPGQSLESTVQTWFKPLAYAFILTRESGYPQVFYGDMYGTKGDSQREIPALKHKIEPILKARKQYAYGAQHDYFDHHDIVGWTREGDSSVANSGLAALITDGPGGAKRMYVGRQNAGETWHDITGNRSEPVVINSEGWGEFH.... The drug is Cc1ccc(S(=O)(=O)Nc2ccc(C(=O)/C=C/c3ccc(O)c(O)c3)cc2)cc1. The pIC50 is 3.7.