From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is O=C(O)c1ccc(NC(=O)[C@H](CC2CCCC2)n2cnc(S(=O)(=O)C3CCC3)c2)nc1. The target protein (P17712) has sequence MLDDRARMEATKKEKVEQILAEFQLQEEDLKKVMSRMQKEMDRGLRLETHEEASVKMLPTYVRSTPEGSEVGDFLSLDLGGTNFRVMLVKVGEGEAGQWSVKTKHQMYSIPEDAMTGTAEMLFDYISECISDFLDKHQMKHKKLPLGFTFSFPVRHEDLDKGILLNWTKGFKASGAEGNNIVGLLRDAIKRRGDFEMDVVAMVNDTVATMISCYYEDRQCEVGMIVGTGCNACYMEEMQNVELVEGDEGRMCVNTEWGAFGDSGELDEFLLEYDRMVDESSANPGQQLYEKIIGGKYMGELVRLVLLKLVDENLLFHGEASEQLRTRGAFETRFVSQVESDSGDRKQIHNILSTLGLRPSVTDCDIVRRACESVSTRAAHMCSAGLAGVINRMRESRSEDVMRITVGVDGSVYKLHPSFKERFHASVRRLTPNCEITFIESEEGSGRGAALVSAVACKKACMLAQ. The pIC50 is 5.8. (2) The compound is CCN(CC)CCCOC(=O)C(C)(c1ccccc1)C1CCCCC1. The target protein (P00689) has sequence MKFVLLLSLIGFCWAQYDPHTADGRTAIVHLFEWRWADIAKECERYLAPKGFGGVQVSPPNENIIINNPSRPWWERYQPISYKICSRSGNENEFKDMVTRCNNVGVRIYVDAVINHMCGSGNSAGTHSTCGSYFNPNNREFSAVPYSAWYFNDNKCNGEINNYNDANQVRNCRLSGLLDLALDKDYVRTKVADYMNNLIDIGVAGFRLDAAKHMWPGDIKAVLDKLHNLNTKWFSQGSRPFIFQEVIDLGGEAIKGSEYFGNGRVTEFKYGAKLGTVIRKWNGEKMSYLKNWGEGWGFVPTDRALVFVDNHDNQRGHGAGGASILTFWDARMYKMAVGFMLAHPYGFTRVMSSYRRTRNFQNGKDVNDWIGPPNNNGVTKEVTINPDTTCGNDWVCEHRWRQIRNMVAFRNVVNGQPFANWWDNGSNQVAFSRGNRGFIVFNNDDWALSSTLQTGLPAGTYCDVISGDKVNGNCTGLKVNVGSDGKAHFSISNSAEDPFI.... The pIC50 is 7.7. (3) The small molecule is C#C[C@]1(O)CC[C@H]2[C@@H]3CCC4=Cc5oncc5C[C@]4(C)[C@H]3CC[C@@]21C. The target protein (P51589) has sequence MLAAMGSLAAALWAVVHPRTLLLGTVAFLLAADFLKRRRPKNYPPGPWRLPFLGNFFLVDFEQSHLEVQLFVKKYGNLFSLELGDISAVLITGLPLIKEALIHMDQNFGNRPVTPMREHIFKKNGLIMSSGQAWKEQRRFTLTALRNFGLGKKSLEERIQEEAQHLTEAIKEENGQPFDPHFKINNAVSNIICSITFGERFEYQDSWFQQLLKLLDEVTYLEASKTCQLYNVFPWIMKFLPGPHQTLFSNWKKLKLFVSHMIDKHRKDWNPAETRDFIDAYLKEMSKHTGNPTSSFHEENLICSTLDLFFAGTETTSTTLRWALLYMALYPEIQEKVQAEIDRVIGQGQQPSTAARESMPYTNAVIHEVQRMGNIIPLNVPREVTVDTTLAGYHLPKGTMILTNLTALHRDPTEWATPDTFNPDHFLENGQFKKREAFMPFSIGKRACLGEQLARTELFIFFTSLMQKFTFRPPNNEKLSLKFRMGITISPVSHRLCAVP.... The pIC50 is 7.7. (4) The small molecule is O=C(Cc1ccc(Cl)cc1B(O)O)N1CCC(Oc2ccc(C(F)(F)F)cn2)CC1. The target protein (P15304) has sequence MKPRRPISFTREITAMEPSSTSVSRPEWRPEAQQTLTDYPGSRELQEFGIPQKQSLPNEATAQQGAEFQQEQGVQQSTLLQKLLTPLAFPVPQQSFPSHKVHSDQQEATSQNGPGAGKVHTTQKELEHRDEHVGTAESGPAEPPPATEVEATSIAQAVSGPDKKLPTQTDLVSQERAEQSDPTAQQTPLVQGVKSDQGSLIESGILARLQKLAIQQPSQEWKTFLDCVTESDMEKYLNSSSKSNPPEPSGGTVIPGTLPSKQKPDCGKMSGYGGKLPHGKKGILQKHKHYWDTASAFSHSMDLRTMTQSLVALAEDNMAFFSSQGPGETARRLSNVFAGVREQALGLEPTLGQLLGVAHHFDLDTETPANGYRSLVHTARCCLAHLLHKSRYVASNRRSIFFRASHNLAELEAYLAALTQLRALAYYAQRLLTINRPGVLFFEGDEGLSADFLQDYVTLHKGCFYGRCLGFQFTPAIRPFLQTLSIGLVSFGEHYKRNET.... The pIC50 is 7.6. (5) The drug is N=C1N[C@H]2[C@H](COC(=O)N3CCCC3)NC(=N)N3CCC(O)(O)[C@]23N1. The target protein sequence is MASSSLPNLVPPGPHCLRPFTPESLAAIEQRAVEEEARLQRNKQMEIEEPERKPRSDLEAGKNLPLIYGDPPPEVIGIPLEDLDPYYSDKKTFIVLNKGKAIFRFSATPALYLLSPFSIVRRVAIKVLIHALFSMFIMITILTNCVFMTMSNPPSWSKHVEYTFTGIYTFESLIKMLARGFCIDDFTFLRDPWNWLDFSVITMAYVTEFVDLGNISALRTFRVLRALKTITVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALVGLQLFMGNLRQKCVRWPPPMNDTNTTWYGNDTWYSNDTWYGNDTWYINDTWNSQESWAGNSTFDWEAYINDEGNFYFLEGSNDALLCGNSSDAGHCPEGYECIKAGRNPNYGYTSYDTFSWAFLALFRLMTQDYWENLFQLTLRAAGKTYMIFFVVIIFLGSFYLINLILAVVAMAYAEQNEATLAEDQEKEEEFQQMLEKYKKHQEELEKAKAAQALESGEEADGDPTHNKD.... The pIC50 is 7.0. (6) The pIC50 is 3.1. The target protein (P15101) has sequence MQVPSPSVREAASMYGTAVAVFLVILVAALQGSAPAESPFPFHIPLDPEGTLELSWNISYAQETIYFQLLVRELKAGVLFGMSDRGELENADLVVLWTDRDGAYFGDAWSDQKGQVHLDSQQDYQLLRAQRTPEGLYLLFKRPFGTCDPNDYLIEDGTVHLVYGFLEEPLRSLESINTSGLHTGLQRVQLLKPSIPKPALPADTRTMEIRAPDVLIPGQQTTYWCYVTELPDGFPRHHIVMYEPIVTEGNEALVHHMEVFQCAAEFETIPHFSGPCDSKMKPQRLNFCRHVLAAWALGAKAFYYPEEAGLAFGGPGSSRFLRLEVHYHNPLVITGRRDSSGIRLYYTAALRRFDAGIMELGLAYTPVMAIPPQETAFVLTGYCTDKCTQLALPASGIHIFASQLHTHLTGRKVVTVLARDGRETEIVNRDNHYSPHFQEIRMLKKVVSVQPGDVLITSCTYNTEDRRLATVGGFGILEEMCVNYVHYYPQTQLELCKSAV.... The compound is Cn1cc[nH]c1=S. (7) The small molecule is O=C(O)[C@H]1/C(=C/CO)O[C@@H]2CC(=O)N21. The target protein sequence is MVKKSLRQFTLMATATVTLLLGSVPLYAQTADVQQKLAELERQSGGRLGVALINTADNSQILYRADERFAMCSTSKVMAAAAVLKKSESEPNLLNQRVEIKKSDLVNYNPIAEKHVNGTMSLAELSAAALQYSDNVAMNKLIAHVGGPASVTAFARQLGDETFRLDRTEPTLNTAIPGDPRDTTSPRAMAQTLRNLTLGKALGDSQRAQLVTWMKGNTTGAASIQAGLPASWVVGDKTGSGGYGTTNDIAVIWPKDRAPLILVTYFTQPQPKAESRRDVLASAAKIVTDGL. The pIC50 is 8.0. (8) The small molecule is COc1ccc2cc(CNCCc3ccc(Br)cc3)c(-c3ncco3)nc2c1. The target protein (Q8TDU6) has sequence MTPNSTGEVPSPIPKGALGLSLALASLIITANLLLALGIAWDRRLRSPPAGCFFLSLLLAGLLTGLALPTLPGLWNQSRRGYWSCLLVYLAPNFSFLSLLANLLLVHGERYMAVLRPLQPPGSIRLALLLTWAGPLLFASLPALGWNHWTPGANCSSQAIFPAPYLYLEVYGLLLPAVGAAAFLSVRVLATAHRQLQDICRLERAVCRDEPSALARALTWRQARAQAGAMLLFGLCWGPYVATLLLSVLAYEQRPPLGPGTLLSLLSLGSASAAAVPVAMGLGDQRYTAPWRAAAQRCLQGLWGRASRDSPGPSIAYHPSSQSSVDLDLN. The pIC50 is 5.6. (9) The small molecule is O=C(Nc1cccc(Br)c1O)c1cc2ccc(Br)cc2[nH]1. The target protein (Q6GG09) has sequence MRKTKIVCTIGPASESEEMIEKLINAGMNVARLNFSHGSHEEHKGRIDTIRKVAKRLDKIVAILLDTKGPEIRTHNMKDGIIELERGNEVIVSMNEVEGTPEKFSVTYENLINDVQVGSYILLDDGLIELQVKDIDHAKKEVKCDILNSGELKNKKGVNLPGVRVSLPGITEKDAEDIRFGIKENVDFIAASFVRRPSDVLEIREILEEQKANISVFPKIENQEGIDNIEEILEVSDGLMVARGDMGVEIPPEKVPMVQKDLIRQCNKLGKPVITATQMLDSMQRNPRATRAEASDVANAIYDGTDAVMLSGETAAGLYPEEAVKTMRNIAVSAEAAQDYKKLLSDRTKLVETSLVNAIGISVAHTALNLNVKAIVAATESGSTARTISKYRPHSDIIAVTPSEETARQCSIVWGVQPVVKKGRKSTDALLNNAVATAVETGRVTNGDLIIITAGVPTGETGTTNMMKIHLVGDEIANGQGIGRGSVVGTTLVAETVKDL.... The pIC50 is 6.9. (10) The small molecule is CCCCCC[C@H]1C(=O)O[C@H](C)[C@H](NC(=O)c2cccc(NC=O)c2O)C(=O)O[C@@H](C)[C@@H]1OC(=O)CC(C)C. The target protein (P28272) has sequence MTASLTTKFLNNTYENPFMNASGVHCMTTQELDELANSKAGAFITKSATTLEREGNPEPRYISVPLGSINSMGLPNEGIDYYLSYVLNRQKNYPDAPAIFFSVAGMSIDENLNLLRKIQDSEFNGITELNLSCPNVPGKPQVAYDFDLTKETLEKVFAFFKKPLGVKLPPYFDFAHFDIMAKILNEFPLAYVNSINSIGNGLFIDVEKESVVVKPKNGFGGIGGEYVKPTALANVRAFYTRLRPEIKVIGTGGIKSGKDAFEHLLCGASMLQIGTELQKEGVKIFERIEKELKDIMEAKGYTSIDQFRGKLNSI. The pIC50 is 5.0.