From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is CCC(CC)(CC(=O)Nc1cccc(OCc2ccc3ccccc3n2)c1)C(=O)O. The target protein (Q2NNR5) has sequence MDETGNPTIPPASNNTCYDSIDDFRNQVYSTLYSMISVVGFFGNGFVLYVLVKTYHEKSAFQVYMINLAVADLLCVCTLPLRVAYYVHKGIWLFGDFLCRLSTYALYVNLYCSIFFMTAMSFFRCVAIVFPVQNISLVTQKKARLVCIAIWMFVILTSSPFLMANTYKDEKNNTKCFEPPQDNQAKNYVLILHYVSLFIGFIIPFITIIVCYTMIIFTLLKSSMKKNLSSRKRAIGMIIVVTAAFLVSFMPYHIQRTIHLHFLHNKTKPCDSILRMQKSVVITLSLAASNCCFDPLLYFFSGGNFRRRLSTIRKYSLSSMTYIPKKKTSLPQKGKDICKE. The pIC50 is 8.0. (2) The small molecule is NC(=O)c1cc(N2CCOCC2)nc2c(-c3ccnn3C3CCCCO3)nccc12. The target protein (Q13535) has sequence MGEHGLELASMIPALRELGSATPEEYNTVVQKPRQILCQFIDRILTDVNVVAVELVKKTDSQPTSVMLLDFIQHIMKSSPLMFVNVSGSHEAKGSCIEFSNWIITRLLRIAATPSCHLLHKKICEVICSLLFLFKSKSPAIFGVLTKELLQLFEDLVYLHRRNVMGHAVEWPVVMSRFLSQLDEHMGYLQSAPLQLMSMQNLEFIEVTLLMVLTRIIAIVFFRRQELLLWQIGCVLLEYGSPKIKSLAISFLTELFQLGGLPAQPASTFFSSFLELLKHLVEMDTDQLKLYEEPLSKLIKTLFPFEAEAYRNIEPVYLNMLLEKLCVMFEDGVLMRLKSDLLKAALCHLLQYFLKFVPAGYESALQVRKVYVRNICKALLDVLGIEVDAEYLLGPLYAALKMESMEIIEEIQCQTQQENLSSNSDGISPKRRRLSSSLNPSKRAPKQTEEIKHVDMNQKSILWSALKQKAESLQISLEYSGLKNPVIEMLEGIAVVLQLT.... The pIC50 is 5.8. (3) The compound is CC(C)(C)c1ccc(SCCCCC(CN)c2ccc(F)cc2)cc1. The target protein (P04055) has sequence MKLLLLAALLTAGVTAHSISTRAVWQFRNMIKCTIPGSDPLREYNNYGCYCGLGGSGTPVDDLDRCCQTHDHCYNQAKKLESCKFLIDNPYTNTYSYKCSGNVITCSDKNNDCESFICNCDRQAAICFSKVPYNKEYKDLDTKKHC. The pIC50 is 5.3. (4) The compound is O[C@H]1[C@H](NCc2cn(CCC(c3ccccc3)c3ccccc3)nn2)[C@@H](O)[C@H](O)[C@@H](O)[C@@H]1O. The target protein (Q653V7) has sequence MMGSPPAPPARRLGALAVFLLALFLAAPWGVDCGYNVASVAGSKNRLRARLELAGGGGGAAPELGPDVRRLSLTASLETDSRLHVRITDADHPRWEVPQDVIPRPSPDSFLAATRPGGGRVLSTATSDLTFAIHTSPFRFTVTRRSTGDVLFDTTPNLVFKDRYLELTSSLPPPGRASLYGLGEQTKRTFRLQRNDTFTLWNSDIAAGNVDLNLYGSHPFYMDVRSGGGGGGGAAHGVLLLNSNGMDVIYGGSYVTYKVIGGVLDFYFFAGPSPLAVVDQYTQLIGRPAPMPYWSFGFHQCRYGYKNVADLEGVVAGYAKARIPLEVMWTDIDYMDAYKDFTLDPVNFPADRMRPFVDRLHRNGQKFVVIIDPGINVNTTYGTFVRGMKQDIFLKWNGSNYLGVVWPGNVYFPDFLNPRAAEFWAREIAAFRRTLPVDGLWVDMNEISNFVDPPPLNAIDDPPYRINNSGVRRPINNKTVPASAVHYGGVAEYDAHNLFG.... The pIC50 is 4.0. (5) The drug is CCCCC(=O)C=C(C)C=CCCC(=O)N1CCCC1=O. The target protein (P24547) has sequence MADYLISGGTSYVPDDGLTAQQLFNCGDGLTYNDFLILPGYIDFTADQVDLTSALTKKITLKTPLVSSPMDTVTEAGMAIAMALTGGIGFIHHNCTPEFQANEVRKVKKYEQGFITDPVVLSPKDRVRDVFEAKARHGFCGIPITDTGRMGSRLVGIISSRDIDFLKEEEHDRFLEEIMTKREDLVVAPAGVTLKEANEILQRSKKGKLPIVNENDELVAIIARTDLKKNRDYPLASKDAKKQLLCGAAIGTHEDDKYRLDLLALAGVDVVVLDSSQGNSIFQINMIKYIKEKYPSLQVIGGNVVTAAQAKNLIDAGVDALRVGMGSGSICITQEVLACGRPQATAVYKVSEYARRFGVPVIADGGIQNVGHIAKALALGASTVMMGSLLAATTEAPGEYFFSDGIRLKKYRGMGSLDAMDKHLSSQNRYFSEADKIKVAQGVSGAVQDKGSIHKFVPYLIAGIQHSCQDIGAKSLTQVRAMMYSGELKFEKRTSSAQVE.... The pIC50 is 3.7. (6) The drug is NC(=O)[C@@H]1CCCN1C(=O)[C@H](Cc1cnc[nH]1)NC(=O)[C@@H]1CCC(=O)N1. The target protein sequence is MDGPSNVSLIHGDTTLGLPEYKVVSVFLVLLVCTLGIVGNAMVILVVLTSRDMHTPTNCYLVSLALADLLVLLAAGLPNVSDSLVGHWIYGRAGCLGITYFQYLGINVSSFSILAFTVERYIAICHPLRAQTVCTVARAKRIIAGIWGVTSLYCLLWFFLVDLNVRDNQRLECGYKVPRGLYLPIYLLDFAVFFIGPLLVTLVLYGLIGRILFQSPLSQEAWQKERQPHGQSEAAPGNCSRAKSSRKQATRMLAVVVLLFAVLWTPYRTLVLLNSFVAQPFLDPWVLLFCRTCVYTNSAVNPVVYSLMSQKFRAAFLKLCWCRAAGPQRRAARVLTSNYSAAQETSEGTEKM. The pIC50 is 7.6. (7) The compound is CC(=O)Nc1cc(Oc2ccc3c(C(=O)Nc4ccc(CN5CCN(C)CC5)c(C(F)(F)F)c4)cccc3c2)ncn1. The target protein sequence is MENFQKVEKIGEGTYGVVYKARNKLTGEVVALKKIRXDTETEGVPSTAIREISLLKELNHPNIVKLLDVIHTENKLYLVFEFLHQDLKKFMDASALTGIPLPLIKSYLFQLLQGLAFCHSHRVLHRDLKPQNLLINTEGAIKLCDFGLARAFGVPVRTYTHEVVTLWYRAPEILLGCKYYSTAVDIWSLGCIFAEMVTRRALFPGDSEIDQLFRIFRTLGTPDEVVWPGVTSMPDYKPSFPKWARQDFSKVVPPLDEDGRSLLSQMLHYDPNKRISAKAALAHPFFQDVTKPVPHLRL. The pIC50 is 5.0. (8) The small molecule is Cc1ccc(C2=C(NC(=O)CS(C)(=O)=O)C(=O)NC(c3ccc(OCCCC(F)(F)F)cc3)(C(F)(F)F)C2)cc1. The target protein (Q10469) has sequence MRFRIYKRKVLILTLVVAACGFVLWSSNGRQRKNEALAPPLLDAEPARGAGGRGGDHPSVAVGIRRVSNVSAASLVPAVPQPEADNLTLRYRSLVYQLNFDQTLRNVDKAGTWAPRELVLVVQVHNRPEYLRLLLDSLRKAQGIDNVLVIFSHDFWSTEINQLIAGVNFCPVLQVFFPFSIQLYPNEFPGSDPRDCPRDLPKNAALKLGCINAEYPDSFGHYREAKFSQTKHHWWWKLHFVWERVKILRDYAGLILFLEEDHYLAPDFYHVFKKMWKLKQQECPECDVLSLGTYSASRSFYGMADKVDVKTWKSTEHNMGLALTRNAYQKLIECTDTFCTYDDYNWDWTLQYLTVSCLPKFWKVLVPQIPRIFHAGDCGMHHKKTCRPSTQSAQIESLLNNNKQYMFPETLTISEKFTVVAISPPRKNGGWGDIRDHELCKSYRRLQ. The pIC50 is 7.8. (9) The drug is CC(C)NC(=O)OC[C@@H]1NC(=N)N2CCC(O)(O)[C@@]23NC(=N)N[C@@H]13. The target protein sequence is MASSSLPNLVPPGPHCLRPFTPESLAAIEQRAVEEEARLQRNKQMEIEEPERKPRSDLEAGKNLPLIYGDPPPEVIGIPLEDLDPYYSDKKTFIVLNKGKAIFRFSATPALYLLSPFSIVRRVAIKVLIHALFSMFIMITILTNCVFMTMSNPPSWSKHVEYTFTGIYTFESLIKMLARGFCIDDFTFLRDPWNWLDFSVITMAYVTEFVDLGNISALRTFRVLRALKTITVIPGLKTIVGALIQSVKKLSDVMILTVFCLSVFALVGLQLFMGNLRQKCVRWPPPMNDTNTTWYGNDTWYSNDTWYGNDTWYINDTWNSQESWAGNSTFDWEAYINDEGNFYFLEGSNDALLCGNSSDAGHCPEGYECIKAGRNPNYGYTSYDTFSWAFLALFRLMTQDYWENLFQLTLRAAGKTYMIFFVVIIFLGSFYLINLILAVVAMAYAEQNEATLAEDQEKEEEFQQMLEKYKKHQEELEKAKAAQALESGEEADGDPTHNKD.... The pIC50 is 6.8. (10) The drug is O=C(N[C@H]1CC=CC[C@H]1C(=O)N[C@@H](CCc1ccccc1)C(=O)CO)OCc1ccccc1. The pIC50 is 4.7. The target protein sequence is MEYHMEYSPNEVIKQEREVFVGKEKSGSKFKRKRSIFIVLTVSICFMFALMLFYFTRNENNKTLFTNSLSNNINDDYIINSLLKSESGKKFIVSKLEELISSYDKEKKMRTTGAEENNMNMNGIDDKDNKSVSFVNKKNGNLKVNNNNQVSYSNLFDTKFLMDNLETVNLFYIFLKENNKKYETSEEMQKRFIIFSENYRKIELHNKKTNSLYKRGMNKFGDLSPEEFRSKYLNLKTHGPFKTLSPPVSYEANYEDVIKKYKPADAKLDRIAYDWRLHGGVTPVKDQALCGSCWAFSSVGSVESQYAIRKKALFLFSEQELVDCSVKNNGCYGGYITNAFDDMIDLGGLCSQDDYPYVSNLPETCNLKRCNERYTIKSYVSIPDDKFKEALRYLGPISISIAASDDFAFYRGGFYDGECGAAPNHAVILVGYGMKDIYNEDTGRMEKFYYYIIKNSWGSDWGEGGYINLETDENGYKKTCSIGTEAYVPLLE.