From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is CC(C)(C)c1ccc(CN(Cc2cccc(OCC(=O)O)c2)S(=O)(=O)c2cccnc2)cc1. The target protein (Q62928) has sequence MDNSFNDSRRVENCESRQYLLSDESPAISSVMFTAGVLGNLIALALLARRWRGDTGCSAGSRTSISLFHVLVTELVLTDLLGTCLISPVVLASYSRNQTLVALAPESRACTYFAFTMTFFSLATMLMLFAMALERYLAIGHPYFYRRRVSRRGGLAVLPAIYGVSLLFCSLPLLNYGEYVQYCPGTWCFIQHGRTAYLQLYATVLLLLIVAVLGCNISVILNLIRMQLRSKRSRCGLSGSSLRGPGSRRRGERTSMAEETDHLILLAIMTITFAVCSLPFTIFAYMDETSSRKEKWDLRALRFLSVNSIIDPWVFVILRPPVLRLMRSVLCCRTSLRAPEAPGASCSTQQTDLCGQL. The pIC50 is 7.3. (2) The compound is CCN(CCN(C)C)C(=O)CNCc1cc(C(N)=O)ccn1. The target protein (Q9H6W3) has sequence MDGLQASAGPLRRGRPKRRRKPQPHSGSVLALPLRSRKIRKQLRSVVSRMAALRTQTLPSENSEESRVESTADDLGDALPGGAAVAAVPDAARREPYGHLGPAELLEASPAARSLQTPSARLVPASAPPARLVEVPAAPVRVVETSALLCTAQHLAAVQSSGAPATASGPQVDNTGGEPAWDSPLRRVLAELNRIPSSRRRAARLFEWLIAPMPPDHFYRRLWEREAVLVRRQDHTYYQGLFSTADLDSMLRNEEVQFGQHLDAARYINGRRETLNPPGRALPAAAWSLYQAGCSLRLLCPQAFSTTVWQFLAVLQEQFGSMAGSNVYLTPPNSQGFAPHYDDIEAFVLQLEGRKLWRVYRPRVPTEELALTSSPNFSQDDLGEPVLQTVLEPGDLLYFPRGFIHQAECQDGVHSLHLTLSTYQRNTWGDFLEAILPLAVQAAMEENVEFRRGLPRDFMDYMGAQHSDSKDPRRTAFMEKVRVLVARLGHFAPVDAVADQ.... The pIC50 is 4.0. (3) The small molecule is Nc1c(-c2ccc(F)cc2)c(-c2ccncc2)nn1-c1c(Cl)cc(Cl)cc1Cl. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The pIC50 is 6.8. (4) The compound is CC(NC(=O)Cc1cc(F)cc(F)c1)C(=O)NC1N=C(c2ccccc2)c2ccccc2N(C)C1=O. The target protein (A4D1B5) has sequence MALRLVADFDLGKDVLPWLRAQRAVSEASGAGSGGADVLENDYESLHVLNVERNGNIIYTYKDDKGNVVFGLYDCQTRQNELLYTFEKDLQVFSCSVNSERTLLAASLVQSTKEGKRNELQPGSKCLTLLVEIHPVNNVKVLKAVDSYIWVQFLYPHIESHPLPENHLLLISEEKYIEQFRIHVAQEDGNRVVIKNSGHLPRDRIAEDFVWAQWDMSEQRLYYIDLKKSRSILKCIQFYADESYNLMFEVPLDISLSNSGFKLVNFGCDYHQYRDKFSKHLTLCVFTNHTGSLCVCYSPKCASWGQITYSVFYIHKGHSKTFTTSLENVGSHMTKGITFLNLDYYVAVYLPGHFFHLLNVQHPDLICHNLFLTGNNEMIDMLPHCPLQSLSGSLVLDCCSGKLYRALLSQSSLLQLLQNTCLDCEKMAALHCALYCGQGAQFLEAQIIQWISENVSACHSFDLIQEFIIASSYWSVYSETSNMDKLLPHSSVLTWNTEIP.... The pIC50 is 8.9. (5) The small molecule is Cc1ccc(CNC(=O)C(=O)O)o1. The target protein sequence is LIVKKNLGDVVLFDIVKNMPHGKALDTSHTNVMAYSNCKVSGSNTYDDLAGADVVIVTAGFTKAPGKSDKEWNRDDLLPLNNKIMIEIGGHIKKNCPNAFIIVVTNPVDVMVQLLHQHSGVPKNKIIGLGGVLDTSRLKYYISQKLNVCPRDVNAHIVGAHGNKMVLLKRYITVGGIPLQEFINNKLISDAELEAIFDRTVNTALEIVNLHASPYVAPAAAIIEMAESYLKDLKKVLICSTLLEGQYGHSDIFGGTPVVLGANGVEQ. The pIC50 is 3.7. (6) The drug is Nc1c(-c2ccc(F)cc2)c(-c2ccncc2)nn1-c1c(Cl)cc(Cl)cc1Cl. The target protein sequence is MRPSGTAGAALLALLAALCPASRALEEKKVCQGTSNKLTQLGTFEDHFLSLQRMFNNCEVVLGNLEITYVQRNYDLSFLKTIQEVAGYVLIALNTVERIPLENLQIIRGNMYYENSYALAVLSNYDANKTGLKELPMRNLQEILHGAVRFSNNPALCNVESIQWRDIVSSDFLSNMSMDFQNHLGSCQKCDPSCPNGSCWGAGEENCQKLTKIICAQQCSGRCRGKSPSDCCHNQCAAGCTGPRESDCLVCRKFRDEATCKDTCPPLMLYNPTTYQMDVNPEGKYSFGATCVKKCPRNYVVTDHGSCVRACGADSYEMEEDGVRKCKKCEGPCRKVCNGIGIGEFKDSLSINATNIKHFKNCTSISGDLHILPVAFRGDSFTHTPPLDPQELDILKTVKEITGFLLIQAWPENRTDLHAFENLEIIRGRTKQHGQFSLAVVSLNITSLGLRSLKEISDGDVIISGNKNLCYANTINWKKLFGTSGQKTKIISNRGENSCK.... The pIC50 is 7.0. (7) The small molecule is CCNC(=O)Nc1cn2c(-c3nc(C)cc(C(F)(F)F)n3)cc(-c3cccnc3)cc2n1. The target protein sequence is NYNDDAIQVLEGLDAVRKRPGMYIGSTDGAGLHHLVWEIVDNAVDEALSGFGDRIDVTINKDGSLTVQDHGRGMPTGMHAMGIPTVEVIFTILHAGGKFGQGGYKTSGGLHGVGSSVVNLSSSWLEVEITRDGAIYKQRFENGGKPVTTLKKIGTAPKSKTGTKVTFMPDAGIFSTTDFKYNTISERLNESAFLLKNVTLSLTDKRTDESVEFHYENGVQDFVSYLNEDKETLTPVLYFEGEDNGFQVEVALQYNDGFSDNILSFVNNVRTKDGGTHETGLKSAITKVMNDYARKTGLLKEKDKNLEGSDYREGLAAVLSILVPEEHLQFEGQTKDKLGSPLARPVVDGIVADKLTFFLMENGELASNLIRKAIKARDAREAARKARDESRNGKKNKKDKGLLSGKLTPAQSKNPAKNELYLVEGDSAGGSAKQGRDRKFQAILPLRGKVINTAKAKMADILKNEEINTMIYTIGAGVGADFSIEDANYDKIIIMTDADT.... The pIC50 is 6.3. (8) The compound is CC(C)CCC[C@@H](C)[C@H]1CCC2=C(CCCN3CCCCC3)CCC[C@@]21C. The pIC50 is 4.8. The target protein (Q96WJ0) has sequence MIYGYTEKELEKTDPDGWRLIVEDTGRQRWKYLKTEEERRERPQTYMEKYFLGKNMDLPEQPAAKTPIESARKGFSFYKHLQTSDGNWACEYGGVMFLLPGLIIAMYISKIEFPDEMRIEVIRYLVNHANPEDGGWGIHIEGKSTVFGTALNYVVLRILGLGPDHPVTMKARIRLNELGGAIGCPQWGKFWLAVLNCYGWEGINPILPEFWMLPEWLPIHPSRWWVHTRAVYLPMGYIYGEKFTAPVDPLIESLREELYTQPYSSINFSKHRNTTSPVDVYVPHTRFLRVINSILTFYHTIFRFSWIKDMASKYAYKLIEYENKNTDFLCIGPVNFSIHILAVYWKEGPDSYAFKSHKERMADFLWISKKGMMMNGTNGVQLWDTSFAVQALVESGLAEDPEFKDHMIKALDFLDKCQIQKNCDDQQKCYRHRRKGAWPFSTRQQGYTVSDCTAEALKAVLLLQNLKSFPKRVSYDRLKDSVDVILSLQNKDGGFASYEL.... (9) The small molecule is CC(C)c1ccccc1-c1ncc2[nH]c(=O)n(Cc3ccc(-n4cc(CN(C)C)nn4)cc3)c2n1. The target protein (O94782) has sequence MPGVIPSESNGLSRGSPSKKNRLSLKFFQKKETKRALDFTDSQENEEKASEYRASEIDQVVPAAQSSPINCEKRENLLPFVGLNNLGNTCYLNSILQVLYFCPGFKSGVKHLFNIISRKKEALKDEANQKDKGNCKEDSLASYELICSLQSLIISVEQLQASFLLNPEKYTDELATQPRRLLNTLRELNPMYEGYLQHDAQEVLQCILGNIQETCQLLKKEEVKNVAELPTKVEEIPHPKEEMNGINSIEMDSMRHSEDFKEKLPKGNGKRKSDTEFGNMKKKVKLSKEHQSLEENQRQTRSKRKATSDTLESPPKIIPKYISENESPRPSQKKSRVKINWLKSATKQPSILSKFCSLGKITTNQGVKGQSKENECDPEEDLGKCESDNTTNGCGLESPGNTVTPVNVNEVKPINKGEEQIGFELVEKLFQGQLVLRTRCLECESLTERREDFQDISVPVQEDELSKVEESSEISPEPKTEMKTLRWAISQFASVERIVG.... The pIC50 is 10.0. (10) The drug is COc1cc(OC)nc(-n2c(O)cc3ccccc3c2=O)n1. The target protein (P79208) has sequence MLARALLLCAAVVCGAANPCCSHPCQNRGVCMSVGFDQYKCDCTRTGFYGENCTTPEFLTRIKLLLKPTPDTVHYILTHFKGVWNIVNKISFLRNMIMRYVLTSRSHLIESPPTYNVHYSYKSWEAFSNLSYYTRALPPVPDDCPTPMGVKGRKELPDSKEVVKKVLLRRKFIPDPQGTNLMFAFFAQHFTHQFFKTDIERGPAFTKGKNHGVDLSHVYGESLERQHNRRLFKDGKMKYQMINGEMYPPTVKDTQVEMIYPPHIPEHLKFAVGQEVFGLVPGLMMYATIWLREHNRVCDVLKQEHPEWGDEQLFQTSRLILIGETIKIVIEDYVQHLSGYHFKLKFDPELLFNQQFQYQNRIAAEFNTLYHWHPLLPDVFQIDGQEYNYQQFIYNNSVLLEHGVTQFVESFTRQIAGRVAGRRNLPAAVEKVSKASLDQSREMKYQSFNEYRKRFLLKPYESFEELTGEKEMAAELEALYGDIDAMELYPALLVEKPAPD.... The pIC50 is 5.5.