From a dataset of Drug-target binding data from BindingDB using IC50 measurements. Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The drug is Nc1nc2c(ncn2COCCO)c(=O)[nH]1. The target protein sequence is MASHAGHQDSPALDRVAGSAGHGDHPSALLRIYVDGPHGLGKTTTAAALAAALGRRDEIEYVPEPMAYWQTLGGPQTITRIFDAQHRLDRGEISASEAAMAMASAQVTMSTPYAVTESAVAPHIGAELPPGHGPHPNIDLTLVFDRHPVASLLCYPAARYLMGSLSLPTVLSFAALLPQTTPGTNLVLGALPEAVHAERLAQRQRPGERLDLAMLSAIRRVYDMLGNAIVYLQRGGSWRADWRRLSPARSAAASGRPARILPRPEIEDTIFALFCAPELLDETGEPYRVFAWTLDLLAERLRPMHLLVLDYNQAPHHCWMDLMEMIPEMTPTLPATPGSMLTLQLLAREFAREMTSTRGGDVGGEGRETR. The pIC50 is 3.0. (2) The pIC50 is 5.0. The target protein (P45984) has sequence MSDSKCDSQFYSVQVADSTFTVLKRYQQLKPIGSGAQGIVCAAFDTVLGINVAVKKLSRPFQNQTHAKRAYRELVLLKCVNHKNIISLLNVFTPQKTLEEFQDVYLVMELMDANLCQVIHMELDHERMSYLLYQMLCGIKHLHSAGIIHRDLKPSNIVVKSDCTLKILDFGLARTACTNFMMTPYVVTRYYRAPEVILGMGYKENVDIWSVGCIMGELVKGCVIFQGTDHIDQWNKVIEQLGTPSAEFMKKLQPTVRNYVENRPKYPGIKFEELFPDWIFPSESERDKIKTSQARDLLSKMLVIDPDKRISVDEALRHPYITVWYDPAEAEAPPPQIYDAQLEEREHAIEEWKELIYKEVMDWEERSKNGVVKDQPSDAAVSSNATPSQSSSINDISSMSTEQTLASDTDSSLDASTGPLEGCR. The drug is Cc1nnc(-c2sccc2NC(=O)Cc2cccc3ccccc23)o1. (3) The small molecule is Nc1c(S(=O)(=O)[O-])cc(Nc2ccccc2C(=O)O)c2c1C(=O)c1ccccc1C2=O. The target protein (P51577) has sequence MAGCCSVLGSFLFEYDTPRIVLIRSRKVGLMNRAVQLLILAYVIGWVFVWEKGYQETDSVVSSVTTKAKGVAVTNTSQLGFRIWDVADYVIPAQEENSLFIMTNMIVTVNQTQSTCPEIPDKTSICNSDADCTPGSVDTHSSGVATGRCVPFNESVKTCEVAAWCPVENDVGVPTPAFLKAAENFTLLVKNNIWYPKFNFSKRNILPNITTSYLKSCIYNAQTDPFCPIFRLGTIVEDAGHSFQEMAVEGGIMGIQIKWDCNLDRAASLCLPRYSFRRLDTRDLEHNVSPGYNFRFAKYYRDLAGKEQRTLTKAYGIRFDIIVFGKAGKFDIIPTMINVGSGLALLGVATVLCDVIVLYCMKKKYYYRDKKYKYVEDYEQGLSGEMNQ. The pIC50 is 5.0. (4) The drug is CN(C)CCNc1c2[nH+]c3ccccc3c-2n(C)c2ccccc12. The target protein sequence is MHLMRACITFCIASTAVVAVNAALVAEDAPVLSKAFVDRVNRLNRGIWKAKYDGVMQNITLREAKRLNGVIKKNNNASILPKRRFTEEEARAPLPSSFDSAEAWPNCPTIPQIADQSACGSCWAVAAASAMSDRFCTMGGVQDVHISAGDLLACCSDCGDGCNGGDPDRAWAYFSSTGLVSDYCQPYPFPHCSHHSKSKNGYPPCSQFNFDTPKCNYTCDDPTIPVVNYRSWTSYALQGEDDYMRELFFRGPFEVAFDVYEDFIAYNSGVYHHVSGQYLGGHAVRLVGWGTSNGVPYWKIANSWNTEWGMDGYFLIRRGSSECGIEDGGSAGIPLAPNTA. The pIC50 is 4.0. (5) The compound is COc1cc(N2CCN(C(C)=O)CC2)ccc1Nc1ncc(F)c(Nc2ccc(N3CCN(C)CC3)cc2OC(F)F)n1. The target protein sequence is MGAIGLLWLLPLLLSTAAVGSGMGTGQRAGSPAAGPPLQPREPLSYSRLQRKSLAVDFVVPSLFRVYARDLLLPPSSSELKAGRPEARGSLALDCAPLLRLLGPAPGVSWTAGSPAPAEARTLSRVLKGGSVRKLRRAKQLVLELGEEAILEGCVGPPGEAAVGLLQFNLSELFSWWIRQGEGRLRIRLMPEKKASEVGREGRLSAAIRASQPRLLFQIFGTGHSSLESPTNMPSPSPDYFTWNLTWIMKDSFPFLSHRSRYGLECSFDFPCELEYSPPLHDLRNQSWSWRRIPSEEASQMDLLDGPGAERSKEMPRGSFLLLNTSADSKHTILSPWMRSSSEHCTLAVSVHRHLQPSGRYIAQLLPHNEAAREILLMPTPGKHGWTVLQGRIGRPDNPFRVALEYISSGNRSLSAVDFFALKNCSEGTSPGSKMALQSSFTCWNGTVLQLGQACDFHQDCAQGEDESQMCRKLPVGFYCNFEDGFCGWTQGTLSPHTPQ.... The pIC50 is 7.0. (6) The compound is Nc1ncnc2c1ccn2[C@@H]1C[C@H](O)[C@@H](O)[C@H]1O. The target protein (P50247) has sequence MSDKLPYKVADIGLAAWGRKALDIAENEMPGLMRMREMYSASKPLKGARIAGCLHMTVETAVLIETLVALGAEVRWSSCNIFSTQDHAAAAIAKAGIPVFAWKGETDEEYLWCIEQTLHFKDGPLNMILDDGGDLTNLIHTKYPQLLSGIRGISEETTTGVHNLYKMMSNGILKVPAINVNDSVTKSKFDNLYGCRESLIDGIKRATDVMIAGKVAVVAGYGDVGKGCAQALRGFGARVIITEIDPINALQAAMEGYEVTTMDEACKEGNIFVTTTGCVDIILGRHFEQMKDDAIVCNIGHFDVEIDVKWLNENAVEKVNIKPQVDRYWLKNGRRIILLAEGRLVNLGCAMGHPSFVMSNSFTNQVMAQIELWTHPDKYPVGVHFLPKKLDEAVAEAHLGKLNVKLTKLTEKQAQYLGMPINGPFKPDHYRY. The pIC50 is 3.3. (7) The small molecule is CCCCCCc1ccc(NC(=O)C=C(C)C)cc1. The target protein sequence is THKPEPTDEEWELIKTVTEAHVATNAQGSHWKQKRKFLPEDIGQAPIVNAPEGGKVDLEAFSHFTKIITPAITRVVDFAKKLPMFCELPCEDQIILLKGCCMEIMSLRAAVRYDPESETLTLNGEMAVTRGQLKNGGLGVVSDAIFDLGMSLSSFNLDDTEVALLQAVLLMSSDRPGLACVERIEKYQDSFLLAFEHYINYRKHHVTHFWPKLLMKVTDLRMIGACHASRFLHMKVECPTELFPPLFLEVFED. The pIC50 is 4.0. (8) The compound is CCO[C@@H](Cc1ccc(OCc2ccc(S(C)(=O)=O)cc2)cc1)C(=O)NOC. The target is CKENALLRYLLDKDD. The pIC50 is 3.6. (9) The small molecule is CC1=NN(c2ccc(C(=O)O)cc2)C(=O)/C1=C\c1ccc(-c2ccc(C(=O)NCC3CC3)cc2)o1. The target protein (P23025) has sequence MAAADGALPEAAALEQPAELPASVRASIERKRQRALMLRQARLAARPYSATAAAATGGMANVKAAPKIIDTGGGFILEEEEEEEQKIGKVVHQPGPVMEFDYVICEECGKEFMDSYLMNHFDLPTCDNCRDADDKHKLITKTEAKQEYLLKDCDLEKREPPLKFIVKKNPHHSQWGDMKLYLKLQIVKRSLEVWGSQEALEEAKEVRQENREKMKQKKFDKKVKELRRAVRSSVWKRETIVHQHEYGPEENLEDDMYRKTCTMCGHELTYEKM. The pIC50 is 4.9. (10) The small molecule is c1cc(C2=NCCN2)ccc1OCCCCCCCOc1ccc(C2=NCCN2)cc1. The target protein (P04631) has sequence MSELEKAMVALIDVFHQYSGREGDKHKLKKSELKELINNELSHFLEEIKEQEVVDKVMETLDEDGDGECDFQEFMAFVSMVTTACHEFFEHE. The pIC50 is 2.6.