Dataset: Drug-target binding data from BindingDB using IC50 measurements. Task: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The compound is C[C@@H]1O[C@@H](O[C@@H]2CCO[C@H](CO)[C@H]2O[C@@H]2O[C@H](O)[C@H](OS(=O)(=O)[O-])[C@H](O)[C@H]2O)[C@@H](O)[C@H](O)[C@@H]1O. The target protein (P16109) has sequence MANCQIAILYQRFQRVVFGISQLLCFSALISELTNQKEVAAWTYHYSTKAYSWNISRKYCQNRYTDLVAIQNKNEIDYLNKVLPYYSSYYWIGIRKNNKTWTWVGTKKALTNEAENWADNEPNNKRNNEDCVEIYIKSPSAPGKWNDEHCLKKKHALCYTASCQDMSCSKQGECLETIGNYTCSCYPGFYGPECEYVRECGELELPQHVLMNCSHPLGNFSFNSQCSFHCTDGYQVNGPSKLECLASGIWTNKPPQCLAAQCPPLKIPERGNMTCLHSAKAFQHQSSCSFSCEEGFALVGPEVVQCTASGVWTAPAPVCKAVQCQHLEAPSEGTMDCVHPLTAFAYGSSCKFECQPGYRVRGLDMLRCIDSGHWSAPLPTCEAISCEPLESPVHGSMDCSPSLRAFQYDTNCSFRCAEGFMLRGADIVRCDNLGQWTAPAPVCQALQCQDLPVPNEARVNCSHPFGAFRYQSVCSFTCNEGLLLVGASVLQCLATGNWNS.... The pIC50 is 3.2. (2) The small molecule is CN1CCN(C(=S)SSC(=S)N2CCN(C)CC2)CC1. The target protein (P11884) has sequence MLRAALSTARRGPRLSRLLSAAATSAVPAPNQQPEVFCNQIFINNEWHDAVSKKTFPTVNPSTGEVICQVAEGNKEDVDKAVKAAQAAFQLGSPWRRMDASDRGRLLYRLADLIERDRTYLAALETLDNGKPYVISYLVDLDMVLKCLRYYAGWADKYHGKTIPIDGDFFSYTRHEPVGVCGQIIPWNFPLLMQAWKLGPALATGNVVVMKVAEQTPLTALYVANLIKEAGFPPGVVNIVPGFGPTAGAAIASHEDVDKVAFTGSTEVGHLIQVAAGSSNLKRVTLELGGKSPNIIMSDADMDWAVEQAHFALFFNQGQCCCAGSRTFVQEDVYDEFVERSVARAKSRVVGNPFDSRTEQGPQVDETQFKKILGYIKSGQQEGAKLLCGGGAAADRGYFIQPTVFGDVKDGMTIAKEEIFGPVMQILKFKTIEEVVGRANNSKYGLAAAVFTKDLDKANYLSQALQAGTVWINCYDVFGAQSPFGGYKMSGSGRELGEYG.... The pIC50 is 4.8. (3) The drug is O=C(O)[C@H](Cc1ccc(-n2c(C3CCC3)nc3cccnc32)cc1)NC1=C(Br)C(=O)C12CCCCC2. The target protein (P13612) has sequence MAWEARREPGPRRAAVRETVMLLLCLGVPTGRPYNVDTESALLYQGPHNTLFGYSVVLHSHGANRWLLVGAPTANWLANASVINPGAIYRCRIGKNPGQTCEQLQLGSPNGEPCGKTCLEERDNQWLGVTLSRQPGENGSIVTCGHRWKNIFYIKNENKLPTGGCYGVPPDLRTELSKRIAPCYQDYVKKFGENFASCQAGISSFYTKDLIVMGAPGSSYWTGSLFVYNITTNKYKAFLDKQNQVKFGSYLGYSVGAGHFRSQHTTEVVGGAPQHEQIGKAYIFSIDEKELNILHEMKGKKLGSYFGASVCAVDLNADGFSDLLVGAPMQSTIREEGRVFVYINSGSGAVMNAMETNLVGSDKYAARFGESIVNLGDIDNDGFEDVAIGAPQEDDLQGAIYIYNGRADGISSTFSQRIEGLQISKSLSMFGQSISGQIDADNNGYVDVAVGAFRSDSAVLLRTRPVVIVDASLSHPESVNRTKFDCVENGWPSVCIDLTL.... The pIC50 is 7.7. (4) The small molecule is OCCCc1ccc[n+](CCCCCc2cc(CCCCC[n+]3cccc(CCCO)c3)c(CCCCC[n+]3cccc(CCCO)c3)cc2CCCCC[n+]2cccc(CCCO)c2)c1. The target protein (P43143) has sequence MLNGWGRGDLRSGLCLWICGFLAFFKGSRGCVSEEQLFHTLFAHYNRFIRPVENVSDPVTVHFELAITQLANVDEVNQIMETNLWLRHVWKDYRLCWDPTEYDGIETLRVPADNIWKPDIVLYNNAVGDFQVEGKTKALLKYDGVITWTPPAIFKSSCPMDITFFPFDHQNCSLKFGSWTYDKAEIDLLIIGSKVDMNDFWENSEWEIVDASGYKHDIKYNCCEEIYTDITYSFYIRRLPMFYTINLIIPCLFISFLTVLVFYLPSDCGEKVTLCISVLLSLTVFLLVITETIPSTSLVIPLVGEYLLFTMIFVTLSIVVTVFVLNIHYRTPATHTMPKWVKTMFLQVFPSILMMRRPLDKTKEMDGVKDPKTHTKRPAKVKFTHRKEPKLLKECRHCHKSSEIAPGKRLSQQPAQWVTENSEHPPDVEDVIDSVQFIAENMKSHNETKEVEDDWKYMAMVVDRVFLWVFIIVCVFGTVGLFLQPLLGNTGAS. The pIC50 is 8.5. (5) The small molecule is Cc1[nH]c2nc(N)[nH]c(=O)c2c1Sc1cccc(Br)c1. The target protein (P04818) has sequence MPVAGSELPRRPLPPAAQERDAEPRPPHGELQYLGQIQHILRCGVRKDDRTGTGTLSVFGMQARYSLRDEFPLLTTKRVFWKGVLEELLWFIKGSTNAKELSSKGVKIWDANGSRDFLDSLGFSTREEGDLGPVYGFQWRHFGAEYRDMESDYSGQGVDQLQRVIDTIKTNPDDRRIIMCAWNPRDLPLMALPPCHALCQFYVVNSELSCQLYQRSGDMGLGVPFNIASYALLTYMIAHITGLKPGDFIHTLGDAHIYLNHIEPLKIQLQREPRPFPKLRILRKVEKIDDFKAEDFQIEGYNPHPTIKMEMAV. The pIC50 is 5.9. (6) The compound is Cc1cc(-c2ccc(/N=N/c3c(N)c(S(=O)(=O)O)cc4cc(S(=O)(=O)O)ccc34)c(C)c2)ccc1/N=N/c1c(N)ccc2cc(S(=O)(=O)O)ccc12. The target protein (Q9P0U3) has sequence MDDIADRMRMDAGEVTLVNHNSVFKTHLLPQTGFPEDQLSLSDQQILSSRQGHLDRSFTCSTRSAAYNPSYYSDNPSSDSFLGSGDLRTFGQSANGQWRNSTPSSSSSLQKSRNSRSLYLETRKTSSGLSNSFAGKSNHHCHVSAYEKSFPIKPVPSPSWSGSCRRSLLSPKKTQRRHVSTAEETVQEEEREIYRQLLQMVTGKQFTIAKPTTHFPLHLSRCLSSSKNTLKDSLFKNGNSCASQIIGSDTSSSGSASILTNQEQLSHSVYSLSSYTPDVAFGSKDSGTLHHPHHHHSVPHQPDNLAASNTQSEGSDSVILLKVKDSQTPTPSSTFFQAELWIKELTSVYDSRARERLRQIEEQKALALQLQNQRLQEREHSVHDSVELHLRVPLEKEIPVTVVQETQKKGHKLTDSEDEFPEITEEMEKEIKNVFRNGNQDEVLSEAFRLTITRKDIQTLNHLNWLNDEIINFYMNMLMERSKEKGLPSVHAFNTFFFTK.... The pIC50 is 5.7. (7) The drug is O=C(NCC(c1ccccc1)c1ccccc1)Nc1cccc(NC(=O)NCC(c2ccccc2)c2ccccc2)c1. The target protein (P10111) has sequence MVNPTVFFDITADGEPLGRVCFELFADKVPKTAENFRALSTGEKGFGYKGSSFHRIIPGFMCQGGDFTRHNGTGGKSIYGEKFEDENFILKHTGPGILSMANAGPNTNGSQFFICTAKTEWLDGKHVVFGKVKEGMSIVEAMERFGSRNGKTSKKITISDCGQL. The pIC50 is 5.3. (8) The drug is COc1ccc(S(=O)(=O)N2Cc3[nH]c4ccccc4c3C[C@@H]2C(N)=O)cc1. The target protein (Q9HCN6) has sequence MSPSPTALFCLGLCLGRVPAQSGPLPKPSLQALPSSLVPLEKPVTLRCQGPPGVDLYRLEKLSSSRYQDQAVLFIPAMKRSLAGRYRCSYQNGSLWSLPSDQLELVATGVFAKPSLSAQPGPAVSSGGDVTLQCQTRYGFDQFALYKEGDPAPYKNPERWYRASFPIITVTAAHSGTYRCYSFSSRDPYLWSAPSDPLELVVTGTSVTPSRLPTEPPSPVAEFSEATAELTVSFTNEVFTTETSRSITASPKESDSPAGPARQYYTKGNLVRICLGAVILIILAGFLAEDWHSRRKRLRHRGRAVQRPLPPLPPLPLTRKSNGGQDGGRQDVHSRGLCS. The pIC50 is 3.9.