This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COC1O[C@H](CNC(=O)c2ccccc2)[C@@H](O)C(O)[C@@H]1O. The target protein sequence is MATQGVFTLPANTRFGVTAFANSSGTQTVNVLVNNETAATFSGQSTNNAVIGTQVLNSGSSGKVQVQVSVNGRPSDLVSAQVILTNELNFALVGSEDGTDNDYNDAVVVINWPLG. The pIC50 is 4.0. (2) The drug is CN(CC(=O)N1CCOCC1)CC(O)c1c(-c2ccccc2)[nH]c2ccccc12. The target protein (Q8K4Z6) has sequence MALLITVVTCFMIILDTSQSCHTPDDFVAITSPGHIMIGGLFAIHEKMLSSDDHPRRPQIQKCAGFEISVFLQTLAMIHSIEMINNSTLLSGVKLGYEIYDTCTEVTAAMAATLRFLSKFNCSRETVVFQCDYSSYMPRVKAVIGAGYSETSIAVSRMLNLQLMPQVSYESTAEILSDKIRFPSFLRTVPSDFYQTKAMAHLIRQSGWNWIGAITTDDDYGRLALNTFAIQAAENNVCIAFKEVLPAFLSDNTIEVRINQTLEKIIAEAQVNVIVVFLRKFHVFNLFTKAIERKISKIWIASDNWSTATKIITIPNVKKLGKVVGFAFRRGNTSSFHSFLQTLHMYPNDNNKPLHEFAMLVSACKYIKDGDLSQCISNYSQATLTYDTTKTIENHLFKRNDFLWHYTEPGLIYSIQLAVFALGHAIRDLCQARDCKKPNAFQPWELLAVLKNVTFTDGRNSFHFDAHGDLNTGYDVVLWKETNGLMTVTKMAEYDLQRDV.... The pIC50 is 3.4. (3) The small molecule is O=c1c(-c2ccc(O)cc2O)coc2cc(O)cc(O)c12. The target protein (P06760) has sequence MSPRRSVCWFVLGQLLCSCAVALQGGMLFPKETPSRELKVLDGLWSFRADYSNNRLQGFEKQWYRQPLRESGPTLDMPVPSSFNDITQEAELRNFIGWVWYEREAVLPQRWTQDTDRRVVLRINSAHYYAVVWVNGIHVVEHEGGHLPFEADITKLVQSGPLTTFRVTIAINNTLTPYTLPPGTIVYKTDPSMYPKGYFVQDISFDFFNYAGLHRSVVLYTTPTTYIDDITVTTDVDRDVGLVNYWISVQGSDHFQLEVRLLDEDGKIVARGTGNEGQLKVPRAHLWWPYLMHEHPAYLYSLEVTMTTPESVSDFYTLPVGIRTVAVTKSKFLINGKPFYFQGVNKHEDSDIRGRGFDWPLLIKDFNLLRWLGANSFRTSHYPYSEEVLQLCDRYGIVVIDECPGVGIVLPQSFGNVSLRHHLEVMDELVRRDKNHPAVVMWSVANEPVSSLKPAGYYFKTLIAHTKALDPTRPVTFVSNTRYDADMGAPYVDVICVNSY.... The pIC50 is 5.2. (4) The small molecule is CCOC(=O)C1=C[C@@H](OC(CC)CC)[C@H](NC(C)=O)[C@@H](N)C1. The target protein sequence is MNPNQKIITIGSVCIVIGIVSLMLQIGNIISIWVSHSIQTGNQYQPEPCNQSITEQAVTSVTLAGNSSLCPISGWAIYSKDNGIRIGSKGDVFVIREPFISCSYLECRTFFLTQGALLNDKHSNGTVKDRSPYRTLMSCPMGEAPSPYNSRFESVAWSASACHDGISWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSNGQASYKIFKIEKGKVVKSVELNAPNYHYEECSCYPDAGEIMCVCRDNWHGSNRPWVSFNQNLEYQLGYVCSGVFGDNPRPNDGTGNCGPMSSNGAYGVKGFSFKYGNGVWIGRTKSTSSRSGFEMIWDPNGWTETDSNFSVKQDIVAITDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKENTIWTSGSSISFCGVNSETVGWSWPDGAELPFTIDK. The pIC50 is 7.6. (5) The small molecule is CCc1cc(C(F)(F)F)cc(OC)c1C(=O)N[C@@H]1COCC[C@H]1N1CCCC1. The target protein (Q28039) has sequence MAAAQGPVAPSSLEQNGAVPSEATKKDQNLKRGNWGNQIEFVLTSVGYAVGLGNVWRFPYLCYRNGGGAFMFPYFIMLIFCGIPLFFMELSFGQFASQGCLGVWRISPMFKGVGYGMMVVSTYIGIYYNVVICIAFYYFFSSMTPVLPWTYCNNPWNTPDCMSVLDNPNITNGSQPPALPGNVSQALNQTLKRTSPSEEYWRLYVLKLSDDIGNFGEVRLPLLGCLGVSWVVVFLCLIRGVKSSGKVVYFTATFPYVVLTILFIRGVTLEGAFTGIMYYLTPQWDKILEAKVWGDAASQIFYSLGCAWGGLVTMASYNKFHNNCYRDSVIISITNCATSVYAGFVIFSILGFMANHLGVDVSRVADHGPGLAFVAYPEALTLLPISPLWSLLFFFMLILLGLGTQFCLLETLVTAIVDEVGNEWILQKKTYVTLGVAVAGFLLGIPLTSQAGIYWLLLMDNYAASFSLVIISCIMCVSIMYIYGHQNYFQDIQMMLGFPP.... The pIC50 is 7.8. (6) The compound is CCCC(=O)Nc1ccc2c(c1)C(=O)c1cc(NC(=O)CCC)ccc1-2. The target protein (Q62668) has sequence MSDHPLKEMSDNNRSPPLPEPLSSRYKLYESELSSPTWPSSSQDTHPALPLLEMPEEKDLRSSDEDSHIVKIEKPNERSKRRESELPRRASAGRGAFSLFQAVSYLTGDMKECKNWLKDKPLVLQFLDWVLRGAAQVMFVNNPLSGLIIFIGLLIQNPWWTIAGALGTVVSTLAALALSQDRSAIASGLHGYNGMLVGLLVAVFSEKLDYYWWLLFPVTFASMACPVISSALSTVFAKWDLPVFTLPFNIALTLYLAATGHYNLFFPTTLVKPASSAPNITWSEIEMPLLLQTIPVGVGQVYGCDNPWTGGVILVALFISSPLICLHAAIGSIVGLLAALTVATPFETIYTGLWSYNCVLSCVAIGGMFYVLTWQTHLLALVCALFCAYTGAALSNMMAVVGVPPGTWAFCLSTLTFLLLTSNNPGIHKLPLSKVTYPEANRIYFLTAKRSDEQKPPNGGGGEQSHGGGQRKAEEGSETVFPRRKSVFHIEWSSIRRRSK.... The pIC50 is 4.3.