This data is from Drug-target binding data from BindingDB using IC50 measurements. The task is: Regression. Given a target protein amino acid sequence and a drug SMILES string, predict the binding affinity score between them. We predict pIC50 (pIC50 = -log10(IC50 in M); higher means more potent). Dataset: bindingdb_ic50. (1) The small molecule is COC(=O)[C@H](CC(C)C)NC(=O)[C@@H](O)[C@H](N)CCCCN. The target protein (P15684) has sequence MAKGFYISKTLGILGILLGVAAVCTIIALSVVYAQEKNRNAENSAIAPTLPGSTSATTSTTNPAIDESKPWNQYRLPKTLIPDSYQVTLRPYLTPNEQGLYIFKGSSTVRFTCNETTNVIIIHSKKLNYTNKGNHRVALRALGDTPAPNIDTTELVERTEYLVVHLQGSLVKGHQYEMDSEFQGELADDLAGFYRSEYMEGGNKKVVATTQMQAADARKSFPCFDEPAMKASFNITLIHPNNLTALSNMLPKDSRTLQEDPSWNVTEFHPTPKMSTYLLAYIVSEFKYVEAVSPNRVQIRIWARPSAIDEGHGDYALQVTGPILNFFAQHYNTAYPLEKSDQIALPDFNAGAMENWGLVTYRESALVFDPQSSSISNKERVVTVIAHELAHQWFGNLVTVDWWNDLWLNEGFASYVEFLGADYAEPTWNLKDLIVLNDVYRVMAVDALASSHPLSSPANEVNTPAQISELFDSITYSKGASVLRMLSSFLTEDLFKKGLS.... The pIC50 is 3.7. (2) The drug is CC(=O)N[C@@H]1[C@@H](N=C(N)N)C=C(C(=O)O)O[C@H]1[C@H](O)[C@H](O)CO. The target protein sequence is MNPNQKIITIGSICMVIGIVSLMLQIGNMISIWVSHSIQTGNQRQAEPISNTKFLTEKAVASVTLAGNSSLCPISGWAVYSKDNSIRIGSRGDVFVIREPFISCSHLECRTFFLTQGALLNDKHSNGTVKDRSPHRTLMSCPVGEAPSPYNSRFESVAWSASACHDGTSWLTIGISGPDNGAVAVLKYNGIITDTIKSWRNNILRTQESECACVNGSCFTVMTDGPSSGQASYKIFKMEKGKVVKSVELDAPNYHYEECSCYPDAGEITCVCRDNWHGSNRPWVSFNQNLEYQIGYICSGVFGDNPRPNDGTGSCGPVSPNGAYGVKGFSFKYGNGVWIGRTKSTNSRSGFEMIWDPNGWTGTDSSFSVKQDIVAITDWSGYSGSFVQHPELTGLDCIRPCFWVELIRGRPKESTIWTSGSSISFCGVNSDTVSWSWPDGAELPFTIDK. The pIC50 is 8.8. (3) The drug is CN1CCN(c2cccc3nc(CN(C)[C@H]4CCCc5cccnc54)c(CO)n23)CC1. The target protein (P79394) has sequence MEGISIYTSDNYTEEMGSGDYDSIKEPCFREENAHFNRIFLPTIYSIIFLTGIVGNGLVILVMGYQKKLRSMTDKYRLHLSVADLLFVITLPFWAVDAVANWYFGNFLCKAVHVIYTVNLYSSVLILAFISLDRYLAIVHATNSQKPRKLLAEKVVYVGVWIPALLLTIPDFIFASVSEADDRYICDRFYPNDLWVVVFQFQHIMVGLILPGIDILSCYCIIISKLSHSKGHQKRKALKTTVILILAFFACWLPYYIGISIDSFILLEIIKQGCEFENTVHKWISITEALAFFHCCLNPILYAFLGAKFKTSAQHALTSVSRGSSLKILSKGKRGGHSSVSTESESSSFHSS. The pIC50 is 8.1. (4) The target protein sequence is MDYNMDYAPHEVISQQGERFVDKYVDRKILKNKKSLLVIISLSVLSVVGFVLFYFTPNSRKSDLFKNSSVENNNDDYIINSLLKSPNGKKFIVSKIDEALSFYDSKKNDINKYNEGNNNNNADFKGLSLFKENTPSNNFIHNKDYFINFFDNKFLMNNAEHINQFYMFIKTNNKQYNSPNEMKERFQVFLQNAHKVNMHNNNKNSLYKKELNRFADLTYHEFKNKYLSLRSSKPLKNSKYLLDQMNYEEVIKKYRGEENFDHAAYDWRLHSGVTPVKDQKNCGSCWAFSSIGSVESQYAIRKNKLITLSEQELVDCSFKNYGCNGGLINNAFEDMIELGGICPDGDYPYVSDAPNLCNIDRCTEKYGIKNYLSVPDNKLKEALRFLGPISISVAVSDDFAFYKEGIFDGECGDQLNHAVMLVGFGMKEIVNPLTKKGEKHYYYIIKNSWGQQWGERGFINIETDESGLMRKCGLGTDAFIPLIE. The drug is CC(C)C[C@H](NC(=O)N1CCOCC1)C(=O)N[C@H](C=C1CCN(Cc2ccccc2)S1(=O)=O)Cc1ccccc1. The pIC50 is 4.3.