Dataset: Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. Task: Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. (1) The compound is CN[C@@H](C)C(=O)N[C@H](C(=O)N1CCC[C@H]1C(=O)N[C@H](C(=O)OC)C(c1ccccc1)c1ccccc1)C1CCCCC1. The Y is 2.42 logD. (2) The compound is Nc1nc(Cl)nc(NCc2ccccc2)n1. The Y is 2.26 logD. (3) The drug is COc1ccc(Nc2cc(Nc3ccc(OCC(O)CN(C)C)cc3)ncn2)cc1. The Y is 1.50 logD. (4) The compound is Cc1sc(CN2CCNCC2)cc1C(=O)NCC12CC3CC(CC(C3)C1)C2. The Y is 2.55 logD. (5) The compound is O=C(O)C[C@H]1CC[C@H](c2ccc(NC(=O)c3nnc(Nc4ccccc4F)o3)cc2)CC1. The Y is 2.48 logD. (6) The drug is COc1cc(C(=O)O)ccc1Cn1ccc2ccc(NC(=O)OC3CCCC3)cc21. The Y is 2.10 logD. (7) The molecule is CC1(C)CC(=O)c2c(nc3ccccc3c2N)C1. The Y is 3.00 logD.