Dataset: Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. Task: Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. (1) The molecule is CC(COc1ccccc1)NC(C)C(O)c1ccc(O)cc1. The Y is 0.560 logD. (2) The molecule is N=C(N)Nc1ccc(C(=O)Oc2ccc(Cl)cc2)cc1. The Y is 0.700 logD. (3) The compound is COc1cccc2c1c(NS(=O)(=O)c1ccc(Cl)s1)nn2Cc1cccc(CNC(=O)C(C)(C)O)c1. The Y is 2.09 logD. (4) The molecule is O=C(NS(=O)(=O)Cc1ccccc1)N1CCC(N2CCC(Oc3ccc(Cl)c(Cl)c3)CC2)CC1. The Y is 2.13 logD. (5) The Y is 2.80 logD. The compound is O=C1NC(=O)/C(=C/c2ccc3c(c2)OC(F)(F)O3)S1. (6) The drug is CCCCn1c(Cc2cc(OC)c(OC)c(OC)c2)nc2c(N)ncnc21. The Y is 2.39 logD. (7) The drug is CN(C)CC(O)COc1ccc(Nc2cc(Nc3cc(Cl)ccc3Cl)ncn2)cc1. The Y is 2.60 logD. (8) The drug is CCN(CC)CC(=O)Nc1c(C#N)cnn1-c1ccccc1. The Y is 0.400 logD. (9) The molecule is O=c1cc(-c2ccco2)[nH]c2ccccc12. The Y is 2.21 logD. (10) The molecule is COc1cccc(Nc2ncnc3cc(OC)c(OC)cc23)c1. The Y is 3.10 logD.