Dataset: Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. Task: Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. (1) The compound is CC(C)Oc1cccc(CC(=O)N2CCC[C@](CC[N+]34CCC(c5ccccc5)(CC3)CC4)(c3ccc(Cl)c(Cl)c3)C2)c1. The Y is 1.23 logD. (2) The drug is C[C@@H](C[C@@](C)(CS(=O)(=O)N1CCN(CCc2ccc(Cl)cc2Cl)CC1)N(O)C=O)c1ncc(F)cn1. The Y is 3.27 logD. (3) The molecule is O=C(NS(=O)(=O)N1CCC(N2CCC(Oc3ccc(Cl)c(Cl)c3)CC2)CC1)c1ccccc1. The Y is 2.14 logD. (4) The drug is CS(=O)(=O)c1ccc(C(CCNC(=O)c2ccc(C#N)cc2)c2ccc(F)cc2)cc1. The Y is 2.90 logD. (5) The drug is Cn1cc(S(C)(=O)=O)cc1-c1c2c(=O)n(C)c(=O)n(CC3CC3)c2nn1Cc1ccnc2ccc(Cl)cc12. The Y is 3.18 logD. (6) The drug is Cn1cncc1-c1c2c(=O)n(CC#CCO)c(=O)n(CC3CC3)c2nn1Cc1ccnc2ccc(Cl)cc12. The Y is 2.55 logD. (7) The drug is Nc1ccc(S(=O)(=O)c2ccc(N)cc2)cc1. The Y is 0.880 logD. (8) The molecule is O=C(Nc1ccc(N2CCOCC2)cc1)c1nnc(Nc2ccccc2F)o1. The Y is 3.05 logD. (9) The molecule is c1ccc2ncccc2c1. The Y is 2.09 logD.