Dataset: Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. Task: Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. (1) The compound is C[C@H]1CN(Cc2cc(Cl)ccc2CC(=O)O)CCN1S(=O)(=O)Cc1ccccc1. The Y is 1.12 logD. (2) The drug is Cc1c(Sc2ccc(Cl)cc2)c2c(C#N)c(Cl)ccc2n1CC(=O)O. The Y is 1.60 logD. (3) The compound is CC(C)n1c(=O)n(C)c(=O)c2cn(Cc3cccc4ccccc34)cc21. The Y is 3.97 logD. (4) The molecule is O=C(Nc1ccccc1-c1ccccc1)OC1CCN(CCCCCCCCCNC[C@H](O)c2ccc(O)c3[nH]c(=O)ccc23)CC1. The Y is 2.16 logD. (5) The Y is 1.30 logD. The drug is Cn1sc(=O)c2cc(S(=O)(=O)NC3CC3)ccc21. (6) The molecule is OC[C@H]1C[C@@H](n2nnc3c(N[C@@H]4C[C@H]4c4ccccc4)nc(SCCC(F)(F)F)nc32)[C@H](O)[C@@H]1O. The Y is 4.06 logD. (7) The molecule is Cn1c(CN2CCN(c3ccc(Cl)cc3)CC2)nc2ccccc21. The Y is 3.54 logD. (8) The Y is 1.63 logD. The compound is Cc1onc(C(N)=O)c1C(=O)Nc1nccs1. (9) The drug is O=C(Nc1nnn[nH]1)c1cc(Oc2ccccc2)c2ccccn2c1=O. The Y is 0.670 logD. (10) The drug is Cc1c(Cl)ccc(OC2CCN(C[C@H](O)CNC(=O)c3c[nH]nc3C(F)(F)F)CC2)c1Cl. The Y is 3.66 logD.