Dataset: Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. Task: Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. (1) The drug is Cc1ccc2c(c1)c([S+]([O-])c1ccc(Cl)cc1)c(C)n2CC(=O)O. The Y is 0.0500 logD. (2) The compound is COc1ccc(-c2nc3c(NCC4CCNCC4)c(Br)cnc3[nH]2)cc1. The Y is 1.36 logD. (3) The compound is COCCC(NC(=O)C1(N)CCN(c2ncnc3[nH]ccc23)CC1)c1ccc(Cl)cc1. The Y is 2.95 logD. (4) The Y is 2.75 logD. The molecule is CCN1C(=O)N(C)c2cnc(Nc3ccc(C(=O)NC4CCN(C)CC4)cc3OC)nc2N1C1CCCC1. (5) The drug is O=C(NC[C@@H](O)CN1CCC(Oc2ccc(Cl)c(Cl)c2)CC1)c1cc(=O)[nH]c2ccccc12. The Y is 3.48 logD. (6) The compound is COc1cc(N2CCN(C)CC2)c2nc(C(=O)Nc3ccc(N4CCOCC4)cc3)cc(N(C)C)c2c1. The Y is 3.30 logD. (7) The compound is Cc1cc(Cn2[nH]c(=O)c3[nH]c4cc(Cl)ccc4c(=O)c3c2=O)oc1C. The Y is 1.00 logD. (8) The drug is O=C(O)c1cn(C2CC2)c2cc(N3CCNCC3)c(F)cc2c1=O. The Y is -0.960 logD. (9) The drug is Cc1ncc(-c2ccnc(Nc3ccc(C(=O)NCCO)cc3)n2)n1C(C)C. The Y is 2.09 logD.