Dataset: Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. Task: Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. (1) The drug is COc1cccc(S(=O)(=O)c2c(C)n(CC(=O)O)c3ccc(C)cc23)c1. The Y is -0.420 logD. (2) The drug is Oc1nc2cc(Cl)cc(Cl)c2c(O)c1-c1ccccc1. The Y is 1.60 logD. (3) The Y is 3.25 logD. The drug is Clc1ccc2c(c1Nc1ccnc(Nc3cc(N4CCOCC4)cc(N4CCOCC4)c3)n1)OCO2. (4) The drug is Nc1ccccc1NC(=O)c1ccc(-c2nccs2)cc1. The Y is 2.30 logD. (5) The drug is CNC(=O)c1ccc(Nc2nccc(-c3cnc(C)n3C(C)C)n2)cc1F. The Y is 2.95 logD. (6) The drug is Nc1[nH]ncc1-c1cc(Cl)ccc1Oc1cc(F)c(S(=O)(=O)Nc2cscn2)cc1Cl. The Y is 2.30 logD. (7) The molecule is Cc1ccc(Cl)c(Nc2cc(Nc3ccc(OCC(O)CN(C)C)cc3)ncn2)c1. The Y is 2.16 logD. (8) The molecule is CCOC(=O)CN(c1ccccc1C)S(=O)(=O)c1ccc(Cl)cc1. The Y is 4.03 logD. (9) The molecule is O=c1[nH]c2c(O)ccc([C@@H](O)CNCCc3cccc(CNCc4ccccc4F)c3)c2s1. The Y is 0.690 logD. (10) The drug is Cc1ccc(S(=O)(=O)Nc2ccc(/C=C/C(=O)Nc3ccccc3N)cc2)cc1. The Y is 2.82 logD.