Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (lipophilicity_astrazeneca), we predict Y. From a dataset of Experimental lipophilicity measurements (octanol/water distribution) for 4,200 compounds from AstraZeneca. (1) The drug is O=S(=O)(c1ccccc1-c1ccc(CNC2CCOCC2)cc1)N1CCCC1. The Y is 1.40 logD. (2) The compound is O=c1[nH]c2c(O)ccc([C@@H](O)CNCCc3ccc(CNCCc4ccccn4)cc3)c2s1. The Y is -0.250 logD. (3) The molecule is N#Cc1ccc(C(c2ccc(C#N)cc2)n2cncn2)cc1. The Y is 1.70 logD. (4) The drug is CCN(CC)CCNC(=O)c1cc(S(C)(=O)=O)ccc1OC. The Y is -0.900 logD. (5) The compound is Cc1ccn2c(NC(=O)c3ccccc3)c(-c3cccs3)nc2c1. The Y is 2.87 logD. (6) The compound is Cc1cc2n[nH]c(=O)n2c2cc(-c3ccc(C(=O)N4CCOCC4)cc3)ccc12. The Y is 2.77 logD. (7) The molecule is CC(=O)Nc1ccc2ccn(-c3cc(NCCCCO)n4ncc(C#N)c4n3)c2c1. The Y is 2.58 logD. (8) The compound is Cc1c(Cl)cccc1S(=O)(=O)Nc1nc(CC(=O)N2CCN(C)CC2)cs1. The Y is 0.570 logD.