Regression/Classification. Given a drug SMILES string, predict its absorption, distribution, metabolism, or excretion properties. Task type varies by dataset: regression for continuous measurements (e.g., permeability, clearance, half-life) or binary classification for categorical outcomes (e.g., BBB penetration, CYP inhibition). For this dataset (clearance_hepatocyte_az), we predict log10(clearance) (log10 of clearance in mL/min/kg). From a dataset of Hepatocyte clearance measurements from AstraZeneca. (1) The compound is CC(C)(N)C(=O)N[C@H](COCc1ccccc1)C(=O)N1CCC2(CC1)CN(S(C)(=O)=O)c1ccccc12. The log10(clearance) is 1.16. (2) The drug is COc1ccc(-c2ccc3c(N4CCOC[C@@H]4C)nc(N4CCOC[C@@H]4C)nc3n2)cc1C(=O)N(C)C. The log10(clearance) is 0.630. (3) The compound is CC(=O)c1cc2ccc(O)cc2oc1=O. The log10(clearance) is 2.18. (4) The log10(clearance) is 0.480. The molecule is CCCS(=O)(=O)N1N=Cc2cc(Cl)ccc2B1O. (5) The drug is Cc1cc(OC2CCN(CC3CCN([C@@](C)(Cc4ccc(F)cc4)C(=O)O)CC3)CC2)ccc1Cl. The log10(clearance) is 0.480. (6) The molecule is COC(=O)C1=C(C)NC(C)=C(C(=O)OC)C1c1ccccc1[N+](=O)[O-]. The log10(clearance) is 2.18. (7) The molecule is c1ccc(Nc2nc3c(s2)CCCc2n[nH]cc2-3)nc1. The log10(clearance) is 1.56. (8) The drug is CCO/N=C/c1ccc(OCCC2CCN(c3ccc(C)nn3)CC2)cc1. The log10(clearance) is 1.41. (9) The molecule is Cc1ccc(F)cc1S(=O)(=O)N[C@@H]1CCN(Cc2ccc(-c3ccccc3)cc2)C1. The log10(clearance) is 2.18.