Study design. [a] Upper panel: Training of 1H-NMR metabolomics-based predictors for routinely assessed phenotypic variables available in BBMRI.nl. This data set was created as a collaboration of 28 community and hospital-based cohorts that collected nuclear magnetic resonance (1H-NMR) metabolomics data (Nightingale) for ∼31,000 individuals, before quality control. Upper panel left: Metabolomics-based predictors were trained using an inner loop of 5-fold Cross Validation (CV) (with 5 repetitions) for hyperparameter optimization and were evaluated in unseen data employing an outer loop of 5-fold CV or Leave-One-Biobank-Out-Validation (LOBOV). Upper panel right: using our models 19 different surrogate values can be derived from a single metabolomics data measurement to impute or complement a broad set of conventional clinical variables routinely assessed in epidemiological and clinical studies. Lower panel: Trained metabolomics-predictors were evaluated in two application scenarios using a held-out study, the Leiden Longevity Study.19 This study is a two-generation family-based cohort consisting of highly aged parents (LLS-SIBS, N = 817, median age = 92 years) and their middle-aged offspring and the partners thereof (LLS-PAROFF, N = 2,280, median age = 59 years), for which we had access to additional detailed phenotypic information. Trained predictors were evaluated for their ability to reconstruct missing datapoints in an independent dataset (Application 1, lower left), to be used as confounder in Metabolome Wide Association Studies (Application 2, lower central), and to investigate and to explore determinants of health in older individuals (Application 3, lower right). This image has been designed using resources from Flaticon.com. [b] Groupings of phenotypic variables routinely assessed in epidemiological and clinical studies for which data was available in BBMRI-NL. Continuous variables are dichotomized at levels generally accepted to confer an increased risk for cardio-metabolic endpoints. As various cutoffs on chronological age are in use, in part reflecting the highly non-linear relation between chronological age and disease risk, we choose to split chronological age in three categories (I ‘young’: < 45 years [TRUE/FALSE]; II ‘middle-aged’: ≥ 45 years [TRUE/FALSE]] and III ‘old’: < 65 years [TRUE/FALSE]]; ≥ 65 years). We integrated Body Mass Index, waist circumference and sex into one sex-specific measure of ‘obesity’. Similarly, we integrated diastolic blood pressure (DBP) and systolic blood pressure to arrive at one variable ‘high pressure’. Overall, we obtain data for 20 dichotomous phenotypic variables. Colors indicate groupings.