TABLE 3.
Reference | Location; study | DP identification methods | Explained variance % (number of factors) or CFA/CA model | Assessment of reproducibility/validity | Main results |
---|---|---|---|---|---|
Balder, 2003 (6) | Netherlands, Sweden, Finland, and Italy; DIETSCAN (NLCS, SMC, ATBC, ORDET) | Separate EFAs on each of the 4 studies: standardization and separate analysis by sex; within each study, sensitivity analyses assessing the effect of: 1) untransformed vs. dichotomized variables (for FGs with >75 % of nonusers); 2) unadjusted vs. energy-adjusted variables using residual method; 3) solutions with 2–6 factors; 4) split-half analysis using the procrustes rotation to compare different solutions; Scree test to assess the final number of factors to retain in a range from 2 to 6 factors; Varimax rotation; Loading ≥0.35 cutoff | NLCS: 23 (5) for Ms, 23.2 (5) for Fs; SMC: 21.8 (4); ATBC: 20.3 (3); ORDET: 28.5 (4); final results based on unadjusted variables for energy | Internal reproducibility: see (5) for details Cross-study reproducibility: no formal assessment | Internal reproducibility: see (5) for detailsCross-study reproducibility: 2 of the identified DPs were qualitatively similar across studies and between Ms and Fs |
Castello, 2016 (12) | Spain; EpiGEICAM, DDM-Spain | Separate PCAs on EpiGEICAM and DDM studies: PCA on EpiGEICAM data: PCA on controls only; EIG > 1; No rotation; Loading ≥0.30 cutoff; PCA on DDM data: separate PCAs on 5000 replicates of the DDM-Spain study within bootstrap estimation with selection of the 3 DPs that were more similar to those from EpiGEICAM study; PCA on controls only; EIG > 1; No rotation; Loading ≥0.30 cutoff | 37 (3) with PCA on EpiGEICAM data | Cross-study reproducibility: CC (95% CI) between factor loadings (with values of 0.85–0.94 indicating fair similarity and values ≥0.95 indicating 2 DPs were equivalent); Spearman correlation coefficient (Corr) (95% CI) between factor scores (considering any significant correlation as being indicative of DP similarity) | Cross-study reproducibility: satisfactory reproducibility of WESTERN DP, but not of PRUDENT and MEDITERRANEAN DPs [WESTERN DPs: CC = 0.90 (95% CI: 0.58–0.95), Corr = 0.92 (95% CI: 0.55–0.98); PRUDENT: CC = 0.76 (95% CI: 0.40–0.84), Corr = 0.83 (95% CI: 0.47–0.91); MEDITERRANEAN: CC = 0.77 (95% CI: 0.65–0.83), Corr = 0.74 (95% CI: 0.63–0.79)]; had we considered any significant correlation as being indicative of similarity, all DPs from the EpiGEICAM data were reproducible in the DDM-Spain study |
Castello, 2016 (11) | Spain; EpiGEICAM | PCA on EpiGEICAM study: PCA on controls only; EIG >1; No rotation; Loading ≥0.30 cutoff; food consumption information from EpiGEICAM study grouped into FG proposed in 3 other papers (Bessaoud et al., Adebamowo et al., and Terry et al.) and factor scores calculated with loadings from the original papers and FGs defined as in the original papers but recalculated on EpiGEICAM data; factor loadings recalculated using the definition of FG from (10) | 37 (3) with PCA on EpiGEICAM data | Cross-study reproducibility: CC (95% CI) between factor loadings (with values of 0.85–0.94 indicate fair similarity and values ≥0.95 indicate 2 DPs were equivalent); Spearman correlation coefficient (Corr) (95% CI) between factor scores (considering any significant correlation as being indicative of DP similarity) | Cross-study reproducibility: 5 of the 6 reconstructed DPs showed high CC (>0.9) to their corresponding DP derived on the EpiGEICAM study data [CC (Castello-WESTERN, Bessaoud-WESTERN) = 0.82, Corr (Castello-WESTERN, Bessaoud-WESTERN) = 0.57; CC (Castello-WESTERN, Adebamowo-WESTERN) = 0.92, Corr (Castello-WESTERN, Adebamowo-WESTERN) = 0.83; CC (Castello-WESTERN, Terry-WESTERN) = 0.94, Corr (Castello-WESTERN, Terry-WESTERN) = 0.85; CC (Castello-PRUDENT, Bassaoud-MEDITERRANEAN) = 0.86, Corr (Castello-PRUDENT, Bassaoud-MEDITERRANEAN) = 0.67; CC (Castello-MEDITERRANEAN, Bassaoud-MEDITERRANEAN) = 0.95, Corr (Castello-MEDITERRANEAN, Bassaoud-MEDITERRANEAN) = 0.85; CC (Castello-PRUDENT, Adebamowo-PRUDENT) = 0.95, Corr (Castello-PRUDENT, Adebamowo-PRUDENT) = 0.85; CC (Castello-MEDITERRANEAN, Adebamowo-PRUDENT) = 0.88, Corr (Castello-MEDITERRANEAN, Adebamowo-PRUDENT) = 0.73; CC (Castello-PRUDENT, Terry-HEALTHY) = 0.95, Corr (Castello-PRUDENT, Terry-HEALTHY) = 0.89; CC (Castello-MEDITERRANEAN, Terry-HEALTHY) = 0.77, Corr (Castello-MEDITERRANEAN, Terry-HEALTHY) = 0.52]; some smaller CC between comparable DPs depended on lack of FGs in the original studies |
De Vito, 2019 (32) | USA, Italy, and Switzerland; INHANCE | Multi-study factor analysis on the merged dataset including the 7 studies: within-study log-transformation (base e) and standardization; controls-only analysis; identification of shared (among all studies) and (potential) study-specific dietary patterns within an integrated statistical model based on the maximum likelihood approach; number of factors to retain chosen according to a combination of standard techniques for FA, including Horn's parallel analysis, Cattell's scree plot, and the Steiger's RMSEA index, for the best number of total factors allowed, and to Akaike Information Criterion, for the number of shared factors; Varimax rotation on the shared factor loading matrix; Loading ≥0.60 cutoff for the shared (rotated) factors and loading ≥0.25 cutoff for the study-specific (unrotated) factors; robustness analyses and stratified multi-study factor analysis by sex | 75–81 (3 common DPs shared among all the studies plus 1 additional study-specific DP for each of the 4 US studies) | Cross-study reproducibility: multi-study factor analysis | Cross-study reproducibility: Study populations from Italy, Switzerland, and the United States shared 3 reproducible DPs characterized by consumption of animal products and cereals, vitamin-rich foods, and fats, respectively; each of the American studies was characterized by a somewhat similar additional DP, which opposed calcium and niacin as dominant nutrients |
Judd, 2014 (26) | USA; REGARDS | EFA on the first split-sample, CFA on the second split-sample, and final PCA on the whole sample as far as the model is correctly identified: EFA: 3 separate PCAs by population subgroups [region (southeastern US stroke belt/non-belt), sex (male/female), and race (black/white)] to identify the optimal number of factors in a range from 3 to 6 factors; EIG >1.5, Scree test, interpretability of results from stratified PCAs; Varimax rotation; Descriptive labeling; CFA: Loading >0.20 cutoff on EFA results; No different correlation structures specified; RMSEA and CFI | NA (5) | Cross-study reproducibility: CC determined for each stratification pair for each of the factor number solutions (“excellent” when the smallest coefficient was >0 .8, “good” between 0.65 and 0.8, “acceptable” between 0.5 and 0.65, and “poor” <0 .5) Validity: CFA | Cross-study reproducibility: PCA stratified by region of residence on the first half-sample: excellent CC for the 4- and 5-factor solutions, and acceptable CC for the 3- and 6-factor solutions; PCA stratified by gender: good CC for the 5- and 6-factor solutions and poor CC for the 3- and 4-factor solutions; PCA stratified by race: acceptable CC in the 5-factor solution, but poor CC for the other 3; the 5-factor solution had an acceptable CC in all stratified analyses and it was interpretable, so this was the final model selected for CFA; CFA on the second half-sample using the 5-factor solution: very good results, even when removing FG with low factor loadings (RMSEA values <0.05) |
Männistö, 2005 (7) | Netherlands, Sweden, and Italy; DIETSCAN (NLCS, SMC, ATBC, ORDET) | Separate PCFAs on each of the 3 studies: Scree test; Varimax rotation; Loading ≥0.35 cutoff | NLCS: 23.2 (5); ORDET: 29 (4); SMC: 21.8 (4) | Cross-study reproducibility: no formal assessment | Cross-study reproducibility: both the identified DPs remained quite consistent across cohort studies |
Moskal, 2014 (8) | Europe; EPIC | Overall PCA on combined but country-specific questionnaire intakes and separate PCAs by center: log-transformation (base e) and energy adjustment with energy density method (based on alcohol-free energy) but no adjustment for center; separate analysis by sex; PCA on covariance matrix; Scree-plot, interpretability; Varimax rotation; Loading >0.45 cutoff | Overall PCA: 67 (4) | Cross-study reproducibility: Krzanowski's index, Bk, which measures the proportion of variance captured by k center-specific PCs, which is also captured by overall PCA | Cross-study reproducibility: >75% of the variance that would be captured by center-specific PCs was captured by the PCs from the overall PCA (Bj > 0.76 for all j ≥ 2, B2 > 0.85 for 23 of 27 centers); retaining ≥4 PCs was sufficient to capture at least 80% of variance in any center (Bj > 0.80 for all j ≥ 4); differences between sexes in each center were small when k > 2 |
Schwerin, 1981 (43) | USA; Ten-State Nutrition Survey (Ten-State), HANES I | Separate PCAs on the 2 surveys: standardization; EIG > 1; Varimax rotation; Alpha-numeric labeling; assignment algorithm of subjects based on the highest factor score; (probably) applied scores on HANES I data based on Ten-State DP loadings in the final solution | 55.3 (7) | Cross-study reproducibility: no formal assessment | Cross-study reproducibility: the identified DPs were similar in the 2 surveys in terms of FGs consumed |
Schwerin, 1982 (44) | USA; Ten-State Nutrition Survey (Ten-State), HANES I, NFCS | Separate PCAs on the 3 surveys: standardization; EIG > 1; Varimax rotation with Kaiser normalization; Alpha-numeric labeling | NA (6, or 7, or 8) | Cross-study reproducibility: no formal assessment | Cross-study reproducibility: 4 of the identified DPs remained quite consistent across studies that covered a decade |
ATBC, Alpha-Tocopherol Beta-Carotene Cancer Prevention Study; CA, cluster analysis; CC, congruence coefficient; CFA, confirmatory factor analysis; CFI, comparative fit index; DDM-Spain, Determinantes de la Densidad Mamográfica en España; DIETSCAN, DIETary patternS and CANcer in 4 European countries project; DP, dietary pattern; EFA, exploratory factor analysis; EIG, eigenvalue; EPIC, European Prospective Investigation into Cancer and Nutrition; EpiGEICAM, Grupo Español de investigación en Cáncer de Mama; F, female; FA, factor analysis; FG, food group; HANES, Health and Nutrition Examination Survey; INHANCE, International Head and Neck Cancer Epidemiology Consortium; M, male; NA, not available; NFCS, Nationwide Food Consumption Survey; NLCS, Netherlands Cohort Study on Diet and Cancer; ORDET, Ormoni e Dieta nella Eziologia dei Tumori in Italy; PC, principal component; PCA, principal component analysis; PCFA, principal component factor analysis; REGARDS, Reasons for Geographic and Racial Differences in Stroke; RMSEA, root mean square error of approximation; SMC, Swedish Mammography Cohort.