A summary plot for the effect of substituting soymilk for cow’s milk on intermediate cardiometabolic outcomes. Analyses were conducted using generic, inverse variance random-effects models (at least 5 trials available), or fixed-effects models (fewer than 5 trials available). Between-study heterogeneity was assessed by the Cochrane Q statistic, where PQ < 0.100 was considered statistically significant, and quantified by the I2 statistic, where I2 ≥ 50% was considered evidence of substantial heterogeneity. The GRADE of randomized controlled trials are rated as “high” certainty of evidence and can be downgraded by 5 domains and upgraded by 1 domain. The white squares represent no downgrades, the filled black squares indicate a single downgrade or upgrades for each outcome, and the black square with a white “2” indicates a double downgrade for each outcome. Because all included trials were randomized or nonrandomized controlled trials, the certainty of the evidence was graded as high for all outcomes by default and then downgraded or upgraded based on prespecified criteria. Criteria for downgrades included risk of bias (downgraded if most trials were considered to be at high ROB); inconsistency (downgraded if there was substantial unexplained heterogeneity: I2 ≥ 50%; PQ < 0.10); indirectness (downgraded if there were factors absent or present relating to the participants, interventions, or outcomes that limited the generalizability of the results); imprecision (downgraded if the 95% CI crossed the minimally important difference (MID) for harm or benefit); and publication bias (downgraded if there was evidence of publication bias based on the funnel plot asymmetry and/or significant Egger or Begg test (P < 0.10)), with confirmation by adjustment using the trim-and-fill analysis of Duval and Tweedie. The criteria for upgrades included a significant dose–response gradient. For the interpretation of the magnitude, we used the MIDs to assess the importance of magnitude of our point estimate using the effect size categories according to the new GRADE guidance. Then, we used the MIDs to assess the importance of the magnitude of our point estimates using the effect size categories according to the GRADE guidance as follows: a large effect (≥ 5 × MID); moderate effect (≥ 2 × MID); small important effect (≥ 1 × MID); and trivial/unimportant effect (< 1 MID). *HDL-C values reversed to show benefit. **LDL-C was not downgraded for imprecision, as the degree to which the upper 95% CI crosses the MID is not clinically meaningful. Additionally, the moderate change in non-HDL-C, with high certainty of evidence, substantiates the high certainty of the LDL-C results.