a,b, UMAP ordination of metabolomics data (N=232), same as Fig. 1b, colored by Pos Early, Pos Late, and Polar platform batches (a; 2 batches) and by Neg platform batches (b; 3 batches). See Table S4 for which metabolites were measured by each platform. Limited batch effect is noted, which is statistically significant only for the 3 batches (PERMANOVA p=0.09 and p=0.023 for 2 and 3 batches, respectively). c, The fraction of samples from each batch (y-axis; top, Pos Early, Pos Late, and Polar platform batches; bottom, Neg platform batches) whose metabolite profiles clustered to each metabolite cluster (MC; x-axis), shown for each MC separately. No significant batch effect was detected in MC assignments (Two-sided Fisher’s exact p > 0.05 for all without FDR correction). d, Heatmap showing odds ratio for sPTB (color bar) for each metabolite from Fig. 2a (x-axis) using a logistic regression model adjusting for batch (according to the appropriate platform for the metabolite, Table S4), stratified by maternal race (y-axis). The exact odds ratio and confidence interval are written in the cell for all statistically significant associations (FDR < 0.1). e, sPTB classification accuracy (auROC, x-axis) for a prediction model similar to those used for the entire cohort (Fig. 4, Methods), that is: trained and evaluated in cross validation on batch 1 (N=114; orange; auROC=0.66; one-sided permutation p=0.44 for lower accuracy than random draw); trained on batch 1 (N=114) and evaluated on batch 2 (N=118; violet; auROC=0.66; p=0.46); trained and evaluated in cross validation on batch 2 (N=118; magenta; auROC=0.66; p=0.44); and trained on batch 2 (N=118) and evaluated on batch 1 (N=114; brown; auROC=0.69; p=0.66). Gray histogram (black line, KDE) shows accuracy of models evaluated in cross-validation on random samples (N=116) from this cohort (mean auROC=0.67). This analysis demonstrates that a prediction model trained on one of the two batches generalizes well to the other batch, and that both accuracies are to be expected given the limited sample size.