Skip to main content
. 2023 Jan 18;42(8):e112600. doi: 10.15252/embj.2022112600

Figure EV3. Chromosome size alone does not explain the frequency of aneuploidy in yeasts.

Figure EV3

  1. Correlation matrix of features of centromeres and aneuploidy frequency. Only correlations with significance (P < 0.05) are shown. The number of CEN‐like regions is taken from Lefrançois et al (2013) and the pericentromeric sizes are taken from Paldi et al (2020).
  2. Histogram of the number of aneuploidies per chromosome from this study and nine additional studies (see text). Inset shows a boxplot of the same data with paired t‐test of the mean difference in aneuploid frequency between groups A and B. Each dot represents the observed number of aneuploids from our study and nine additional studies (n = 80 for each group), the central band represents the median, the box extends from the 25th to 75th percentile, and the whiskers represent the 95% confidence interval.
  3. Randomizations of the mean difference in aneuploidy occurrence between centromere paralog pairs. Left panel, schematic of the analysis. Briefly, we examined the mean difference in aneuploidy frequency between centromere paralog pairs (Group A—Group B, e.g., CEN1—CEN7) in our study and nine additional studies (Table EV3; Kao et al, 2010; McCulley & Petes, 2010; Selmecki et al, 2015; Gallone et al, 2016; Zhu et al, 2016; Jaffe et al, 2017; Duan et al, 2018; Peter et al, 2018; Sharp et al, 2018). Next, from these 70 paired aneuploid values (note, we excluded the comparison of pair CEN15‐CEN13), we calculated the difference in aneuploidy occurrence between each group A and group B pairs. We then performed a paired t‐test on the mean difference in aneuploid occurrence between group A and group B pairs (t = 3.95, P = 0.0001846). To evaluate how “extreme” the observed t‐statistic is, we performed randomized allocations of the observed differences in aneuploid occurrence. Succinctly put, we randomly assigned the sign (− or +) to the differences between group A and B paralog pairs and then calculated t‐statistics from 100,000 randomized such allocations. The randomizations yielded a distribution of t‐statistics of the mean difference in aneuploid occurrence between pairs assuming the null hypothesis is true (i.e., there is no difference in aneuploid frequency between the two groups). Right panel: histogram of the t‐statistics from 100,000 randomizations of the difference in aneuploid occurrence. The observed t‐statistic is shown with a red dashed line (P < 0.00004).
  4. Comparison of linear regression models of aneuploid frequency explained by the indicated factors. Regression models were compared using the Akaike information criterion (AIC) method.
  5. Example predictions from linear regression models based on chromosome size with or without the centromere paralog information.