Skip to main content
. 2010 Aug 19;6(8):e1000885. doi: 10.1371/journal.pcbi.1000885

Figure 1. Simulation studies used to derive the appropriate penalty term for Inline graphic.

Figure 1

Each panel plots the difference in log likelihood (Inline graphic) normalized by the logarithm of the sample size (number of characters), between best fitting GA models with Inline graphic and Inline graphic rates (Inline graphic), against the number of sites in the alignment. For simulations with a single rate class we plotted Inline graphic, top right. Figures for multiple rate simulations (2–5 rates) show Inline graphic as black dots (left column); and Inline graphic as blue dots (right column). Values to the right of row report simulated rates for each class. The left column is a reflection of power, whereas the right column – of the degree of over-fitting. For the case where a single rate was simulated, the degree of over-fitting is the rate of false positives. The desired behavior for Inline graphic is achieved when the model with Inline graphic rate classes is preferred to models with Inline graphic, and Inline graphic rate classes. For a modified BIC criterion Inline graphic with Inline graphic, the former happens if Inline graphic (more definitively with increasing sample size), and the latter if Inline graphic (regardless of sample size).