Skip to main content
Systematic Biology logoLink to Systematic Biology
. 2019 Oct 29;69(3):530–544. doi: 10.1093/sysbio/syz069

A Bayesian Approach for Inferring the Impact of a Discrete Character on Rates of Continuous-Character Evolution in the Presence of Background-Rate Variation

Michael R May 1,, Brian R Moore 1
Editor: Mark Holder
PMCID: PMC7608729  PMID: 31665487

Abstract

Understanding how and why rates of character evolution vary across the Tree of Life is central to many evolutionary questions; for example, does the trophic apparatus (a set of continuous characters) evolve at a higher rate in fish lineages that dwell in reef versus nonreef habitats (a discrete character)? Existing approaches for inferring the relationship between a discrete character and rates of continuous-character evolution rely on comparing a null model (in which rates of continuous-character evolution are constant across lineages) to an alternative model (in which rates of continuous-character evolution depend on the state of the discrete character under consideration). However, these approaches are susceptible to a “straw-man” effect: the influence of the discrete character is inflated because the null model is extremely unrealistic. Here, we describe MuSSCRat, a Bayesian approach for inferring the impact of a discrete trait on rates of continuous-character evolution in the presence of alternative sources of rate variation (“background-rate variation”). We demonstrate by simulation that our method is able to reliably infer the degree of state-dependent rate variation, and show that ignoring background-rate variation leads to biased inferences regarding the degree of state-dependent rate variation in grunts (the fish group Haemulidae). [Bayesian phylogenetic comparative methods; continuous-character evolution; data augmentation; discrete-character evolution.]


Variable rates of continuous-character evolution are central to many evolutionary questions. These questions may involve changes in the rate of character evolution over time (time-dependent scenarios) or among lineages (lineage-specific scenarios). Such questions may be pursued by means of agnostic surveys to detect rate variation (data-exploration approaches; Harmon et al. 2010; Eastman et al. 2011; Venditti et al. 2011) or by testing predictions regarding factors hypothesized to influence rates of character evolution (hypothesis-testing approaches; O’Meara et al. 2006; Collar et al. 2009). A particular type of hypothesis posits that the rate of continuous-character evolution depends on the state of a discrete trait, for example, the evolutionary rate of the feeding apparatus (a set of continuous traits) in a lineage depends on the habitat type (the discrete trait) of its members.

Testing hypotheses regarding state-dependent rates of continuous-character evolution is currently pursued using a computational procedure (e.g., Collar et al. 2009, 2010; Price et al. 2011, 2013) comprised of four steps: 1) fit a Brownian motion model to the observations (the tree and continuous-trait values at its tips), where the rate of continuous-character evolution is assumed to be constant across all branches of the tree (the “null” or constant-rate model); 2) generate a sample of discrete-character histories (“stochastic maps”, Nielsen 2002; Huelsenbeck et al. 2003); 3) for each stochastic map, fit a Brownian motion model to the observations, where the instantaneous rate of continuous-character evolution at a given point on a given branch depends on the corresponding state of the discrete-character mapping (the “state-dependent model” O’Meara et al. 2006), and; 4) compare the fit of the state-dependent model (averaged over the sample of stochastic maps) to the constant-rate model using the Akaike Information Criterion (AIC). If the state-dependent model is preferred, we infer that rates of continuous-character evolution are correlated with the state of the discrete character.

The current approach has two potential problems. First, stochastic maps of the discrete character are generated without reference to the continuous characters. By construction, however, the state-dependent model specifies that the discrete and continuous characters are evolving jointly. The continuous characters therefore possess information about the history of the discrete character; disregarding this mutual information will lead to biased parameter estimates Revell 2012). Second, the null model—where the continuous characters are assumed to evolve at a constant rate across lineages—is extremely unrealistic. Any variation in the rate of continuous-character evolution—whether or not it is associated with the discrete character under consideration—is apt to be interpreted as evidence against the overly simplistic null model. This “straw-man effect” has the potential to mislead our inferences regarding the factors that impact rates of continuous-character evolution.

We describe a Bayesian approach for inferring the impact of a discrete trait on rates of continuous-character evolution that addresses the problems described above. We begin by developing a stochastic process that explicitly models the joint evolution of the discrete and continuous characters; this stochastic process can accommodate one or more continuous characters evolving under a state-dependent multivariate Brownian motion process. We refer to this new model as MuSSCRat (for Multiple State-Specific Rates of continuous-character evolution). We then develop an inference model that accommodates variation in the background rate of continuous-character evolution (i.e., rate variation across lineages that is independent of the discrete character under consideration). We implement this model in a Bayesian framework, which accommodates uncertainty in the phylogeny, discrete-character history, and parameters of the state-dependent model. We show by simulation that the method is able to reliably infer the state-dependent rates of continuous-character evolution, and that ignoring background-rate variation leads to an inflated false-positive rate. Finally, we demonstrate the new method with an empirical analysis of grunts (a group of haemulid fish) to illustrate the impacts of background-rate variation and prior specification on inferences about state-dependent rate variation.

Methods

Our goal is to develop a state-dependent multivariate Brownian motion model, MuSSCRat, in a Bayesian statistical framework. We begin with a simple simulation to describe the parameters and basic properties of the MuSSCRat model. We then show how to calculate the probability of observing the discrete and continuous characters across the tips of a phylogeny under the model (i.e., how to compute the likelihood). Finally, we describe the relevant details—the priors and Markov chain Monte Carlo (MCMC) machinery—required to perform Bayesian inference under the MuSSCRat model.

The State-Dependent Multivariate Brownian Motion Process

A simulation example

We introduce the salient properties of the MuSSCRat model by describing a simple simulation with a binary discrete character, Inline graphic, and a single continuous character, Inline graphic, over a single branch. The discrete character has two states, 0 and 1; the continuous character can be any real number. The state of the simulation at time Inline graphic is the pair of discrete- and continuous-character values, Inline graphic. (We use capital letters—Inline graphic and Inline graphic—to represent random variables, and lowercase letters—Inline graphic and Inline graphic—to represent specific values of those random variables.)

The discrete trait evolves under a continuous-time Markov process, changing from state 0 to state 1 with rate Inline graphic, and from state 1 to state 0 with rate Inline graphic. The continuous character evolves under a state-dependent Brownian motion process, where the diffusion rate, Inline graphic, measures the rate of continuous-character evolution when the discrete character is in state Inline graphic (i.e., Inline graphic indicates that the continuous character evolves faster in discrete state 1 than in discrete state 0). In a small time interval of duration Inline graphic, where the discrete character begins in state Inline graphic, the continuous character changes by a normally distributed random variable with mean 0 and variance Inline graphic, and the discrete character changes state with probability Inline graphic.

We begin the simulation at time Inline graphic, with the discrete character in state Inline graphic and the continuous character with value Inline graphic. We then increment the simulation forward in time by a small time interval, Inline graphic, applying the above rules describing how the state of the process changes during each time interval. We continue to increment the simulation forward in time until we reach the end of the branch (at time Inline graphic). The outcome of the simulation is a sample path that records the state of the process from the beginning to the end of the branch (Fig. 1a).

Figure 1.

Figure 1.

Simulations under the state-dependent multivariate Brownian motion process. The process is either in discrete state 0 (blue) or discrete state 1 (orange), where the rate of change between discrete states is equal (Inline graphic), and the rate of continuous-character evolution is higher when the process is in the orange state (Inline graphic, Inline graphic). A) A single sample path from Inline graphic to Inline graphic. The process begins and ends in the blue state, but spends some time in the orange state. Note that there is more evolution in the orange state than in the blue state. B) The distribution of end states for 10 million simulated realizations. Solid lines represent the simulated joint probability densities of the discrete and continuous states. Dashed lines represent the normal densities with parameters estimated from the simulated end states. Note that the simulated densities depart from the normal densities (both Kolmogorov–Smirnov Inline graphic

The transition-probability density specifies the probability that the process ends in some state, Inline graphic, given an initial state Inline graphic, after a certain amount of time, Inline graphic, has elapsed. The resulting frequency histogram of end states provides a Monte Carlo approximation of the transition-probability density for a branch of duration Inline graphic. Note that the transition-probability densities of standard (i.e., state-independent) Brownian motion processes are normal densities. By contrast, it is clear from our simulations that the transition-probability densities under the state-dependent process are not normal densities (Fig. 1b).

Parameters of the MuSSCRat model

We model the joint evolution of a discrete binary character, Inline graphic, and a set of Inline graphic continuous characters, Inline graphic, as a stochastic process. The discrete trait has two possible values, which we arbitrarily label Inline graphic and Inline graphic; Inline graphic. The continuous characters are a vector of real-valued random variables, where Inline graphic is the value of the Inline graphic continuous character (Inline graphic). Variables of the MuSSCRat model are summarized in Table 1.

Table 1.

The variables of the MuSSCRat model and their interpretation.

Variable Interpretation
Inline graphic The continuous characters for one lineage
Inline graphic The state of the continuous characters at time Inline graphic
Inline graphic The number of continuous characters
Inline graphic An Inline graphic matrix containing the Inline graphic continuous characters for the Inline graphic species in the tree
Inline graphic The discrete character for one lineage
Inline graphic The state of the discrete character at time Inline graphic
Inline graphic An Inline graphic matrix containing the discrete character for the Inline graphic species in the tree
Inline graphic The complete history of the discrete character (the state at the beginning and end of the process,
      and all of the changes in between) along the lineage Inline graphic
Inline graphic The instantaneous-rate matrix of the discrete-character CTMC model
Inline graphic The instantaneous rate of change from state Inline graphic to state Inline graphic
Inline graphic The background rate of continuous-character evolution among all lineages
Inline graphic The background rate of continuous-character evolution for lineage Inline graphic
Inline graphic The relative rates of continuous-character evolution for each continuous character
Inline graphic The relative rates of continuous-character evolution for each discrete state
Inline graphic The evolutionary correlation matrix
Inline graphic The evolutionary correlation between continuous characters Inline graphic and Inline graphic
Inline graphic The discrete-state-independent evolutionary variance–covariance matrix

We assume that the discrete character evolves under a continuous-time Markov process, and that the continuous characters evolve under a multivariate Brownian motion process with rates that depend on the state of the binary character. These model components collectively describe how the set of characters, Inline graphic, evolve together over a single branch; we detail the evolutionary dynamics of this process over an entire tree when we describe the likelihood function.

The instantaneous-rate matrix, Inline graphic, describes the rates at which the binary character evolves: Inline graphic describes the instantaneous rate of change from state Inline graphic to state Inline graphic:

graphic file with name M77.gif

The complete history of the discrete trait on branch Inline graphic—which we represent as Inline graphic—specifies the state of the character at the beginning and end of the branch, and also the times of any state changes along the branch.

While the process is in a particular discrete state, we assume the continuous characters evolve under a multivariate Brownian motion model. We allow the background rate of evolution to vary among lineages in the phylogeny by letting each branch have its own rate parameter, Inline graphic (we call this “background-rate variation”). While in discrete state Inline graphic, the background rate is multiplied by the state-specific relative rate, Inline graphic. We also allow the relative rate of evolution to vary among the continuous characters. The vector Inline graphic contains these relative rates; Inline graphic is the relative rate at which continuous character Inline graphic evolves. The evolutionary correlations between characters are contained in the Inline graphic symmetric correlation matrix, Inline graphic, where Inline graphic specifies the correlation between characters Inline graphic and Inline graphic.

We assume that the relative rates of change between characters, Inline graphic, and the evolutionary correlations between characters, Inline graphic, are independent of the discrete state; in other words, we assume that the state of the discrete trait affects only the overall rate of continuous-character evolution, but not the nature of the evolutionary process (as represented by Inline graphic and Inline graphic). We combine the relative rates among characters and the correlation matrix to form the overall evolutionary variance-covariance matrix, Inline graphic:

graphic file with name M96.gif

Bayesian Inference

We implement the MuSSCRat model with background-rate variation as a Bayesian model to infer the joint posterior density of the parameters given the observed data. We must specify both the likelihood function and also the joint prior density to compute the joint posterior distribution. We describe each of these components below.

Data

We imagine that we have sampled one discrete character and Inline graphic continuous characters for each of Inline graphic species; relationships among these species are defined by the phylogeny, Inline graphic. We store the discrete characters in an Inline graphic column vector, Inline graphic, and the continuous characters in an Inline graphic matrix, Inline graphic. We assume that the discrete and continuous characters evolve independently along each of the Inline graphic branches of the tree. We index the internal nodes according to their sequence in a post-order traversal of the tree, starting from the root (which has index Inline graphic).

Likelihood function

We simplify likelihood calculations by including the vector of character histories, Inline graphic, where Inline graphic is the discrete-character history along branch Inline graphic (including the state at the beginning and end of the branch), as variables in the model. In effect, we are “augmenting” the discrete-character data observed at the tips of the tree with unobserved discrete-character histories over the entire tree; this technique is referred to as data augmentation (Tanner and Wong 1987; Robinson et al. 2003; Mateiu and Rannala 2006; Lartillot 2006; Landis et al. 2013).

The augmented likelihood is a product of the joint probability of Inline graphic and the conditional probability density of Inline graphic given Inline graphic:

graphic file with name M112.gif

where Inline graphic are the parameters of the MuSSCRat model. We compute the joint probability of Inline graphic as a product of independent probabilities across each of the Inline graphic branches:

graphic file with name M116.gif

where Inline graphic is an indicator function that ensures that character histories are consistent across ancestor-descendant branches (i.e., that the state at the end of one branch matches the state at the beginning of its descendant branches), and ensures that character histories for terminal branches end in the observed state. We compute Inline graphic as illustrated in Figure 2. By convention, we use the stationary distribution of the Markov chain defined by Inline graphic as the probability of the root state, Inline graphic.

Figure 2.

Figure 2.

Computing the probability of a character history, Inline graphic. Blue and orange segments correspond to discrete states 0 and 1, respectively. The probability of the history is the product of the probabilities of waiting times between events (or the probability of no event in the final segment) given the current rate of change.

To calculate the conditional probability density of the continuous characters, we first consider how the continuous characters, Inline graphic, evolve along a single branch of length Inline graphic when the history of the discrete character along that branch is known. Given that the discrete character is in state Inline graphic for duration Inline graphic, changes in the continuous character follow a multivariate normal distribution with mean Inline graphic and variance-covariance matrix Inline graphic. This implies that the changes in Inline graphic while in discrete state Inline graphic, Inline graphic, are multivariate-normally distributed:

graphic file with name M131.gif

where Inline graphic is the amount of time the history spends in discrete state Inline graphic and Inline graphic is a Inline graphic vector of zeros (indicating that the expected amount of change for each continuous character is 0). Because it is the sum of multivariate-normally distributed random variables, Inline graphic is also multivariate-normally distributed:

graphic file with name M137.gif

where Inline graphic is the branch-specific variance-covariance matrix given the discrete-character history and the background rate of evolution for the branch, Inline graphic.

Because changes to the continuous characters follow a multivariate normal distribution, we can compute the conditional probability density of the continuous characters, Inline graphic, using standard algorithms to integrate over the distributions of the states at internal nodes. Specifically, we use Felsenstein’s REML algorithm (Felsenstein 1973, 2004), extended to multivariate Brownian motion (Huelsenbeck and Rannala 2003; Freckleton 2012), to compute the conditional probability density of Inline graphic. This algorithm assumes a uniform prior over all possible continuous states at the root.

Incorporating background-rate variation in the MuSSCRat model does not complicate the computation of the augmented likelihood, since it simply “rescales” the variance-covariance matrices on each branch. However, including background-rate variation does complicate inference; specifically, it causes the MuSSCRat model to become nonidentifiable (see Supplementary Material available on Dryad at http://dx.doi.org/10.5061/dryad.499c4j2). A model is nonidentifiable when multiple combinations of parameters have identical likelihoods (Rannala 2002). Consequently, parameters of a nonidentifiable model cannot be estimated by standard maximum-likelihood methods because there may be no unique “maximum” likelihood. In this case, it is necessary to apply constraints on nonidentifiable parameters that “penalize” different combinations of parameters that have identical likelihoods. Bayesian models provide a natural solution to nonidentifiability, as the prior distributions on the parameters naturally penalize combinations of parameters that might have identical likelihoods (i.e., the joint posterior probability of parameter combinations with identical likelihoods will differ if their joint prior probabilities are different). We describe our assumed prior distribution on background rates in the next section.

Priors

We assume that the background-rate parameters, Inline graphic, are drawn from a hierarchical model with parameters Inline graphic and Inline graphic, and the remainder of the parameters are drawn from independent prior distributions, so that the joint prior density becomes:

graphic file with name M145.gif

We describe our prior distributions in the following paragraphs; these parameterizations reflect our “baseline” model, but we explore alternative priors and prior sensitivity in our empirical analyses.

We draw the lineage-specific background rates of continuous-character evolution, Inline graphic, iid from a shared lognormal distribution with mean Inline graphic and standard deviation Inline graphic. We use a uniform prior on Inline graphic, such that Inline graphic is drawn from a logInline graphic-uniform distribution between Inline graphic and Inline graphic. We draw the standard deviation, Inline graphic, from an exponential distribution with mean Inline graphic. The constant Inline graphic is the standard deviation for a lognormal distribution that indicates that our Inline graphic prior belief ranges over one order of magnitude (see Supplementary Material available on Dryad). This model is the continuous-character analog of the uncorrelated lognormal (UCLN) relaxed-clock model used to describe variation in rates of molecular evolution across lineages (e.g., Drummond et al. 2006; Lemey et al. 2010). Accordingly, we refer to this extension of the MuSSCRat model with background-rate variation as the MuSSCRat + UCLN model. A convenient property of the UCLN model is that—as Inline graphic shrinks to 0—it collapses to a “strict” morphological-clock model, where Inline graphic for all lineages. Our prior on Inline graphic specifies that we expect the values of Inline graphic to range over about one order of magnitude, but the exponential prior allows the standard deviation to shrink to 0 if the data prefer a strict morphological clock. In summary, we specify the background-rate-variation component of the prior model as:

graphic file with name M162.gif

The parameter vector Inline graphic describes the relative rate of continuous-character evolution for each of the discrete states. We specify a Dirichlet distribution on half the values of Inline graphic. Specifying the prior on half the values of Inline graphic ensures that the mean value of Inline graphic is 1, which allows us to interpret these parameters as the relative rate of continuous-character evolution in the alternative discrete states. We assume the concentration parameters of the Dirichlet distribution are the same, so this is a symmetric Dirichlet distribution with parameter Inline graphic:

graphic file with name M168.gif

The average rate of change for each of the Inline graphic continuous characters may vary; we allow the relative rate of continuous characters to vary by including a parameter vector, Inline graphic, where Inline graphic is the rate of evolution of the Inline graphic continuous character. We specify a Dirichlet distribution on Inline graphic of the values of Inline graphic. We adopt the same logic as above for the prior on Inline graphic, specifying a symmetric Dirichlet distribution with parameter Inline graphic such that the mean value of Inline graphic is 1:

graphic file with name M178.gif

The symmetric matrix Inline graphic determines the evolutionary correlation between each pair of continuous characters; Inline graphic is the correlation between characters Inline graphic and Inline graphic. The matrix Inline graphic has a special constraint (it must be positive semidefinite) that makes it difficult to specify, for example, iid priors on each matrix element, Inline graphic. We use the LKJ distribution as a prior on Inline graphic, which defines a prior over positive-semidefinite correlation matrices (Lewandowski et al. 2009). Correlation matrices drawn from this distribution have prior density:

graphic file with name M186.gif

where Inline graphic is inversely related to the variance of the correlation parameters: larger values of Inline graphic result in marginal distributions on Inline graphic that are concentrated closer to 0, while smaller values of Inline graphic result in distributions that are more diffuse. We choose Inline graphic, which indicates a uniform distribution over all possible positive-semidefinite correlation matrices:

graphic file with name M192.gif

Finally, the matrix Inline graphic describes the rates of change between the discrete-character states. We parameterize the stationary frequencies, Inline graphic, and the average rate of change, Inline graphic. We build Inline graphic from these parameters as follows:

graphic file with name M197.gif

where the diagonal elements are specified so that the sum of each row is 0, and the scalar Inline graphic is an arbitrary value that guarantees that the expected number of transitions over a tree of length Inline graphic is Inline graphic.

Assuming that the rates of change are symmetric (Inline graphic) or asymmetric (Inline graphic) may have some impact on our analysis through the distribution on Inline graphic. Moreover, inferring whether rates of change are (a)symmetric is often of direct interest to researchers studying discrete-character evolution. We therefore specify a mixture distribution on Inline graphic, so that Inline graphic may be symmetric or asymmetric. Specifically, we draw Inline graphic, from a degenerate distribution concentrated on equal rates, Inline graphic, with probability Inline graphic, and from a Dirichlet distribution with parameter Inline graphic with probability Inline graphic. We draw Inline graphic from a lognormal prior distribution with standard deviation Inline graphic, and specify the mean such that the expected number of transitions over the entire phylogeny is Inline graphic. The prior expected number of transitions reflects an empirical prior, and should be specified differently for different data sets; for the simulations and analyses we describe later, we use Inline graphic. Our overall prior on Inline graphic is:

graphic file with name M216.gif

Posterior

Having specified the augmented likelihood function and the joint prior density, we can write down the joint posterior density of the model parameters and the discrete-character histories:

graphic file with name M217.gif (1)

where the first two terms on the right-hand side are the augmented likelihood, and the third term is the joint prior density on the model parameters.

The above Bayesian model conditions on a known tree, Inline graphic. It is straightforward to relax the assumption of a fixed tree by including it as a parameter in the model. In this case, we may include a sequence alignment and specify a subsitution model, and jointly infer the phylogeny and the parameters of the MuSSCRat and substitution models.

Constant background rates

The MuSSCRat model with constant-background rates is nested within the model with background-rate variation described above: as Inline graphic, the lognormal prior on Inline graphic collapses to a point centered on Inline graphic (so that all values of Inline graphic for all branches become increasingly similar).

To specify the constant-background-rate model explicitly, we draw a single value for Inline graphic from a logInline graphic-uniform distribution between Inline graphic and Inline graphic. Otherwise, we use the same prior distributions for the constant-background-rate model as we described for the variable-background-rate model, above.

Markov chain Monte Carlo

The joint posterior probability density cannot be calculated analytically because we cannot evaluate the normalizing constant of equation 1 (the marginal likelihood). We therefore approximate the joint posterior probability density numerically using MCMC; specifically, we draw samples from the joint posterior distribution using the Metropolis–Hastings and Green algorithm (Metropolis et al. 1953; Hastings 1970; Green 1995). We use standard proposal distributions for the majority of the model parameters; for brevity, we only provide details for two of our more uncommon proposal distributions—for moves between symmetric and asymmetric Inline graphic matrices, and for the discrete-character histories—in the Supplementary Material available on Dryad.

Our data-augmentation strategy involves including the complete history of the discrete character, Inline graphic, as a variable in the Markov chain. As such, the MCMC procedure includes proposals that change the discrete-character history. When a new character history, Inline graphic, is proposed, it is accepted with probability Inline graphic, computed as:

graphic file with name M231.gif

where Inline graphic is the distribution from which the new character history is drawn. Note that the probabilities of the discrete characters, Inline graphic, and continuous characters, Inline graphic, both contribute to the probability that the proposed discrete-character history is accepted. Importantly, this means that the continuous characters are able to correctly influence the discrete-character histories; that is, we are correctly modeling the joint distribution of the discrete and continuous characters.

Implementation

We implemented our MuSSCRat model in the open-source Bayesian phylogenetic software, RevBayes (Höhna et al. 2016). Our implementation relies upon the data-augmentation functionality developed in RevBayes by Michael J. Landis and Sebastian Höhna for discrete characters (unpublished), extended to accommodate phylogenetic uncertainty. Owing to the flexibility of RevBayes, our implementation allows users to explore the impact of binary or multistate discrete traits on rates of continuous-character evolution, provides tremendous flexibility for specifying priors, enables simultaneous inference of ancestral states for both discrete and continuous characters, and allows joint inference of the phylogeny, divergence times, and parameters of the MuSSCRat model. We provide Rev scripts for performing analyses under the MuSSCRat model in RevBayes (see Data Dryad repository http://doi.org/10.5061/dryad.499c4j2 and GitHub repository https://github.com/mikeryanmay/musscrat_supp_archive/releases/tag/1.1.

Statistical behavior

The MuSSCRat model has many parameters relative to the number of observations, Inline graphic and Inline graphic. It is therefore unclear how well this complex model can detect rate variation, or distinguish between state-dependent and background sources of rate variation. Accordingly, we performed a simulation study to characterize the statistical behavior of the state-dependent model. Specifically, we performed experiments to understand: 1) its ability to detect state-dependent rate variation in the absence of background-rate variation; 2) its ability to detect state-dependent rate variation in the presence of background-rate variation; 3) the cost of including background-rate variation in the model when background rates are actually constant, and; 4) the consequences of assuming background rates are constant when they are actually variable.

For the following analyses, we approximated the joint posterior probability density by running two replicate MCMC simulations for each simulated data set. We performed MCMC diagnosis to ensure that the joint posterior density was adequately approximated. We provide details of the MCMC simulations and MCMC diagnoses in the Supplementary Material available on Dryad.

Measures of Performance

Frequentist properties

The frequentist interpretation of a Bayesian credible interval (CI) is that the true value of a parameter has a Inline graphic chance of being within the Inline graphic CI of its corresponding marginal posterior distribution (assuming the model is true, see Huelsenbeck and Rannala 2004). With this interpretation in mind, we assessed the frequentist properties of the Inline graphic CI inferred for our simulated data sets (assuming the conventional significance level, Inline graphic). We define the coverage probability as the frequency with which the true value of a parameter is contained in the Inline graphic CI, the false-positive rate as the frequency with which the true value is excluded from the Inline graphic CI (one minus the coverage probability), and the power as the frequency with which a state-independent model is excluded from the Inline graphic CI when the state-dependent model is true.

Accuracy and bias

We assess the accuracy and bias of the posterior-mean estimate of Inline graphic using the percent-error statistic, defined as:

graphic file with name M245.gif

where Inline graphic is the true value of state-dependent rate for discrete state 1, and Inline graphic is estimated value of the state-dependent rate for discrete state 1 (we use the mean of the corresponding marginal posterior distribution). Values of Inline graphic indicate an underestimate; conversely, values of Inline graphic indicate an overestimate.

Simulation Experiments

Experiment 1: Constant background rates

We simulated data sets of different sizes (with Inline graphic continuous characters), over a variety of tree sizes (with Inline graphic species), and state-dependent rate-ratios, Inline graphic (where Inline graphic corresponds to the case when rates do not depend on the state of the discrete trait). For each combination of Inline graphic, Inline graphic, and Inline graphic, we simulated 100 trees under a constant-rate birth–death process with a speciation rate of 1 and an extinction rate of 0.5 using the R (R Core Team 2017) package TESS (Höhna et al. 2015); we then rescaled each tree to have a root height of 1. We simulated discrete-character histories under a symmetric continuous-time Markov chain with a rate specified such that the expected number of transitions was 5. We then drew correlation matrices from an LKJ distribution with Inline graphic, and relative rates for each continuous character from a symmetric Dirichlet distribution with Inline graphic. Finally, we simulated Inline graphic continuous characters under a multivariate Brownian motion model assuming a background rate of 1. We provide more details for the simulation procedure in the Supplementary Material available on Dryad.

We analyzed each simulated data set in RevBayes under the MuSSCRat model, assuming the true phylogeny was known. We constrained the model so that the background rate was equal for each lineage (i.e., the constant-rate model), and excluded the standard deviation parameter, Inline graphic, from the model. We estimated the remaining model parameters from the simulated data using the priors described in the Bayesian inference section, above. Since the generating model and the inference model both exclude background-rate variation, this simulation scenario reflects the performance of the method when the model is correctly specified.

The false-positive rate for this simulation experiment was Inline graphic (i.e., when Inline graphic; Fig. 3, the top row of the top three panels), which is indistinguishable from the expected Inline graphic (two-tailed binomial test Inline graphic, Inline graphic). Next, we computed the power when rates of continuous-character evolution varied among discrete states (Inline graphic). The power ranged from Inline graphic to Inline graphic from the worst-case to the best-case scenarios. Predictably, power improved as the number of continuous characters and species increased; overall, the average power was Inline graphic (Fig. 3, top three panels, excluding the top row). The posterior-mean estimate of Inline graphic was slightly biased for small numbers of continuous characters (Inline graphic) but quickly converged to the true value as the number of characters and species increased (Fig. 4, top row of panels).

Figure 3.

Figure 3.

The frequency with which Inline graphic was excluded from the Inline graphic CI when background rates were constant (top row of panels) or variable (bottom row of panels). Each panel corresponds to simulations for a given number of species, Inline graphic. Within each panel, rows correspond to different degrees of state-dependent rate variation, Inline graphic, and columns correspond to different numbers of continuous characters, Inline graphic. Each cell represents the fraction of the Inline graphic CI that exclude Inline graphic, colored according to the scale (at right).

Figure 4.

Figure 4.

The percent error of the posterior-mean estimates of Inline graphic when background rates were constant (top row of panels) or variable (bottom row of panels). Each column of panels corresponds to simulations for a given number of species, Inline graphic. Within each panel, boxplots depict the distribution of percent error across 100 simulated data sets for each of the Inline graphic continuous characters (along the x-axis), colored by the true state-dependent rates, Inline graphic (see inset legend). Boxplots represent the middle Inline graphic (boxes) and the middle Inline graphic (whiskers) of simulations.

Experiment 2: Variable background rates

In this simulation scenario, we reused all of the simulated trees, discrete-character histories, correlation matrices, and relative-rate parameters describing the degree of variation among continuous characters from the first simulation experiment (with constant background rates). In this simulation, however, we simulated lineage-specific rates of continuous-character evolution by drawing the background rate for each lineage, Inline graphic, from a lognormal distribution with mean Inline graphic and standard deviation Inline graphic.

For this scenario, we analyzed each simulated data set using the MuSSCRat model with background-rate variation, by allowing Inline graphic to vary among branches, as described in the Bayesian inference section (MuSSCRat + UCLN). Again, this simulation scenario reflects the performance of the method when the model is correctly specified, since the data-generating model and the inference model both allow background rates to vary among lineages.

The false-positive rate for this experiment was Inline graphic (Fig. 3, top row of bottom three panels), again indistinguishable from the expected Inline graphic (two-tailed binomial test Inline graphic, Inline graphic). The power was only slightly lower than that of experiment 1: the power was Inline graphic in the worst case, and Inline graphic in the best case. On average, the power was Inline graphic (Fig. 3, bottom three panels, excluding top row). Again, the posterior-mean estimate of Inline graphic was only modestly biased for analyses based on a small number of continuous characters (Fig. 4, bottom row of panels).

Experiment 3: Cost of background-rate variation

When background rates of continuous-character evolution are constant, we expect that including unnecessary parameters (i.e., to accommodate background-rate variation) in the inference model should decrease our ability to detect state-dependent rate variation. The goal of this simulation experiment is to understand the cost of accommodating background-rate variation when it is absent. To achieve this, we reused the data sets from Experiment 1 (simulated under constant background rates) with Inline graphic, Inline graphic, and Inline graphic, but analyzed these data sets under the MuSSCRat + UCLN model.

For Experiment 1—where we correctly assumed that background rates were constant—the coverage probability was Inline graphic (two-tailed binomial test Inline graphic, Inline graphic). By contrast, in this experiment—when we incorrectly assumed that background rates are variable—the coverage probability was Inline graphic (two-tailed binomial test Inline graphic, Inline graphic). Overall, the cost of accommodating background-rate variation when absent was therefore quite modest (Inline graphic). Additionally, the posterior distributions of the background-rate-variation parameter, Inline graphic, shrunk strongly toward the true value, Inline graphic (the constant-background-rate model): the average posterior-mean estimate of Inline graphic across these simulations was Inline graphic (compared to a prior mean of Inline graphic; see Supplementary Material available on Dryad).

Experiment 4: Consequences of ignoring background-rate variation

When background rates of continuous-character evolution vary among lineages, we expect that excluding background-rate variation from the inference model may be positively misleading. The goal of this simulation experiment is to understand the consequences of failing to accommodate background-rate variation on inferences about state-dependent rates of continuous-character evolution. To achieve this, we reused the data sets from Experiment 2 (simulated under variable background rates), with Inline graphic, Inline graphic, and Inline graphic, but analyzed these data sets using the “constrained” MuSSCRat model (i.e., that assumes a constant background rate of evolution by forcing Inline graphic to be the same for all lineages).

For Experiment 2—where we correctly assumed that background rates are variable—the coverage probability was Inline graphic (two-tailed binomial test Inline graphic, Inline graphic). By contrast, in this experiment—when we incorrectly assumed that background rates are constant—the coverage probability decreased to Inline graphic (two-tailed binomial test Inline graphic, Inline graphic). This decreased coverage probability implies that we are very confident in the wrong answer about Inline graphic more often than we should be. For example, when state-dependent rates are truly equal (Inline graphic), we will incorrectly—but confidently—infer that state-dependent rates differ Inline graphic of the time.

Empirical analyses

Haemulids (grunts) are a group of percomorph fishes that have previously been used to explore state-dependent rates of continuous-character evolution (Price et al. 2013). Specifically, the hypothesis posits that—owing to the increased habitat complexity of reefs—the feeding apparatus (comprising several continuous traits) of reef-dwelling grunt species should evolve at a higher rate than that of their non-reef-dwelling relatives. We revisit this hypothesis by analyzing the haemulid data from Price et al. (2013) under the MuSSCRat model, using a phylogeny estimated from the more extensive molecular data set from Tavera et al. (2018).

Phylogenetic Analyses

We assembled a molecular data set by subsampling the alignments from Tavera et al. (2018) to include only the 49 species represented in our morphological data set. We estimated a chronogram under a partitioned substitution model assuming an uncorrelated lognormal branch-rate prior model and a sampled birth–death node-age prior model. We performed posterior-predictive tests to ensure that the substitution model provided an adequate description of the substitution process. We computed the maximum a posteriori (MAP) chronogram from the posterior distribution of sampled trees and conditioned on this tree in our comparative analyses. We provide details of these analyses in the Supplementary Material available on Dryad.

Comparative Analyses

We analyzed the continuous morphological data under the MuSSCRat model, with habitat type (reef/non-reef) as the discrete character. In these analyses, we conditioned on the MAP chronogram estimated above. We performed a series of analyses to understand: 1) the impact of including or excluding background-rate variation, and; 2) the sensitivity of posterior estimates to the specified priors. For the following analyses, we approximated the joint posterior density by running four replicate MCMC simulations for each analysis using RevBayes. Again, we provide details of the MCMC simulations and MCMC diagnoses in the Supplementary Material available on Dryad.

Character data

We used eight continuous morphological characters related to the feeding apparatus from Price et al. (2013); we included species that also had molecular sequence data from Tavera et al. (2018), resulting in a total of 49 species. The continuous characters include: 1) the mass of the adductor mandibulae muscle; 2) the length of the ascending process of the premaxilla; 3) the length of the longest gill raker; 4) the diameter of the eye; 5) the length of the buccal cavity; 6) the width of the buccal cavity; 7) the height of the head, and; 8) the length of the head. Rather than size correcting these characters, we included body size as an additional character (for a total of nine continuous characters). Following Price et al. (2013), we log-transformed each character before the analyses (and cube-rooted the adductor mass prior to log transformation). We used the habitat data from Price et al. (2013) to score each species for the binary discrete character; we coded non-reef-dwelling species and reef-dwelling species as states 0 and 1, respectively.

Inferring state-dependent rates

To understand the impact of background-rate variation, we estimated the posterior distribution of the MuSSCRat model parameters with and without background-rate variation using the prior settings described in the Bayesian inference section.

The treatment of background-rate variation had a profound impact on both the habitat-specific rate of continuous-character evolution, and also on the inferred history of habitat evolution (Fig. 5). Under the MuSSCRat model without background-rate variation, we inferred that the feeding apparatus of reef-dwelling haemulids evolved Inline graphic times faster than that of their non-reef-dwelling relatives; under the MuSSCRat + UCLN model, we inferred a Inline graphic-fold increase in the evolutionary rate of reef-dwelling species (Inline graphic CIs Inline graphic and Inline graphic, respectively).

Figure 5.

Figure 5.

At left, the posterior densities (curves) and the Inline graphic CI (shaded regions) for the state-dependent rate-ratio, Inline graphic, when the background-rates are constant (orange), or when they vary among lineages (blue), inferred for the haemulid data set. The dashed vertical line corresponds to Inline graphic. At right, the posterior distribution (lines) and the Inline graphic CI (shaded regions) for the number of habitat transitions, Inline graphic, assuming the background-rates are constant (orange), or vary among lineages (blue).

Examining the posterior distribution of habitat transitions reveals that excluding background-rate variation implies biologically implausible scenarios of habitat evolution. When we disallowed background-rate variation, we inferred Inline graphic transitions between reef- and non-reef habitats across the phylogeny; when we allowed background rates to vary, we inferred a more reasonable Inline graphic transitions (Inline graphic CIs Inline graphic and Inline graphic, respectively). The inferred history of the habitat across the branches of the phylogeny was similarly distorted when we assumed that background rates did not vary (see Supplementary Material available on Dryad).

Prior sensitivity

We assessed the prior sensitivity of inferences by performing a series of analyses using different prior values for various parameters of the model. Specifically, we explored the following prior values:

graphic file with name M340.gif

where Inline graphic is the prior expected standard deviation of the background-rate variation model. We varied a single prior setting at a time, rather than testing all possible combinations of these priors; we left the remaining priors as described in the Bayesian inference section, for a total of 23 prior combinations.

Most prior settings appear to have little impact on the posterior distribution of the focal parameter, Inline graphic (Fig. 6). Unsurprisingly, the prior on the focal parameter, Inline graphic, had the greatest influence on the state-dependent rate estimates: the posterior-mean estimate ranged from Inline graphic to Inline graphic over the priors that we tested (Fig. 6, left band); in all cases Inline graphic was excluded from the Inline graphic CI. We discuss the (negligible) prior sensitivity of the remaining model parameters in the Supplementary Material available on Dryad.

Figure 6.

Figure 6.

The posterior densities of the state-dependent rate-ratio, Inline graphic, for the haemulid data set under various priors. Each band of boxplots corresponds to a different prior-sensitivity experiment. Within each band, boxplots represent the Inline graphic CI (box) and Inline graphic CI (whiskers) for the posterior density under a particular value of that prior.

Discussion

Understanding the factors that drive variation in rates of character evolution is a fundamental goal for evolutionary comparative biologists. Current approaches for assessing the influence of a discrete character on rates of continuous-character evolution suffer from two problems: 1) they do not correctly characterize the mutually informative relationship between the discrete and continuous characters, and; 2) they compare against a simple—and likely unrealistic—null model, potentially misleading inferences about state-dependent rates due to rate variation that is unrelated to the discrete character of interest, which we term “background-rate variation”. This second problem is especially concerning, given that rates of evolution are likely to vary greatly across the Tree of Life, and for many reasons not related to the discrete character a particular researcher is investigating.

We present a Bayesian method that deals with both of these issues using a model (MuSSCRat) that correctly integrates over discrete-character histories with extensions that accommodate background-rate variation. This method involves estimating a large number of parameters, especially compared to the size of typical morphological data sets. This raises serious questions about the reliability of inferences made using the method—especially because the background-rate variation model may wash out any signal of state-dependent rate variation—and also about the sensitivity of inferences to the choice of priors. In the following sections, we describe simulation and empirical results that shed light on the statistical behavior of the method.

Statistical behavior under simulation

We explored the ability of the MuSSCRat model to infer state-dependent rates of continuous-character evolution using simulated data. We varied these simulations over the number of species, the number of continuous characters, and the degree of state-dependent rate variation. We repeated our simulations under different background-rate models: “background-constant” simulations, where background rates were the same across lineages, and “background-variable” simulations, where background rates were allowed to vary among lineages.

When the model was correctly specified (i.e., when we inferred parameters under the true background-rate model), the method had appropriate frequentist behavior: the false-positive rate was approximately Inline graphic, and the power increased with the number of taxa, the number of continuous characters, and the degree of state-dependent rate variation. The power was modestly reduced for background-variable simulations (Inline graphic74%) compared to the background-constant simulations (Inline graphic80%). Posterior-mean estimates of the state-dependent rate parameters were biased only for small trees (Inline graphic) or data sets with only one or two continuous characters. These results suggest that researchers should be able to reliably infer the state-dependent rate parameters for data sets with a reasonable number of species and continuous characters.

We also used our simulated data sets to assess the costs of including background-rate variation, as well as the consequences of ignoring it. Including unnecessary parameters in the model (overspecification) should lead to increased uncertainty and a concomitant decrease in power; that is, for background-constant data, allowing for background-rate variation in the inference model should dampen the signal of state dependence. Conversely, excluding parameters from the model (underspecification) should lead to artifactually increased confidence and a higher false-positive rate; that is, for background-variable data, an inference model that assumes that background rates are constant may spuriously interpret the unmodeled rate variation as additional evidence for state dependence. In our simulations, the cost of model overspecification (an Inline graphic2% decrease in power) was minor compared to the consequences of model underspecification (an Inline graphic10% increase in the false-positive rate).

Empirical impact of background-rate variation

We reanalyzed the trophic-character data for the haemulids (grunts) from Price et al. (2013) with constant and variable background rates. The inclusion (or exclusion) of background-rate variation in the inference model had a profound impact on inferences regarding the degree of state-dependent rate variation. Under the constant-background-rate model, reef-dwelling species were inferred to evolve more than 15 times faster than their non-reef-dwelling relatives. By contrast, we inferred a Inline graphic2.6-fold increase when we allowed background-rate variation. Furthermore, the history of the discrete character inferred under the constant-background-rate model involved an implausible number of habitat transitions. Together, these results suggest that, while trophic-character evolution within the haemulids is elevated within reefs, other factors (manifest as background-rate variation) also played an important role in the evolution of continuous traits in this group.

Benefits of being Bayesian

Our Bayesian implementation comes with all of the usual advantages of Bayesian inference: the marginal posterior distributions for parameters have a natural interpretation (the Inline graphic CI contains the true value of the parameter Inline graphic of the time), and these estimates are automatically averaged over uncertainty in all of the parameters. Indeed, our implementation also allows us to accommodate uncertainty in the tree topology and divergence times, although the impact of phylogenetic uncertainty appears to be relatively mild for haemulids (see Supplementary Material available on Dryad).

Of course, the need to specify prior densities for each model parameter means that posterior estimates may be sensitive to arbitrary prior choices. However, at least for haemulids, inferences about state-dependent rates appear to be quite robust over a range of reasonable priors. We note that our results regarding the impact of phylogenetic uncertainty and prior sensitivity are data set specific, and may not be generally true for all/most data sets. For this reason, we urge users to perform similar sensitivity assessments for their empirical studies.

Beyond the prosaic strengths and weaknesses of Bayesian inference, adopting a Bayesian framework allowed us to overcome two critical issues for the MuSSCRat model. First, the joint distribution of the discrete and continuous characters implied by the MuSSCRat model makes it difficult (perhaps impossible) to calculate the full likelihood analytically. Specifying our model in a Bayesian framework allowed us to use a MCMC technique—data augmentation—to simplify these likelihood calculations and correctly describe the joint evolution of discrete and continuous characters. We note that similar Monte Carlo integration techniques might be practical for maximum-likelihood applications; indeed, these Monte Carlo solutions have been used to perform maximum-likelihood inference for similar types of problems (Mayrose and Otto 2010; Levy Karin et al. 2017). We conducted experiments that suggest that such approaches would be unreliable for the haemulid data set (see Supplementary Material available on Dryad), but more work is necessary to understand the generality of these results.

Second, adopting a Bayesian framework allowed us to include background-rate variation while retaining the ability to perform reliable inference under the model. The MuSSCRat model with background-rate variation is inherently nonidentifiable: multiple combinations of background-rate and state-dependent-rate parameters can have identical likelihoods, so a unique maximum-likelihood estimate of the parameters may not exist. Within a Bayesian setting, the joint prior distribution acts to “tease apart” combinations of parameters that would otherwise have identical likelihoods, thus making it possible to infer parameters under nonidentifiable models.

Broader context

There is mounting concern within the phylogenetic comparative community that methods for understanding relationships between evolutionary variables can be unreliable in the face of model misspecification (Beaulieu et al. 2013; Maddison and FitzJohn 2014; Rabosky and Goldberg 2015; Beaulieu and O’Meara 2016; Uyeda et al. 2018). One particular problem, identified by Rabosky and Goldberg (2015), is that our model-selection procedures often involve comparisons against an extremely unrealistic—and therefore easy-to-reject—null model.

Beaulieu and O’Meara (2016) clarify the fundamental issue raised by Rabosky and Goldberg (2015): when asked to choose between two models, it should come as no surprise when model-comparison procedures reject an overly simplistic, constant-rate null model in favor of a very specific, variable-rate alternative model. The danger is that any rate variation (whether or not it is associated with a focal variable of interest) will be interpreted by a model-comparison procedure as evidence against a constant-rate null model. This logic also applies to parameter-estimation procedures: when considering a variable-rate model with a single explanatory variable, it seems likely that any evidence for heterogeneity is at risk of being spuriously attributed to the factor of interest. We refer to this problem as the “straw-man effect,” and we suspect that it applies whenever model comparison includes at least one constant-rate model or—in the case of parameter estimation—whenever a variable-rate model is overly simplistic.

A possible solution to the straw-man effect—one that we favor, but by no means the only conceivable solution—is to move away from null-model hypothesis testing with overly specific variable-rate models toward more general hierarchical models that include various sources of rate variation, as we have done in the present work. A justifiable concern with such complex hierarchical models is that the results may be sensitive to the assumed nature of the background variation—whether it is due to lineage-specific effects, heritable (but unobserved) biological traits, the environment, or any number of alternatives—which may be difficult to justify a priori or distinguish a posteriori.

Future prospects

Our MuSSCRat model makes many simplifying assumptions about the nature of the evolutionary process under consideration. For example, we assume that the continuous characters evolve under a Brownian motion process; that the underlying evolutionary variance-covariance matrix, Inline graphic, is the same over the entire tree; that the discrete characters evolve under a simple continuous-time Markov process; and that the background-rate variation is adequately described by an uncorrelated lognormal distribution. Each of these assumptions provides opportunities for future model development.

There are many alternatives to the Brownian motion model in the phylogenetic comparative toolkit, perhaps chief among them the Ornstein–Uhlenbeck (OU) process (Hansen 1997; Butler and King 2004). State-dependent OU process models have been widely used to detect shifts in evolutionary optima associated with discrete characters. It seems that these approaches are vulnerable to unmodeled process heterogeneity, since they typically compare state-dependent models against homogeneous models (as demonstrated by Uyeda et al. 2018). Extending the data-augmentation framework presented here to a variable-optimum OU process (e.g., as described in Uyeda et al. 2018) should be straightforward.

We have assumed that the “structure” of the evolutionary variance-covariance matrix is independent of the discrete character (in the sense that the discrete character affects only the magnitude of the variance-covariance matrix). Recently, Caetano and Harmon (2019) developed a method that allows the structure of the variance-covariance matrix to depend on the state of a discrete character. While this is an important advance over models that assume that the variance-covariance matrix is homogeneous, it too may suffer from the straw-man effect: heterogeneity in the variance-covariance matrix unrelated to the discrete character may be positively misleading. Extending the framework developed by Caetano and Harmon (2019) to incorporate background variation in the variance-covariance matrix is an important avenue for future development.

We used a relatively simple continuous-time Markov process to model the evolution of the discrete character. In reality, it may be that the rate or process of discrete-character evolution varies over the tree (Beaulieu et al. 2013), or that the discrete character impacts rates of lineage diversification (Maddison 2006), or violates the assumptions of a Markov model in some other way. The extent to which these complex models of discrete-character evolution impact the distribution of discrete-character histories—and therefore compromise inferences about discrete-state-dependent rates of continuous-character evolution—is an open question, though we expect the impact might be mild for the haemulids (see Supplementary Material available on Dryad). Although the modeling task may be straightforward, developing the computational machinery to do inference under such models—where a discrete character affects both rates of continuous-character evolution while itself evolving under a complex process—is a nontrivial technical challenge.

Finally, we have assumed that background-rate variation follows an uncorrelated lognormal prior distribution. A strong assumption of this model is that rates of evolution are uncorrelated between lineages; however, if the rate of character evolution is itself evolving, then we might expect rates to be correlated between ancestors and descendants (as in the model described by Huelsenbeck et al. 2000). Because the MuSSCRat model with background-rate variation is nonidentifiable, specifying a prior on background-rate variation is critical to teasing apart the relative contributions of background- and state-dependent effects on the overall rate variation. However, the Bayesian solution to nonidentifiability is not without caveats: since we rely on the prior to distinguish the relative effects of background-rate variation and state-dependent rates, our inferences about state-dependent rates may be sensitive to the assumed model of background-rate variation. Indeed, we performed exploratory analyses of the haemulid data set using an alternative background-rate model that allowed for autocorrelated rates (following Eastman et al. 2011) and found that posterior estimates were fairly sensitive to this aspect of the model (see Supplementary Material available on Dryad). Unfortunately, it is not clear that alternative models of background-rate variation can be reliably distinguished using standard model-selection procedures. However, Bayesian models provide a natural way for updating our prior beliefs and integrating information across disparate data sets, and we foresee a fruitful Bayesian research program that leverages information from across the Tree of Life (extinct and extant) to characterize overall heterogeneity in the evolutionary process, and to more accurately distinguish among its multifarious causes.

Acknowledgements

We would like to thank Bruce Rannala, Jiansi Gao, Nikolai Vetr, Sebastian Höhna, Michael Landis, Xavier Meyer, Peter Wainwright, and Samantha Price for their thoughtful and generous discussion throughout the process of writing this manuscript. We would also like to thank Mark Holder, Nicolas Lartillot, and an anonymous reviewer for providing extremely helpful suggestions that greatly improved the manuscript.

Supplementary material

Supplementary scripts and data (including the Haemulidae data and simulated data used in this study) can be found in the Dryad Digital repository: http://dx.doi.org/10.5061/dryad.499c4j2 and the GitHub repository https://github.com/mikeryanmay/musscrat_supp_archive/releases/tag/bioRxiv1.0.

Funding

This research was supported by the National Science Foundation (NSF) DEB-0842181, DEB-0919529, DBI-1356737, and DEB-1457835 to B.R.M.

References

  1. Beaulieu J.M., O’Meara B.C.. 2016. Detecting hidden diversification shifts in models of trait-dependent speciation and extinction. Syst. Biol. 65(4):583–601. [DOI] [PubMed] [Google Scholar]
  2. Beaulieu, J.M., O’Meara, B.C., Donoghue, M.J.. 2013. Identifying hidden rate changes in the evolution of a binary morphological character: the evolution of plant habit in campanulid angiosperms. Syst. Biol. 62(5):725–737. [DOI] [PubMed] [Google Scholar]
  3. Butler, M.A., King, A.A.. 2004. Phylogenetic comparative analysis: a modeling approach for adaptive evolution. Am. Nat. 164(6):683–695. [DOI] [PubMed] [Google Scholar]
  4. Caetano, D.S., Harmon, L.J.. 2019. Estimating correlated rates of trait evolution with uncertainty. Syst. Biol. 68(3):412–429. [DOI] [PubMed] [Google Scholar]
  5. Collar, D., Schulte, J., O’Meara, B., and Losos, J.. 2010. Habitat use affects morphological diversification in dragon lizards. J. Evol. Biol. 23(5):1033–1049. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Collar, D.C., O’Meara, B.C., Wainwright, P.C., and Near, T.J.. 2009. Piscivory limits diversification of feeding morphology in centrarchid fishes. Evolution 63(6):1557–1573. [DOI] [PubMed] [Google Scholar]
  7. Drummond, A.J., Ho, S.Y., Phillips, M.J., and Rambaut, A.. 2006. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4(5):e88. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Eastman, J.M., Alfaro, M.E., Joyce, P., Hipp, A.L., and Harmon, L.J.. 2011. A novel comparative method for identifying shifts in the rate of character evolution on trees. Evolution 65(12):3578–3589. [DOI] [PubMed] [Google Scholar]
  9. Felsenstein, J. 1973. Maximum-likelihood estimation of evolutionary trees from continuous characters. Am. J. Hum. Genetics 25(5):471. [PMC free article] [PubMed] [Google Scholar]
  10. Felsenstein, J. 2004. Inferring phylogenies Vol.. 2 Sunderland (MA): Sinauer Associates. [Google Scholar]
  11. Freckleton, R.P. 2012. Fast likelihood calculations for comparative analyses. Methods Ecol. Evol. 3(5):940–947. [Google Scholar]
  12. Green, P.J. 1995. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732. [Google Scholar]
  13. Hansen, T.F. 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution 51(5):1341–1351. [DOI] [PubMed] [Google Scholar]
  14. Harmon, L.J., Losos, J.B., Jonathan Davies, T., Gillespie, R.G., Gittleman, J.L., Bryan Jennings, W., Kozak, K.H., McPeek, M.A., Moreno-Roark, F., Near, T.J., Purvis, A.. 2010. Early bursts of body size and shape evolution are rare in comparative data. Evolution 64(8):2385–2396. [DOI] [PubMed] [Google Scholar]
  15. Hastings, W.K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1):97–109. [Google Scholar]
  16. Höhna, S., Landis, M.J., Heath, T.A., Boussau, B., Lartillot, N., Moore, B.R., Huelsenbeck, J.P., and Ronquist, F.. 2016. RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst. Biol. 65(4):726–736. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Höhna, S., May, M.R., and Moore, B.R.. 2015. TESS: an R package for efficiently simulating phylogenetic trees and performing Bayesian inference of lineage diversification rates. Bioinformatics 32(5):789–791. [DOI] [PubMed] [Google Scholar]
  18. Huelsenbeck, J.P., Larget, B., and Swofford, D.. 2000. A compound Poisson process for relaxing the molecular clock. Genetics 154(4):1879–1892. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Huelsenbeck, J.P., Nielsen, R., and Bollback, J.P.. 2003. Stochastic mapping of morphological characters. Syst. Biol. 52(2):131–158. [DOI] [PubMed] [Google Scholar]
  20. Huelsenbeck, J.P. and Rannala, B.. 2003. Detecting correlation between characters in a comparative analysis with uncertain phylogeny. Evolution 57(6):1237–1247. [DOI] [PubMed] [Google Scholar]
  21. Huelsenbeck, J.P. and Rannala, B.. 2004. Frequentist properties of Bayesian posterior probabilities of phylogenetic trees under simple and complex substitution models. Syst. Biol. 53(6):904–913. [DOI] [PubMed] [Google Scholar]
  22. Landis, M.J., Matzke, N.J., Moore, B.R., and Huelsenbeck, J.P.. 2013. Bayesian analysis of biogeography when the number of areas is large. Syst. Biol. 62(6):789–804. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Lartillot, N. 2006. Conjugate Gibbs sampling for Bayesian phylogenetic models. J. Comput. Biol. 13(10):1701–1722. [DOI] [PubMed] [Google Scholar]
  24. Lemey, P., Rambaut, A., Welch, J.J., and Suchard, M.A.. 2010. Phylogeography takes a relaxed random walk in continuous space and time. Mol. Biol. Evol. 27(8):1877–1885. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Levy Karin, E., Wicke, S., Pupko, T., and Mayrose, I.. 2017. An integrated model of phenotypic trait changes and site-specific sequence evolution. Syst. Biol. 66(6):917–933. [DOI] [PubMed] [Google Scholar]
  26. Lewandowski, D., Kurowicka, D., and Joe, H.. 2009. Generating random correlation matrices based on vines and extended onion method. J. Multivariate Anal. 100(9):1989–2001. [Google Scholar]
  27. Maddison, W.P. 2006. Confounding asymmetries in evolutionary diversification and character change. Evolution 60(8):1743–1746. [PubMed] [Google Scholar]
  28. Maddison, W.P. and FitzJohn, R.G.. 2014. The unsolved challenge to phylogenetic correlation tests for categorical characters. Syst. Biol. 64(1):127–136. [DOI] [PubMed] [Google Scholar]
  29. Mateiu, L. and Rannala, B.. 2006. Inferring complex DNA substitution processes on phylogenies using uniformization and data augmentation. Syst. Biol. 55(2):259–269. [DOI] [PubMed] [Google Scholar]
  30. Mayrose, I. and Otto, S.P.. 2010. A likelihood method for detecting trait-dependent shifts in the rate of molecular evolution. Mol. Biol. Evol. 28(1):759–770. [DOI] [PubMed] [Google Scholar]
  31. Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., and Teller, E.. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6):1087–1092. [Google Scholar]
  32. Nielsen, R. 2002. Mapping mutations on phylogenies. Syst. Biol. 51(5):729–739. [DOI] [PubMed] [Google Scholar]
  33. O’Meara, B.C., Ané, C., Sanderson, M.J., and Wainwright, P.C.. 2006. Testing for different rates of continuous trait evolution using likelihood. Evolution 60(5):922–933. [PubMed] [Google Scholar]
  34. Price, S.A., Holzman, R., Near, T.J., and Wainwright, P.C.. 2011. Coral reefs promote the evolution of morphological diversity and ecological novelty in labrid fishes. Ecol. Lett. 14(5):462–469. [DOI] [PubMed] [Google Scholar]
  35. Price, S.A., Tavera, J.J., Near, T.J., Wainwright, P.C.. 2013. Elevated rates of morphological and functional diversification in reef-dwelling haemulid fishes. Evolution 67(2):417–428. [DOI] [PubMed] [Google Scholar]
  36. R Core Team. 2017. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. [Google Scholar]
  37. Rabosky, D.L. and Goldberg, E.E.. 2015. Model inadequacy and mistaken inferences of trait-dependent speciation. Syst. Biol. 64(2):340–355. [DOI] [PubMed] [Google Scholar]
  38. Rannala, B. 2002. Identifiability of parameters in MCMC Bayesian inference of phylogeny. Syst. Biol. 51(5):754–760. [DOI] [PubMed] [Google Scholar]
  39. Revell, L.J. 2012. A comment on the use of stochastic character maps to estimate evolutionary rate variation in a continuously valued trait. Syst. Biol. 62(2):339–345. [DOI] [PubMed] [Google Scholar]
  40. Robinson, D.M., Jones, D.T., Kishino, H., Goldman, N., and Thorne, J.L.. 2003. Protein evolution with dependence among codons due to tertiary structure. Mol. Biol. Evol. 20(10):1692–1704. [DOI] [PubMed] [Google Scholar]
  41. Tanner, M.A. and Wong, W.H.. 1987. The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82(398):528–540. [Google Scholar]
  42. Tavera, J., Acero, A., and Wainwright, P.C.. 2018. Multilocus phylogeny, divergence times, and a major role for the benthic-to-pelagic axis in the diversification of grunts (Haemulidae). Mol. Phylogenet. Evol. 121:212–223. [DOI] [PubMed] [Google Scholar]
  43. Uyeda, J.C., Zenil-Ferguson, R., and Pennell, M.W.. 2018. Rethinking phylogenetic comparative methods. Syst. Biol., 67(6):1091–1109. [DOI] [PubMed] [Google Scholar]
  44. Venditti, C., Meade, A., and Pagel, M.. 2011. Multiple routes to mammalian diversity. Nature 479(7373):393. [DOI] [PubMed] [Google Scholar]

Articles from Systematic Biology are provided here courtesy of Oxford University Press

RESOURCES