Skip to main content
Bioinformatics logoLink to Bioinformatics
. 2016 Aug 29;32(17):i736–i745. doi: 10.1093/bioinformatics/btw462

A weighted exact test for mutually exclusive mutations in cancer

Mark DM Leiserson 1, Matthew A Reyna 1, Benjamin J Raphael 1,*
PMCID: PMC5013919  PMID: 27587696

Abstract

Motivation: The somatic mutations in the pathways that drive cancer development tend to be mutually exclusive across tumors, providing a signal for distinguishing driver mutations from a larger number of random passenger mutations. This mutual exclusivity signal can be confounded by high and highly variable mutation rates across a cohort of samples. Current statistical tests for exclusivity that incorporate both per-gene and per-sample mutational frequencies are computationally expensive and have limited precision.

Results: We formulate a weighted exact test for assessing the significance of mutual exclusivity in an arbitrary number of mutational events. Our test conditions on the number of samples with a mutation as well as per-event, per-sample mutation probabilities. We provide a recursive formula to compute P-values for the weighted test exactly as well as a highly accurate and efficient saddlepoint approximation of the test. We use our test to approximate a commonly used permutation test for exclusivity that conditions on per-event, per-sample mutation frequencies. However, our test is more efficient and it recovers more significant results than the permutation test. We use our Weighted Exclusivity Test (WExT) software to analyze hundreds of colorectal and endometrial samples from The Cancer Genome Atlas, which are two cancer types that often have extremely high mutation rates. On both cancer types, the weighted test identifies sets of mutually exclusive mutations in cancer genes with fewer false positives than earlier approaches.

Availability and Implementation: See http://compbio.cs.brown.edu/projects/wext for software.

Contact: braphael@cs.brown.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

1 Introduction

A key challenge in cancer genomics is distinguishing the small number of somatic mutations that drive cancer from the vast majority of mutations that accumulate randomly. The ability to distinguish these driver mutations from the random passenger mutations may lead to better understanding of cancer biology and personalized therapies customized to a tumor’s mutational profile. However, large scale cancer sequencing efforts such as The Cancer Genome Atlas (TCGA) (The Cancer Genome Atlas Research Network, 2012, 2014; The Cancer Genome Atlas Research Network, et al., 2013) and the International Cancer Genome Consortium (ICGC) have shown that many driver mutations are rare across patient cohorts and thus distinguishing the driver mutations from the passengers by their frequency of occurrence is a difficult problem.

Driver mutations are hypothesized to group into a small number of pathways or hallmarks (Hanahan and Weinberg, 2011), and this hypothesis is a widely accepted explanation for the observed mutational heterogeneity of cancer (Vogelstein et al., 2013). Thus, researchers have developed methods to identify combinations of mutations using varying levels of prior knowledge, from pathway databases (Mootha et al., 2003; Subramanian et al., 2005; Wendl et al., 2011) to protein–protein interaction networks (Ciriello et al., 2012; Leiserson et al., 2015a; Ruffalo et al., 2015; Vandin et al., 2011).

As prior knowledge of pathways and interactions is often noisy or unavailable, de novo methods that do not use prior information are advantageous. The vast number of possible combinations of mutated genes makes complete de novo discovery of combinations computationally and statistically intractable. However, a number of methods (Babur et al., 2015; Ciriello et al., 2012; Constantinescu et al., 2015; Kim et al., 2015, 2016; Leiserson et al., 2013, 2015b; Miller et al., 2011; Szczurek and Beerenwinkel, 2014; Vandin et al., 2012) use the observation that mutations within the same pathway are often mutually exclusive across tumors (Thomas et al., 2007; Yeang et al., 2008). These methods differ in how they score mutual exclusivity and in how they identify the best scoring set(s) of mutations.

The first type of score for mutual exclusivity is a combinatorial score, such as the scores employed in Miller et al. (2011), Vandin et al. (2012) and Leiserson et al. (2013). For example, in the Dendrix algorithm (Vandin et al., 2012), the score for a set M of mutational events is the difference between the number of samples with a mutation in M (coverage) and the number of mutations in M occurring in more than one sample (coverage overlap). The advantage of a combinatorial score is that it is easy to compute, but it was observed by Leiserson et al. (2015b) and others that the score is often biased towards sets with frequently mutated genes.

The second type of score for mutual exclusivity is a statistical score (Babur et al., 2015; Ciriello et al., 2012; Constantinescu et al., 2015; Kim et al., 2015, 2016; Leiserson et al., 2015b; Ley et al., 2013; Szczurek and Beerenwinkel, 2014). A particularly useful statistical score for exclusivity is based on the exact distribution that conditions on the observed number of mutated samples in each gene (Babur et al., 2015; Leiserson et al., 2015b). For a pair of mutations, such a test is a one-sided Fisher’s exact test for independence (Babur et al., 2015; Leiserson et al., 2015b; Ley et al., 2013). For more than two genes, Leiserson et al. (2015b) generalized the exact test to multi-dimensional contingency tables. They introduced the CoMEt algorithm that computes a generalization of Fisher’s exact test for event sets of any size using either an exact tail enumeration algorithm or an approximation. They showed that conditioning on the number of mutations in each event reduces bias towards frequently mutated events compared to combinatorial scores.

Statistical scores that condition only on mutation frequencies do not account for the variation in mutation rate among tumors. It has been observed that the number of mutations in a tumor can vary over several orders of magnitude (Lawrence et al., 2013; Roberts and Gordenin, 2014; Vogelstein et al., 2013). For example, colorectal tumors with microsatellite stability have a median of 66 non-synonymous mutations, but colorectal tumors with microsatellite instability have a median of 777 mutations (Vogelstein et al., 2013). Another example is from The Cancer Genome Atlas Research Network et al. (2013), who classified a subset of TCGA endometrial cancers as ultramutated or hypermutated.

Another useful statistical test for mutual exclusivity conditions on both the number of mutated samples in each event and the number of mutation events in each sample (Ciriello et al., 2012; Kim et al., 2015). Since computing this distribution exactly is not computationally efficient, permutation tests are used. The permutation tests, which compare observed results to a number of samples (104) drawn from a null distribution, are more tractable than computing the P-value exactly on genome-scale data, but the significance of the score is directly limited by the number of permutations. MEMo (Ciriello et al., 2012) computes the significance of the coverage (number of mutated samples) of M using this permutational distribution for sets of any size k. MEMCover (Kim et al., 2015) computes the significance of the exclusivity of pairs to search for exclusivity within, between, and across cancer types. Both MEMo and MEMCover restrict their analysis to sets of genes that interact in a protein-interaction network. WeSME (Kim et al., 2016), which appeared while this paper was under review, computes the significance of exclusivity of pairs of genes with a less expensive approximation to the permutational distribution. To the best of our knowledge, there is no method for quickly computing the significance of mutual exclusivity conditioned on both the observed number of mutations per event and number of mutations per sample.

1.1 Contributions

We introduce a weighted test for mutual exclusivity that conditions on the frequency of each mutational event in a set M and also incorporates the probability that each event is mutated in each sample. Our test was inspired by a model derived by Manescu and Keich (2015), who compute the significance of the overlap between two sets of genes weighted by gene length. We introduce the weighted exclusivity test to approximate the fixed event and sample frequency permutation test quickly and accurately by estimating the mutation probabilities from the null distribution of the permutation test. We present a recursive formula for computing the P-value of this test exactly and derive a saddlepoint approximation for arbitrarily sized groups of events. We show that the saddlepoint approximation is both a fast and accurate approximation of the permutational distribution. We also demonstrate that the saddlepoint approximation can be used to rapidly compute the CoMEt statistical test, which is a special case of the weighted test where the mutation probabilities for a given event are the same in each sample.

We use our Weighted Exclusivity Test (WExT) software to identify sets of exclusive mutations in hundreds of colorectal and endometrial cancers. Cancer of these types often have extremely high mutation rates (Vogelstein et al., 2013), which make them difficult to analyze when conditioning only on the number of mutated samples per event. However, our weighted statistical test allows us to effectively condition on the number of mutation events per sample, and we identify exclusive patterns of mutations in these cancers that were missed by earlier approaches. We find that the weighted test identifies more biologically interesting sets than CoMEt (Leiserson et al., 2015b). We expect that the weighted test for mutual exclusivity will prove useful for many cancer types where defects in DNA damage or environmental exposures, e.g. ultraviolet light, lead to very high mutation rates in some samples.

2 Methods

We introduce a new weighted test for mutual exclusivity that incorporates per-event, per-sample mutation probabilities, and we describe how to use particular instances of our test to approximate commonly used tests for mutual exclusivity, which we refer to as the row exclusivity (R-exclusivity) and row-column exclusivity (RC-exclusivity) tests.

First, in Section 2.2, we describe the R-exclusivity and RC-exclusivity tests. Next, in Section 2.3, we introduce our new weighted test for mutual exclusivity, which we call the weighted-row exclusivity (WR-exclusivity) test, that incorporates event and sample mutation frequencies without using permutations. In Section 2.4, we describe how to approximate the R-exclusivity and RC-exclusivity tests with the WR-exclusivity test. Then, in Section 2.5, we provide a recursive formula for computing the WR-exclusivity P-value exactly, and we derive a fast and accurate saddlepoint approximation for the WR-exclusivity P-value. Finally, in Sections 2.6, we describe how we search for exclusive sets, and in Section 2.7, we describe our WExT software.

We summarize the tests and our contributions in this article in Table 1.

Table 1.

Three tests for mutual exclusivity, the values that are fixed in each test and different algorithms for computing the P-values associated with the tests

Test P-value Conditioning Algorithms
Row (R) exclusivity ΦR Event frequencies Tail enumeration (CoMEt), saddlepoint approximation
Row-column (RC) exclusivity ΦRC Event and sample frequencies Permutations
Weighted-row (WR) exclusivity ΦWR Event frequencies and per-event, Recursive formula, saddlepoint approximation
per-sample weights (W)

Bold face entries indicate contributions by this manuscript.

2.1 Notation

We observe the presence or absence of mutational events across a collection of samples. The presence of an event may reflect a variety of genomic (e.g. the canonical BRAF V600E mutation or deletions in CDKN2A), proteomic and/or epigenomic alterations. In this study, we analyze single nucleotide variants and small insertion/deletions (indels) by gene. For the clarity of exposition, we will describe these events at the gene level, but our weighted test can accommodate a broader class of mutational events.

Let {gi}i=1m be a set of m genes and {sj}j=1n be a set of n samples. For each sample, we observe the presence of one or more mutations in each gene, and we record the presence or absence of mutations in a per-gene, per-sample binary mutation matrix A{0,1}m×n, where A=[aij] with aij = 1 if gene gi is mutated in sample sj and aij = 0 otherwise.

Let M{gi}i=1m be a set of k genes. The gene set M has co-occurring mutations in sample sj if multiple genes are mutated in that sample, i.e. there exist distinct gi,gM such that aij = 1 and aj=1. Alternatively, the gene set M has a mutually exclusive mutation in sample sj if one and only one gene is mutated in that sample, i.e. there exists gM such that aj=1 and aij = 0 for giM{g}. Our goal is to identify sets of genes with statistically significant numbers of mutually exclusive mutations.

Let ri=j=1naij be the number of samples with mutations in gi, let cj=i=1maij be the number of genes with mutations in sj, let zM be the number of samples with co-occurring mutations in M and let tM be the number of samples with mutually exclusive mutations in M.

For any mutation matrix B, let B(M) be the submatrix of B with rows corresponding to the gene set M, and let tB(M) be the number of mutually exclusive mutations in B(M). We will use tM=tA(M).

2.2 Permutation tests for mutual exclusivity

We describe two different permutation tests for mutual exclusivity. First, the row-exclusivity (R-exclusivity) test finds the probability ΦR(M) of observing at least tM mutually exclusive mutations in a gene set M given that each giM is mutated in ri samples. We describe this test as the row-exclusivity test because it conditions on the row sums of the mutation matrix.

Formally, we define ΩR to be the set of mutation matrices with the same row sums as A. Let ER={BΩR:tB(M)tM} be the set of mutation matrices with at least tM mutually exclusive mutations in M. Then

ΦR(M)=|ER||ΩR| (1)

is the P-value of the R-exclusivity test.

Since the R-exclusivity test only conditions on the row sums of A, we can consider each row of A independently. This implies that to compute ΦR(M), we use only the rows corresponding to M. Thus, for k = 2, the P-value ΦR(M) is equal to the P-value from the one-sided Fisher’s exact test, which computes the tail probability by summing the hypergeometric probability of 2 × 2 contingency tables with fixed margins. The hypergeometric probability of each contingency table is the proportion of matrices in ΩR that gives a contingency table with those margins. Note also that, when k = 2, the probability of observing tM or more mutually exclusive mutations is equal to the probability of observing zM or more co-occurring mutations. Leiserson et al. (2015b) generalized this test to k > 2 genes as part of the CoMEt algorithm.

The rowcolumn-exclusivity (RC-exclusivity) test finds the probability ΦRC(M) of observing at least tM mutually exclusive mutations in a gene set M given that each giM is mutated in ri samples and each sj is mutated in cj genes. We describe this test as the row–column-exclusivity test because it conditions on the row and column sums of the mutation matrix.

Formally, we define ΩRC to be the set of mutation matrices with the same row and column sums as A. Let ERC={BΩRC:tB(M)tM} be the set of mutation matrices with at least tM mutually exclusive mutations in M. Then

ΦRC(M)=|ERC||ΩRC| (2)

is the P-value of the RC-exclusivity test. Since ΩRC depends on the row and column sums of A, we cannot consider the rows of A, or even A(M), independently.

The RC-exclusivity test is related to the co-occurrence and mutual exclusivity tests used in Ciriello et al. (2012) and Kim et al. (2015, 2016). Ciriello et al. (2012) use coverage (i.e. tM+zM) instead of exclusivity as the test statistic, while Kim et al. (2015, 2016) limit to pairs of genes. Both Ciriello et al. (2012) and Kim et al. (2015) use permutation tests that sample matrices from ΩRC, so their P-values are limited by the number of draws (e.g. both use 104 permutations).

2.3 Weighted exact test for mutual exclusivity

We introduce a new weighted test for mutual exclusivity. The weighted-row-exclusivity (WR-exclusivity) test finds the probability ΦWR(M) of observing at least tM mutually exclusive mutations in a gene set M given that giM is mutated in ri samples and a per-gene, per-sample mutation probability matrix W that prescribes weights with the presence or absence of individual mutations. We describe this test as the weighted-row-exclusivity test because it conditions on the row sums of the mutation matrix and a mutation probability weight matrix.

For our model, we assume that {Xij}j=1n is a set of mutually independent Bernoulli random variables for each gene gi with success probabilities W=[wij], i.e.

Pr(Xij=)={wij, if =1,1wij, if =0, (3)

where wij is the probability that gene gi is mutated in sample sj. Let TM,j be a random variable with TM,j=1 if sj has a mutually exclusive mutation in a gene set M and TM,j=0 otherwise. Therefore, Yi=j=1nXij is a Poisson binomial distributed variable for the number of mutations in gi and TM=j=1nTM,j is a test statistic for mutual exclusivity indicating the number of mutually exclusive mutations in M. We want to find the tail probability (commonly referred to as the P-value) of observing tM mutually exclusive mutations in M given that gi is mutated in ri samples. The WR-exclusivity P-value ΦWR(M) is the probability of observing at least tM mutually exclusive mutations in a gene set M under this model with

ΦWR(M)=Pr(TMtM|YM=rM) (4)

where YM=[Yi]iM and rM=[ri]iM.

Note that, for any gene gi, the assumption that Y i =ri implies that

j=1nwij=j=1nE[Xij]=E[j=1nXij]=E[Yi]=ri (5)

by the definitions of {Xij}j=1n and Yi.

2.4 Approximating the permutation tests with the weighted exclusivity test

Each of the sets ΩR and ΩRC underlying the R-exclusivity and RC-exclusivity tests, respectively, determines a per-gene, per-sample weight matrix W=[wij] by considering the probability wij of observing a mutation in gene gi in sample sj, i.e.

W=1|Ω|BΩB (6)

where Ω{ΩR,ΩRC}. Since both ΩR and ΩRC fix the number of mutated samples per gene, the weight matrix W in (6) with Ω{ΩR,ΩRC} satisfies (5). We define WR to be the weight matrix with Ω=ΩR and WRC to be the weight matrix with Ω=ΩRC.

For the R-exclusivity test, each row of BΩR can be considered separately, so (6) for the set ΩR is given by WR=[wij], with wij=rin.

However, for the RC-exclusivity test, each row BΩRC cannot be considered separately, so, to the best of our knowledge, there is no closed-form expression for (6) for the set ΩRC. Therefore, we generate an empirical weight matrix WRCN=[wij] for ΩRC by drawing N matrices ΩRCN uniformly at random from ΩRC and computing (6) with ΩRCN instead of ΩRC. We assume that there is a nonzero probability that a gene is mutated in a sample, and thus set wij=12N when no mutation in gene gi is observed in sample sj in ΩRCN.

Estimating WRCN in this way gives an accurate approximation of ΦRC(M) using relatively small values of N.

2.5 Computing the weighted exclusivity test

Our weighted test for mutual exclusivity requires computing the tail probability in (4), which can be computationally expensive. We compute the tail probability using two different strategies: a recursive formula and a saddlepoint approximation.

2.5.1 Recursive formula for the weighted exclusivity test

We present a recursive formula for computing the tail probability in (4) exactly for sets M of any size k. Assuming that {Yi}i=1m are mutually independent, we can write (4) as

graphic file with name btw462m7.jpg (7)

Without loss of generality, let M={1,,k}. We first find the joint probability in the numerator of (7) using a recursive formula, where Pr(TMtM,YM=rM)=F(tM,r1,,rk,n) is computed by the recurrence relation

F(t,x1,,xk,j)=π{0,1}ki=1kqijπiF(wπ(t),yπ1(x1),,yπk(xk),j1), (8)

where

qij={pij if =1,0 otherwise,
wπ(t)={t1 if i=1kπi=1,t otherwise,

and

y(x)={x1 if =1,x otherwise.

The base cases for (8) are

F(t,x1,,xk,j)={1,if t=x1=···=xk=j=0,0,if min{t,x1,,xk,j}<0,t>i=1kxi, or maxi=1kxi>n. (9)

We then find the marginal probabilities in the denominator of (7) using dynamic programming, which is a standard method for computing the Poisson-Binomial probability mass function (Hong, 2013).

2.5.2 Saddlepoint approximation for the weighted exclusivity test

We derive a saddlepoint approximation (Butler, 2007) for computing the conditional tail probability in (4). This approach is inspired by Manescu and Keich (2015), who derive a saddlepoint approximation for an enrichment test for differentially expressed genes in Gene Ontology categories weighted by gene lengths. The saddlepoint approximation is specifically designed to provide a quick and accurate approximation of the tail probability. We present the key equation in (10) and provide a full derivation for k = 3 in the supplement. The saddlepoint approximation is given by

Pr(TMtM|YM=rM)1Φ(w~)ϕ(w~)(1w~1u~), (10)

where Φ and ϕ are, in this setting, the cumulative distribution and density functions, respectively, of the standard normal distribution, and w~ and u~ are defined as follows.

Without loss of generality, let M={1,,k}. First, for λRk+1, let MYM,TM(λ)=E[eiMλiYi+λk+1TM] be the joint moment generating function of {Yi}iM and TM, and let KYM,TM(λ)=logMYM,TM(λ) be the corresponding joint cumulant generating function.

Similarly, let MYi(λ)=E[eλYi] be the moment generating function of Yi, and let KYi(λ)=logMYi(λ) be the corresponding cumulant generating function.

Next, let KYM,TM(λ) and KYM,TM(λ) be the gradient vector and Hessian matrix, respectively, of KY,T(λ), and let KYi(λ) and K″Yi(λ) be the gradient vector and Hessian matrix, respectively, of KYi(λ).

Finally, define w~ by

w˜=2sgn(y˜k+1)iMKYi(xˆi)KYM,TM(y˜)y˜T(xˆx˜) (11)

and u˜ by

u˜=2sinh(y˜k+12)|KYM,TM(y˜)|iMKYi(xˆi), (12)

where x˜=(r1,,rk,tM12) and y˜=(y˜1,,y˜k+1) with y˜ the unique solution for KYM,TM(y˜)=x˜ and (10) undefined if y˜k+1=0, and x^=(x^1,,x^k,0) with x^i the unique solution for KYi(x^i)=ri.

2.6 Searching for sets of mutually exclusive mutations

Our goal is to identify sets M of genes with significantly exclusive mutations, i.e. extremely small P-values ΦWR(M). There has been a considerable amount of work on methods for optimizing scores for mutually exclusive mutations, including Markov chain Monte Carlo methods (Leiserson et al., 2015b; Vandin et al., 2012), integer linear programs (Leiserson et al., 2013; Zhang et al., 2014), greedy algorithms (Babur et al., 2015) and others. These methods have been shown to be able to search datasets of many hundreds of genes for mutually exclusive mutations. Many of these methods can be modified to use our weighted exclusivity test to identify the most significant sets.

Since the focus of this work is on a statistical test for exclusivity, we instead enumerate and test all sets M of k genes that satisfy the following basic criteria using the R-exclusivity, RC-exclusivity and WR-exclusivity tests:

  1. The number tM of samples with mutually exclusive mutations must be larger than the number zM of samples with co-occurring mutations, i.e. tM > zM.

  2. Each gene giM must have at least one exclusive mutation.

We use the Benjamini–Hochberg procedure (Benjamini and Hochberg, 1995) to control the false discovery rate (FDR). We examine the subset of genes in each dataset with a minimum mutation frequency so that we can enumerate and test all combinations of genes of a certain size in a reasonable amount of time.

2.7 Implementation

We implemented the recursive formula for the WR-exclusivity test in Python and C, and we implemented the saddlepoint approximation for the WR-exclusivity test in Python using the NumPy and SciPy numerical libraries. We implemented the RC-exclusivity test in Python, and we used a bipartite double edge swap algorithm (Gobbi et al., 2014; Milo et al., 2003) that has been shown empirically to sample uniformly from ΩRC. Our code, along with commands and data for reproducing the results and figures in this paper, is available as the WExT software package at http://compbio.cs.brown.edu/projects/wext.

3 Results

We compare the results of the WR-exclusivity test to both the R-exclusivity and RC-exclusivity tests on real data. In general, we can choose any weights to compute WR-exclusivity, but, in this article, we specifically consider weights to allow us to approximate the R-exclusivity and RC-exclusivity tests. We use WExT to discover mutually exclusive sets of mutations in thyroid, colorectal and endometrial cancers, restricting our analysis to mutations at the gene level.

The rest of this section is organized as follows. In Section 3.1, we describe the data used in our experiments. In Section 3.2, we compare the tail enumeration and saddlepoint approximation algorithms for computing the R-exclusivity P-values ΦR(M), and we show that the saddlepoint approximation provides a fast and accurate approximation for ΦR(M). In Section 3.3, we compare the results of the recursive and saddlepoint approximation algorithms for computing the WR-exclusivity P-values ΦWR(M) with the results of the RC-exclusivity test, and we show that ΦWR(M) is an accurate approximation of ΦRC(M) using either the recursive or saddlepoint approximation algorithms. In Section 3.4, we show that ΦWR(M) provides an accurate approximation of ΦRC(M) even with coarser estimates of the weight matrix W. Finally, in Sections 3.5 and 3.6, we present the results of the WR-exclusivity test on thyroid, colorectal and endometrial cancers.

3.1 Data

We analyzed non-synonymous single nucleotide variants (SNVs) and small insertions or deletions (indels) in 224 colorectal (COADREAD) (The Cancer Genome Atlas Research Network, 2012), 402 papillary thyroid carcinoma (THCA) (The Cancer Genome Atlas Research Network, 2014), and 248 uterine corpus endometrial carcinoma (UCEC) (The Cancer Genome Atlas Research Network et al., 2013) samples from The Cancer Genome Atlas (TCGA). We analyzed the mutations in the COADREAD and UCEC samples from the TCGA Pan-Cancer project (Weinstein et al., 2013) by downloading the mutations in Mutation Annotation Format (MAF) from Synapse. We downloaded the mutations in THCA from Firehose. We restricted our analysis to non-synonymous mutations, ignoring mutations classified as ‘Silent’, ‘Intron’ ‘3′UTR’, ‘5′UTR’, ‘IGR’, ‘lincRNA’ and ‘RNA’. We also downloaded lists of hypermutator samples for COADREAD and UCEC. We created a list of 35 hypermutator samples in COADREAD listed in The Cancer Genome Atlas Research Network (2012) in their Supplementary Table S3, and 82 hypermutator samples in UCEC listed by The Cancer Genome Atlas Research Network et al. (2013) as samples labeled ‘POLE OR MSI’ in their Supplementary Datafile S1.1. We restrict our analysis to genes mutated in at least 20, 5, and 30 samples in the COADREAD, THCA and UCEC datasets, analyzing 76, 30 and 62 genes in each dataset, respectively.

In general, COADREAD samples have the most mutated genes (median: 78.5), with COADREAD hypermutators with mutations in at least an order of magnitude more genes than non-hypermutators (median for hypermutators: 797; median for non-hypermutators: 69). THCA samples have the fewest mutated genes per sample (median: 12), with no hypermutators, while UCEC has more mutated genes per sample (median: 57.5) with UCEC hypermutators mutated in approximately an order of magnitude more genes than non-hypermutators (median hypermutators: 355; median non-hypermutators: 43.5). See Supplementary Figure S1.

For each dataset, we estimated the weights WRCN using the permutation procedure described in Section 2.4 using N=103 permutations. We show the weights for each dataset in Figure 1.

Fig. 1.

Fig. 1.

The weights WRCN estimated by sampling N=103 permuted matrices on the THCA, COADREAD and UCEC datasets. Samples (x-axis) are sorted by the number of mutated genes in increasing order from left to right, with hypermutators (right) separated from non-hypermutators (left) with a dashed line in COADREAD and UCEC. Genes (y-axis) are sorted by the number of mutated samples in increasing order from top to bottom

3.2 Comparison of methods for computing the R-exclusivity test on real data

First, we investigated the accuracy and speed of the saddlepoint approximation of the R-exclusivity P-value ΦR(M). We enumerated triples according to the procedure described in Section 2.6 in the THCA, COADREAD and UCEC datasets, and computed ΦR(M) exactly using the CoMEt software from Leiserson et al. (2015b) as well as approximately using the saddlepoint approximation with WR given in Section 2.4.

Supplementary Figure S2 shows a comparison of the P-values and runtimes given by the two methods, where the weights for the WR-exclusivity test are uniform across samples. On these datasets, the saddlepoint approximation is an extremely accurate approximation of the tail enumeration procedure (ρ2=0.995). Additionally, while the median runtimes of the two algorithms are similar, the tail enumeration procedure is much slower for sets with co-occurring mutations while the saddlepoint approximation is largely unaffected. We expect the discrepancy between runtimes to grow for gene sets of larger sizes.

3.3 Comparison of methods for computing the WR-exclusivity test on real data

Next, we compared the results of methods for computing WR-exclusivity test with weights WRCN with the RC-exclusivity test on pairs of genes from the THCA, COADREAD, and UCEC datasets. We chose pairs instead of triples because of the prohibitive cost of computing the recursive formula for ΦWR(M). We used N=104 permutations to compute ΦRC(M), and also included the tail enumeration procedure for ΦR(M) as a control.

Table 2 shows that the results of WR-exclusivity test—computed either with the recursive formula or the saddlepoint approximation—are strongly correlated with the RC-exclusivity test (Fig. 2b). The results of the R-exclusivity test are more weakly correlated with the RC-exclusivity test (Fig. 2a), showing that conditioning on the number of mutations in each sample changes the distribution of mutually exclusive mutations. This discrepancy remains when we restrict to gene sets M with ΦRC(M)104, i.e. sets of genes for which the empirical permutational distribution finds at least one mutually exclusive mutation in M.

Table 2.

Pearson’s correlation coefficient ρ2 of P-values of pairs of genes from the THCA, COADREAD and UCEC datasets using the tail enumeration R-exclusivity P-values ΦR(M), RC-exclusivity P-values ΦRC(M), and the recursive formula and saddlepoint approximations of the WR-exclusivity P-values ΦWR(M) using weights WR

Pairs ΦR (CoMEt) ΦWR (recursive) ΦWR (saddlepoint)
All 0.71291 0.99816 0.99481
ΦRC(M)104 0.65376 0.99811 0.99404

The correlations were computed for two sets of pairs of genes: P-values for 5014 pairs (all) and 4926 pairs (ΦRC(M)104).

Fig. 2.

Fig. 2.

Comparison of P-values and runtimes of different tests on THCA, COADREAD and UCEC pairs. (a–b) Scatter plots comparing the RC-exclusivity test with N=104 permutations against (a) the R-exclusivity test and (b) the WR-exclusivity test (recursive) with weights WRCN. (c) The WR-exclusivity (recursive) P-values versus the WR-exclusivity (saddlepoint) P-values with weights WRCN. (d) Boxplots of the runtimes for computing the weighted test with the recusive formula (red) and with the saddlepoint approximation (blue) for each pair of genes in the datasets

The WR-exclusivity P-values computed exactly and with the saddlepoint approximation are highly correlated (Fig. 2c), with a Pearson’s correlation coefficient of 0.996 for all P. For smaller P-values with ΦWR(M)<104 from either the recursive formula or the saddlepoint approximation, the correlation increases to 0.9999.

The runtime to compute ΦWR using the recursive formula varies widely because pairs with co-occurring mutations require more computation, but the runtime of the saddlepoint approximation is more consistent. As a result, testing all pairs with the recursive formula requires approximately 2 hours, but testing the same pairs with the saddlepoint approximation requires approximately 30 seconds. Note that the runtime does not include generating the weights WRCN, which requires several minutes.

3.4 Approximating the RC-exclusivity test with the WR-exclusivity test

We compared the saddlepoint approximation of the WR-exclusivity test to the RC-exclusivity test using gene triples from the COADREAD dataset, again using the R-exclusivity test as a control. We computed ΦRC(M) with N=106 permutations. We computed the saddlepoint approximation for ΦWR(M) using WRCN with N=103 draws from ΩRC, which is three orders of magnitude fewer than the number of permutations than we used to compute ΦRC(M). The P-values ΦR(M) and ΦRC(M) are weakly correlated in the tail (ρ2=0.67 for ΦRC(M)<0.001; see Fig. 3a). In contrast, the ΦWR(M) (saddlepoint) P-values provide an accurate approximation of the ΦRC(M) P-values. The RC-exclusivity and WR-exclusivity P-values are highly correlated in the tail (ρ2=0.948 for ΦRC(M)<0.001; see Fig. 3b). Moreover, ΦWR(M) is an accurate estimate of ΦRC(M) to within one or more digits for most triples and within an order of magnitude for all triples. Furthermore, despite the much smaller number of permutations used to generate WRCN,ΦWR(M) provides smaller P-values than ΦRC(M) with a greater number of significant predictions, and is much faster than the permutation test.

Fig. 3.

Fig. 3.

Comparison of the R-exclusivity test (left, y-axis) and the saddlepoint approximation of the WR-exclusivity test with weights WRCN estimated from N=103 permutations (right, y-axis) to the R-exclusivity test (x-axis) on triples from the COADREAD dataset. The saddlepoint approximation uses weights WRCN estimated from N=103 permutations, while the RC-exclusivity P-values were computed with N=106 permutations

3.5 Mutually exclusive mutations in thyroid carcinomas

We computed the WR-exclusivity P-values for all triples of genes that were each mutated in at least 5 of the 402 thyroid carcinomas in the THCA dataset. The WR-exclusivity test identifies 48 triples with significantly exclusive mutations (FDR < 0.001), while the R-exclusivity test identifies 38 triples (FDR < 0.001).

The top 25 ranked triples by both tests are identical, which is not surprising since THCA samples have low mutation rates compared to most cancer types [see Vogelstein et al. (2013) and Section 3.1]. In addition, the P-values for the top ranked triples are all within a few orders of magnitude, demonstrating that the two tests are very similar on this dataset.

Supplementary Table S1 shows the top triples, which include many known thyroid cancer genes. The top five triples include seven genes, five of which are well-known cancer genes with known roles in thyroid cancer (The Cancer Genome Atlas Research Network, 2014): BRAF, HRAS, NRAS, EIF1AX and ATM. The other two genes are BDP1 and TG, both of which may play a role in cancer. Woiwode et al. (2008) describe a role for BDP1 in AKT signaling, which was also noted in TCGA thyroid publication (The Cancer Genome Atlas Research Network, 2014), although BDP1 has greater than 11 000 nucleotides in its coding sequence, so it may also accumulate many passenger mutations. TG is the thyroglobulin gene, and is used as a tumor marker in papillary thyroid carcinoma, which is the same subtype of thyroid cancer analyzed in TCGA.

3.6 Mutually exclusive mutations in colorectal cancers and endometrial carcinomas

We expect that the difference between the R-exclusivity and WR-exclusivity tests would be more pronounced on cancer types with higher and highly variable mutation rates. Thus, we computed P-values on triples of genes from colorectal cancers (COADREAD) and endometrial carcinomas (UCEC). We find that the WR-exclusivity test predicts more biologically interesting triples than the R-exclusivity test. The WR-exclusivity test identifies 5290 and 6835 triples (many of which overlap) with significantly mutually exclusive mutations (FDR < 0.001) in the 224 COADREAD and 248 UCEC samples, respectively. In contrast, the R-exclusivity test computes 4 and 130 triples, respectively, with significantly mutually exclusive mutations (FDR < 0.001).

Compared to the R-exclusivity test results, the highest ranked triples by the WR-exclusivity test include fewer long genes that tend to accumulate random, passenger mutations—especially in samples with high mutation rates (Tables 3 and 4).

Table 3.

Five most significant triples identified by the R-exclusivity (top 5) and WR-exclusivity (bottom 5) tests on the COADREAD dataset

ΦR rank ΦWR rank Triple M ΦR(M) ΦWR(M) Hypermutator mutations
1 2 ACVR2A, PIK3CA, TP53 2.65·107 2.54·1018 31
2 32 APC, BRAF, PRDM2 5.44·107 2.30·1013 33
2 33 APC, BRAF, WDFY3 5.44·107 2.43·1013 32
4 3 ATM, PIK3CA, TP53 5.87·107 1.15·1017 24
5 81 APC, BRAF, FAT2 1.93·106 6.48·1012 35
6 1 BRAF, KRAS, NRAS 2.50·106 9.95·1019 26
1 2 ACVR2A, PIK3CA, TP53 2.65·107 2.54·1018 31
4 3 ATM, PIK3CA, TP53 5.87·107 1.15·1017 24
10 4 ARID1A, TGFBR2, TP53 5.89·106 1.76·1016 29
9 5 ABCA12, TGFBR2, TP53 4.29·106 1.83·1016 28

Genes in bold are among 600 longest genes (at least 9560 nucleotides in coding sequence).

Table 4.

Five most significant triples identified by the R-exclusivity (top 5) and WR-exclusivity (bottom 5) tests on the UCEC dataset

ΦR rank ΦWR rank Triple M ΦR(M) ΦWR(M) Hypermutator mutations
1 20 CACNA1E, PTEN, TP53 3.11·1012 5.71·1030 77
2 21 LAMA2, PTEN, TP53 4.13·1012 8.05·1030 77
3 29 PTEN, RYR2, TP53 4.60·1012 4.85·1029 78
4 28 NBEA, PTEN, TP53 8.40·1012 3.32·1029 76
5 39 FAT4, PTEN, TP53 1.23·1011 3.00·1028 75
22 1 CTNNB1, RPL22, TP53 2.11·1010 1.20·1041 47
44 2 CTNNB1, KRAS, TP53 3.05·109 1.10·1037 48
55 3 CTNNB1, MLL4, TP53 4.26·108 5.78·1036 42
57 4 CTCF, CTNNB1, TP53 4.84·108 3.29·1035 43
60 5 CTNNB1, RYR1, TP53 1.14·107 9.51·1035 40

Notation as in Table 3.

On COADREAD, the WR-exclusivity test identifies ten different genes in the five most significant triples (Table 3). Nine of these genes are well-known cancer genes—BRAF, KRAS, NRAS, ACV2RA, PIK3CA, TP53, ATM, TGFBR2 and ARID1A—while the 10th gene (ABCA12) is known to have an association with colorectal cancers (Hlavata et al., 2012). The R-exclusivity test results are similar—two of the top five triples identified by the WR-exclusivity test are in the top five triples identified by the R-exclusivity test—but the R-exclusivity test does not identify ARID1A, TGFBR2, KRAS or NRAS. Further, the three additional genes identified by the R-exclusivity—APC, FAT2 and WDFY3are all in the top 600 longest genes in the human genome (at least 9560 nucleotides in the coding sequence). While mutations in APC are well-known to play a role in colorectal cancers, there is currently little evidence for the roles of FAT2 or WDFY3 in cancer, and it is likely that these long genes have accumulated many passenger mutations, particularly in hypermutated samples. Also of note is the fact that the number of hypermutator samples that contain mutations in the top triples from the WR-exclusivity test are not appreciably different from the number of hypermutator samples that contain mutations in the top triples from the R-exclusivity test. This demonstrates that the WR-exclusivity is not systematically excluding hypermutator samples from consideration, but rather weighting the contribution of these samples appropriately in evaluating the significance of mutual exclusivity.

On UCEC, the differences between the R-exclusivity and WR-exclusivity tests are even more pronounced. The WR-exclusivity test identifies seven genes in the top five most significant triples (Table 4). These include six genes with known roles in cancer—CTTNB1, TP53, RPL22, KRAS, CTCF and MLL4—with only one gene—RYR1—with likely spurious mutations. In contrast, the top five triples ranked by the R-exclusivity test include PTEN, and TP53—two well-known cancer genes—but also five genes with no known role in cancer that all have greater than 11 000 nucleotides in their coding sequences: CACNA1E, LAMA2, RYR2, NBEA and FAT4. Further, none of the top five triples identified by the WR-exclusivity test are in the top twenty R-exclusivity triples. Finally, the R-exclusivity triples include many more mutations in hypermutator samples (ranging from mutations in 75 to 78 of the 81 hypermutators, versus 40 to 47 for the WR-exclusivity triples). This further demonstrates how the results of the R-exclusivity test are skewed by hypermutator samples, while the WR-exclusivity test incorporates the contribution of these samples appropriately in evaluating the significance of mutual exclusivity.

4 Discussion

We introduce a weighted exact test for the mutual exclusivity of mutations in cancer. We use this test to approximate the permutation test for exclusivity where the number of mutations in each event and each sample are fixed. To do so, we estimate per-event, per-sample mutation probabilities directly from the permutational distribution. We derive a recursive formula and a saddlepoint approximation of the P-value of the weighted test for event sets of any size, and we demonstrate the accuracy and efficiency of the saddlepoint approximation on genome-scale mutation datasets. Together, these contributions allow us to overcome the significant computational challenge of finding highly significant sets of mutually exclusive mutations conditioned on both the observed number of mutations per-event and per-sample.

We then demonstrate the weighted test on three datasets with hundreds of samples from TCGA, including colorectal and endometrial cancers that have high variability in the number of mutations per sample. The weighted test identifies sets of mutually exclusive mutations including known cancer genes in each dataset, and its results include many fewer long genes and mutations in hypermutator samples than do the results of the generalization of Fisher’s exact test from CoMEt (Leiserson et al., 2015b).

There are several avenues for improving analyses with the weighted test. First, while we restricted our study to non-synonymous SNVs and indels, one should also test mutual exclusivity between other types of aberrations, such as copy number aberrations and gene fusions. We searched for mutually exclusive mutations by enumerating sets containing the most mutated genes, but the weighted test could easily be used in existing algorithms for optimizing mutual exclusivity scores [e.g. the MCMC from Vandin et al. (2012), the greedy approach from Babur et al. (2015)] or to search for multiple sets simultaneously [e.g. from Leiserson et al. (2015b)]. We estimated the per-event, per-sample mutation probability weights directly from the permutational distribution, but we also anticipate alternative methods for setting the weights that incorporate different event or sample attributes, such as gene length, to further reduce the number of false positives.

The weighted test may be of broader interest beyond searching for mutually exclusive mutations, both in other areas of computational biology and other disciplines. For example, statistical tests of ‘presence–absence’ matrices with fixed row and column sums are a common tool in ecology for looking at species-associations, but can be computationally prohibitive (Miklós and Podani, 2004). The weighted exact test presented here may offer a fast, alternative approach for computing the significance of associations with high accuracy.

Supplementary Material

Supplementary Data

Acknowledgments

The authors would like to acknowledge Manescu and Keich (2015) for inspiring this work, and Uri Keich (2015) for generously helping us run their code and providing comments on our manuscript.

Funding

This study was supported by the US National Institutes of Health (NIH) grants (R01HG005690, R01HG007069 and R01CA180776 to B.J.R.), NSF fellowship (GRFP DGE 0228243 to M.D.M.L.) and a Career Award at the Scientific Interface from the Burroughs Wellcome Fund, an Alfred P. Sloan Research Fellowship and an NSF CAREER Award (CCF-1053753 to B.J.R.).

References

  1. https://doi.org/10.7303/syn1710680.4.
  2. http://gdac.broadinstitute.org/runs/stddata__2016_01_28/data/THCA/20160128/gdac.broadinstitute.org_THCA.Mutation_Packager_Calls.Level_3.2016012800.0.0.tar.gz.
  3. http://www.mayomedicallaboratories.com/test-catalog/Clinical+and+Interpretive/62800.
  4. Babur Ö. et al. (2015) Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations. Genome Biol., 16, 45. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Benjamini Y., Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol., 57, 289–300. [Google Scholar]
  6. Butler R.W. (2007) Saddlepoint Approximations with Applications, volume 22 Cambridge University Press, Cambridge. [Google Scholar]
  7. Ciriello G. et al. (2012) Mutual exclusivity analysis identifies oncogenic network modules. Genome Res., 22, 398–406. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Constantinescu S. et al. (2015) TiMEx: a waiting time model for mutually exclusive cancer alterations. Bioinformatics (Oxford, England), 32, 968–75. [DOI] [PubMed] [Google Scholar]
  9. Gobbi A. et al. (2014) Fast randomization of large genomic datasets while preserving alteration counts. Bioinformatics (Oxford, England), 30, i617–i623. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Hanahan D., Weinberg R.A. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646–674. [DOI] [PubMed] [Google Scholar]
  11. Hlavata I. et al. (2012) The role of ABC transporters in progression and clinical outcome of colorectal cancer. Mutagenesis, 27, 187–196. [DOI] [PubMed] [Google Scholar]
  12. Hong Y. (2013) On computing the distribution function for the Poisson binomial distribution. Comput. Stat. Data Anal., 59, 41–51 [Google Scholar]
  13. Kim Y.A. et al. (2015) MEMCover: integrated analysis of mutual exclusivity and functional network reveals dysregulated pathways across multiple cancer types. Bioinformatics (Oxford, England), 31, i284–i292. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Kim Y.A. et al. (2016) Wesme: Uncovering mutual exclusivity of cancer drivers and beyond., doi:10.1093/bioinformatics/btw242. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Lawrence M.S. et al. (2013) Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature, 499, 214–218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Leiserson M.D.M. et al. (2013) Simultaneous identification of multiple driver pathways in cancer. PLoS Comput. Biol., 9, e1003054.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Leiserson M.D.M. et al. (2015a) Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet., 47, 106–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. Leiserson M.D.M. et al. (2015b) CoMEt: a statistical approach to identify combinations of mutually exclusive alterations in cancer. Genome Biol., 16, 160.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. Ley T.J. et al. (2013) Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia. N. Engl. J. Med., 368, 2059–2074. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Manescu D., Keich U. (2015) A symmetric length-aware enrichment test. J. Comput. Biol., 23, 508–525. [DOI] [PubMed] [Google Scholar]
  21. Miklós I., Podani J. (2004) Randomization of presence-absence matrices: comments and new algorithms. Ecology, 85, 86–92. [Google Scholar]
  22. Miller C.A. et al. (2011) Discovering functional modules by identifying recurrent and mutually exclusive mutational patterns in tumors. BMC Med. Genomics, 4, 34.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Milo R. et al. (2003) On the uniform generation of random graphs with prescribed degree sequences. arXiv Preprint cond-Mat/0312028. [Google Scholar]
  24. Mootha V.K. et al. (2003) Responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet., 34, 267–273. [DOI] [PubMed] [Google Scholar]
  25. Roberts S.A., Gordenin D.A. (2014) Hypermutation in human cancer genomes: footprints and mechanisms. Nat. Rev. Cancer, 14, 786–800. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Ruffalo M. et al. (2015) Network-based integration of disparate omic data to identify “silent players” in cancer. PLoS Comput. Biol., 11, e1004595.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Subramanian A. et al. (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl. Acad. Sci. USA, 102, 15545–15550. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Szczurek E., Beerenwinkel N. (2014) Modeling mutual exclusivity of cancer mutations. PLoS Comput. Biol., 10, e1003503.. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. The Cancer Genome Atlas Research Network. (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature, 487, 330–337. [DOI] [PMC free article] [PubMed] [Google Scholar]
  30. The Cancer Genome Atlas Research Network. (2014) Integrated genomic characterization of papillary thyroid carcinoma. Cell, 159, 676–690. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. The Cancer Genome Atlas Research Network, et al. (2013) Integrated genomic characterization of endometrial carcinoma. Nature, 497, 67–73. [DOI] [PMC free article] [PubMed] [Google Scholar]
  32. Thomas R.K. et al. (2007) High-throughput oncogene mutation profiling in human cancer. Nat. Genet., 39, 347–351. [DOI] [PubMed] [Google Scholar]
  33. Vandin F. et al. (2011) Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol., 18, 507–522. [DOI] [PubMed] [Google Scholar]
  34. Vandin F. et al. (2012) De novo discovery of mutated driver pathways in cancer. Genome Res., 22, 375–385. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Vogelstein B. et al. (2013) Cancer genome landscapes. Science, 339, 1546–1558. [DOI] [PMC free article] [PubMed] [Google Scholar]
  36. Weinstein J.N. et al. (2013) The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet., 45, 1113–1120. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Wendl M.C. et al. (2011) PathScan: a tool for discerning mutational significance in groups of putative cancer genes. Bioinformatics (Oxford, England), 27, 1595–1602. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Woiwode A. et al. (2008) PTEN represses RNA polymerase III-dependent transcription by targeting the TFIIIB complex. Mol. Cell. Biol., 28, 4204–4214. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Yeang C.H. et al. (2008) Combinatorial patterns of somatic gene mutations in cancer. FASEB J., 22, 2605–2622. [DOI] [PubMed] [Google Scholar]
  40. Zhang J. et al. (2014) Discovery of co-occurring driver pathways in cancer. BMC Bioinformatics, 15, 271.. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary Data

Articles from Bioinformatics are provided here courtesy of Oxford University Press

RESOURCES