Skip to main content
Springer logoLink to Springer
. 2017 Jun 12;75(6):1827–1840. doi: 10.1007/s00285-017-1117-6

Minimum triplet covers of binary phylogenetic X-trees

K T Huber 1, V Moulton 1,, M Steel 2
PMCID: PMC5641367  PMID: 28608005

Abstract

Trees with labelled leaves and with all other vertices of degree three play an important role in systematic biology and other areas of classification. A classical combinatorial result ensures that such trees can be uniquely reconstructed from the distances between the leaves (when the edges are given any strictly positive lengths). Moreover, a linear number of these pairwise distance values suffices to determine both the tree and its edge lengths. A natural set of pairs of leaves is provided by any ‘triplet cover’ of the tree (based on the fact that each non-leaf vertex is the median vertex of three leaves). In this paper we describe a number of new results concerning triplet covers of minimum size. In particular, we characterize such covers in terms of an associated graph being a 2-tree. Also, we show that minimum triplet covers are ‘shellable’ and thereby provide a set of pairs for which the inter-leaf distance values will uniquely determine the underlying tree and its associated branch lengths.

Keywords: Trees, Median vertex, 2-Trees, Shellability, Reconstruction

Introduction

Trees play a central role in systematic biology, and other areas of classification, such as linguistics. It is often assumed that such a tree T has a labelled leaf set X, that all vertices have degree 1 or at least three, and that there is an assignment of a positive real-valued length to each edge of T.

A classical and important result from the 1960s and 1970s asserts that any such tree T with edge lengths is uniquely determined from the induced leaf-to-leaf distances between each pair of elements of X. This result is the basis of widely-used methods for inferring trees from distance data, such as the popular ‘Neighbor-Joining’ algorithm (Saitou and Nei 1987). Moreover, when T is binary (each non-leaf vertex has degree 3) then we do not require distance values for all of the n2 pairs from X (where n=|X|), since just 2n-3 carefully selected pairs of leaves suffice to determine T and its edge lengths [see Guénoche et al. (2004); more recent results appear in Dress et al. (2012), motivated by the irregular distribution of genes across species in biological data].

This value of 2n-3 cannot be made any smaller, since a binary unrooted tree with n leaves has 2n-3 edges, and the inter-leaf distances are linear combinations of the corresponding 2n-3 edge lengths (so, by linear algebra, these values cannot be uniquely determined by fewer than 2n-3 equations).

There is a particularly natural way to select a subset of X2 for T when T is binary. Since each non-leaf vertex is incident with three subtrees of T, let us (i) select a leaf from each subtree, (ii) consider the three pairs of leaves we can form from this triple, and then (iii) take the union of these sets of pairs over all non-leaf vertices of T. This process produces a ‘triplet cover’ of T (defined more precisely below).

A triplet cover need not be of this minimum size (i.e. of size 2n-3) but in this paper we characterize when it is. Also, we show that in that case the resulting triplet cover is ‘shellable’ which implies that the inter-leaf distances defined on these pairs uniquely determine the tree and its edge lengths. These, and other results obtained along the way complement recent work into phylogenetic ‘lasso’ sets (Dress et al. 2012; Huber and Steel 2014), as well as a Hall-type characterization of the median function on trees in Dress and Steel (2009).

We begin with some definitions.

Definitions

Let X be a finite set with |X|3. We denote elements in X2 and X3 also by ab and abc, respectively, where a,b,cX are distinct. We refer to the elements in X3 as triples.

A (binary) phylogenetic X-tree is an unrooted tree T=(V,E) which has leaf set X, and for which each non-leaf vertex is unlabelled and of degree three. We let B(X) denote the set of binary phylogenetic X-trees (two such trees are regarded as equivalent if there is a graph isomorphism between them that maps leaf x in one tree to leaf x in the other tree, for all xX). In evolutionary biology, the set X usually corresponds to some collection of species or taxa.

Note that a phylogenetic X-tree T must contain at least one cherry {a,b}, that is, a and b are adjacent with the same interior vertex of T. Moreover, if |X|>3 then each tree TB(X) has at least two cherries that are vertex disjoint from each other; if T has exactly two cherries we say it is a caterpillar tree [every tree in B(X) is a caterpillar when |X|=4 or |X|=5]. When |X|=4, we say that TB(X) is a quartet, and if the two cherries of this tree are (say) {a,b} and {c,d} then we denote T by ab|cd.

We let V=V(T)V denote the set of |X|-2 interior vertices of T. Given xX where |X|4, we let T-x denote the phylogenetic (X-{x})-tree which is obtained by removing the leaf x (and its incident edge) from T and suppressing the resulting degree 2 vertex.

Suppose that T is a subset of X2, and T=(V,E)B(X). We say that a triple in X3 supports a vertex vV in T (relative to T) if we can select leaves a,b,cX, one from each connected component of the graph obtained by removing v and its incident edges from T, such that ab,ac,bcT. We call a subset TX2 a triplet cover for T if for each vertex vV there is some triple in X3 that supports v (relative to T). Note that X=ATA holds in this case. Given a non-empty subset TX2, we define the cover graph Γ(T)=(X,T) (of T) to be the graph with vertex set X and edge set T.

We illustrate these concepts in Fig. 1. For the binary phylogenetic X-tree in Fig. 1i (with X={a,,e}) the vertex v (in Fig. 1ii) is supported by the triple bce (there are three other triples that support v). If u is supported by, say, abc and w by cde then we obtain the triplet cover

T={b,c,e}2{a,b,c}2{c,d,e}2={ab,ac,bc,cd,ce,de,be}.

The corresponding cover graph Γ(T) is shown in Fig 1iii.

Fig. 1.

Fig. 1

i A tree TB(X) for X={a,b,c,d,e}; ii vertex v is supported by the triple bce (the dashed lines show the edge-disjoint paths from v to these three leaves); iii the cover graph Γ(T) corresponding to the triplet cover T obtained by taking all pairs from the triple bce that supports v and from the triples abc and cde that support vertices u and w, respectively. This triplet cover is minimal, and since its size is 7 (=2n-3 for n=|X|) it is also a minimum triplet cover for the tree (by Proposition 3)

Given a tree TB(X), a triplet cover T for T is called

  • minimal if T-{ab} is not a triplet cover for T, for any abT;

  • minimum if |T||T| for every triplet cover T for T.

These two concepts are different; there exist minimal triplet covers that are not minimum (we describe an example in the final section).

Note that it can be shown that any minimum triplet cover on X must have cardinality 2|X|-3 [by applying Theorem 1 and Proposition 1 of Dress et al. (2012)]. Moreover, there are various ways to construct triplet covers that are minimum [for example, ‘pointed covers’ (Dress et al. 2012, Theorem 7) and ‘stable triplet covers’ (Huber and Steel 2014, Theorem 1)].

Outline of main results

In this paper, we prove a structural result concerning minimum triplet covers. Namely, we prove that a set TX2 is a minimum triplet cover for a tree TB(X) if and only if the associated cover graph Γ(T)=(X,T) is a 2-tree (see Theorem 1 and Sect. 5 for the definition of a 2-tree).

Using the concepts that we develop to prove this result, we also give an independent proof [that does not require the notion of phylogenetic ‘lassos’ from Dress et al. (2012)] that any minimum triplet cover on X must have cardinality 2|X|-3 (Proposition 3). As a corollary of our structural result, we also show that if T is a minimum triplet cover for T then it is shellable for T (Proposition 4).

This corollary has two important implications. First it implies [from results in Dress et al. (2012)] that if T is a minimum triplet cover for T, then T (together with its edge lengths) can be uniquely reconstructed from the tree metric restricted to the pairs in T. Note that this can also be deduced from results in Leclerc and Makarenkov (1998) that relate 2-trees and tree metrics [see also Guénoche et al. (2004)].

Second, the corollary gives an independent proof of Dress et al. (2012), Theorem 7 and Huber and Steel (2014), Theorem 1 which state that pointed triplet covers and stable triplet covers are shellable, respectively.

The support graph

In this section we introduce a graph that can be associated to a triplet cover of a tree. Properties of this graph will be used to help prove our results later on. We begin with some further definitions.

Suppose for the following that T=(V,E)B(X). Given a subset TX2 and vV, we let Sv(T) be the subset of X3 which contains precisely those triples in X3 that support v (relative to T). We call Sv(T) the support of v (relative to T). In addition, suppose that a,b,cV are pairwise distinct. Then we call the unique vertex of T that simultaneously lies on the shortest path from a to b, from b to c, and from a to c the median of a, b, and c, denoted by medT(a,b,c). The following observation linking medians with supports will be useful.

Lemma 1

Let T=(V,E)B(X) and TX2. If abcSv(T), vV, then v=medT(a,b,c). Moreover, T is a triplet cover of T if and only if |Sv(T)|1 for all vV.

Now, given a non-empty subset TX2 and some xX, we put

T-x=T-{xa:aX-{x}andxaT}.

Put differently, T-x is the subset of T obtained by removing from T precisely those elements in T which contain x. We also define a bipartite graph G(T)=(X⨿V,E(T)), with edge {x,v}E(T), xX, vV, if xA for all ASv(T). We call G(T) the support graph associated to T. For any vertex p of G(T), we let degT(p)=degG(T)(p) denote the degree of p in G(T). In Fig. 2ii we illustrate the support graph for the triplet cover T given in Fig. 1.

Fig. 2.

Fig. 2

For triplet cover T for the example from Fig. 1 reproduced in (i), with the triple supporting an interior vertex shown in parentheses, the corresponding support graph G(T) is shown in (ii)

We now list some properties of G(T).

Proposition 1

Suppose that T and T are triplet covers of a tree T=(V,E)B(X), and that xX.

  1. If vV, then 0degT(v)3, and 1degT(x)|X|-2.

  2. If TT, then E(T)E(T). In particular, if there exists some xX with degT(x)=1, then degT(x)=1.

  3. If T is a minimal triplet cover for T, then for all abT, there exists some vV such that avb is a path in G(T).

  4. Suppose that v is the vertex adjacent to x in T. Then {v,x}E(T). Furthermore degT(x)=1 if and only if {v,x} is the only edge in G(T) that contains x.

  5. degT(x)=1 if and only if T-x is a triplet cover of T-x.

  6. If degT(x)=1, then |T||T-x|+2.

Proof

  1. The inequality degT(v)3 follows immediately from the definition of the support Sw(T) of a vertex wV and the fact that T is binary. The inequality 1degT(x) follows since xA for all ASu(T) for the vertex u that is adjacent to x in T. The inequality degT(x)|X|-2 follows from the fact that TB(X) and so has |X|-2 interior vertices.

  2. Suppose that {v,x}E(T), xX,vV. Then xA, for all ASv(T). Since Sv(T)Sv(T) as TT it follows that xA for all ASv(T). Hence, {v,x}E(T). The second statement is a trivial consequence in light of the inequality 1degT(x) from (P1).

  3. Suppose for contradiction that there exists some abT such that for all vV, we have that avb is not a path in G(T). Then for all vV there must exist some ASv(T) such that abA. Hence, T=T-{ab} is a triplet cover of T. Since TT clearly holds, we obtain a contradiction in view of the minimality of T.

  4. That {v,x}E(T) holds is an immediate consequence of the choice of v. If degT(x)=1, then since xA for all ASv(T), it follows that {x,v} is in E(T). The rest of the statement follows immediately.

  5. Suppose that T-x is not a triplet cover of T-x. Then, by Lemma 1, there exists an interior vertex u of T-x such that Su(T-x)=. Let u be the vertex in T that corresponds to u in T-x. Then as Su(T-x)=, it follows that xA for all ASu(T). Hence {x,u}E(T) and, so, degT(x)1. Moreover, if v is the vertex adjacent to x in T, then vu. By (P4), it follows that {x,v} is also an edge in E(T). Therefore degT(x)>1.

    Conversely, suppose that T-x is a triplet cover for T-x, and assume for contradiction that degT(x)2. Then there exist u,vV distinct such that xA for all ASu(T) and xB for all BSv(T). Without loss of generality, we may assume that v is the vertex in T that is adjacent to x. Let u be the vertex in T-x that corresponds to u in T. Then Su(T-x)= since xA for all ASu(T). Hence T-x is not a triplet cover for T-x, a contradiction.

  6. If v is the vertex in T adjacent to x, then Sv(T) by Lemma 1. Hence, there must be some ASv(T) with xA. But then |T-T-x|2.

We now show that any minimal triplet cover of a tree in B(X) has a size that grows linear with |X|.

Corollary 1

Suppose that T is a minimal triplet cover of some TB(X). Then

|T|3(|X|-2).

Proof

Put T=(V,E). First we observe that if B=(X⨿V,E) is a bipartite graph in which every vertex in V has degree at most 3, then the number of length 2 paths in B of the form xvy with x,yX and vV is equal to

vV|{x,v,y:x,yXandx,v,yapathinB}|.

Now, by (P3), |T| is less than or equal to the number of length 2 paths in G(T) of the form xvy with x,yX and vV. Since |V|=|X|-2, and each term in the above sum is at most 3 the corollary follows.

Multiplicities

In this section we derive some bounds for degrees of vertices in the cover graph of a triplet cover. Suppose that T is a triplet cover of TB(X). For xX we define the multiplicity μ(x)=μT(x) of x (relative to T) to be the number of elements in T that contain x [or in other words, the degree of the vertex x in the cover graph Γ(T)]. The multiplicity of T is μ(T)=min{μT(x):xX}.

The following observation relating multiplicities with degrees will be useful later.

Lemma 2

Suppose that T is a triplet cover for some tree TB(X) and xX. If μ(x)=2, then degT(x)=1.

Proof

If μ(x)=2, then x can be contained in at most one element of vVSv(T). But x must be contained in every element of Su(T) for u the vertex in V that is adjacent to x in T. Hence |Su(T)|=1, and the only edge contained in the support graph G(T) that contains x (which must exist by (P1)) is {x,u}. In particular, degT(x)=1.

We now derive some bounds for multiplicities of minimal and minimum triplet covers.

Proposition 2

Suppose that TB(X).

  1. If T is a minimal triplet cover for T, then 2μ(T)5.

  2. If T is a minimum triplet cover for T, then 2μ(T)3.

Proof

(M1):

Suppose that xX. Let v be the vertex in T adjacent to x in T. Then, as T is a triplet cover for T, by Lemma 1 there must exist some axySv(T) where a,yX-{x} are distinct. Therefore 2μ(x) for all xX and so 2μ(T).

To see that the remaining inequality holds, we show that there is some element of X that is contained in at most 5 elements of T. We use a simple counting argument based on pairs (xc) where xX is an element in some cT. By Corollary 1, |T|3(|X|-2) as T is minimal. Since each element of T contains 2 elements of X, the size of the set R of pairs (xc) is at most 6(|X|-2). On the other hand xXμ(x)=|R|. Hence, since |X|3, there must exist some xX with μ(x)5.

(M2):

We again count pairs (xc) where xc and c is an element in T. This is 2|T|=2(2|X|-3) and also equal to xXμ(x). Since 2(2|X|-3)<4|X| and |X|3, there is some xX with μ(x)3. That μ(T)2 holds follows from (M1).

A lower bound

In this section, we show that a minimum triplet cover of a tree TB(X) has size 2|X|-3. As mentioned in the introduction, this result can also be derived by applying Theorem 1 and Proposition 1 of Dress et al. (2012). However, it is of interest to have a direct proof that is independent of results concerning tree metrics.

Proposition 3

Suppose that T is a triplet cover for some TB(X). Then we have |T|2|X|-3. Moreover this bound is tight.

Proof

We use induction on n=|X|. The result clearly holds for n=3. So, suppose that the result holds for all triplet covers of trees in B(X) with 3|X|n-1.

Suppose that T is a triplet cover for a tree in B(X) with |X|=n. If there exists some aX such that degT(a)=1, then by (P5) T-a is a triplet cover for T-a. Hence, by (P6) and induction, |T||T-a|+22n-3.

So, suppose that degT(x)2 for all xX. Note that there must exist some aX with degT(a)=2 (otherwise, degT(x)3 for all xX implies that there is a vertex vV with degT(v)4, which contradicts (P1)). Suppose that v,uV are distinct with {a,v},{a,u} in E(T). Then there exist distinct elements b,c,x,yX-{a} with {b,x}{c,y} such that abxSv(T) and acySu(T). Put C:={b,x}{c,y}. Then since {b,x}{c,y} it follows that |C|<2 and so we consider the two possible cases (|C|=1 and |C|=0).

Case 1:

|C|=1. Without loss of generality we may assume x=c and yb. Then it is straight-forward to see that without loss of generality, v is adjacent to a in T, u lies on the path in T between v and c, and T restricted to the set {a,b,c,y} is the quartet ab|cy. Note that byT since otherwise bcySu(T) which contradicts {a,u}E(T).

Consider the triplet cover T=T{by} of T. Then acy,bcySu(T). Hence, since E(T)E(T) by (P2), degT(a)=1. Therefore, by (P5), T-a is a triplet cover of T-a. But the elements abacay of T are not contained in T-a and, so,
|T-a|+3|T|=|T|+1.
The fact that |T|2|X|-3 holds now follows immediately by induction.
Case 2:

|C|=0. Then xc and yb. Without loss of generality, we can assume that v is adjacent to a in T, and that T restricted to the set {a,b,c,y,x} is a caterpillar tree with cherry {a,x}. We consider the case where {y,c} is also a cherry in this caterpillar tree and u is adjacent to both y and c in T. The argument for the remaining case (where {b,y} or {b,c} is also a cherry) is similar.

First note that if bcT, then byT, since otherwise bycSu(T) which would contradict {a,u}E(T). Similarly if cxT, then yxT. Hence, by symmetry, we can assume that T does not contain at least one element from the set {bc,by} and at least one element from the set {cx,yx}. Now, let P be a subset of {bc,by,cx,yx}-T of minimum size such that TP contains precisely one of the sets {bc,by} or {cx,yx}, noting that |P|2. Consider the triplet cover T=TP of T. Then it is easily seen that degT(a)=1, and so by (P5) T-a is a triplet cover of T-a. But the elements abacaxay of T are not contained in T-a and so

|T-a|+4|T|=|T|+|P||T|+2.

The fact that |T|2|X|-3 holds now follows by induction.

The fact that the bound is tight follows since for every TB(X) there exists some triplet cover of T with cardinality 2|X|-3 [e.g. a pointed cover Dress et al. (2012)].

A characterization of minimum triplet covers

In this section, we prove our main result, namely a characterization of minimum triplet covers in terms of the structure of their cover graphs. First, we recall that a graph H=(V,E) is called a 2-tree if there exists an ordering v1,v2,,vm of V such that {v1,v2}E and, for i=3,,m, the vertex vi has degree 2 and belongs to a unique triangle in the subgraph induced by H on the set {v1,v2,,vi} (Guénoche et al. 2004, p. 235). It is easily seen that a 2-tree has treewidth at most 2, and conversely, every graph of treewidth at most 2 is a subgraph of a 2-tree.

Theorem 1

Suppose that T is a triplet cover for a tree TB(X). Then T is minimum triplet cover if and only if Γ(T) is a 2-tree.

Proof

Put T=(V,E). Suppose that Γ(T) is a 2-tree. Then since 2-trees on n vertices have 2n-3 edges (Leclerc and Makarenkov 1998, p. 227) and |X|=n, we have T=2|X|-3. So T is a minimum triplet cover for T.

Conversely, suppose that T is a minimum triplet cover for some tree TB(X). We shall prove that Γ(T) is a 2-tree by induction on n=|X|. If |X|=3,4 it is clearly true. Suppose the statement holds for all X with 3|X|n-1.

Let T be a minimum triplet cover for T on X with n=|X|. Note that, by (M2), μ(T) equals 2 or 3. Also, note that T must be a minimal triplet cover for T.

Suppose that μ(T)=2. Let xX be such that μ(x)=2. Then there exist a,bX-{x} with xa,xbT. Consider the vertex vV(T) adjacent to x in T (as shown in Fig. 3i). Then as T is a triplet cover, and xaxb are the only elements in T containing x, it follows that Sv(T)={xab}.

Fig. 3.

Fig. 3

Figures for the proof of Theorem 1. i Leaf x and the other two leaves that form the triple in Sv(T); ii the tree T-x obtained from T by restricting this tree to X-{x}; iii the labelling of additional vertices in the case where μ(T)=3. Squiggly lines denote paths in T

Hence, abT. It follows that T:=T-{xa,xb} is a triplet cover for T-x (see Fig. 3ii) and since |T|=2|X|-3, it follows that |T|=2(|X|-1)-3 and so T is a minimum triplet cover for T-x. Since T-x has one fewer leaf than T, we can apply the induction hypothesis and conclude that Γ(T) is a 2-tree. Then, since Γ(T) is obtained from Γ(T) by attaching x to the endpoints of the edge {a,b} in Γ(T), it follows that Γ(T) is also 2-tree.

Now suppose that μ(T)=3. We shall show that this is not possible, from which the theorem follows. Let xX be such that μ(x)=3 and let vV denote the vertex adjacent to x in T. Then since T is a minimal triplet cover for T there must exist a,bX-{x} distinct such that xabSv(T). Moreover, as μ(x)=3 there must exist some cX-{x,a,b} with xcT. Since we also have xa,xbT, and since T is a minimum triplet cover, it follows that bcT.

Without loss of generality, assume T restricted to xabc is the quartet xa|bc (notice that we have symmetry involving a and b, and the quartet cannot be xc|ab because of the assumption that xabSv(T) where v is the vertex adjacent to x in T), as shown in Fig. 3iii. Let wV be such that w=med(x,b,c).

We claim that acT. Assume for contradiction that acT. Since T is minimal and xcT, there exists some vertex uV and some ASu(T) such that xcA. Note that as μ(x)=3, we must have u{v,w}. If u=v then T-{xc} is a smaller minimum triplet cover for T (since v is still supported by abx), and this contradicts the minimality of T. Thus we may assume that u=w, in which case there is a set ASw(T) with xcA. Since μ(x)=3 and we already have ax,bx,cxT it follows that A=xbcSw(T) which implies that bcT. However, as we already have abT, the additional assumption that acT means that T-{xc} contains abacbc which provides an alternative set, namely abc in Sw(T), in which case T-{xc} remains a triplet cover for T. But again this contradicts the minimality of T. Thus, acT, as claimed.

Therefore, in summary, xa,xb,xc,ab,bcT and acT. We claim next that T=T-{xb}{ac} is a triplet cover for T. Indeed, if xb is contained in some element in Su(T) for some uV, then since μ(x)=3 we must have u{v,w}. Since acxSv(T) and abcSw(T) it follows that T must be a triplet cover for T, as claimed.

To complete the proof, note that since μT(x)=2, Lemma 2 implies degT(x)=1. Hence, by (P5), T-x=T-{xa,xc} is a triplet cover of T-x. Since T-x has one fewer leaf than T we can apply the induction hypothesis and conclude that the graph Γ(T-x)=(X-{x},T-x) is a 2-tree. Since any 2-tree has at least two vertices with degree 2 (Leclerc and Makarenkov 1998, p. 227), it follows that in Γ(T-x) at least one of the two vertices a or c has degree 2 (since there cannot be a vertex yX-{x,a,b,c} such that the degree of y in Γ(T-x) is equal to 2 as, by assumption, μ(T)=3). But if, without loss of generality, the degree of a in Γ(T-x) is equal to 2, then μT(a)=2 must hold too which contradicts μ(T)=3. This completes the proof.

The next result follows immediately from the last theorem and the fact that any 2-tree has at least two vertices with degree 2 [see e.g. Leclerc and Makarenkov (1998), p. 227]. It improves on the bound given in Proposition 2 (M2).

Corollary 2

If T is a minimum triplet cover for some tree TB(X) then μ(T)=2.

Note that a 2-tree is a 2d-tree, but not necessarily conversely [Guénoche et al. (2004), Proposition 3.4] (a graph G=(V,E) is called a 2d-tree if there exists an ordering x1,x2,,xn of V such that {x1,x2}E and, for i=2,,n the vertex xi has degree 2 in the subgraph of G induced by {x1,x2,,xi}). So Theorem 1 can be used to strengthen Theorem 1 of Huber and Steel (2014).

Shellings

Given a triplet cover T of a tree TB(X), we say that T is T-shellable if there exists an ordering of the elements in X2-T, say a1b1,a2b2,,ambm such that for every 1im, there exists a pair xi,yi of distinct elements in X-{ai,bi} such that the restriction of T to the set Yi={ai,bi,xi,yi} is the quartet xiai|yibi, and all elements in Yi2 except aibi are contained in Ti=T{ajbj:1ji-1}. If T is clear from the context then we sometimes just say that T is shellable, and we refer to the ordering of X2-T as a shellable ordering.

Although this combinatorial definition of shellability seems somewhat involved, its motivation rests on it being a sufficient condition for recursively determining the distances between all pairs of leaves (when the edges of T are assigned arbitrary positive edge lengths) starting with just the distance values for the pairs in the triplet cover. In other words, if a triplet cover T of a tree TB(X) is shellable then the pairs of elements from X that are not already present in T can be ordered in a sequence so that the distance in T between the leaves in each pair is uniquely determined from the distances values on pairs that are either (i) present as an element of T or (ii) appear earlier in the sequence.

For example, for the tree T shown in Fig. 1i, and the triplet cover T consisting of the 7 pairs of elements of X that form the edges of Γ(T) in Fig. 1iii, there are just three pairs from X2 that are not present in T, namely adaebd. Ordering the pairs as a1b1=ae,a2b2=ad,a3b3=bd provides a shellable ordering, since for ae we can select x1y1=bcT and observe that x1a1|y1b1=ba|ce is the quartet obtained by restricting T to {a,b,c,e}, the distance between a1=a and b1=e in T is determined uniquely by the five other distances involving pairs from {a,b,c,e}, and these five pairs are present in T. Having determined the distance for a1b1 one can now use this (and the distances for pairs in T) to compute the distance value for the pair a2b2 and, subsequently, for the pair a3b3.

We now gather together some facts concerning the shellability of triplet covers, including shellability of minimum triplet covers.

Proposition 4

  1. Suppose that TB(X), xX, and T is a triplet cover of T such that T-x is a triplet cover of T-x. If T-x is (T-x)-shellable, then T is T-shellable.

  2. Suppose that T, T are triplet covers of some tree TB(X) and that TT. If T is T-shellable, then so is T.

  3. If T is a minimum triplet cover for a tree TB(X), then T is T-shellable.

Proof

(S1): Put T=(V,E). Suppose xX such that T-x is a triplet cover of T-x which is shellable. Suppose that vV is the vertex in T that is adjacent to x in T. Then there must exist a,bX-{x} distinct with xabSv(T). Let T(x)={deT:x{d,e}} and T(x)={deX2:x{d,e}anddeT(x)}, so that T=T-x⨿T(x) and

X2-T=X-{x}2-T-x⨿T(x).

Since T-x is (T-x)-shellable, there is a shellable ordering of X-{x}2-T-x so that all of the elements in that set can be added into T-x to obtain X-{x}2.

To complete the shellable ordering it remains to add the elements of X2 that contain x to the ordering so far constructed. We consider two cases. First, suppose that neither {x,a} nor {x,b} form a cherry of T. Then for all pxT(x), without loss of generality, the quartet induced by T on {x,a,b,p} is ap|xb. Since we have that xaxbab as xabSv(T) and also ap and bp as we have all elements in X-{x}2, it follows that we can add in xp as a next element of the shellable ordering. We can repeat this adding-in process for all remaining elements in T(x) (in any order) to obtain X2. So T is T-shellable in this case.

Second, suppose without loss of generality that {x,a} forms a cherry. Then if pxT(x), then the quartet induced by T on the set {x,a,b,p} is xa|bp. So, using similar arguments as in the previous case, we can add in xp as a next element in the shellable ordering. It follows that we can repeat this process for all remaining elements in T(x) (in any order) to obtain a shellable ordering of X2. So T is T-shellable in this case too.

(S2): This follows immediately from the definition of shellability.

(S3): We proceed using induction on n=|X|. For n=4 the statement is clearly true. Suppose the statement is true up to and including n-14.

Let T be a triplet cover for some binary phylogenetic X-tree with |X|=n. By Corollary 2, μ(T)=2. Suppose that xX with μ(x)=2. Then, by Lemma 2, degT(x)=1. By (P4) it follows that T-x is a triplet cover for T-x. Note that T-x is minimum since |T-x|=|T|-2. Thus by induction T-x is (T-x)-shellable. Therefore, T is T-shellable by (S1).

Corollary 3

For any tree TB(X), suppose that T is a minimum triplet cover for T. Consider any assignment of strictly positive lengths to the edges of T, and the resulting assignment of inter-leaf distances on the pairs from T. This function from T to R>0 uniquely determines T and its edge lengths, since no different tree TB(X) can induce the same inter-leaf distances on pairs from T under any positive weighting of the edges of T.

Proof

This follows immediately from Part (S3) of Proposition 4, combined with Theorem 6 of Dress et al. (2012).

Note that there are examples of sets TX2 having cardinality 2|X|-3 that determine T and any set of positive edge lengths from inter-leaf distances, but which are not T-shellable (see Example 1).

Example 1

Put X={a,b,c,d,e,f,g} and let T be the caterpillar tree with exactly two cherries {a,b},{f,g} and intermediate leaves cde (as shown in Fig. 4ii). Put T={ab,ad,bc,be,cd,cf,de,dg,ef,fg,ag}. Then T determines T and any set of positive edge lengths from inter-leaf distances, but it is not T-shellable (Dress et al. 2012, Example 6.2).

Fig. 4.

Fig. 4

i A phylogenetic X-tree with X={a,,h}. The set T={ab,ac,bc,cd,bd,ce,de,df,ef,ah,ag,fg,fh,gh} is a minimal triplet cover but not a minimum one. The set Sv(T) associated with each interior vertex v of T generates the following sequence (from left-most to right-most interior vertex): abcfghcbdfghdecedf. (ii) A phylogenetic X-tree with X={a,g} for which the set T={ab,ad,bc,be,cd,cf,de,dg,ef,fg,ag} determines T along with an assignment of positive edge lengths from the induced inter-leaf distances, yet T is not shellable

Conclusion and open problems

As mentioned earlier, there are examples of minimal triplet covers T that are not minimum. The following provides a specific example.

Example 2

Let X={a,b,c,d,e,f,g,h} and T be the phylogenetic X-tree having cherries {a,b},{e,f} and leaves, starting with cherry {a,b}, labeled in the order gchd (see Fig. 4i). Let

T={ab,ac,bc,cd,bd,ce,de,df,ef,ah,ag,fg,fh,gh}.

Then T is a minimal triplet cover for T. Since |T|=142|X|-3 it follows that T is not minimum.

An interesting problem would be to investigate the structure of the cover graph for minimal triplet covers.

Our results also suggest further questions for future work.

  • (i)

    There are formulae for counting the number of labeled 2-trees (Moon 1969). Is there a formula for counting the number of minimum triplet covers for a given phylogenetic X-tree?

  • (ii)

    We have shown that minimum triplet covers are shellable. It would be interesting to see how far this result extends. For example, is every triplet cover shellable? Understanding the structure of minimal triplet covers might help to shed light on this question.

Acknowledgements

We thank the two anonymous reviewers for helpful comments, particularly Reviewer 1 for numerous helpful suggestions. MS thanks the Allan Wilson Centre for helping fund this work. VM and KTH thank the London Mathematical Society for helping to fund their visit to the University of Canterbury, Christchurch.

Contributor Information

K. T. Huber, Email: K.Huber@uea.ac.uk

V. Moulton, Email: V.Moulton@uea.ac.uk

M. Steel, Email: mike.steel@canterbury.ac.nz

References

  1. Dress A, Steel M. A hall-type theorem for triplet systems based on medians in trees. Appl Math Lett. 2009;22:1789–1793. doi: 10.1016/j.aml.2009.07.001. [DOI] [Google Scholar]
  2. Dress A, Huber KT, Steel M. ‘Lassoing’ a phylogenetic tree I: basic properties, shellings and covers. J Math Biol. 2012;65:77–105. doi: 10.1007/s00285-011-0450-4. [DOI] [PubMed] [Google Scholar]
  3. Guénoche A, Leclerc B, Makarenkov V. On the extension of a partial metric to a tree metric. Discrete Math. 2004;276:229–248. doi: 10.1016/S0012-365X(03)00294-2. [DOI] [Google Scholar]
  4. Huber KT, Steel M. Reconstructing fully-resolved trees from triplet cover distances. Electron J Comb. 2014;21(2):P2.15. [Google Scholar]
  5. Leclerc B, Makarenkov V. On some relations between 2-trees and tree metrics. Discrete Math. 1998;192:223–249. doi: 10.1016/S0012-365X(98)00073-9. [DOI] [Google Scholar]
  6. Moon J. The number of labeled k-trees. J Comb Theory. 1969;6:196–199. doi: 10.1016/S0021-9800(69)80119-5. [DOI] [Google Scholar]
  7. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4:406–425. doi: 10.1093/oxfordjournals.molbev.a040454. [DOI] [PubMed] [Google Scholar]

Articles from Journal of Mathematical Biology are provided here courtesy of Springer

RESOURCES