Skip to main content
Springer logoLink to Springer
. 2018 Mar 27;81(2):598–617. doi: 10.1007/s11538-018-0419-1

Phylogenetic Flexibility via Hall-Type Inequalities and Submodularity

Katharina T Huber 1,, Vincent Moulton 1, Mike Steel 2
PMCID: PMC6342847  PMID: 29589255

Abstract

Given a collection τ of subsets of a finite set X, we say that τ is phylogenetically flexible if, for any collection R of rooted phylogenetic trees whose leaf sets comprise the collection τ, R is compatible (i.e. there is a rooted phylogenetic X-tree that displays each tree in R). We show that τ is phylogenetically flexible if and only if it satisfies a Hall-type inequality condition of being ‘slim’. Using submodularity arguments, we show that there is a polynomial-time algorithm for determining whether or not τ is slim. This ‘slim’ condition reduces to a simpler inequality in the case where all of the sets in τ have size 3, a property we call ‘thin’. Thin sets were recently shown to be equivalent to the existence of an (unrooted) tree for which the median function provides an injective mapping to its vertex set; we show here that the unrooted tree in this representation can always be chosen to be a caterpillar tree. We also characterise when a collection τ of subsets of size 2 is thin (in terms of the flexibility of total orders rather than phylogenies) and show that this holds if and only if an associated bipartite graph is a forest. The significance of our results for phylogenetics is in providing precise and efficiently verifiable conditions under which supertree methods that require consistent inputs of trees can be applied to any input trees on given subsets of species.

Keywords: Phylogenetic tree, Set systems, Partial taxon coverage, Bipartite graph, Hall’s marriage theorem, Submodularity

Introduction

In phylogenomics, biologists often encounter the following problem: Given a collection τ of different subsets of species, the corresponding phylogenetic trees—each one reconstructed from the genomic data available for the corresponding subset—cannot be consistently combined into a single phylogenetic tree for all the species. When this occurs, various heuristic and somewhat ad hoc ‘supertree’ methods (such as ‘matrix recoding with parsimony’) are often applied to provide some estimate of a parent tree (Felsenstein 2004). However, when the collection of subsets of species has sufficiently sparse overlap (in a sense we will make precise shortly), then any phylogenetic tree assignment for τ will lead to a set of trees that can be consistently combined into a parent tree. Figure 1i provides an example of this.

Fig. 1.

Fig. 1

A collection τ={{a,b,c},{a,b,d},{b,c,e},{d,e,f}} of four sets that is phylogenetically flexible and (ii) a collection τ=τ{{b,d,e}} that fails to be phylogenetically flexible. In (i) all of the 34=81 choices of rooted triples (one for each of the four leaf sets) give a set of rooted triples that is displayed by at least one rooted phylogenetic tree on the six leaves a,b,,f. However, in (ii) this fails, for example, the set of rooted triples ab|cbd|abc|edf|e together with be|d (for the fifth set) is not displayed by any tree on the six leaves. The set τ is thin, but τ is not, since it has a subset (namely τ itself) which has strictly negative excess (equal to -1)

In this paper, we investigate the conditions under which the existence of a consistent parent tree can be guaranteed regardless of the tree structure for each subset. Here ‘parent’ tree means that the leaf set of the tree is the union of the leaf sets of the input trees. For example, given a set of input trees, if there is a parent tree that displays each tree, then a simple, fast and well-known algorithm due to Aho et al. (1981) constructs such a tree in a canonical way. However, this method will fail to return any phylogenetic tree when presented with input trees that are incompatible (i.e. cannot be displayed by any parent tree). In this paper, we characterise when such a method will always be safe to use on any set of input trees, given the sets of taxa that form the leaf sets of those trees. Thus, we consider as input just subsets of species and develop mathematical characterisations and algorithms for this combinatorial question in the special case where each subset has a fixed (small) size. Later in the paper, we consider how the results extend to more general set systems. Our approach throughout is to reduce certain combinatorial questions in phylogenetics to the study of systems of inequalities involving linear expressions and related submodularity properties.

In discussion section, we mention a further biological context where the results may be relevant. Note that there are many reasons why phylogenetic trees are constructed on different subsets of species, and a particularly topical one is that genes used to estimate a given phylogeny may only be present (or have been sequenced) in a given subset of the species, and these subsets vary from gene to gene (Sanderson et al. 2011).

Our work is motivated in part by a remarkable combinatorial result by Grünewald (2012) involving unrooted binary trees. In that paper, a set P of binary trees having leaves labelled from some set X is said to be ‘slim’ if for every non-empty subset P of P, the number of leaves appearing in at least one tree in P is at least the total number of interior edges of T plus 3. Theorem 1.1 of Grünewald (2012) then states that for any such thin collection P there is a tree with leaf set X that ‘displays’ each of the trees in P. In particular, this leads to the rather striking consequence that ‘the property of being slim only depends on the involved leaf sets of the trees and not on which phylogenetic tree is chosen for a fixed leaf set’ (Grünewald 2012, p. 324). In this paper, we explore this notion further, and by working with rooted trees (rather than unrooted ones) we are able to establish precise characterisations of the analogous ‘slim’ property.

Our work is also partly motivated by results from Dress and Steel (2009) where slim-type properties also arise in a tree-based setting, but for a quite different question involving ‘median’ vertices. To explain this, given a tree T=(V,E) and a subset S of V of size 3, say S={x,y,z}, consider the path in T connecting xy, the path connecting xz and the path connecting yz. There is a unique vertex that is shared by these three paths, the median vertex of S in T, denoted medT(S). In Dress and Steel (2009), the authors show that ‘slim’-type properties characterise when a set of triples from X can be realised as providing an encoding of the interior vertices of a (unrooted) tree with leaf set X. (An extension of this to sets of subset of X of size greater than 3 is also described.) In this paper, we extend this result further by showing that the tree that provides this encoding can be chosen to have a particular special type of structure (a ‘caterpillar’).

The phylogenetic combinatorics of subsets of a species set is a topic that has also been explored recently in the setting of ‘phylogenetic decisiveness’ (Steel and Sanderson 2010). However, the questions that we consider here are quite different from that setting; rather than requiring a dense overlap of the species subsets in the phylogenetic decisiveness setting, here we investigate sparse overlap.

We begin with some definitions. Throughout this paper, X will denote a fixed finite set.

Thin Set Systems

Suppose τ is a non-empty subset of Xr,r2. Let L(τ)=sτs (i.e. the set of elements of X that appear in at least one set in τ) and define the excess of τ, denoted exc(τ), by:

exc(τ)=|L(τ)|-|τ|-(r-1).

We say that τ is thin if, for all non-empty subsets τ of τ, we have:

exc(τ)0.

This notion appears in related but slightly different settings, namely for the leaf sets of unrooted trees in Grünewald (2012), in the median representation of sets of triples in Dress and Steel (2009), and as sparse triplet covers in Grünewald et al. (2017).

In the following lemma, recall that a collection of (not necessarily distinct) sets {B1,B2,,Bm} has a system of distinct representatives if one can select an element xiBi for each i{1,,m} so that the elements x1,x2,,xm are all distinct. For τ a non-empty subset of Xr,r2 with L(τ)=X and for xX let nτ(x) be the number of elements in τ that contain x.

Lemma 1

Let τ be a non-empty subset of Xr, r2 and L(τ)=X. If τ is thin, then the following properties hold:

  • (i)

    |τ|n-r+1 where n=|X|.

  • (ii)

    For some xX, nτ(x)r-1.

  • (iii)

    For any subset B of X of size r-1, the collection of sets {S-B:Sτ} has a system of distinct representatives.

Proof

Part (i) follows from the defining condition for thin upon taking τ=τ.

Part (ii) can be established by the following double-counting argument. Suppose that there is no element xX with nτ(x)r-2, so that nτ(x)r-1 for all xX. Let Ω={(x,S):xSτ}. We then have:

|Ω|=xXnτ(x)(r-1)k+r(n-k) 1

where k=|{xX:nτ(x)=r-1}|. On the other hand:

|Ω|=r|τ|r(n-(r-1)), 2

where the inequality is from Part (i). Combining (1) and (2) gives kr(r-1) and, so, k2. By the definition of k, (ii) follows.

For Part (iii), consider the union of any l sets A1,A2,,Al where Ai=Si-B and Siτ for i=1,,l. (Note that these sets may have different sizes and a set may occur more than once.) Since τ is thin, |i=1lSi|l+(r-1), and so, since B has size r-1, |i=1lAi|=|i=1lSi|-(r-1)l. Since the inequality |i=1lAi|l holds for all 1l|τ|, Hall’s marriage theorem (Hall 1935) ensures that τ has a system of distinct representatives.

For the first part of this paper, we will deal with the case where r=3. However, the main theorem in this setting (Theorem 1) will be used in Sect. 4 to derive a result for the more general case where the sets have different sizes. When r=3, notice that if |τ|=1, then exc(τ)=3-1-2=0; however, if |τ|=2, then exc(τ)4-2-2=0, so it suffices, in the definition of thin, to consider subsets of τ of τ of size at least 3.

A simple way to generate a thin set is to take any ordered sequence of subsets of X of size 3, for which the ordered sequence has the property that each member contains at least one element of X that is not present in any earlier member of the sequence. However, not all thin sets can be obtained in this way. For example, consider the collection {{a,b,c},{c,d,e},{b,e,f},{a,d,f}} of four subsets sets of X={a,b,,f}. This collection of subsets is thin, yet these four sets cannot be ordered so as to satisfy the property described.

Phylogenetic Trees and Flexible Sets

Following Semple and Steel (2003), a rooted phylogenetic treeT is a rooted tree having a set L(T) of labelled leaves (vertices of out-degree 0) and for which every non-leaf vertex is unlabelled and has out-degree at least 2. We let ρT, or more briefly ρ denote the root vertex of T, which has in-degree 0. In case each non-leaf vertex has out-degree exactly 2, we say that T is binary. If L(T)=X, we will also say that T is a rooted phylogenetic X-tree. We let Inline graphic denote the set of interior (i.e. non-leaf) vertices of T. Similarly, an unrooted phylogenetic treeT is an unrooted tree having a set L(T) of labelled leaves (vertices of degree 1) and for which every non-leaf vertex is unlabelled and has degree at least 3. In case each non-leaf vertex has degree exactly 3, we say that T is binary. If L(T)=X, we will also say that T is a unrooted phylogenetic X-tree.

A rooted triple is a rooted binary phylogenetic tree on three leaves, and we denote such a tree as ab|c if it has leaf set {a,b,c} with leaf c adjacent to the root. A rooted phylogenetic X-tree T is said to display the rooted triple ab|c if some subdivision of the tree ab|c is a subgraph of T.

A cherry in a (rooted or unrooted) phylogenetic tree is a pair of leaves that is adjacent to the same vertex. A rooted (respectively, unrooted) caterpillar tree on X is a rooted (resp. unrooted) binary phylogenetic X-tree for which the number of cherries is at most 1 (respectively, 2).

These notions are illustrated in Fig. 2.

Fig. 2.

Fig. 2

(i) A rooted phylogenetic tree on leaf set {a,b,c,,f}. This tree is not binary, as it has a vertex of out-degree 3 (adjacent to a and d). (ii) The rooted triple ab|c for which ab forms a cherry. This rooted triple is also a rooted caterpillar and it is displayed by the tree in (i)

A set R of rooted triples chosen from X is said to be compatible if there is a rooted phylogenetic X-tree T that displays each rooted triple in R (in which case, we say that TdisplaysR). Note that if R is compatible, then T can always be chosen to be a binary tree and R can contain at most one tree for any triplet (i.e. at most one of ab|c, ac|b, and bc|a can be present in R).

Suppose that we have a set R of rooted triples with leaves chosen from X. We will let ||R|| denote the subset of X3 consisting of the leaf sets of the trees in R. We say that a non-empty subset τ of X3 is phylogenetically flexible if every set R of rooted triples for which ||R||=τ holds is compatible. An example to illustrate this notion is provided in Fig. 1.

The following observation that phylogenetic flexibility is hereditary is straightforward to check.

Lemma 2

Suppose τ is a non-empty subset of X3 that is phylogenetically flexible. If τ is a non-empty subset of τ, then τ is phylogenetically flexible.

Characterisation Result

We can now state our first main result.

Theorem 1

Suppose that τ is a non-empty subset of X3. Then τ is phylogenetically flexible if and only if τ is thin.

The ‘if’ direction of Theorem 1 can be established by applying Theorem 1.1 of Grünewald (2012); however, we give a shorter and more direct proof of this direction here (as well as establishing the converse). We begin with some preliminary results, which are required for the argument.

Given a rooted phylogenetic tree T=(V,E) with leaf set X and every vertex in Inline graphic having degree three. We say that a rooted triple xy|zsupports a vertex v in T if xy|z is displayed by T and v=lcaT(x,y).

For a set R of rooted triples on X, put L(R)=tRL(t). Furthermore, for a non-empty subset S of X, let [RS] be the graph with vertex set S and with an edge {a,b} if and only if there exists a rooted triple ab|cR for at least one element cS. By Bryant and Steel (1995, Theorem 2), R is compatible if and only if the graph [RS] is disconnected for all subsets S of X of size at least 2.

Lemma 3

Suppose that T is a rooted binary phylogenetic X-tree, that R is a set of rooted triples with L(R)=X, and that each rooted triple supports a unique (interior non-root) vertex in T. Then, the graph [RX] has precisely two connected components.

Proof

For Inline graphic, let Xv be the leaf set of the rooted subtree of T with root v. We claim that for every such v, the graph induced by [RX] on Xv is connected. The lemma then follows immediately by considering the graphs induced by [RX] on Xu,Xw for u and w the children of the root of T.

To prove the claim, for u (a child of the root ρT of T), we consider the following set:

Xu={Xv:visaninternalvertexofTbeloworequaltou},

where v is said to be belowu if u lies on the path from ρT to v. Note that since |X|3, there must exist a child u of ρT such that Xu and also there exists some vertex vV(T) below or equal to u such that |Xv|2. We use induction on |Xv| for Xv in Xu. If |Xv|=2, then both children of v are leaves and the lemma holds because if Xv={p,q}, then, by assumption, there exists a rooted triple in R of the form r|pq for some rX-{p,q} that supports v. Hence, there is an edge {p,q} in [RX], and therefore, the graph induced by [RX] on Xv is connected.

Now suppose that v is an internal vertex of T below or equal to u such that |Xv|3. Then at least one of the two children v1 and v2 of v is not a leaf of T. Without loss of generality, we may assume that v1 is that child. Therefore, 2|Xv1|<|Xv| and so, by induction, the graph induced by [RX] on Xv1 is connected. If v2 is not a leaf of T, then the same arguments as before imply that the graph induced by [RX] on Xv2 is also connected. If v2 is a leaf of T, then the graph [RX] on v2 is a vertex and therefore is (trivially) connected. Since, by assumption, there exists a rooted triple in R that supports v, there is an edge {y,z} in [RX] with yXv1 and zXv2. Hence, the graph induced by [RX] on Xv is connected.

Proof of Theorem 1

We first establish the ‘if’ direction. Suppose that τ is thin, and let R be a set of rooted triples with leaves chosen from X with ||R||=τ. We show that any such choice of R is compatible.

We will establish the compatibility of R via the aforementioned characterisation that R is compatible if and only if [RS] is disconnected for all subsets S of X of size at least 2. To that end, let S be a subset of X of size at least two.

Notice that [R,S]=[RS,S] where RS is the subset of those rooted triples in R that have all three of their leaves in S. Let τ=||RS||. Since τ is thin, we have exc(τ)0, in other words:

|L(τ)|-|τ|2. 3

Now (i) the number of vertices of [RS] is |S| and |S||L(τ)|; (ii) the number of edges of [RS] is at most |RS|=|τ|. Thus, by Inequality (3), the number of vertices of [RS] minus the number of edges of this graph is at least 2. But any finite graph with this property must be disconnected. Since this holds for all subsets S of X of size at least two, it follows that R is compatible.

We turn now to the ‘only if’ direction.

We use induction on |τ|. If |τ|=1, then τ is clearly thin. So, suppose the ‘only if’ direction holds for all τX3 with 1|τ|<m, some m2, and let τX3 such that |τ|=m. Without loss of generality, we may assume that X=L(τ).

Suppose that τ is a non-empty proper subset of τ. By Lemma 2, τ is phylogenetically flexible. Hence by induction, τ is thin. Thus, |L(τ)||τ|+2. To show that τ is thin, it therefore suffices to prove that |L(τ)||τ|+2.

Suppose for the purposes of obtaining a contradiction that |L(τ)|<|τ|+2. Let {x,y,z}τ and set τ=τ-{{x,y,z}}. Then, as τ is thin by induction,

|τ|+2>|L(τ)|=|L(τ)|+(3-|L(τ){x,y,z}|)|τ|+4-|L(τ){x,y,z}|. 4

Hence |L(τ){x,y,z}|>2 and, so, {x,y,z}L(τ). Thus, L(τ)=X.

Now, since τ is thin, there exists a (unrooted) phylogenetic tree T=(V,E) with leaf set X, and all vertices in Inline graphic of degree 3, for which the map Inline graphic is one to one (Dress and Steel 2009) (see also Sect. 3). We claim that the map medT must in fact be bijective. Suppose that this is not the case. Then, there exists some Inline graphic such that medT(s)v, for all sτ. Hence, Inline graphic and, so, |X|-1>|τ|. But then |X|+1>|τ|+2>|L(τ)|=|X|, which is impossible as |τ|+2 is an integer. Hence, medT is a bijection as claimed.

Now, root the tree T by inserting a root vertex ρ into an edge which separates xy from z, when the edge is removed from T. Let R be a set of rooted triples induced by the map medT (for each element {a,b,c} in τ,medT maps to some Inline graphic so that we get a rooted triple with leaf set {a,b,c} which supports v in the rooted version of T) with ||R||=τ and L(R)=X. Since medT is a bijection, R satisfies the conditions of Lemma 3 for the rooted version of T. Hence, the graph [R,X] has two connected components, one that contains xy in its vertex set and the other that contains z.

Now consider the set of rooted triples R=R{y|zx}. Then L(R)=X, [RX] is connected and so R is not compatible, and ||R||=τ. But this is impossible, since τ is phylogenetically flexible.

The following corollary of Theorem 1 is now immediate from Lemma 1(i).

Corollary 1

If a non-empty subset τ of X3 is phylogenetically flexible, then |τ|n-2 where n=|X|.

We end this section by considering how many trees can display a set of rooted triples R when ||R|| is phylogenetically flexible. It might be suspected that since the overlap between the leaf sets of the trees in R is sparse, the number of trees displaying R would need to be large. Indeed, this is sometimes the case; for example, suppose that the leaf sets in R are all disjoint, so the total number of leaves is given by n=3k, where k=|R|. In this case, the number N of rooted binary trees on n leaves that display R is given by:

N=(2n-3)!!3n/3, 5

which grows exponentially with n. The proof of Eq. (5) is to observe that each of the 3k ways to select a rooted triple from the k triples in ||R|| provides a set of rooted triples that is displayed by at least one rooted phylogenetic tree [by the algorithm from Aho et al. (1981)] and hence by at least one rooted binary tree, and these rooted binary trees are pairwise distinct, since any two of them display a different rooted triple for at least one triple in ||R||.

At the other extreme, if R has the maximum possible size for a phylogenetically flexible set on n leaves (namely n-2 by Corollary 1), then it is possible for there to be just a single rooted phylogenetic tree that displays R; this is stated more precisely in the next proposition.

Proposition 1

  • (i)

    For every rooted binary phylogenetic X-tree T on n3 leaves, there exists a set RT of n-2 rooted triples for which (a) T is the only phylogenetic X-tree that displays RT and (b) ||RT|| is thin.

  • (ii)

    There exist phylogenetically flexible sets of triples of size n-2 on n leaves (n6) for which each assignment of a tree structure to these triples leads to a set of rooted triples that can be displayed by more than one rooted phylogenetic tree.

Proof

(i) We use induction on n. For n=3, we can write T=ab|c, in which case RT={ab|c} satisfies Conditions (a) and (b). Suppose now that Proposition 1 holds for kn where n3 and that T is a rooted binary phylogenetic X-tree with n+1 leaves. Select a pair of leaves ab that are adjacent to the same vertex (say v) of T (i.e. {a,b} is a cherry of T), let vertex u be the parent of vertex v in T, and let c be any leaf of T present in the component of T-u (the graph obtained by deleting u from T) that contains neither the root, nor the leaves ab. Put X=X-{a} and let T be the rooted binary phylogenetic X-tree obtained from T by deleting leaf a and its incident edge and suppressing the resulting vertex of degree 2. Since T has n leaves, the induction hypothesis ensures that there is a set RT of n-2 rooted triples for which T is the only phylogenetic X-tree that displays RT and that ||RT|| is thin. If we now let RT=RT{ab|c}, then RT is a set of (n+1)-2 rooted triples and RT satisfies Conditions (a) and (b) for the tree T. This establishes the induction step and thereby the proposition.

(ii) Let τ={{1,2,j}:2<jn}. In this case, τ is a thin (and therefore phylogenetically flexible) set of size n-2. Now, for n6, it can be checked that any assignment of a tree structure to these triples leads to a set of rooted triples that can be displayed by more than one rooted phylogenetic tree.

Median Characterisations

Given a phylogenetic tree T with leaf set X and a set sX3, let medT(s) refer to the vertex that is the unique median vertex of T for the three elements of s.

The following result was established in Dress and Steel (2009, Theorem 1.1). Suppose that τ is a subset of X3 with L(τ)=X. The following are equivalent:

  • (i)

    τ is thin.

  • (ii)

    There exists a binary unrooted phylogenetic X-tree T=(V,E) for which the function Inline graphic: smedT(s) from the elements s of τ to the set of interior vertices of T is one to one.

When (ii) holds, we say that T provides a median representation of τ. Figure 3i illustrates how this equivalence applies.

Fig. 3.

Fig. 3

(i) Associating each member of the thin collection of sets {{a,b,c},{c,d,e},{a,e,f},{b,e,g},{a,d,g}} with its median vertex in the tree shown provides a one-to-one mapping. (ii) A caterpillar tree that also provides a median representation of this thin collection of sets

We now strengthen this result from Dress and Steel (2009) by showing that the tree T can always be chosen to be an unrooted caterpillar tree. For example, for the thin collection of sets considered in Fig. 3, we may select the caterpillar tree shown in Fig. 3ii.

Theorem 2

Suppose τ is a non-empty subset of X3, where |X|4. If τ is thin, then there exists an unrooted caterpillar tree T=(V,E) with leaf set X for which the function Inline graphic is one to one.

Proof

We adapt the proof of (3) (2) of Dress and Steel (2009, Theorem 1.1) and use induction on the size of X. If |X|=4, the theorem clearly holds in view of Lemma 1(i). Let us suppose that it holds whenever 4|X|n-1, for some n5. Let X be such that |X|=n. By Lemma 1(ii), we may assume that one of the following two cases hold:

  • (A)

    There is an element x of X with nτ(x)=1.

  • (B)

    There is an element x of X with nτ(x)=2.

In case (A), there is some triple {a,b,x}τ such that for τ=τ-{{a,b,x}} we have that τ is thin. Put X=L(τ). By induction, there is an unrooted caterpillar tree T with leaf set X and the function Inline graphic is one to one. Now we can create a tree T by inserting an edge {x,u} where u is a new vertex subdividing an interior edge of T on the path between a and b. The resulting tree T is clearly a unrooted caterpillar tree on X and Inline graphic is one to one. This establishes the induction step in this case.

In Case (B), there is an element x in X with nτ(x)=2. Then there exist two distinct triples t,tτ each of which contains x. We consider the following two possible cases: (i) |tt|=2 and (ii) |tt|=1.

Case (i):|tt|=2. In this case, there exist a,b,bX with bb such that t={a,b,x} and t={a,b,x}. Since τ is thin, it follows that

τ=τ-{{a,b,x},{a,b,x}}{{a,b,b}}

is also thin. Put X=L(τ). Then, by induction, there is a unrooted caterpillar tree T with leaf set X and Inline graphic is one to one.

Consider the leaf b of T. Let Inline graphic denote the vertex adjacent to b. As T is an unrooted caterpillar tree, it suffices to consider the following two subcases:

Subcase (a): The leaves a and b are on the same side of T relative to b (i.e. they are in the same connected component of T-b as b). Without loss of generality, assume that the distance from a to b in T is less than or equal to distance from b to b in T. Note that in this case medT(a,b,b) is the vertex in T that is adjacent to a. Now create a tree T with leaf set X by inserting a new vertex u and a new edge {u,x} into T such that {u,b} is an edge on the path connecting b and a. The tree T is again an unrooted caterpillar tree on X. Furthermore, Inline graphic is one to one since (i) medT is one to one, and (ii) medT(x,a,b)=u and the median of {x,a,b} in T corresponds to the median vertex of {a,b,b} in T and therefore is a vertex of T that is different from any other median vertex of an element in τ.

Subcase (b): The leaves a and b are on different sides of T relative to b. Note that in this case, medT(a,b,b)=b. Now create a tree T with leaf set X by inserting a new vertex u and a new edge {x,u} into T such that {u,b} is an edge on the path connecting b and b. T is then clearly an unrooted caterpillar tree on X. Since medT(x,b,b)=u and the median of {x,a,b} in T corresponds to the median vertex of {a,b,b} in T, the same arguments as in the previous case imply that Inline graphic is one to one.

Case (ii):|tt|=1. In this case, there exist pairwise distinct elements a,a,b,b in X such that t={a,b,x} and t={a,b,x}. We may assume that τ does not contain both {a,a,b} and {a,a,b} as, otherwise, the claim follows from Case (B)(i)(a). By symmetry, we can assume without loss of generality that {a,b,b} is not in τ. Let τ=τ-{{a,b,x},{a,b,x}}{{a,a,b}}. Then since τ is thin it follows that τ is thin. Put X=L(τ). Then, by induction, there is an unrooted caterpillar tree T with leaf set X and Inline graphic is one to one.

Consider the leaf a. As T is a caterpillar tree on X, we can again consider two subcases ((a) and (b)), the first of which involves two further subcases:

Case (a): The leaves a and b are on the same side of T relative to b. Without loss of generality, assume that the distance from a to b in T is less than or equal to distance from b to b in T. We now have two subcases to consider for this subcase:

Case (a1): The leaf a is on the same side of the caterpillar T as a and b relative to b. If a and b are on the same side of T relative a, then the same arguments as in the Case (B)(i)(a) apply with a playing the role of b. If a and b are on different sides of T relative a, then the same arguments apply as in Case (B)(i)(b) with a playing the role of b.

Case (a2): The leaf a is on a different side of the caterpillar T from a and b relative to b. Now create a tree T on X by inserting a new vertex u and a new edge {x,u} into T such that with Inline graphic the vertex adjacent with b we have that {b,u} is an edge on the path connecting a and b. Then, T is clearly a unrooted caterpillar tree with leaf set X. Since medT(x,a,b) is u and the median of {x,a,b} in T corresponds to the median of {a,a,b} in T, the same arguments as in Case (B)(i)(a) imply that Inline graphic is one to one.

Case (b): The leaves a and b are on different sides of T relative to b. If a lies on the same side of T as a relative b and a and b lie on different sides of T relative a, then the same arguments as in the Case (B)(i)(a) apply with a playing the role of b. In all other cases, the same arguments as in the Case (B)(i)(b) apply with a playing the role of b

The Case r=2

The concept of phylogenetic flexibility does not directly carry over to the case where r=2, since in this case, there is just a single rooted phylogenetic tree. Instead, we use a stronger notion of tree structure (namely, total order) to obtain an analogue of Theorem 1.

We say that a non-empty subset τ of X2 is total-order flexible if every choice of a total order on the set s, for each sτ, is compatible with a total order on X. More formally, for every s={x,y}τ, if we declare that either xy or yx, then for any such selection of choices (one for each sτ), there is a total order on X that agrees with these inequalities. For example, τ={{a,b},{b,c}} is total-order flexible but {{a,b},{b,c},{a,c}} is not, since the orderings ab,bc,ca are not compatible with any total order on abc. The following result is the analogue of Theorem 1 for the case where r=2.

Theorem 3

Suppose that τ is a non-empty subset of X2. Then τ is thin if and only if τ is total-order flexible.

Proof

We first show that if τ is not thin, then τ is not total-order flexible. Suppose that τ is not thin. Then there exists a non-empty subset τ of τ for which |L(τ)||τ|. Let Gτ be the graph (L(τ),τ) that has vertex set L(τ) and edge set τ. Since Gτ has at least as many edges as vertices, this graph has a connected component that contains a cycle. If the edges of this cycle are {x1,x2},,{xi,xi+1},,{xr,x1}, then the total orders x1x2,,xixi+1,,xrx1 on these pairs are not compatible with any total order on X (since transitivity would imply that x1x1).

We now show that the thin property implies total-order flexibility by using induction on k=|τ|. The result clearly holds for k=1 so suppose that the result holds for subsets of X2 of size k1 and that τX2 is a thin set of size k+1. By Lemma 1(ii), there is an element x in X that is present in precisely one set, say {x,y}, in τ. Let τ be the set obtained from τ by deleting {x,y}. Then τ is thin and, since |τ|=k, the induction hypothesis implies that τ is total-order flexible. Then any choice of a total order on the set s for each sτ is compatible with a total order on X-{x} (recall that xL(τ) by the choice of x). If we now introduce a total order on {x,y}, then we can extend the total order to X by placing x after y if {x,y} is ordered as xy, and placing x after y otherwise.

We now present some characterisations for when a non-empty set τX2 is thin. We begin with an analogue of Dress and Steel (2009, Theorem 1.1), which was stated in the last section.

Given a rooted tree T with leaf set X and a set sX2, let lcaT(s) refer to the vertex that is the unique vertex of T that is the least common ancestor of the elements in the set s.

Theorem 4

Suppose that τ is a subset of X2 with L(τ)=X. The following are equivalent:

  • (i)

    τ is thin.

  • (ii)

    There exists a rooted binary phylogenetic X-tree T=(V,E) for which the function slcaT(s) from the elements of τ to the set of interior vertices of T is one to one.

  • (iii)

    As for (ii) but with T a rooted caterpillar tree.

Proof

(iii) (ii) is trivial.

(i) (iii) Suppose τ is thin. We use induction on the size of X. If |X|=3, then clearly (iii) holds. Therefore, suppose it holds whenever 3|X|n-1, for some n4. Let |X|=n.

By Lemma 1(ii), there is some x with nτ(x)=1. Let X=X-{x}. It is then straightforward to see that there is some pair {a,x}τ with τ=τ-{{a,x}}. Clearly τ is thin as τ is thin and either L(τ)=X-{x} or L(τ)=X-{x,a}.

Assume first that L(τ)=X, where X=X-{x}. By induction, there is a rooted caterpillar tree T with leaf set X and root ρ for which the function Inline graphic is one to one. Now, we can create a new rooted tree T with root ρ by adding two new edges {ρ,ρ} and {ρ,x} to T where ρ is a new vertex that is not in T. T is then clearly a rooted caterpillar tree on X; every vertex in Inline graphic has out-degree 2 and Inline graphic is one to one. This establishes the induction step, and so (iii) holds.

Assume next that L(τ)=X, where X=X-{x,a}. By induction, there exists a rooted caterpillar tree T on X. Let ρ denote the root of T. Let T be rooted caterpillar tree obtained from T via the following two-step process. First, add a new root ρ and a new edge e={ρ,ρ} to T. In the resulting tree subdivide e by a vertex c and add the edges {c,a} and {ρ,x}. Clearly, Inline graphic is one to one. This establishes again the induction step, and so (iii) holds too in this case.

(ii) (i) Suppose x is an element which is not in L(τ)=X. Given a non-empty subset ω of τ, let ω={t{x}:tω}.

Suppose a rooted phylogenetic tree T on X satisfies the conditions in Part (ii) of the theorem. Add a new leaf x that is not in L(τ) to T by adding the edge {ρT,x} and regard the resulting tree as an unrooted phylogenetic tree T on X{x}. In T, the map medT from τ to the internal vertices of T is then one to one. Hence, by Dress and Steel (2009, Theorem 1.1), τ is thin, and thus, for any non-empty subset ω of τ, we have:

|L(ω)|+1=|L(ω)||ω|+2=|ω|+2.

It immediately follows that τ is thin.

Interestingly, we can give an alternative characterisation of thin subsets τ of X2 in terms of bipartite graphs.

We first recall some results from matching theory. For a graph G and v a vertex in G, we let degG(v) denote the degree of v in G. Given a bipartite graph G=(AB,E) and a non-empty set YA, we let NG(Y) denote the set of vertices in B that are adjacent to some vertex in Y, and we define the surplusσG(Y) of Y to be:

σG(Y)=σ(Y)=|NG(Y)|-|Y|.

We also define the surplusσ(G) of G, to be the minimum surplus over all non-empty sets of A. We say that a bipartite graph G=(AB,E) has positive surplus (as viewed from A) if σ(G)>0. The following result is from Lovász and Plummer (1986, Theorem 1.3.8).

Theorem 5

A bipartite graph G=(AB,E) has positive surplus (as viewed from A) if and only if G contains a forest F such that degF(u)=2 for all uA.

We now apply this result to the setting of thin sets. Let τ be a non-empty collection of non-empty subsets of X. We associate a bipartite graph G(τ) to τ that has the vertex set τL(τ) and the edge set given by containment (i.e. {t,x} is an edge in G(τ) if and only if xt with xL(τ) and tτ). Thus, we are representing our set τ by a bipartite graph G=(AB,E) with A=τ, B=X and E given by containment.

Since τ is thin if and only if G(τ) has positive surplus, by Theorem 5, the following corollary is straightforward.

Corollary 2

Suppose that τ is a subset of X2 with L(τ)=X. Then τ is thin if and only if G(τ) is a forest.

For r=3, it might also be interesting to characterise those graphs G(τ) for which τ is thin.

The General Case (Slim Set Systems)

Suppose we have a non-empty collection τ of subsets of X, each of size at least 3. Consider the modified notion of excess, denoted exc and defined as follows: Define

exc(τ)=|L(τ)|-2-sτ(|s|-2).

Notice that when τX3 this notion of excess agrees with the earlier one.

Given a non-empty collection τ of subsets of X, each of size at least 3, we say that τ is slim if for every non-empty subset τ of τ, we have exc(τ)0. The next result relates slim to thin; the two notions coincide when τX3; however, slim is a more restrictive notion than thin when τXr, for r>3. Note, however, that (unlike the thin property) the slim property does not require the sets in τ to all have the same size.

Lemma 4

Suppose that τXr,r3. If τ is slim, then τ is thin. Moreover, for r=3,τ is thin if and only if τ is slim.

Proof

If |s|=r for each sτ, then sτ(|s|-2)=|τ|(r-2); therefore, τ is thin if and only if |L(τ)||τ|(r-2)+2 for every non-empty subset τ of τ. We now impose the assumption that r3. First, if r>3, then the required inequality: |τ|(r-2)+2|τ|+(r-1) is equivalent to the condition that |τ|1, which holds by the assumption that τ is non-empty. Thus if τ is slim, it is also thin. Moreover, when r=3, the inequality |τ|(r-2)+2|τ|+(r-1) becomes an equality; in this case, τ is slim if and only if it is thin.

We can extend the notion of phylogenetic flexibility introduced in Sect. 1 to arbitrary collections of subsets of X as follows. We first need to extend the earlier definitions of ‘display’ and ‘compatibility’ from sets of rooted triples to arbitrary collections of rooted trees, as follows.

Given a rooted phylogenetic X-tree T and a binary phylogenetic tree T with leaf set YX,T is said to displayT if T contains a subdivision of T as a (directed) subtree. [This is equivalent to the condition that each rooted triple displayed by T is also displayed by T (Bryant and Steel 1995).] A set R of rooted binary phylogenetic trees is said to be compatible if there is rooted phylogenetic tree T that displays each of the trees in R.

For a set R of rooted binary phylogenetic tree, let ||R|| denote the collection of their leaf sets. Thus, ||R|| is a set of sets. Given a non-empty collection τ of subsets of X, each of size at least 3, we say that τ is phylogenetically flexible if every set R of rooted binary phylogenetic trees for which ||R||=τ holds is compatible.

This notion agrees with the earlier notion of phylogenetic flexibility in the case where each set in τ has size exactly 3. Moreover, as before, we can assume without loss of generality that the tree T (in the definition) is binary.

The following result is a strengthening of our earlier Theorem 1; one direction follows from that theorem, the other direction is a consequence of a result from Grünewald (2012) (which dealt with unrooted trees).

Theorem 6

Suppose that τ is a collection of sets, each of size at least 3. Then τ is phylogenetically flexible if and only if τ is slim.

Proof

We first establish the ‘only if’ direction. Suppose that τ is phylogenetically flexible. For each set sτ, select two elements x,ys and let:

A(s)={{x,y,z}:zs,zx,y}},

and for any non-empty subset τ of τ let

α(τ)=sτA(s).

Thus, A(s) is a set of |s|-2 triples, and α(τ) is also a set of triples.

Claim 1

A(s)A(s)= for each s,sτ,ss.

To see this, suppose that a triple, say {a,b,c}, lies in A(s) and A(s) for two distinct elements s and s of τ. Then, we can select a rooted binary phylogenetic tree Ts with leaf set s that displays the rooted triple ab|c and select a rooted binary phylogenetic tree Ts with leaf set s that displays the rooted triple ac|b. But no rooted binary phylogenetic tree can display both Ts and Ts (since such a tree would also simultaneously display two different rooted triples with leaf set {a,b,c}). This contradicts the assumption that τ is phylogenetically flexible, so such a shared triple {a,b,c} in A(s)A(s) cannot exist. This establishes Claim 1.

Claim 2

α(τ) is phylogenetically flexible.

To see this, suppose that for each triple tα(τ), we have an associated rooted triple Tt with leaf set t. We need to show that there is a rooted binary phylogenetic tree that displays {Tt:tα(τ)}. Observe that A(s) is thin for each sτ, since if A is a non-empty subset of A(s) of size k (say), then |A|=k+2, and so |A|=|A|+2. Theorem 1 (the ‘if’ direction) then ensures that for each s in τ, there is a rooted phylogenetic tree Ts with leaf set s that displays {Tt:tα(τ)A(s)}. Moreover, since τ is phylogenetically flexible, there is a rooted binary phylogenetic tree T that displays Ts for each sτ. It follows that the tree T displays {Tt:tα(τ)}, and so α(τ) is phylogenetically flexible, as claimed.

Claim 2 implies that α(τ) is thin, by Theorem 1 (the ‘only if’ direction). We now show that this implies that τ is slim. Let τ be a non-empty subset of τ and consider α(τ). Since L(τ)=L(α(τ)), we have:

|L(τ)|=|L(α(τ))||α(τ)|+2, 6

where the inequality holds because α(τ) (and thereby its subset α(τ)) is thin.

Now:

|α(τ)|=sτ|A(s)|=sτ(|s|-2),

where the first equality holds by Claim 1. Combining this last equation with Inequality (6) gives:

|L(τ)|-2sτ(|s|-2),

which shows that τ is slim as claimed.

This establishes the ‘only if’ direction. Notice in doing so that we have used both directions of Theorem 1 in different places in this proof.

We turn now to the ‘if’ direction. Given τ, select a new element, say x, that is not present in any of the sets in τ, and add this to each of the sets in τ to produce a set τ+x. Notice that if τ is slim, then τ+x satisfies the property that for each non-empty set τ of τ+x, we have:

|L(τ)|-3sτ(|s|-3).

It follows from Theorem 1.1 of Grünewald (2012) that for any assignment of unrooted binary phylogenetic trees with leaf sets that correspond to the sets in τ+x, there is a binary phylogenetic tree Tx that displays each of these unrooted trees. Suppose now that we have an assignment of rooted binary phylogenetic trees having leaf sets that correspond to the sets in τ. By attaching x as a leaf adjacent to the root of each of these trees, we obtain an assignment of unrooted binary phylogenetic trees with leaf sets that correspond to the sets in τ+x. Hence, by the result just stated, there is an unrooted binary phylogenetic tree Tx that displays each of these unrooted trees. If we now let T be the rooted binary phylogenetic tree obtained from Tx by deleting the leaf x and rooting the resulting tree on the vertex adjacent to x, then T displays the original assignment of rooted binary phylogenetic trees. Since this holds for all possible assignments of rooted phylogenetic trees to the sets in τ, it follows that τ is phylogenetically flexible.

Polynomial-Time Algorithms for Thin and Slim

Given finite set S, a function f:2SR is called submodular if for all A,BS:

f(A)+f(B)f(AB)+f(AB).

Submodular functions play an important role in optimisation and matroid theory (see, e.g. Lovász and Plummer 1986; Bixby and Cunningham 1995; Welsh 1995). In this section, we exploit these connections to show that there are polynomial-time algorithms to decide whether sets are thin or slim.

Suppose that τ is a subset of 2X. For ττ, we define

σ(τ)=|L(τ)|-|τ|,

and

γ(τ)=|L(τ)|-sτ(|s|-2)

Note that σ()=γ()=0 (since the summation term is then empty). Although the following result is straightforward to show by using results concerning submodular functions in the literature, for completeness we give a direct proof.

Theorem 7

For τ a non-empty subset of 2X, the functions σ:2τR;τσ(τ) and γ:2τR;τγ(τ) are submodular.

Proof

Suppose that τ is a non-empty subset of 2X and that τ1,τ2τ are non-empty. Clearly, |L(τ1τ2)|=|L(τ1)L(τ2)| and |L(τ1τ2)||L(τ1)L(τ2)|. Hence:

|L(τ1τ2)|+|L(τ1τ2)||L(τ1)L(τ2)|+|L(τ1)L(τ2)|=(|L(τ1)|+|L(τ2)|-|L(τ1)L(τ2)|)+|L(τ1)L(τ2)|=|L(τ1)|+|L(τ2)|.

The fact that σ is submodular now follows, since |τ1|+|τ2|=|τ1τ2|+|τ1τ2| and thus, by the above inequality, we have:

σ(τ1)+σ(τ2)=|L(τ1)|+|L(τ2)|-(|τ1|+|τ2|)|L(τ1τ2)|+|L(τ1τ2)|-|τ1τ2|-|τ1τ2|=σ(τ1τ2)+σ(τ1τ2).

Similarly, γ is submodular, since

sτ1(|s|-2)+sτ2(|s|-2)=sτ1τ2(|s|-2)+sτ1τ2(|s|-2)

and therefore:

γ(τ1)+γ(τ2)=|L(τ1)|+|L(τ2)|-sτ1(|s|-2)+sτ2(|s|-2)|L(τ1τ2)|+|L(τ1τ2)|-sτ1τ2(|s|-2)-sτ1τ2(|s|-2)=γ(τ1τ2)+γ(τ1τ2).

For τ2X, we define:

σ(τ)=min{σ(τ):ττ,τ},andγ(τ)=min{γ(τ):ττ,τ}.

Lemma 5

  • (i)

    Suppose that τXr where r3. Then τ is thin if and only σ(τ)2.

  • (ii)

    Suppose τ2X such that each element in τ has size at least three. Then τ is slim if and only γ(τ)2.

Proof

  • (i)

    τ is thin if and only if σ(τ)2 for all non-empty ττ if and only if σ(τ)2.

  • (ii)

    τ is slim if and only if γ(τ)2 for all non-empty ττ if and only if γ(τ)2.

The following result (Lovász 1983, Theorem 4.4) is originally due to Grötschel et al. (1981) (see also Lovász and Plummer 1986, pp. 417–418).

Theorem 8

Let f be a submodular function defined on the subsets of some finite set S. A set minimising f over all non-empty subsets of S can then be found in polynomial time.

In light of this theorem and Theorem 7, it follows that we can determine σ(τ) and γ(τ) for a given set τ2X in polynomial time. Therefore, by Lemma 5, we can determine whether or not a given set τ for which each element has size at least three is thin or slim in polynomial time.

Note that although this shows that polynomial-time algorithms exist for determining whether or not a set is thin or slim, these are likely to be impracticable (Lovász and Plummer 1986, pp. 417–418). However, for the case of determining whether or not a set τ is thin a more explicit algorithm can be given. More specifically, in Fritzilas et al. (2013, Theorem 2) a polynomial-time algorithm is presented for computing the surplus σ(G) of a bipartite graph G. Since for a set τXr,r3, the surplus of the bipartite graph G(τ) as defined in Sect. 3.1 is equal to σ(τ), we can therefore apply this algorithm to determine if τ is thin. It would be interesting to find an explicit algorithm for determining whether a set is slim.

Theorem 7 has another consequence that relates to phylogenetics. Recall that a patchwork is a non-empty collection P of sets that satisfies the property: if A,BP and AB, then AB,ABP. A combinatorial theory of patchworks, relevant to phylogenetics, was developed in Böcker and Dress (2001). Patchworks were also referred to as ‘intersecting families’ in earlier work by Lovász (1983, p. 240). The following is a generalisation of Dress and Steel (2009, Lemma 1.2), and the proof follows a similar argument to that result.

Corollary 3

If τ is slim, then the collection P of non-empty subsets τ of τ such that |s|3 for all sτ and exc(τ)=0 forms a patchwork.

Proof

Suppose τ1,τ2P satisfy τ1τ2. For i=1,2, notice that exc(τi)=γ(τi)-2 and so, by the submodularity property of γ from Theorem 7, we have:

exc(τ1)+exc(τ2)exc(τ1τ2)+exc(τ1τ2), 7

noting that exc(τ1τ2) is well defined by the condition that τ1τ2. Since exc(τ1)=exc(τ2)=0, Inequality (7) gives:

0exc(τ1τ2)+exc(τ1τ2).

It follows that the terms exc(τ1τ2) and exc(τ1τ2) on the right of this inequality must both be zero since τ1τ2 and τ1τ2 are non-empty subsets of the slim set τ and so each has non-negative excess. Thus, τ1τ2,τ1τ2P, as required.

Discussion

When an evolutionary biologist compares a number of trees on different, but overlapping, leaf sets, it is typically very rare that these trees are found to be compatible, due mainly to errors in the estimation of phylogenetic trees. Thus, in cases where the trees are compatible this fact alone may provide the biologist with some heightened confidence in the accuracy of the input trees. However, such confidence should clearly depend, in part, on the pattern of taxon coverage. In the extreme case where the subsets of species on which the input trees were built from a phylogenetically flexible collection, it is clear that compatibility provides absolutely no hint of accuracy of the input trees, since any trees that had been considered for those subsets would be compatible. For applications, it might therefore be useful to quantify how close to ‘phylogenetically flexible’ a given pattern of taxon coverage is.

Our results also suggest a second possible future research direction. Since submodular functions are connected to matroid theory, are there relevant connections between thin/slim sets and matroids? Other matroid structures in phylogenetics have been recently been described, in different contexts, by Dress et al. (2014) and Hellmuth and Seemann (2017).

Acknowledgements

We thank the organisers of the Algebraic and Combinatorial Phylogenetics Workshop (Barcelona, 26–30 June 2017) where some of the ideas in this paper were conceived, and the London Mathematical Society for supporting the visit of KTH and VM to visit MS in New Zealand. We also thank the two anonymous reviewers of this paper for numerous helpful suggestions.

Contributor Information

Katharina T. Huber, Email: K.Huber@uea.ac.uk

Vincent Moulton, Email: V.Moulton@uea.ac.uk.

Mike Steel, Email: mike.steel@canterbury.ac.nz.

References

  1. Aho AV, Sagiv Y, Szymanski TG, Ullman JD. Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J Comput. 1981;10:405–421. doi: 10.1137/0210030. [DOI] [Google Scholar]
  2. Bixby RE, Cunningham WH, et al. Matroid optimization and algorithms. In: Graham RL, et al., editors. Handbook for combinatorics. New York: Elsevier; 1995. pp. 551–609. [Google Scholar]
  3. Böcker S, Dress AWM. Patchworks. Adv Math. 2001;157:1–21. doi: 10.1006/aima.1999.1912. [DOI] [Google Scholar]
  4. Bryant DJ, Steel M. Extension operations on sets of leaf-labelled trees. Adv Appl Math. 1995;16(4):425–453. doi: 10.1006/aama.1995.1020. [DOI] [Google Scholar]
  5. Dress A, Steel M. A Hall-type theorem for triplet set systems based on medians in trees. Appl Math Lett. 2009;22:1789–1792. doi: 10.1016/j.aml.2009.07.001. [DOI] [Google Scholar]
  6. Dress A, Huber KT, Steel M. A matroid associated with a phylogenetic tree. Discret Math Theor Comput Sci. 2014;16(2):41–56. [Google Scholar]
  7. Felsenstein J. Inferring phylogenies. Sunderland: Sinauer Associates; 2004. [Google Scholar]
  8. Fritzilas E, Milanič M, Monnot J, Rios-Solis YA. Resilience and optimization of identifiable bipartite graphs. Discret Appl Math. 2013;161(4):593–603. doi: 10.1016/j.dam.2012.01.005. [DOI] [Google Scholar]
  9. Grötschel M, Lovász L, Schrijver A. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica. 1981;1:169–197. doi: 10.1007/BF02579273. [DOI] [Google Scholar]
  10. Grünewald S. Slim sets of binary trees. J Comb Theory A. 2012;119:323–330. doi: 10.1016/j.jcta.2011.09.007. [DOI] [Google Scholar]
  11. Grünewald S, Huber KT , Moulton V, Steel M (2017) Combinatorial properties of triplet covers for binary trees. arXiv:1707.07908 [DOI] [PMC free article] [PubMed]
  12. Hall P. On representatives of subsets. J Lond Math Soc. 1935;10(1):26–30. doi: 10.1112/jlms/s1-10.37.26. [DOI] [Google Scholar]
  13. Hellmuth M, Seemann CR (2017) The matroid structure of representative triple sets and triple-closure computation. arXiv:1707.01667
  14. Lovász L (1983) Submodular functions and convexity. In: Mathematical programming the state of the art. Springer, pp 235–257
  15. Lovász L, Plummer MD. Matching theory. New York: Elsevier; 1986. [Google Scholar]
  16. Sanderson MJ, McMahon MM, Steel M. Terraces in phylogenetic tree space. Science. 2011;333:448–450. doi: 10.1126/science.1206357. [DOI] [PubMed] [Google Scholar]
  17. Semple C, Steel M. Phylogenetics. Oxford: Oxford University Press; 2003. [Google Scholar]
  18. Steel M, Sanderson MJ. Characterizing phylogenetically decisive taxon coverage. Appl Math Lett. 2010;23:82–86. doi: 10.1016/j.aml.2009.08.009. [DOI] [Google Scholar]
  19. Welsh DJA, et al. Matroids: fundamental concepts. In: Graham RL, et al., editors. Handbook for combinatorics. New York: Elsevier; 1995. pp. 481–550. [Google Scholar]

Articles from Bulletin of Mathematical Biology are provided here courtesy of Springer

RESOURCES