Skip to main content
Springer logoLink to Springer
. 2023 Oct 13;87(5):70. doi: 10.1007/s00285-023-02004-5

Exploring spaces of semi-directed level-1 networks

Simone Linz 1,, Kristina Wicke 2
PMCID: PMC10575830  PMID: 37831304

Abstract

Semi-directed phylogenetic networks have recently emerged as a class of phylogenetic networks sitting between rooted (directed) and unrooted (undirected) phylogenetic networks as they contain both directed as well as undirected edges. While various spaces of rooted phylogenetic networks and unrooted phylogenetic networks have been analyzed in recent years and several rearrangement moves to traverse these spaces have been introduced, little is known about spaces of semi-directed phylogenetic networks. Here, we propose a simple rearrangement move for semi-directed phylogenetic networks, called cut edge transfer (CET), and show that the space of semi-directed level-1 networks with precisely k reticulations is connected under CET. This level-1 space is currently the predominantly used search space for most algorithms that reconstruct semi-directed phylogenetic networks. Our results imply that every semi-directed level-1 network with a fixed number of reticulations and leaf set can be reached from any other such network by a sequence of CETs. By introducing two additional moves, R+ and R-, that allow for the addition and deletion, respectively, of a reticulation, we then establish connectedness for the space of all semi-directed level-1 networks on a fixed leaf set. As a byproduct of our results for semi-directed phylogenetic networks, we also show that the space of rooted level-1 networks with a fixed number of reticulations and leaf set is connected under CET, when translated into the rooted setting.

Keywords: Phylogenetic networks, Level-1, Cut edge transfer, Semi-directed networks

Introduction

Phylogenetic networks are a generalization of phylogenetic trees allowing for the representation of speciation and reticulate evolutionary events such as hybridization or lateral gene transfer. Traditionally, two types of phylogenetic networks were considered in the literature: unrooted (also referred to as undirected or implicit) phylogenetic networks and rooted (also referred to as directed or explicit) phylogenetic networks (see for example Huson et al. 2010). While the former are often used to represent conflict in data and lack evolutionary directionality, the latter explicitly depict evolution as a directed process from some common ancestor that is represented by the root to the present-day species that are represented by the leaves of the network. Importantly, rooted phylogenetic networks are rooted directed acyclic graphs that, in comparison with phylogenetic trees, contain vertices with in-degree at least two that represent reticulation events.

Recently, a class of phylogenetic networks that have directed and undirected edges, called semi-directed phylogenetic networks, has emerged in the literature. Roughly speaking, semi-directed phylogenetic networks are obtained from rooted phylogenetic networks by suppressing the root whose position is not identifiable under many models of sequence evolution and ignoring the direction of all edges, except for those directed into a vertex of in-degree at least two, thereby keeping information on which vertices represent reticulation events. Formal definitions of a semi-directed phylogenetic network and other mathematical concepts used in this paper are given in the next section.

Semi-directed phylogenetic networks have been the focus of studies concerning identifiability (see, e.g., Allman et al. 2022; Ardiyansyah 2021; Baños 2018; Gross and Long 2018; Gross et al. 2021; Hollering and Sullivant 2021; Solís-Lemus and Ané 2016; Solís-Lemus et al. 2020; Xu and Ané 2023) and also play a major role in phylogenetic network estimation algorithms such as NANUQ (Allman et al. 2019), SNaQ (Solís-Lemus and Ané 2016), and PhyNEST (Kong et al. 2022). The latter two find an optimal semi-directed phylogenetic network that best “fits” the observed data under a composite likelihood (also called pseudo-likelihood) framework and search through a space of semi-directed phylogenetic networks (detailed below). While SNaQ is implemented in the popular software tool PhyloNetworks (Solís-Lemus et al. 2017) and uses gene trees and quartet concordance factors as input, PhyNEST reconstructs an optimal network from site patterns. Like the reconstruction of rooted and unrooted phylogenetic networks, the reconstruction of an optimal semi-directed phylogenetic network typically involves searching the space of all semi-directed phylogenetic networks on a fixed leaf set. More specifically, given an initial phylogenetic network, the network is modified by locally rearranging its structure, the fit of the new network is evaluated, and if there is an improvement in fit, the search continues from that network until a local optimum is found. This strategy is referred to as hill-climbing. Although alternative optimization strategies such as simulated annealing exist, they all involve the need of traversing spaces of phylogenetic networks.

A fundamental question that arises in this regard is whether the space of phylogenetic networks is connected under a given rearrangement operation. In other words, can every phylogenetic network of a space of networks (e.g., all semi-directed phylogenetic networks on a fixed leaf set) be reached from any other phylogenetic network in the space by applying a sequence of these rearrangement operations such that the resulting network after each operation is also in the space? This question has been analyzed for various spaces of unrooted and rooted phylogenetic trees (e.g., Allen and Steel 2001; Bordewich and Semple 2005; Hein et al. 1996), unrooted phylogenetic networks (e.g., Huber et al. 2015, 2016; Francis et al. 2017; Janssen and Klawitter 2019) and rooted phylogenetic networks (e.g., Bordewich et al. 2017; Erdős et al. 2021; Gambette et al. 2017; Janssen 2021a; Janssen et al. 2018; Klawitter 2018), and several rearrangement moves to traverse these spaces have been introduced. We also refer the reader to two excellent PhD theses on the topic by Janssen (2021b) and Klawitter (2020). While Janssen (2021b) argues that connectedness of the space of all semi-directed phylogenetic networks follows from the connectedness of all rooted phylogenetic networks, much less is known about smaller spaces of semi-directed phylogenetic networks such as level-1 or other popular network classes.

Focusing on the reconstruction of semi-directed level-1 networks which are networks whose underlying cycles are vertex disjoint, Solís-Lemus and Ané (2016) suggested that the moves employed in SNaQ assure connectivity due to their similarity to moves for which there is an established connectivity result for unrooted level-1 networks (Huber et al. 2015). However, this has not been formally proven yet. Indeed, Fig. 1 of Huber et al. (2015) shows that, although the space of all unrooted level-1 networks on four leaves is connected under the operation proposed in that paper, the space of all such network restricted to those with two reticulations is not connected under the same operation.

Fig. 1.

Fig. 1

a A semi-directed phylogenetic network Ns on X={x1,x2} and the unique almost level-1 rooted partner Nr of Ns. As the child of the root of Nr is incident with two reticulation edges in parallel, Ns contains a directed loop. b A semi-directed phylogenetic network Ns on X={x1,x2} and the unique level-1 rooted partner Nr of Ns. As the child of the root of Nr is the source of a cycle of length three, Ns contains a pair of parallel edges. Each of Nr and Ns is level-1

The main purpose of this paper is to establish rigorous connectivity results for spaces of semi-directed level-1 networks because SNaQ (Solís-Lemus and Ané 2016) and other algorithms in this area of research such as NANUQ (Allman et al. 2019) and PhyNEST (Kong et al. 2022) also focus on the reconstruction of semi-directed level-1 networks or (in case of PhyNEST) use them as an intermediate step in the estimation of rooted level-1 networks. To this end, we propose a new rearrangement operation for semi-directed phylogenetic networks, called cut edge transfer (CET), which prunes a subnetwork of a semi-directed phylogenetic network by deleting a cut edge and reconnects the two smaller networks by adjoining them with a new cut edge. We then prove that, under CET, the space of semi-directed level-1 networks with a fixed number k of reticulations and leaf set X is connected. Hence, every semi-directed level-1 network with k reticulations and leaf set X can be reached from any other such network by a sequence of CETs such that the network resulting from each CET in the sequence is also a semi-directed level-1 network with k reticulations and leaf set X. As a byproduct of our results, we establish connectivity of rooted level-1 networks with a fixed number of reticulations and leaf set under a rooted version of CET. While CETs operate on semi-directed networks of the same “reticulate complexity” (i.e., the same number of reticulations), we additionally introduce two moves R+ and R- that allow for a change in the number of reticulations by one. Here, we show that (unsurprisingly) under CET, R+, and R-, the space of all semi-directed phylogenetic networks on a fixed leaf set and the space of all semi-directed level-1 networks with a fixed leaf set are connected. Lastly, we show that if two semi-directed level-1 networks are connected by a single CET, then they are also connected by a sequence of restricted local CETs. Such a restricted CET, to which we refer to as CET1, moves a pruned subnetwork across a single internal edge. This last result suggests that the rearrangement moves employed in SNaQ (Solís-Lemus and Ané 2016) are sufficient to reach any semi-directed level-1 network in the search space if their so-called “nearest neighbor interchange (NNI) move on a tree edge” is slightly relaxed to allow for NNI moves on undirected and directed edges.

The remainder of the paper is organized as follows. We begin by defining rooted and semi-directed phylogenetic networks, as well as several concepts in the study of phylogenetic networks in Sect. 2. In Sect. 3 we introduce the CET operation and discuss some of its properties. Subsequently, in Sect. 4 we establish connectedness results for spaces of rooted level-1 networks under CET that play a crucial role in establishing analogous results for spaces of semi-directed level-1 networks. In Sect. 5, we finally turn to semi-directed phylogenetic networks. We first establish connectedness of semi-directed level-1 networks with a fixed number of reticulations and leaf set in Sect. 5.1 and then connectedness for all such networks if only the leaf set is fixed in Sect. 5.2. Lastly, in Sect. 5.3 we show that if two semi-directed level-1 networks are connected by a single CET, then they are also connected by a sequence of local CET1 moves. We end the paper with some concluding remarks and directions for future research in Sect. 6.

Preliminaries

Throughout this paper, X denotes a non-empty finite set.

Rooted phylogenetic networks and related concepts

Let G be a rooted acyclic directed graph. A loop (vv) of G is an edge that connects a vertex v with itself. Furthermore, two edges (uv) and (u,v) of G are said to be in parallel if u=u and v=v. Intuitively, if (uv) and (u,v) are in parallel, then they are two copies of the same edge. Now a rooted binary phylogenetic network Nr on X is a rooted acyclic directed graph with no loops that satisfies the following three properties:

  • (i)

    The (unique) root ρ has in-degree zero and out-degree one;

  • (ii)

    A vertex of out-degree zero has in-degree one, and the set of vertices with out-degree zero is X; and

  • (iii)

    All other vertices have either in-degree one and out-degree two, or in-degree two and out-degree one.

The set X is called the leaf set of Nr. As with other publications on spaces of phylogenetic networks (Bordewich et al. 2017; Janssen and Klawitter 2019), we allow edges to be in parallel or, equivalently, underlying cycles of length two. Although we do allow edges to be in parallel in a rooted phylogenetic network, we note that we do not allow them in rooted level-1 networks as defined later in this section. A vertex with in-degree two and out-degree one is called a reticulation, and a vertex with in-degree one and out-degree two is called a tree vertex. Similarly, an edge directed into a reticulation is called a reticulation edge and each non-reticulation edge is called a tree edge. Lastly, for two vertices u and v, we say that u is a parent of v and v is a child of u if (uv) is an edge of Nr.

A rooted binary phylogenetic X-tree T is a rooted binary phylogenetic network on X with no reticulation. Let |X|=n. We call T a caterpillar if n=1, or if n2 and we can order the elements in X, say x1,x2,,xn, so that x1 and x2 have the same parent and, for all i{2,3,,n-1}, we have that (pi+1,pi) is an edge in T, where pi+1 and pi are the parents of xi+1 and xi, respectively. We denote such a caterpillar T by (x1,x2,x3,xn) or, equivalently, (x2,x1,x3,,xn).

Finally, we introduce two graph operations for a rooted acyclic directed graph G. Let e=(u,v) be an edge of G. Then, subdividing e with a vertex w, refers to deleting e, adding a new vertex w, and adding the edges (uw) and (wv). Conversely, given a degree-2 vertex w of G such that (uw) and (wv) are edges, suppressing w refers to deleting w and adding a new edge (uv).

Semi-directed phylogenetic networks and related concepts

We next define a second network type that will play an important role in this paper and that has directed and undirected edges. Adapting the definition that is used in Solís-Lemus and Ané (2016), we say that a network Ns with leaf set X is a semi-directed binary phylogenetic network on X if it can be obtained from a rooted binary phylogenetic network Nr on X and with root ρ in one of the following three ways:

  1. If the unique child u of ρ is incident with two reticulation edges in parallel that are both directed from u to a vertex w, then undirect all tree edges of Nr, delete ρ and u, and add a (directed) loop (ww).

  2. If the unique child u of ρ is incident with one reticulation edge (uv) and one tree edge (u,v), then undirect all tree edges of Nr, delete ρ and u, and add a directed edge (v,v).

  3. If the unique child of u of ρ is incident with two tree edges (uv) and (u,v), then undirect all tree edges of Nr, delete ρ and u, and add an undirected edge {v,v}.

We define a loop and a pair of parallel edges of a semi-directed phylogenetic network in the same way as for a rooted phylogenetic network. An example for (i) and (ii) is shown in Fig. 1. If Ns can be obtained from Nr by applying (i), (ii), or (iii), then we say that Nr is a rooted partner of Ns. Moreover, Nr is the unique rooted partner of Ns if (i) applies, in which case (ww) is the unique loop in Ns. On the other hand, Nr is not necessarily the unique rooted partner of Ns if (ii) or (iii) applies, in which case Ns has no loop. Lastly, we call a vertex v of Ns a reticulation if there either exist two edges that are directed into v or (vv) is a loop, and we call an edge of Ns that is directed a reticulation edge.

Let Ns and Ns be two semi-directed binary phylogenetic networks on X with vertex and edge sets V and E, and V and E, respectively. Then Ns and Ns are isomorphic if there is a bijection ψ:VV such that ψ(x)=x for all xX and (u,v)E (resp.{u,v}E) if and only if (ψ(u),ψ(v))E (resp. {ψ(u),ψ(v)}E) for all u,vV. If Ns and Ns are isomorphic, we write NsNs and, otherwise, we write NsNs.

For the remainder of the paper, we will refer to the two types of rooted binary phylogenetic networks and semi-directed binary phylogenetic networks as rooted phylogenetic networks and semi-directed phylogenetic networks, respectively, as all such networks considered here are binary. Moreover, whenever we use the expression of a phylogenetic network N without specifying a type, then the following statement or definition applies to both types of networks. Making use of this last convention, we use r(N) to denote the number of reticulations of a phylogenetic network N. Additionally, in all figures except for Fig. 1, the edges of rooted phylogenetic networks are directed down the page and we omit arrowheads.

Similar to rooted phylogenetic networks, we next define the two operations of subdividing an edge and suppressing a vertex for mixed graphs that have directed and undirected edges, and are therefore a generalization of semi-directed phylogenetic networks. Let G be a mixed graph with at least one undirected edge, and let e and e be two edges of G. First, if e is a directed edge (uv) with uv (resp. u=v), then subdividing e is the operation that replaces e with the undirected edge {u,w} and the directed edge (wv) (resp. with two directed edges in parallel from w to u). Second, if e is an undirected edge {u,v}, then subdividing e is the operation that replaces e with the two undirected edges {u,w} and {w,v}. Conversely, for a degree-2 vertex w of G, we distinguish five cases of suppressing w.

  • (i)

    If e is an undirected edge {u,w} and e is a directed edge (wv), then suppressing w replaces e and e with a single directed edge (uv).

  • (ii)

    If e is a directed edge (uw) and e is an undirected edge {w,v}, then suppressing w replaces e and e with an undirected edge {u,v}.

  • (iii)

    If e (resp. e) is an undirected edge {u,w} (resp. {w,v}), then suppressing w is the operation of replacing e and e with an undirected edge {u,v}.

  • (iv)

    If e (resp. e) is a directed edge (uw) (resp. (wv)), then suppressing w is the operation of replacing e and e with a directed edge (uv).

  • (v)

    If e is a directed edge (wu) and e is a directed edge (wv) with u=v, then suppressing w replaces e and e with a (directed) loop (vv).

Cycles and cut edges

Let N be a phylogenetic network. Recall that N may have a loop if it is semi-directed. For 1, we refer to a sequence v1,v2,,v of distinct vertices of N as a cycle of length or as an -cycle if {v,v1} and, for each i{1,2,,-1}, {vi,vi+1} are edges in the underlying graph of N. If =1, the definition of a cycle of length one coincides with that of a loop. Furthermore, if the length of an -cycle is irrelevant, we simply refer to it as a cycle. Now, let e be an edge of N. Recalling that all networks in this contribution are binary, e is called a cut edge (or bridge) of N if the deletion of e from N results in a graph with exactly two connected components1. Note that this in particular implies that a cut edge cannot be contained in a cycle.

Level-1 networks

Let Nr be a rooted phylogenetic network. Then Nr is said to be level-1 if it has no pair of parallel edges and no two cycles have a common vertex. Moreover, if Nr is a rooted level-1 network and v is a vertex of a cycle C of N, we call v the source of C if no edge of Nr that is directed into v lies on C. If, on the other hand, v is the unique reticulation of C, then we call it the sink of C. Since Nr is level-1, each cycle of N has a unique source and sink.

Extending the definition of level-1 to a semi-directed phylogenetic network Ns, we say that Ns is level-1 if there exists a rooted partner of Ns that is level-1. Notice that a semi-directed level-1 network may contain one pair of parallel edges. This is the case if it was obtained from a rooted level-1 network with the property that the unique child of the root is the source of a cycle of length three. An example of this is depicted in Fig. 1b.

We remark that the number of reticulations in rooted and semi-directed level-1 networks is bounded.

Lemma 2.1

Let N be a rooted or semi-directed level-1 network on X. Then N has at most |X|-1 reticulations.

Proof

First, suppose that N is a rooted level-1 network. Then the lemma follows from (Cardona et al. 2008; McDiarmid et al. 2015) and the fact that each level-1 network is also tree-child (Huber et al. 2022). Second, suppose that N is a semi-directed level-1 network. Let Nr be a rooted partner of N that is level-1. By construction, v is a reticulation in N if and only if v is a reticulation in Nr. As, Nr has at most |X|-1 reticulations, so does N.

Almost level- 1 networks

A rooted phylogenetic network on X is called almost level-1 if it has at most one 2-cycle, all other cycles have length at least three, and no two cycles have a common vertex. Similarly, a semi-directed phylogenetic network is called almost level-1 if it has a rooted partner that is almost level-1. Thus, a semi-directed almost level-1 network has at most two cycles of length two and no loop, or at most one loop and no cycle of length two.

Cut edge transfers

In this section we introduce a new rearrangement operation that can be applied to phylogenetic networks and that will play a crucial role in establishing that the space of semi-directed level-1 networks on a fixed leaf set (and a fixed number of reticulations) is connected.

Rooted CET moves

Let Nr be a rooted phylogenetic network, and let e=(u,v) be a cut edge of Nr such that e is not incident with ρ and u is not a reticulation. Obtain a network Nr from Nr by deleting e, suppressing u, subdividing an edge of the connected component that contains ρ with a new vertex u, and adding a new edge (u,v). Clearly, Nr is a rooted phylogenetic network on X. If NrNr, we say that Nr is obtained from Nr by a single cut edge transfer (CET). Furthermore, if Nr can be obtained from Nr by a single CET, then conversely Nr can also be obtained from Nr by the single CET that reverses the roles of u and u. Hence, any CET is reversible. Lastly, if Nr is a rooted phylogenetic X-tree, then CETs coincide with rooted subtree prune and regraft (rSPR) operations (Bordewich and Semple 2005).

Semi-directed CET moves

In the following, we extend the definition of a CET to semi-directed phylogenetic networks. We begin by establishing a relationship between cut edges and reticulation edges of such networks.

Lemma 3.1

Let Ns be a semi-directed phylogenetic network, and let e be an edge of Ns. If e is a reticulation edge of Ns, then e is an edge of a cycle in Ns. Moreover, no cut edge of Ns is a reticulation edge.

Proof

Let Nr be a rooted partner of Ns. Suppose that e is a reticulation edge of Ns. By construction of Ns from Nr, it follows that, as e is an edge of a cycle in Nr, e is also an edge of a cycle in Ns. Now, let f be a cut edge of Ns. Since f is not an edge of a cycle, f is not a reticulation edge of Ns.

We next establish a lemma that pinpoints the relationship between cut edges of a semi-directed phylogenetic network and those of a rooted partner.

Lemma 3.2

Let Ns be a semi-directed phylogenetic network, and let Nr be a rooted partner of Ns with root ρ. Let u and v be two vertices of Ns. Then e={u,v} is a cut edge of Ns if and only if exactly one of the following two conditions applies:

  • (i)

    (uv) or (vu) is a cut edge of Nr, or

  • (ii)

    (ρ,t), (tu), and (tv) are cut edges of Nr, where t is the unique child of ρ.

Proof

Let t be the unique child of ρ in Nr. By construction of Ns from Nr it follows that {u,v}{ρ,t}=. First, suppose that e={u,v} is a cut edge of Ns. If (i) does not apply, then, by construction of Ns from Nr, it follows that neither (uv) nor (vu) is an edge of Nr. Hence, t is the parent of each of u and v in Nr; thereby implying that (ii) holds.

Second, suppose that one of (i) and (ii) applies. Clearly, if (i) applies, then {u,v} is a cut edge of Ns. On the other hand, if (ii) applies, then it again follows from the construction of Ns from Nr that {u,v} is a cut edge of Ns.

We are now in a position to introduce CET moves for semi-directed phylogenetic networks. Let Ns be a semi-directed phylogenetic network on X. Furthermore, let e={u,v} be a cut edge of Ns such that u is not a reticulation and there exists a rooted partner Nr of Ns that satisfies one of the following two conditions.

  1. u is the parent of v in Nr or

  2. there exist three cut edges (ρ,t), (tu), and (tv) in Nr, where t is the unique child of ρ.

Observe that, by Lemma 3.2, these are the only two possibilities. Then obtain a network Ns from Ns by deleting e, suppressing u, subdividing an edge of the connected component that does not contain v with a new vertex u, and adding a new edge {u,v}. Recall that if u subdivides a loop (ww) of Ns, then Ns has two parallel edges (u,w). To see that Ns is a semi-directed phylogenetic network, observe the following. If the connected component containing v does not contain any cycle, then the operation described above clearly preserves the fact that the edges of the resulting graph can be directed to yield a rooted phylogenetic network, which implies that Ns has a rooted partner. If, on the other hand, the connected component containing v contains a cycle, then, by the choice of u and v, there exists a rooted partner Nr of Ns satisfying Conditions 1. or 2. given above. In particular, all edges in the connected component of Ns that contains v, must be directed away from v in Nr. So again, the described operation results in a graph that can be directed to yield a rooted phylogenetic network, implying that, in both cases, Ns is a semi-directed phylogenetic network. If NsNs, we say that Ns is obtained from Ns by a single cut edge transfer (CET). Similar to the rooted case, if Ns can be obtained from Ns by a single CET, then conversely Ns can also be obtained from Ns by a single CET.

To illustrate, Fig. 2 shows two semi-directed networks Ns and Ns such that the latter network can be obtained from the former by a single CET. We remark that carefully choosing a cut edge e={u,v} in the definition of a CET is crucial to ensure that the CET results in a semi-directed phylogenetic network. For arbitrary choices of u and v, a CET may result in a graph that is not a semi-directed phylogenetic network. To see this, we refer back to Fig. 2 and note that the roles of u and v cannot be interchanged (i.e., we cannot suppress v while keeping u) because there exists no rooted partner of Ns such that v is a parent of u or each of (ρ,t), (tu), and (tv) are cut edges, where t is the child of ρ.

Fig. 2.

Fig. 2

A semi-directed phylogenetic network Ns with cut edge e={u,v}. It can easily be checked that there exists a rooted partner of Ns with u being a parent of v. Deleting e, suppressing u, subdividing an edge of the connected component that does not contain v with a new vertex u, and adding a new edge {u,v} is thus a valid CET and the semi-directed phylogenetic network Ns is obtained from Ns by one such operation

We end this section, with several definitions that will be used throughout the remaining sections and that apply to rooted as well as to semi-directed phylogenetic networks.

CET sequences

We call a sequence N0,N1,N2,,Nm of rooted phylogenetic networks on X or of semi-directed phylogenetic networks on X a CET sequence of length m if each Ni with i{1,2,,m} can be obtained from Ni-1 by a single CET.

(Weak) connectedness under CET

Let C be a space of phylogenetic networks on X. We say that C is connected under CET if, for any pair N and N of networks in C, there exists a CET sequence that transforms N into N and every network in the sequence is in C.

In the remainder of this paper, we additionally require the notion of weak connectedness. More precisely, we say that the space of rooted level-1 networks with exactly k reticulations is weakly connected under CET, if, for all rooted level-1 networks with exactly k reticulations, Nr and Nr say, there is a CET sequence connecting Nr and Nr whereby every network in the sequence is a rooted almost level-1 network. Similarly, we say that the space of semi-directed level-1 networks with exactly k reticulations is weakly connected under CET, if, for all semi-directed level-1 networks with exactly k reticulations, Ns and Ns say, there is a CET sequence connecting Ns and Ns whereby every network in the sequence is a semi-directed almost level-1 network.

CET distance and diameter

Suppose that a space C of phylogenetic networks is connected under CET. Then the CET distance between two phylogenetic networks N and N in C is the minimum length of a CET sequence that connects N and N, where every network in the sequence is in C. Furthermore, the diameter of C under CET is the maximum CET distance over all pairs of phylogenetic networks in C.

Connectedness of rooted level-1 networks

In this section, we establish connectedness results under CET for spaces of rooted level-1 networks that have a fixed number of reticulations. These results are then used in the next section to establish analogous connectedness results for spaces of semi-directed level-1 networks. As we will see, almost all work goes into proving connectedness for rooted level-1 networks. Once the results of this section are in place, connectedness for spaces of semi-directed level-1 networks follows relatively easily by considering semi-directed level-1 networks and their rooted partners that are level-1.

Definitions

Standard form and standard shape of rooted level-1 networks

We now introduce what we call the standard form of a rooted level-1 network with precisely k reticulations. This network will play a crucial role in what follows since each rooted level-1 network with precisely k reticulations can be transformed into it by using a sequence of CETs. Let Nr be a rooted level-1 network on X with precisely k reticulations and |X|=n. We say that Nr is in standard form if, either k=0 and Nr is a caterpillar, or, if k1 and Nr has the following properties:

  • (i)

    Nr contains precisely k 3-cycles. For each such cycle Ci with i{1,2,,k}, we denote its source by ui, its sink by vi, and its third vertex by pi.

  • (ii)

    For each i{1,2,,k}, vertex pi denotes the parent of leaf xi.

  • (iii)

    Vertex u1 is the child of the root of Nr, and Nr contains the edges (vi,ui+1) for each i{1,2.,k-1}.

  • (iv)
    Leaves xk+1,xk+2,,xn are the leaves of a caterpillar T, such that:
    1. If n=k+1, leaf xn is the only leaf of T and Nr contains the edge (vk,xn);
    2. If n>k+1, leaves xk+1,,xn of T are ordered such that xk+1 and xk+2 have the same parent and, for all i{k+2,k+3,n-1}, we have that (pi+1,pi) is an edge in Nr, where pi+1 and pi are the parents of xi+1 and xi, respectively, and such that Nr contains the edge (vk,pn).
    Note that since a rooted level-1 network on X has at most |X|-1 reticulations, i.e., kn-1, we always have nk+1, and thus one of (a) and (b) must occur.

A generic example of a rooted level-1 network in standard form is depicted in Fig. 3. For fixed X and fixed k, there is a unique rooted level-1 network of standard form. Continuing on from the definition of a network of standard form, we say that a rooted level-1 network is of standard shape if it only differs from a network in standard form by a permutation of its leaf labels.

Fig. 3.

Fig. 3

The rooted level-1 network on X={x1,x2,,xn} with precisely k reticulations in standard form

Finally, we introduce two technical concepts, chains of length k and the notion of the correct position of a leaf, that will be used in subsequent lemmas.

Chains of length k

Now, let Nr be an almost level-1 network on X. For k1, we say that a collection of k cycles forms a chain of length k of Nr if there is an ordering (C1,C2,,Ck) of these cycles such that the path from ρ to u1 contains only tree vertices, where u1 is the source of C1, and, for each i{1,2,,k-1}, vi is an ancestor of each vertex in {vi+1,vi+2,,vk}, where vi denotes the sink of Ci.

Correct position

Let Nr be a rooted almost level-1 network on X={x1,x2,,xn} with precisely k reticulations. For each i{1,2,,k}, let vi be the sink of cycle Ci. We say that xi with i{1,2,,n} is in its correct position if one of the following two conditions is satisfied.

  1. If ik, then xi is adjacent to a non-sink and non-source vertex of Ci.

  2. If i>k, then xi is a leaf of a caterpillar σ=(y1,y2,,yn) with nn that is rooted at vk such that the sequence obtained from σ by deleting each element in {x1,x2,,xk,xi+1,xi+2,,xn} is equal to (xk+1,xk+2,,xi) or (xk+2,xk+1,,xi).

Results

The aim of this section is to establish the following theorem.

Theorem 4.1

Let k be a fixed non-negative integer. If k|X|-2, then the space of all rooted level-1 networks on X with exactly k reticulations is connected under CET. Otherwise, if k=|X|-1, then the space of rooted level-1 networks on X with exactly k reticulations is weakly connected under CET. Moreover, in both cases, the diameter of the space of rooted level-1 networks on X with exactly k reticulations is at most O(|X|+k) under CET.

In order to prove Theorem 4.1, we require several technical lemmas. We start with a lemma on the number of tree vertices in a rooted phylogenetic network followed by a lemma that investigates level-1 networks whose cycles all have length three. To this end, recall that the root of a rooted phylogenetic network has in-degree zero and out-degree one. By translating Lemma 2.1 and its proof of McDiarmid et al. (2015) into the language of the present paper, we have the following result.

Lemma 4.2

Let Nr be a rooted phylogenetic network on X. Let k be the number of reticulations in Nr, and let t be the number of tree vertices of Nr. Then t=k+|X|-1.

Lemma 4.3

Let Nr be a rooted level-1 network on X with root ρ such that each cycle has length three. Suppose that Nr has exactly k reticulations. Then each reticulation and tree vertex of Nr is a vertex of a cycle if and only if k=|X|-1.

Proof

Let t be the number of tree vertices of Nr. Since each cycle of Nr has length three, we have that k+t3k. By Lemma 2.1, Nr has at most |X|-1 reticulations. Furthermore, by Lemma 4.2, the number of reticulations and tree vertices of Nr is

k+t=k+k+|X|-1. 1

First, assume that k=|X|-1. Then, Eq. (1) simplifies to k+t=3k. Moreover, since each cycle of Nr has length three, it follows that each reticulation and each tree vertex of Nr is a vertex of a cycle.

Second, assume that k<|X|-1. Using again Eq. (1), we have k+t>3k. Hence, there exists a vertex v in Nr that is not a vertex of a cycle. By Lemma 3.1, v is a tree vertex.

The next lemma shows that every rooted level-1 network on X with precisely k reticulations can be transformed into a rooted level-1 network of standard shape using a sequence of CETs.

Lemma 4.4

Let Nr be a rooted level-1 network on X with precisely k reticulations. Then, there exists a CET sequence of length at most 2|X|+2k that transforms Nr into a rooted level-1 network Nr on X with k reticulations of standard shape, whereby

  • (i)

    If k|X|-2, every network in the sequence is a rooted level-1 network on X with precisely k reticulations;

  • (ii)

    If k=|X|-1, every network in the sequence is a rooted almost level-1 network on X with precisely k reticulations.

The high-level idea of the proof is the following: Given a rooted level-1 network Nr with k1 cycles that is not of standard shape, we first transform all cycles into 3-cycles. We then arrange these 3-cycles into a chain of length k and finish the transformation by moving individual leaves.

Proof of Lemma 4.4

If Nr is already in standard shape, there is nothing to show. Else, let C1,C2,,Ck denote the cycles of Nr with k0, and let ui denote the source and vi the sink of Ci for each i{1,2,,k}. In what follows, we generate a CET sequence of rooted almost level-1 networks on X whereby each network in the sequence has precisely k cycles. Although the length of a cycle Ci may change throughout the sequence, its sink remains vi. For each network in the sequence, we therefore refer to the cycle with sink vi as cycle Ci.

Let (C1,C2,,Ck) be an ordering on the cycles in Nr such that Ci precedes Cj if ui is a descendant of uj for i<j. For each i{1,2,,k} in order, we now apply a sequence of CETs to transform Ci into a 3-cycle if Ci has length at least four. Intuitively, each such CET reduces the length of Ci by one. Suppose that Nr has been obtained from Nr by a sequence of CETs and that cycles C1,C2,Ci-1 are 3-cycles in Nr. Consider the cycle Ci, and let mi denote its length. Further, assume that the vertices of Ci are {ui,vi,s1,s2,,smi-2}. Let Nr0=Nr and set j=1. We apply the following CET to each j{1,2,,mi-3}: Let e=(sj,tj) be the cut edge incident with sj. Then we obtain Nrj from Nrj-1 by deleting e, suppressing sj, subdividing the edge incident with ρ with a new vertex uj, adding the edge (uj,tj), and incrementing j by one. By the choice of the vertices sj, all moves are valid CETs and since we apply mi-3 of them, no pair of parallel edges is created in the process. Moreover, when j=mi-2, the size of Ci is three and the process stops. Let Nr denote the rooted level-1 network obtained from Nr by transforming all cycles of Nr into 3-cycles. It follows that each CET in the CET sequence that transforms Nr into Nr cuts an edge e=(sj,tj) in Nrj-1 such that tj is either a leaf or a tree vertex. If tj is a tree vertex, then it has at least one descendant that is a leaf. Hence, by the chosen ordering (C1,C2,,Ck), Nr is obtained from Nr by at most |X| CETs.

Now let (C1,C2,,Ck) be a sequence of the cycles in Nr such that Ci precedes Cj if the source ui of Ci is an ancestor of the source uj of Cj for i<j. We apply a sequence of CETs to transform Nr into a chain of 3-cycles of length k. If C1,C2,Ck already form a chain of 3-cycles, we apply no CET. Else assume that for some maximum k with 1k<k, Nr has a chain Hk of 3-cycles of length k. Consider the minimum j{1,2,,k} such that Cj is not part of Hk. Note that j=1 is possible. Let e=(tj,uj) denote the edge directed into the source uj of Cj. By the chosen ordering, tj is neither the root nor a reticulation of Nr. We now distinguish two cases:

  1. If k<|X|-1, by Lemma 4.3, there exists at least one tree vertex in Nr, t say, that is not in a cycle. Let e=(t,c) denote one of its two out-going edges. We apply a sequence of three CETs. The first CET deletes e, suppresses t, subdivides the edge e=(tj,uj) with a new vertex t, and adds the edge (t,c). The second CET, deletes the edge (t,uj) directed into uj, suppresses t, subdivides the edge incident with ρ with a new vertex tj, and adds the edge (tj,uj). Clearly, no parallel edges are created in this step. Finally, let wj denote the child of tj that is not uj. The third CET deletes the edge (tj,wj), suppresses tj, subdivides the cut edge incident with the sink vj of Cj with a new vertex tj, and adds the edge (tj,wj). Again, no parallel edges are created in this step. Moreover, tj is a tree vertex in the resulting rooted level-1 network that is not in a cycle. An example of this sequence is depicted in Fig. 4.

  2. If k=|X|-1, the procedure is similar to Case (a) except that we only perform the second and third CET since, by Lemma 4.3, there is no tree vertex in Nr that is not in a cycle. To be precise, the second CET move deletes the edge (tj,uj) instead of the edge (t,uj), which implies that this CET creates a pair of parallel edges because tj is a vertex of a cycle of length three in Nr. Furthermore, applying the third CET as in Case (a) results in a rooted almost level-1 network with exactly one pair of parallel edges and in which tj is a tree vertex that is not in a cycle.

Let K be the subsequence of (Cj+1,Cj+2,,Ck) that precisely contains each element that is not a cycle of Hk. Since each of Cases (a) and (b) above results in a rooted almost level-1 network with a tree vertex that is not in a cycle, we now apply the sequence of three CETs as described in Case (a) to each cycle in K in order. It is straightforward to check that, for k<|X|-1, no parallel edges are created throughout the process, whereas for k=|X|-1 one pair of parallel edges is created by deleting (tj,uj), but no more pairs of parallel edges arise when applying the CETs described in Case (a) to the cycles in K. Moreover, the first CET as described in Case (a) ensures that we can subsequently delete the edge directed into the source of a cycle in K since this edge is not incident with a reticulation. Let Nr denote the rooted almost level-1 network obtained from Nr by the process of moving all 3-cycles as described above. Since each of Case (a) and (b) requires at most three CETs, it follows that Nr is obtained from Nr by a sequence of at most 3k CETs. Moreover, by construction, Nr is such that the cycles C1,C2,Ck form a chain of cycles of length k such that each cycle has length three except for one cycle of length two if k=|X|-1. If k>0, we may assume without loss of generality that the sink vk of Ck has no descendant that is a sink. Otherwise, we set vk to be the root of Nr.

Fig. 4.

Fig. 4

Sequence of three CETs as described in the proof of Lemma 4.4. Triangles can be single leaves, tree-like structures, cycles, or combinations of all. Moreover, the edges connecting cycles in Nr may be paths with further branching structure, which are omitted for simplicity. The chain of 3-cycles (whose length is increased by one as a result of the sequence of CETs) is depicted in bold

We now complete the transformation of Nr into a rooted level-1 network on X of standard shape with precisely k reticulations. Let S be the rooted binary subtree of Nr whose root is vk, and let XS be the leaf set of S. If S is not a caterpillar in Nr, then we apply a sequence of at most |XS| CETs that each delete a cut edge that is incident with an element in XS and that collectively transform Nr into a rooted almost level-1 network on X such that vk is the root of a caterpillar with leaf set XS. We next distinguish again two cases. First, if k<|X|-1, we move each leaf x in X\XS that is not adjacent to any 3-cycle in Nr by deleting the edge that is directed into x and subdividing the edge that is directed out of vk. This transformation requires a single CET for each x. Second, if k=|X|-1, then Nr contains precisely one 2-cycle. Furthermore, there is at most one leaf x in X\XS that is not adjacent to a 3-cycle. If no such x exists, then |XS|=2 in which case we set x to be one of these two leaves. Let e=(u,v) be an edge of the 2-cycle in Nr. We move x by deleting the edge directed into x and subdividing e. This step requires a single CET and results in a network whose cycles all have length three. Let Nr be the network obtained from Nr as described. Then Nr is obtained from Nr by at most |X|-k CETs. Furthermore, by construction, Nr is a rooted level-1 network with precisely k reticulations of standard shape. It now follows that Nr can be obtained from Nr by a sequence of at most |X|+3k+|X|-k=2|X|+2k CETs and each intermediate network is a rooted level-1 network with precisely k reticulations if k<|X|-1, or a rooted almost level-1 network with precisely k reticulations if k=|X|-1. This completes the proof.

The following lemma shows that a rooted level-1 network of standard shape can be transformed into a rooted level-1 network in standard form using a sequence of CETs.

Lemma 4.5

Let Nr be a rooted level-1 network on X with precisely k reticulations such that Nr is of standard shape. Then, there exists a CET sequence of length at most 3|X| that transforms Nr into the (unique) rooted level-1 network on X with precisely k reticulations in standard form, whereby

  • (i)

    If k|X|-2, every network in the sequence is a rooted level-1 network on X with precisely k reticulations;

  • (ii)

    If k=|X|-1, every network in the sequence is a rooted almost level-1 network on X with precisely k reticulations.

Proof

Let X={x1,x2,,xn}. Furthermore, for some k0, let C1,C2,,Ck denote the 3-cycles of Nr where each cycle Ci with i{1,2,,k} has sink vi. Since Nr is in standard shape and only differs from a network in standard form by a permutation on the leaves, vi is an ancestor of each element in {vi+1,vi+2,,vk} for each i{1,2,,k-1}. Similar to the proof of Lemma 4.4, we generate a CET sequence of rooted almost level-1 networks on X whereby each network in the sequence has precisely k cycles. Although the length of a cycle Ci may change throughout the sequence, its sink remains vi. For each network in the sequence, we therefore refer to the cycle with sink vi as cycle Ci and to the caterpillar with root vk as T.

Intuitively, we turn Nr into the network of standard form by a sequence of CETs that sequentially swap the positions of leaves until every leaf is in its correct position (see Fig. 5 for an example). To this end, each CET deletes a cut edge that is incident with a leaf xi and moves it to its correct position in the standard form, whereby we subdivide either an edge of T or an edge of a cycle. The key idea is that if k|X|-2, we can guarantee that no parallel edges are created, whereas if |X|=k-1, the creation of one pair of parallel edges is unavoidable.

Fig. 5.

Fig. 5

Sequence of CETs transforming a rooted level-1 network of standard shape but not standard form into a rooted level-1 network of standard form. The first two CETs swap leaves x1 and x2, thereby moving x1 to its correct position. The next two CETs then move x2 to its correct position by swapping leaves x2 and x4. The resulting network is already of standard form, implying that no more CETs are required

More formally, let Nr be a rooted level-1 network on X with precisely k reticulations of standard shape. Suppose that Nr has been obtained from Nr by a sequence of CETs such that the leaves x1,x2,,xi-1 are already in their correct position in Nr for some i<|X|, whereas xi is not in its correct position. If there is no such xi, then all leaves are in their correct positions and Nr is already in standard form, in which case there is nothing to show. We now distinguish the following cases to move xi to its correct position via a sequence of CETs:

  1. If xi is a leaf of T and i>k, we apply one CET to move xi to its correct position such that (xk+1,xk+2,,xi) is a caterpillar. Note that the resulting network is a rooted level-1 network on X with precisely k reticulations of standard shape.

  2. If xi is a leaf of T and ik, we distinguish two cases:
    • (i)
      If k|X|-2, then T consists of at least two leaves. In this case, we move xi to its correct position using a single CET, i.e., we move xi to the cycle Ci whose sink is vi. Note that this CET turns Ci into a cycle of length four since Nr is of standard shape and all cycles of Nr have length exactly three. In particular, there exists a leaf xj with j>i that is adjacent to a non-sink and non-source vertex of Ci. We now apply a second CET to move xj to the edge of T that xi had been incident with. Intuitively, this sequence of two CETs swaps the positions of leaves xi and xj and the resulting network is again a rooted level-1 network with precisely k reticulations of standard shape.
    • (ii)
      If k=|X|-1, then xi is the only leaf in T and its parent is vk. Thus we cannot directly perform a CET that deletes (vk,xi). In this case, we consider the cycle Ci whose sink is vi. As Ci has length exactly three, there exists a leaf xj with j>i adjacent to the non-sink non-source vertex of Ci. Note that xj must exist since xixn, as otherwise xi=xn would already be in its correct position. We now first move leaf xj to the edge (vk,xi) of T. Then, we move xi to Ci. Intuitively, we again swap the positions of xi and xj using two CETs. However, while the network resulting from the second CET is a rooted level-1 network with precisely k reticulations of standard shape, the network resulting from the first CET contains one pair of parallel edges and is therefore a rooted almost level-1 network.
  3. If xi is adjacent to a non-source and non-sink of a cycle C of Nr.
    • (i)
      If k=|X|-1 and ik, we directly move xi to its correct position, i.e., we move xi to cycle Ci. Since Nr is a rooted level-1 network with precisely k reticulations of standard shape and all of its cycles are 3-cycles, this move creates a pair of parallel edges and therefore a rooted almost level-1 network. However, for analogous reasons as above, there exists a leaf xj with j>i adjacent to a non-source and non-sink vertex of Ci, and we move xj to C. This sequence of two CETs swaps the roles of xi and xj and results in a rooted level-1 network with precisely k reticulations of standard shape.
    • (ii)
      If k=|X|-1 and i>k, then i=n. In this case, xi is already in its correct position, i.e., it is the single leaf of T adjacent to vk. This is due to the assumption that leaves x1,x2,,xi-1=xn-1 are already in their correct positions and Nr is a rooted level-1 network with precisely k reticulations of standard shape. In this case, we perform no further CETs.
    • (iii)
      If k|X|-2, the subtree T of Nr contains at least two leaves. Let xj with j>i be one of these leaves (which must exist for similar reasons as in the cases described above). Furthermore, if ik let xj be the leaf that is adjacent to the non-source and non-sink vertex of Ci. Since xi is not in its correct position, we have CiC and xjxi We now first move xj to C, thereby turning C into a 4-cycle. Next, we move xi to its correct position, i.e., we move it either to cycle Ci if ik, thereby turning Ci into a 4-cycle and C into a 3-cycle or to T if i>k. If ik, we perform one more CET and move xj to the edge of T that xj had been incident with. Again, this sequence of at most three CETs swaps the positions of leaves xi and xj, and possibly xj, such that each network in the sequence is a rooted level-1 network with precisely k reticulations and the final network is additionally of standard shape.

In summary, if k|X|-2, we transform Nr into a rooted level-1 network of standard form by a sequence of CETs, whereby every intermediate network is a rooted level-1 network with precisely k reticulations. If k=|X|-1, a single pair of parallel edges might be created during the transformation and, so, every intermediate network is a rooted almost level-1 network. Moreover, since each of the cases requires at most three CETs, it follows that the (unique) rooted level-1 network on X with precisely k reticulations in standard form can be obtained from Nr by a sequence of at most 3|X| CETs. This completes the proof.

We are now finally in the position to prove Theorem 4.1.

Proof of Theorem 4.1

Let Nr and Nr be two rooted level-1 networks on X with exactly k reticulations. First, if k|X|-2 then, by Lemmas 4.4 and 4.5, Nr (resp. Nr) can be transformed into the rooted level-1 network on X with precisely k reticulations in standard form such that each intermediate network is level-1 and has exactly k reticulations. Hence, if k|X|-2, it follows from the reversibility of CET that the space of rooted level-1 networks with exactly k reticulations is connected. Second, if k=|X|-1 then, again by Lemmas 4.4 and 4.5, Nr (resp. Nr) can be transformed into the rooted level-1 network on X with precisely k reticulations in standard form such that each intermediate network is almost level-1 and has exactly k reticulations. Hence, if k=|X|-1, then the space of rooted level-1 networks with exactly k reticulations is weakly connected. Moreover, applying Lemmas 4.4 and 4.5 one more time, it requires at most 2|X|+2k+3|X|=5|X|+2k CETs to transform each of Nr and Nr into the unique rooted level-1 network on X with exactly k reticulations in standard form. Hence, if k=|X|-1 (resp. k<|X|-1), then there exists a CET sequence of length at most 10|X|+4k that connects Nr and Nr in the space of all rooted level-1 networks on X with exactly k reticulations (resp. in the space of all rooted almost level-1 networks on X with exactly k reticulations). In both cases, the diameter is therefore O(|X|+k).

We remark in passing that Theorem 4.1 strengthens a previous result on the connectedness of the space of rooted level-1 networks on X with exactly k reticulations. In particular, Klawitter (2020) showed that this space is connected if one allows for k pairs of parallel edges, whereas our result requires at most one pair of parallel edges.

Connectedness of semi-directed level-1 networks

Connectedness for networks with a fixed number of reticulations

In this section, we use the results established in Sect. 4 to establish connectedness results under CET for spaces of semi-directed level-1 networks with a fixed number of reticulations.

The main result of this section is the following theorem.

Theorem 5.1

Let k be a fixed non-negative integer. If k|X|-2, then, the space all of semi-directed level-1 networks on X with exactly k reticulations is connected under CET. Otherwise, if k=|X|-1, then the space of semi-directed level-1 networks on X with exactly k reticulations is weakly connected under CET. Moreover, in both cases, the diameter of the space of semi-directed level-1 networks on X with exactly k reticulations is at most O(|X|+k) under CET.

To motivate the allowance of parallel edges in establishing connectedness results for semi-directed level-1 networks, note that if k=|X|-1, the space of semi-directed level-1 networks on X with precisely k reticulations is not necessarily connected. As an example, consider the space of semi-directed level-1 networks with |X|=2 and k=1. Let Ns be the semi-directed level-1 network depicted in Fig. 1b, and let Ns be the semi-directed level-1 network obtained from Ns by interchanging x1 and x2. Then, NsNs and there exists no CET sequence that transforms Ns into Ns, whereby every network in the sequence is a semi-directed level-1 network with one reticulation. However, it is possible to transform Ns into Ns by a sequence of two CETs, whereby the network obtained from Ns by the first CET is a semi-directed almost level-1 network with one reticulation.

Before proving Theorem 5.1, we establish a connection between a sequence of CETs connecting two semi-directed almost level-1 networks and such a sequence connecting their rooted partners that are almost level-1.

Lemma 5.2

Let Ns1 and Ns2 be two distinct semi-directed almost level-1 networks, and let Nr1 and Nr2 be two almost level-1 rooted partners of Ns1 and Ns2, respectively. If Nr2 can be obtained from Nr1 by a single CET, then Ns2 can be obtained from Ns1 by one CET.

Proof

Suppose that Nr2 can be obtained from Nr1 by a single CET. Let e=(u,v) be the cut edge of Nr1 that is deleted in obtaining Nr2 from Nr1. Let M and M be the two connected subnetworks that result from deleting e and suppressing u, where M contains ρ and M contains v. Furthermore, let f be the edge of M that is subdivided with a new vertex u in obtaining Nr2 from M and M by adding the edge (u,v). Observe that f is also an edge of Nr1. Moreover, by definition of a CET, u is not a reticulation and uρ. Now, let t be the unique child of ρ in Nr1. Since Nr1Nr2, it follows that e and f cannot both be incident with t. To complete the proof, we consider three cases.

First, assume that neither e nor f is incident with t. By Lemma 3.2, {u,v} is a cut edge of Ns1. Moreover, since Ns1 is obtained from Nr1 by applying one of the operations (s1)–(s3), it is easily checked that f is also an edge of Ns1. It now follows that Ns2 can be obtained from Ns1 by the CET that deletes {u,v}, suppresses u, subdivides f with a new vertex u, and joins the two vertices u and v with a new edge.

Second, assume that e is incident with t. Then t=u and all three edges that are incident with t are cut edges of Nr1. Let w be the second child of t that is not v. It follows from Lemma 3.2, that {v,w} is a cut edge of Ns1. Furthermore, as f is not incident with t, an argument analogous to that used in the first case implies that f is also an edge of Ns1. Hence, Ns2 can be obtained from Ns1 by the CET that deletes {v,w}, suppresses w, subdivides f with a new vertex u, and joins the two vertices u and v with a new edge.

Third, assume that f is incident with t. As before, {u,v} is a cut edge of Ns1 by Lemma 3.2. If t is not the source of a cycle, let w and w be the two children of t in Nr1, Then {w,w} is an edge in Ns1. Hence, Ns2 can be obtained from Ns1 by the CET that deletes {u,v}, suppresses u, subdivides {w,w} with a new vertex u, and joins the two vertices u and v with a new edge. On the other hand, if t is the source of a cycle in Nr1, let (tw) and (t,w) be the two edges that are directed out of t. It follows that {w,w}, (w,w), or (w,w) is an edge of Ns1 depending on whether or not one of w and w is a reticulation in Ns1. Since Nr1 is almost level-1, we may have w=w, in which case (w,w) is a loop. Thus, Ns2 can be obtained from Ns1 by the CET that deletes {u,v}, suppresses u, subdivides {w,w} with a new vertex u, and joins the two vertices u and v with a new edge. Additionally, if {w,w} is a loop in Ns1, then one of the two resulting parallel edges that each join u and w is initially undirected and therefore directed into w in Ns2.

It now follows that, for all three cases, Ns2 can be obtained from Ns1 by one CET; thereby establishing the lemma.

We are now in a position to prove Theorem 5.1.

Proof of Theorem 5.1

Let Ns and Ns be two semi-directed level-1 networks on X that each have exactly k reticulations. Furthermore, let Nr and Nr be a level-1 rooted partner of Ns and Ns, respectively. By Theorem 4.1 and its proof, there exists a CET sequence

NrNr1,Nr2,,Nrm-1,NrmNr

with m10|X|+4k that connects Nr and Nr such that each network in the sequence is either a rooted almost level-1 network on X and with exactly k reticulations if k=|X|-1 or a rooted level-1 network on X with exactly k reticulations if k<|X|-1. For each i{2,3,,m-1}, let Nsi be the semi-directed network on X that is obtained from Nri by applying one of the operations (s1)–(s3). By construction, Nsi has exactly k reticulations and Nri is a rooted partner of Nsi.

Set Ns1=Ns and Nsm=Ns. Then, for each i{1,2,,m}, Nsi is level-1 (resp. almost level-1) if and only if Nri is level-1 (resp. almost level-1). Now consider Nsi and Nsi+1 for each i{1,2,,m-1}. We may have NsiNsi+1. It follows from Lemma 5.2 that Nsi+1 can be obtained from Nsi by at most one CET. Hence, there exists a sequence of at most m CETs that connects Ns and Ns such that each network in the sequence is either a semi-directed almost level-1 network with exactly k reticulations if k=|X|-1 or a semi-directed level-1 network on X with exactly k reticulations if k|X|-2. The theorem now follows.

Connectedness for networks with a varying number of reticulations

In this section, we show that the space of semi-directed level-1 networks on a fixed leaf set is connected under CET and two additional operations, which we now introduce. Intuitively, these two operations change the number of reticulations in semi-directed phylogenetic network by one.

Definitions

Throughout this section, let Ns be a semi-directed phylogenetic network.

R - moves Let e=(u,v) be a reticulation edge of Ns such that, if uv, then u is not a reticulation. If (uv) is a loop, obtain a network Ns from Ns by deleting u and suppressing the resulting degree-two vertex, say w. Observe that, if w is a vertex of a 2-cycle in Ns, then this cylce becomes a loop in Ns. On the other hand, if (uv) is not a loop, obtain Ns from Ns by undirecting the edge that is directed into v and not e, deleting e, and suppressing u and v.

If e is a loop, then Ns has a unique rooted partner and it follows that the neighbor of u in Ns is not a reticulation. Hence, regardless of whether e is a loop in Ns or not, Ns is a semi-directed phylogenetic network on X.

R + moves Let e be an edge of Ns. Obtain a network N from Ns in one of the following two ways: (i) Subdivide e with a new vertex v, add the edge {u,v}, where u is a new vertex, and add the (directed) loop (uu); or (ii) subdivide e with a new vertex v, subdivide an edge in the resulting network with a new vertex u, add the new edge (uv), and direct one of the two other edges incident with v into v.

In contrast to R-, observe that R+ does not necessarily result in a semi-directed phylogenetic network. For example, if Ns contains a loop and N is obtained from Ns by a R+ as described in (i), then N contains two loops and is not a semi-directed phylogenetic network.

Extended CET and extended CET distance Now let Ns and Ns be two semi-directed phylogenetic networks on X. If Ns can be obtained from Ns by a single R+ (resp. R-), then Ns can by obtained from Ns by a single R- (resp. R+). Furthermore, we say that Ns can be obtained from Ns by a single extended CET if it can be obtained by applying exactly one of CET, R-, and R+ to Ns. Similar to the CET distance, we refer to the minimum number of extended CETs that are required to transform Ns into Ns as the extended CET distance between Ns and Ns.

Results

The main aim of this section is to establish two connectedness results for semi-directed networks that do not have a fixed number of reticulations.

We start with an observation that we freely use throughout this section. For a semi-directed phylogenetic network Ns with no reticulation, the definition of a CET on Ns coincides with that of a subtree prune and regraft (SPR) operation for unrooted phylogenetic trees. To be precise, an unrooted binary phylogenetic X-tree T is an undirected tree whose leaves are bijectively labeled with X and whose internal vertices all have degree three. Under the subtree prune and regraft operation, it is well-known that the space of all unrooted phylogenetic trees on a fixed leaf set is connected (Allen and Steel 2001; Maddison 1991).

Theorem 5.3

The space of all semi-directed level-1 networks on X is connected under extended CET.

Proof

Let Ns and Ns be two semi-directed level-1 networks on X, and let k=r(Ns) and k=r(Ns). Furthermore, let (v1,v2,,vk) be an ordering on the reticulations of Ns and, similarly, let (v1,v2,,vk) be an ordering on the reticulations of Ns.

Now, setting Ns0=Ns, repeat the following operation k times for each i{1,2,,k} in order. Obtain a network Nsi from Nsi-1 by applying a R- to a reticulation edge (ui,vi) that is incident with vi. Since Ns0 is level-1, it follows that u1 is not a reticulation. Thus Ns1 is a semi-directed level-1 network on X. Repeating this argument, it follows that each Nsi with i{0,1,2,,k} is a semi-directed level-1 network on X and Nsk is an unrooted phylogenetic tree on X. Let Ts=Nsk, and let Ts be an unrooted phylogenetic tree on X obtained from Ns by applying k R- in an analogous way. Since Ts can be obtained from Ts by a sequence of subtree prune and regraft operations, it follows that that Ts can be obtained from Ts by a sequence of CETs and each tree in the sequence is an unrooted phylogenetic tree on X. The theorem now follows from the reversibility of CET, R+, and R-.

The next theorem is similar to Theorem 5.3 and establishes connectedness for the larger space of semi-directed phylogenetic networks on a fixed leaf set.

Theorem 5.4

The space of all semi-directed phylogenetic networks on X is connected under extended CET.

Proof

Let Ns and Ns be two semi-directed phylogenetic networks on X with k=r(Ns) and k=r(Ns). Let Nr and Nr be a rooted partner of Ns and Ns, respectively. Furthermore, let (v1,v2,,vk) be an ordering on the reticulations of Ns such that, for all i,j{1,2,,k} with i<j, vi is not a descendant of vj in Nr. Similarly, let (v1,v2,,vk) be an ordering on the reticulations of Ns such that, for all distinct i,j{1,2,,k} with i<j, vi is not a descendant of vj in Nr. The theorem can now be established analogously to Theorem 5.3. The more constrained ordering of the reticulations of Ns and Ns in comparison to that used in the proof of Theorem 5.3 guarantees that each R- is applied to a reticulation edge (uv) of a semi-directed phylogenetic network on X such that u is not a reticulation.

The next corollary follows immediately from Theorems 5.3 and 5.4, and the fact that each extended CET is reversible.

Corollary 5.5

The extended CET distance is a metric on the space of all semi-directed phylogenetic networks as well as on all semi-directed level-1 networks on X.

Connectedness using CET1 moves

In the following, we consider CETs that operate “locally” in the sense that when a cut edge {u,v} of a semi-directed phylogenetic network is deleted, the connected component containing v is re-attached via the introduction of a new cut edge in close proximity to its original position (see formal definitions below). We then show that every CET that satisfies a mild constraint can be translated into a sequence of these local CETs.

Definitions

Using similar terminology as Gambette et al. (2017), we define the central concept of this section, namely CET1 moves.

CET 1 moves Let Ns be a semi-directed phylogenetic network on X. First, when a CET deletes a cut edge {u,v}, we refer to the two edges incident with u in Ns that are different from the edge {u,v} as the donor edges, and to the edge that is subdivided by u in Ns prior to adjoining u and v with a new edge as the recipient edge. Then, a CET1 is a CET applied to Ns such that the recipient edge is incident with one of the two donor edges.

As an example, the CET depicted in Fig. 2 is a CET1 since the recipient edge, i.e., the edge incident with leaf x5, is also incident with one of the two donor edges incident with u. If we had instead subdivided the edge incident with leaf x3 by u and then added the edge {u,v}, the resulting CET would not have been a CET1.

Note that a CET1 move may be interpreted as an NNI move for semi-directed phylogenetic networks. In particular, a CET1 move on such a network with no reticulation coincides with an NNI move on an unrooted phylogenetic tree.

Next, we consider two particular types of CET moves affecting loops and parallel edges.

Changing the location of a pair of parallel edges and exchanging a loop for a pair of parallel edges Again, let Ns be a semi-directed almost level-1 network that contains at least one pair of parallel edges and at least one 3-cycle. We say that a CET applied to Ns changes the location of a pair of parallel edges if it deletes a cut edge e whose two donor edges are edges of a 3-cycle (turning this 3-cycle into a 2-cycle) and whose recipient edge is an edge of a 2-cycle (turning this 2-cycle into a 3-cycle). Similarly, if Ns contains (i) precisely one loop and at least one 3-cycle, or (ii) precisely two pairs of parallel edges, we say that a CET applied to Ns exchanges a loop for two pairs of parallel edges or vice versa if it (i) deletes a cut edge e whose two donor edges are edges of a 3-cycle (turning this 3-cycle into a 2-cycle) and whose recipient edge is the loop of Ns (turning the loop into a second 2-cycle), or (ii) deletes a cut edge e whose two donor edges form a 2-cycle (turning this 2-cycle into a loop) and whose recipient edge is an edge of a 2-cycle (turning this 2-cycle into a 3-cycle).

Results

In this section we show that if Ns and Ns are two semi-directed (almost) level-1 networks on X with exactly k reticulations that are one CET apart such that the CET does not change the location of a pair of parallel edges, and does not exchange a loop for two pairs of parallel edges or vice versa, then there is also a sequence of CET1 moves connecting Ns and Ns, whereby every network in the sequence is a semi-directed (almost) level-1 network with exactly k reticulations. The restriction of not changing the location of a pair of parallel edges or exchanging a loop for a pair of parallel edges is required to ensure that every network in the sequence is indeed an (almost) level-1 network.

Proposition 5.6

Let Ns and Ns be two semi-directed level-1 networks on X and with exactly k reticulations if k<|X|-1, respectively two semi-directed almost level-1 networks with exactly k reticulations if k=|X|-1, such that Ns can be obtained from Ns by a single CET that neither changes the location of a pair of parallel edges nor exchanges a loop for two pairs of parallel edges or vice versa. Then, there exists a CET1 sequence transforming Ns into Ns such that each network in the sequence is level-1 and has exactly k reticulations if k<|X|-1 or each network in the sequence is almost level-1 and has exactly k reticulations if k=|X|-1.

Proof

We first show that there is a CET1 sequence connecting Ns and Ns, whereby every network in the sequence is a semi-directed network with precisely k reticulations. Let e={u,v} be the edge of Ns that is deleted in obtaining Ns from Ns by a single CET. Furthermore, let e={p,q} (respectively, e=(p,q) if e is directed) denote the recipient edge in Ns. If e is incident with one of the two donor edges incident with u in Ns, the CET to obtain Ns from Ns is a CET1 and there is nothing to show. Thus, assume that e is not incident with one of the two donor edges. As Ns is connected, there exists an undirected path P between u and p. Let {u,u1},{u1,u2},,{ul-1,ul},{ul,ul+1} be the sequence of edges of P with {ul,ul+1}={p,q}. To ease reading, we view all edges of P as being undirected regardless of whether they are tree or reticulation edges of Ns. We now argue that the CET transforming Ns into Ns can also be realized as a CET1 sequence along P. More precisely, the first CET1 consists of deleting e={u,v} and suppressing u, subdividing the edge {u1,u2} with a new vertex u1, and introducing the edge {u1,v}. Because every CET1 is also a CET, this results in a semi-directed phylogenetic network Ns1 with cut edge {u1,v} and precisely k reticulations. Moreover, by construction, Ns1 has a rooted partner, Nr1 say, such that u1 is the parent of v in Nr1 or there exist three cut edges (ρ,t),(t,u1), and (tv) in Nr1. Lastly, observe that {u2,u3},{u3,u4},,{ul,ul+1} is a path in Ns1. We now perform a second CET1, whereby we delete {u1,v} and suppress u1 in Ns1, subdivide the edge {u2,u3} with a new vertex u2, and introduce the edge {u2,v}. By construction, this results in a semi-directed network Ns2 with cut edge {u2,v} and precisely k reticulations, where u2 and v are again such that u2 is a parent of v in the rooted partner Nr2 of Ns2, or Nr2 contains the three cut edges (ρ,t),(t,u2), and (tv). Furthermore, {u3,u4},{u4,u5},,{ul,ul+1} is a path in Ns2. If l>2, we next apply a CET1 to {u2,v} in Ns2 with recipient edge {u3,u4} and repeat. As P consists of a finite number of edges, this process will eventually lead to a semi-directed network Nsl obtained from the semi-directed network Nsl-1 by deleting the edge {ul-1,v}, suppressing ul-1, subdividing the edge {ul,ul+1}={p,q} with a new vertex ul, and adding the edge {ul,v}. Since all vertices ui with 1i<l introduced during this process are immediately suppressed in subsequent steps, clearly NslNs, which completes the first part of the proof.

It remains to argue that every network in the sequence is level-1 if k<|X|-1 and is almost level-1 if k=|X|-1. We achieve this by showing that each network in the sequence satisfies certain properties that imply that there exists a rooted partner that is level-1, respectively almost level-1, allowing us to conclude that the semi-directed network itself is level-1, respectively almost level-1. Consider the above CET1 sequence Ns,Ns1,Ns2,,NslNs transforming Ns into Ns. We first consider the CET1 transforming Ns into Ns1 and distinguish two cases:

  • (i)

    If k<|X|-1, Ns and Ns are semi-directed level-1 networks and contain at most one pair of parallel edges each and no loop. First, suppose that Ns contains one pair of parallel edges. Since the CET transforming Ns into Ns by assumption does not change the location of a pair of parallel edges, this implies that the donor edges of Ns cannot be part of a 3-cycle. Hence, when deleting e={u,v} from Ns to obtain Ns1, no additional pair of parallel edges is created. Second, suppose that Ns contains no pair of parallel edges. Then Ns1 contains at most one pair of parallel edges. Thus, in both cases, Ns1 is also level-1. Indeed, Nr1 is a rooted level-1 partner of Ns1.

  • (ii)
    If k=|X|-1, Ns and Ns are semi-directed almost level-1 networks and each contain at most one loop and no pair of parallel edges, or at most two pairs of parallel edges but no loop. Assume for the sake of a contradiction that deleting e={u,v} from Ns to obtain Ns1 results in Ns1 not being almost level-1, i.e., containing either three pairs of parallel edges, two loops, or one loop and a pair of parallel edges, while deleting e={u,v} from Ns to obtain Ns results in Ns containing at most one loop and no pair of parallel edges, or at most two pairs of parallel edges but no loop.
    • If Ns1 contains three pairs of parallel edges, Ns contains two pairs of parallel edges and the donor edges of Ns are edges of a 3-cycle. Since e={u,v} is also deleted when transforming Ns into Ns, either Ns also contains three pairs of parallel edges, a contradiction to the fact that Ns is almost level-1, or the recipient edge for the CET from Ns into Ns is an edge of a 2-cycle, which is also a contradiction, since the CET by assumption does not change the location of a pair of parallel edges.
    • If Ns1 contains two loops, Ns contains at least one loop and at least one pair of parallel edges. This contradicts the fact that Ns is an almost level-1 network.
    • Finally, if Ns1 contains one loop and one pair of parallel edges, then either (a) Ns contains one loop and the donor edges of Ns are edges of a 3-cycle, or (b) Ns contains two pairs of parallel edges, and the donor edges of Ns are edges of such a pair. Again, as e is also deleted to obtain Ns1 from Ns, either Ns also contains one loop and one pair of parallel edges, contradicting the fact that Ns is almost level-1, or the CET from Ns to Ns is such that (a) the loop of Ns is exchanged for two pairs of parallel edges, or (b) the two pairs of parallel edges of Ns are exchanged for a loop. Both cases contradict the fact that the CET transforming Ns into Ns does not exchange a loop for two pairs of parallel edges or vice versa.
    As all three cases lead to a contradiction, Ns1 is an almost level-1 network.

Now, consider i{1,2,,l-1} and the CET1 transforming Nsi into Nsi+1. Suppose that deleting the cut edge {ui,v} introduces an excessive loop or pair of parallel edges such that Nsi+1 is not level-1 if k<|X|-1 or is not almost level-1 if k=|X|-1. Since {ui,v} was newly introduced when transforming Nsi-1 into Nsi, this new loop or pair of parallel edges must have already existed in Nsi-1 and thus ultimately in Ns. Thus, it cannot be excessive and Nsi+1 is a level-1, respectively almost level-1 network. This completes the proof.

Revisiting the CET sequences used in the proofs of Lemmas 4.4 and 4.5 to establish (weak) connectedness under CET for rooted level-1 networks with precisely k reticulations and translating these sequences into their semi-directed counterparts to establish (weak) connectedness under CET for semi-directed level-1 networks with precisely k reticulations, we notice that no CET changes the location of a pair of parallel edges or exchanges a loop for two pairs of parallel edges (or vice versa). Thus, the conditions of Proposition 5.6 are satisfied and the next corollary follows from Theorem 5.1, where the definition of weakly connected under CET1 is analogous to that of weakly connected under CET.

Corollary 5.7

Under CET1, the space of all semi-directed level-1 networks on X with exactly k reticulations is connected if k|X|-2 and is weakly connected if k=|X|-1.

As mentioned in the introduction, Solís-Lemus and Ané (2016) conjectured that the five types of moves employed in SNaQ are sufficient to guarantee connectedness of the space of semi-directed level-1 networks with a fixed leaf set. While one of these five types increases the number of reticulations by one, no move decreases this number. Hence, SNaQ must effectively guarantee connectedness of the space of semi-directed level-1 networks with a fixed number of reticulations and leaf set, because once a search through the space of semi-directed level-1 networks reaches a network with k reticulations every network that is investigated later in the search has at least k reticulations. Although a precise definition of SNaQ’s fourth move type, called NNI move on a tree edge, is unfortunately missing in Solís-Lemus and Ané (2016), Corollary 5.7 suggests that the space of level-1 networks with a fixed number of reticulations and leaf set is connected under the five moves employed in SNaQ if the authors additionally allow for NNI moves on a reticulation edge. Our results also imply that, if k=|X|-1, then semi-directed level-1 networks that allow for at most two 2-cycles and a single loop need to be considered when searching for an optimal network although, as noted in Solís-Lemus and Ané (2016), reticulations in a 2-cycle and certain other types of short cycles with small adjacent subnetworks are either not detectable or their parameters are not all identifiable.

Concluding remarks

In this paper, we have introduced a new rearrangement operation on semi-directed phylogenetic networks, called CET, that can transform any semi-directed level-1 network with precisely k reticulations into any other such network with the same set of leaves. Moreover, we have introduced two additional operations, R+ and R-, that allow to move between semi-directed phylogenetic networks and between semi-directed level-1 networks with a fixed leaf set and an arbitrary number of reticulations. While CET moves have a similar flavor as SPR and rSPR moves on unrooted, respectively rooted phylogenetic trees and networks (Allen and Steel 2001; Bordewich and Semple 2005; Bordewich et al. 2017; Gambette et al. 2017), we have also shown that any CET can be translated into a sequence of more local CET1 moves, which are similar to NNI moves studied on phylogenetic trees and networks (Gambette et al. 2017; Huber et al. 2015; Janssen and Klawitter 2019; Robinson 1971). Such CET1 moves essentially coincide with moves that are used in the popular network inference software PhyloNetworks (Solís-Lemus and Ané 2016; Solís-Lemus et al. 2017) up to a slight relaxation of one of their moves. Thus, our theoretical results on the connectedness of the space of semi-directed level-1 networks provide some level of assurance that an optimal semi-directed level-1 network can be reached from any such starting network.

While our main focus has been to establish connectedness and diameter results for the space of semi-directed level-1 networks with a fixed number of reticulations and leaf set, there are several open questions to explore in future research. For instance, it would be interesting to analyze the computational complexity of determining the CET distance between any two semi-directed level-1 networks. It would also be interesting to analyze further properties of the space of semi-directed phylogenetic networks on a fixed leaf set or subspaces of it such as the radius of the space. Finally, one could ask which of the results presented in this paper carry over to unrooted phylogenetic networks.

Acknowledgements

We thank Janosch Döcker, Sungsik Kong, Laura Kubatko, and Claudia Solís-Lemus for helpful discussions. Moreover, we thank two anonymous reviewers for detailed comments on an earlier version of this paper. SL thanks the New Zealand Marsden Fund for their financial support.

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Data availability

Not applicable.

Code availability

Not applicable.

Declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Footnotes

1

A connected component of a graph G is a subgraph in which each pair of vertices is connected via a path of edges.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  1. Allen BL, Steel M. Subtree transfer operations and their induced metrics on evolutionary trees. Ann Comb. 2001;5(1):1–15. doi: 10.1007/s00026-001-8006-8. [DOI] [Google Scholar]
  2. Allman ES, Baños H, Rhodes JA. NANUQ: a method for inferring species networks from gene trees under the coalescent model. Algorithms Mol Biol. 2019;14(1):1–25. doi: 10.1186/s13015-019-0159-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Allman ES, Baños H, Rhodes JA. Identifiability of species network topologies from genomic sequences using the logDet distance. J Math Biol. 2022;84:35. doi: 10.1007/s00285-022-01734-2. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Ardiyansyah M (2021) Distinguishing level-2 phylogenetic networks using phylogenetic invariants. arXiv:2104.12479
  5. Baños H. Identifying species network features from gene tree quartets under the coalescent model. Bull Math Biol. 2018;81(2):494–534. doi: 10.1007/s11538-018-0485-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Bordewich M, Semple C. On the computational complexity of the rooted subtree prune and regraft distance. Ann Comb. 2005;8(4):409–423. doi: 10.1007/s00026-004-0229-z. [DOI] [Google Scholar]
  7. Bordewich M, Linz S, Semple C. Lost in space? Generalising subtree prune and regraft to spaces of phylogenetic networks. J Theor Biol. 2017;423:1–12. doi: 10.1016/j.jtbi.2017.03.032. [DOI] [PubMed] [Google Scholar]
  8. Cardona G, Rosselló F, Valiente G. Comparison of tree-child phylogenetic networks. IEEE ACM Trans Comput Biol Bioinform. 2008;6(4):552–569. doi: 10.1109/TCBB.2007.70270. [DOI] [PubMed] [Google Scholar]
  9. Erdős PL, Francis A, Mezei TR. Rooted NNI moves and distance-1 tail moves on tree-based phylogenetic networks. Discrete Appl Math. 2021;294:205–213. doi: 10.1016/j.dam.2021.02.016. [DOI] [Google Scholar]
  10. Francis A, Huber KT, Moulton V, et al. Bounds for phylogenetic network space metrics. J Math Biol. 2017;76(5):1229–1248. doi: 10.1007/s00285-017-1171-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Gambette P, van Iersel L, Jones M, et al. Rearrangement moves on rooted phylogenetic networks. PLoS Comput Biol. 2017;13(8):e1005611. doi: 10.1371/journal.pcbi.1005611. [DOI] [PMC free article] [PubMed] [Google Scholar]
  12. Gross E, Long C. Distinguishing phylogenetic networks. SIAM J Appl Algebra Geom. 2018;2(1):72–93. doi: 10.1137/17M1134238. [DOI] [Google Scholar]
  13. Gross E, van Iersel L, Janssen R, et al. Distinguishing level-1 phylogenetic networks on the basis of data generated by Markov processes. J Math Biol. 2021;83:32. doi: 10.1007/s00285-021-01653-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Hein J, Jiang T, Wang L, et al. On the complexity of comparing evolutionary trees. Discrete Appl Math. 1996;71(1–3):153–169. doi: 10.1016/S0166-218X(96)00062-5. [DOI] [Google Scholar]
  15. Hollering B, Sullivant S. Identifiability in phylogenetics using algebraic matroids. J Symb Comput. 2021;104:142–158. doi: 10.1016/j.jsc.2020.04.012. [DOI] [Google Scholar]
  16. Huber KT, Linz S, Moulton V, et al. Spaces of phylogenetic networks from generalized nearest-neighbor interchange operations. J Math Biol. 2015;72(3):699–725. doi: 10.1007/s00285-015-0899-7. [DOI] [PubMed] [Google Scholar]
  17. Huber KT, Moulton V, Wu T. Transforming phylogenetic networks: moving beyond tree space. J Theor Biol. 2016;404:30–39. doi: 10.1016/j.jtbi.2016.05.030. [DOI] [PubMed] [Google Scholar]
  18. Huber KT, van Iersel L, Janssen R et al (2023) Orienting undirected phylogenetic networks. J Comput Syst Sci 103480
  19. Huson DH, Rupp R, Scornavacca C. Phylogenetic networks: concepts, algorithms and applications. Cambridge: Cambridge University Press; 2010. [Google Scholar]
  20. Janssen R. Heading in the right direction? Using head moves to traverse phylogenetic network space. J Graph Algorithms Appl. 2021;25(1):263–310. doi: 10.7155/jgaa.00559. [DOI] [Google Scholar]
  21. Janssen R (2021b) Rearranging phylogenetic networks. PhD thesis, Delft University of Technology
  22. Janssen R, Klawitter J. Rearrangement operations on unrooted phylogenetic networks. Theory Appl Graphs. 2019;22(1):1–31. [Google Scholar]
  23. Janssen R, Jones M, Erdős PL, et al. Exploring the tiers of rooted phylogenetic network space using tail moves. Bull Math Biol. 2018;80(8):2177–2208. doi: 10.1007/s11538-018-0452-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Klawitter J. The SNPR neighbourhood of tree-child networks. J Graph Algorithms Appl. 2018;22(2):329–355. doi: 10.7155/jgaa.00472. [DOI] [Google Scholar]
  25. Klawitter J (2020) Spaces of phylogenetic networks. PhD thesis, University of Auckland
  26. Kong S, Swofford D, Kubatko L (2022) Inference of phylogenetic networks from sequence data using composite likelihood. bioRxiv
  27. Maddison DR. The discovery and importance of multiple islands of most-parsimonious trees. Syst Biol. 1991;40(3):315–328. doi: 10.1093/sysbio/40.3.315. [DOI] [Google Scholar]
  28. McDiarmid C, Semple C, Welsh D. Counting phylogenetic networks. Ann Comb. 2015;19(1):205–224. doi: 10.1007/s00026-015-0260-2. [DOI] [Google Scholar]
  29. Robinson DF. Comparison of labeled trees with valency three. J Comb Theory Ser B. 1971;11(2):105–119. doi: 10.1016/0095-8956(71)90020-7. [DOI] [Google Scholar]
  30. Solís-Lemus C, Ané C. Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting. PLoS Genet. 2016;12(3):e1005896. doi: 10.1371/journal.pgen.1005896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Solís-Lemus C, Bastide P, Ané C. PhyloNetworks: a package for phylogenetic networks. Mol Biol Evol. 2017;34(12):3292–3298. doi: 10.1093/molbev/msx235. [DOI] [PubMed] [Google Scholar]
  32. Solís-Lemus C, Coen A, Ané C (2020) On the identifiability of phylogenetic networks under a pseudolikelihood model. arXiv:2010.01758
  33. Xu J, Ané C. Identifiability of local and global features of phylogenetic networks from average distances. J Math Biol. 2023;86:12. doi: 10.1007/s00285-022-01847-8. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.

Not applicable.


Articles from Journal of Mathematical Biology are provided here courtesy of Springer

RESOURCES