Group-based phylogenetic models on 3-sunlet networks

Shelby Cox; Elizabeth Gross; Samuel Martin

doi:10.1007/s11538-025-01506-1

. 2025 Aug 18;87(9):132. doi: 10.1007/s11538-025-01506-1

Group-based phylogenetic models on 3-sunlet networks

Shelby Cox ¹, Elizabeth Gross ², Samuel Martin ^3,^4,^✉

PMCID: PMC12358336 PMID: 40820189

Abstract

Phylogenetic networks describe the evolution of a set of taxa for which reticulate events have occurred at some point in their evolutionary history. Of particular interest is when the evolutionary history between a set of just three taxa has a reticulate event. In molecular phylogenetics, substitution models can model the process of evolution at the genetic level, and the case of three taxa with a reticulate event can be modeled using a substitution model on a semi-directed graph called a 3-sunlet. We investigate a class of substitution models called group-based phylogenetic models on 3-sunlet networks. In particular, we investigate the discrete geometry of the parameter space and how this relates to the dimension of the phylogenetic variety associated to the model. This enables us to give a dimension formula for this variety for general group-based models when the order of the group is odd.

Introduction

Phylogenetic networks are directed graphs that describe the evolution of a set of taxa for which reticulate events have occurred. Such events, which include, horizontal gene transfer and hybridization, are increasingly being discovered to have occurred between taxa, and the development of methods to reconstruct phylogenetic networks from molecular sequence data is an active area of research. It is therefore important that phylogenetic networks and the models that are placed on them are well understood.

In this work, we focus on phylogenetic network-based substitution models. These are latent-variable Markov models where the state space is a set of biological molecules (usually the four nucleic acids ${A,G,C,T}$ ), and along each edge in the network, a transition matrix gives the probabilities of each possible substitution occurring along that edge (see Gross and Long (2018), Nakhleh (2011) for further details). In this work we focus on a family of Markov models called group-based models, so called because the state space of the Markov process is identified with a finite abelian group. In molecular phylogenetics, there are several nucleotide substitution models that are group-based models, such as the Jukes-Cantor (JC) model, the Kimura 2-parameter (K2P) model, and the Kimura 3-parameter (K3P) model. In all these cases, the state space of the four nucleic acids ${A,G,C,T}$ is identified with the Klein-four group $Z / 2 Z \times Z / 2 Z$ .

For Markov models on phylogenetic networks, the joint probabilities of observing particular patterns at the leaves of the network have polynomial parameterizations in terms of the numerical parameters of the model, i.e., the substitution rates along each edge and reticulation edge parameters. This makes them amenable to algebraic study, and, in particular, the space of all possible joint probabilities at the leaves is the intersection of an algebraic variety with the probability simplex.

In this work we are concerned with the dimension of the variety associated to a phylogenetic network and group-based model. The dimension of such varieties, in this case, t-varieties, is an interesting geometric question in its own right, but also has applications to identifiability and phylogenetic network inference. While Gross et al. (2024) establishes the dimension for most group-based phylogenetic network models, the most elusive has been when the network contains 3-cycles. In this work, we focus on the smallest phylogenetic network containing a 3-cycle, which is called a 3-sunlet. While these networks remain the most elusive to understand mathematically, they are perhaps the most important cycles to understand biologically. Indeed, it is assumed that 3-cycles are the most common cycle motif in true phylogenetic networks, since they indicate hybridization or lateral gene transfer between two very closely related taxa, whereas, larger cycles would indicate such reticulation events between less closely related taxa, which, in many cases, is assumed to be rare. Understanding the dimensions of 3-sunlets can help us establish the statistical property of identifiability, as demonstrated in Proposition 3.9. It can also help us understand how 3-sunlet models are geometrically embedded within larger sunlet models, helping interpret residuals when using algebraic methods as in Barton et al. (2022), Martin et al. (2023) or determining the most appropriate penalty term when using Bayesian methods.

For group-based models on phylogenetic trees, after a transformation, the parameterization of the model is monomial Evans and Speed (1993), Székely et al. (1993), and thus the corresponding variety is a toric variety. These models have been well studied (see e.g. Sturmfels and Sullivant (2005)). The parameterization of the model on a sunlet network has a combinatorial interpretation, and, after the same transformation, is described by binomials. Here we look at the 3-cycle case in depth, and establish a dimension result for group-based models for groups of odd order.

Theorem 1.1

Let G be a finite abelian group of odd order $ℓ + 1 \geq 5$ , and let $N$ be the 3-sunlet network under the general group-based model given by G, with corresponding phylogenetic network variety $V_{N}^{G}$ . Then the affine dimension of $V_{N}^{G}$ is given by

dim V_{N}^{G} = 5 ℓ + 1 .

We call the quantity $5 ℓ + 1$ the expected dimension of the model Gross et al. (2024), and Theorem 1.1 agrees with the conjecture given in Gross et al. (2024). As we will see in Section 4, we believe that this conjecture holds for all finite abelian groups (of order at least 5), but as we discuss in detail in Section 5, the proof strategy that we use here is not easily modified for groups of even order, and thus those cases remain open.

Whilst in analysis of DNA sequences the state space of our models is the four nucleic acids and therefore has order four, odd-order state spaces are common in other analyses. For example, in codon models such as the Goldman-Yang model Goldman and Yang (1994), triplets of nucleotides, called codons, code for amino acids, and are the states of the model. There are 64 possible nucleotide triplets, but often only 61 are used, since three are ‘stop’, codons, which are not modeled. In amino acid models, the 20 amino acids of the standard genetic code are often recoded to a smaller set. One example is Dayhoff six-state recoding Dayhoff et al. (1978), where chemically related amino acids are grouped together to form six states. This grouping is based on the substitution rates in the PAM250 matrix, with amino acids with high substitution rates between them grouped together. However, substitution rates between amino acids can depend on factors such as secondary structure and functional domain Goldman et al. (1998), and so methods have been developed to recode based on replacement patterns from, for example, domain-specific databases Kosio et al. (2004). This introduces the possibility of substitution models with odd-order state spaces.

The paper is organized as follows. In Section 2, we describe phylogenetic networks and the paramaterization map for 3-sunlet networks for the general group-based model. We also outline the tropicalization method from Draisma (2008) that we use to determine a lower bound for the dimension. The method is rooted in tropical geometry and leads to hyperplane arrangements on spaces of weight vectors. We close Section 2 with observations about the chambers of these hyperplane arrangements for 3-sunlet networks. In Section 3, we prove the main theorem of the paper (Theorem 1.1), which gives a formula for the dimension for 3-sunlet networks under general group-based models of odd order greater than or equal to 5. We end the section with a partial identifiability result (Proposition 3.9) for general group-based models of odd order. In Section 4, we investigate the dimension for small finite abelian groups (both even and odd) through computational experiments. In particular, we explore chambers of hyperplane arrangements to highlight the difficulty in finding appropriate weight vectors that can be used to establish dimensions of 3-sunlets. Section 5 closes the paper with a discussion about the challenges involved in understanding 3-sunlets, and more generally, networks with 3-cycles.

Background

A (rooted binary) phylogenetic network is a rooted, acyclic, directed graph where each non-root internal vertex has in-degree one and out-degree two, or in-degree two and out-degree one. We refer to the internal vertices with in-degree one as tree vertices and the internal vertices with in-degree two as reticulation vertices. The leaves of the phylogenetic network (the vertices of in-degree 1 and out-degree 0) are labelled by a set of taxa, for which we will always use the set of the first n positive integers $[n] = {1, \dots, n}$ . The two edges directed into a reticulation vertex are called reticulation edges. A phylogenetic network $N$ is said to be level-1 if, in the undirected skeleton of $N$ , no two cycles share an edge. For an example of a level-1 phylogenetic network, see Figure 1. A semi-directed phylogenetic network is a mixed graph that is obtained from a phylogenetic network by suppressing the root vertex and un-directing all non-reticulation edges. Semi-directed networks generalize the notion of unrooted trees, and, for group-based models, if two phylogenetic networks have the same underlying semi-directed topology, then their corresponding varieties are also equal (Gross et al. 2024, Lemma 2.2). Since we are concerned with the dimensions of the corresponding varieties, we will only consider semi-directed phylogenetic networks.

Fig. 1 — A rooted, level-1 phylogenetic network. This network contains a single 3-cycle and a single 4-cycle. Reticulation edges are drawn with dashed lines

The fundamental building blocks of level-1 semi-directed phylogenetic networks are unrooted trees and k-sunlet networks, which are the minimal semi-directed phylogenetic networks containing a k-cycle. In this paper, we focus on 3-sunlet networks, which are the minimal semi-directed phylogenetic networks containing a triangle (i.e., a 3-cycle). A 3-sunlet can be obtained from the phylogenetic network in Figure 1 by restricting the network to the leaves labelled by taxa 1, 3, and 6 and suppressing vertices of degree 2 (restriction is discussed in more detail towards the end of Section 3). Phylogenetic networks with triangles are thought to be among the most common phylogenetic networks, because hybridization usually occurs between closely related species. Despite this, 3-sunlet networks are the least understood of the sunlet networks.

We place a group-based model of evolution on a level-1 semi-directed phylogenetic network $N$ by arbitrarily assigning direction to all undirected edges (i.e., non-reticulation edges), and choosing a finite abelian group G and a subgroup B of the automorphism group of G, denoted $Aut (G)$ . We note that arbitrarily assigning directions to all undirected edges results in the same variety as rooting the semi-directed network even if the chosen edge directions are not consistent with any placement of a root vertex (see the proof of Lemma 2.2 in Gross et al. (2024)). The group G is identified with the state space of the model, and the group B encodes additional constraints that the transition matrices must adhere to. When we choose $B = {id}$ we call the model the general group-based model for G. For example, the Kimura 3-parameter (K3P) model is the general group-based model for $G = Z / 2 Z \times Z / 2 Z$ . The Jukes-Cantor (JC) model is the group-based model with $G = Z / 2 Z \times Z / 2 Z$ and $B = Aut G$ , which we identify with $S_{3}$ , the symmetric group of order 3. In between these two we have the Kimura 2-parameter model (K2P), where B is a subgroup isomorphic to $S_{2}$ . Group-based models have the desirable property that for any phylogenetic tree there exists a Fourier transformation that transforms expressions for the marginal probabilities of observations at the leaves into monomial expressions (see e.g., (Sullivant 2018, Chapter 15) for an overview). For level-1 phylogenetic networks, that same transformation significantly simplifies the expressions for the marginal probabilities, although they are not monomial (Gross and Long 2018, Prop 4.2).

We are interested in identifying the semi-directed phylogenetic network from observed data on the leaves. As noted above, for group-based models, the root of network is not identifiable. For certain group-based models, identifiability results are known (see e.g. Gross and Long (2018), Gross et al. (2021), Hollering and Sullivant (2021), Cummings et al. (2023)), but a general result for all group-based models has yet to be determined. Understanding the dimension of the variety associated to a phylogenetic network and model can assist in determining identifiability. A step in this direction was taken in Gross et al. (2024), and some identifiability results were obtained for arbitrary group-based models. Here, one limiting factor was being unable to determine the dimension of the varieties associated the 3-sunlet network.

The 3-sunlet networks

The 3-sunlet is the semi-directed network topology of a simple 3-leaf phylogenetic network with a single cycle. It poses a particular problem to phylogeneticists, because under the most commonly used 4-state group-based models (JC, K2P, and K3P), the reticulation vertex is not identifiable from data at the leaves of the network (see e.g. (Gross et al. 2021, Lemma 1)). Thus many of the identifiability results obtained for these models require the phylogenetic networks to be ‘triangle free’Gross et al. (2021).

Mathematically, the 3-sunlet is a semi-directed graph whose skeleton consists of a single 3-cycle and one leaf vertex adjacent to each vertex in the cycle. One of the vertices in the cycle is the reticulation vertex, and the two cycle edges adjacent to this vertex are reticulation edges. The reticulation edges are the only directed edges and they are directed towards the reticulation vertex (see Figure 2 for an example). By removing either of the edges $e_{6}$ or $e_{5}$ in Figure 2 and undirecting the remaining edge, we obtain an unrooted phylogenetic tree (with a vertex of degree 2), which we denote by $T_{1}$ and $T_{2}$ respectively.

Fig. 2 — (Left) The semi-directed network topology of the 3-sunlet network with taxa labels 1,2, and 3. (Right) A directed 3-sunlet network

In order to simplify our exposition, we begin with the phylogenetic network parameterization in the transformed coordinates. Readers interested in the derivation of this parameterization from the substitution model can consult (Gross et al. 2024, Section 2) for a full explanation. In order to specify the parameterization we must direct the undirected edges of the semi-directed network topology. By (Gross et al. 2024, Lemma 2.2) we may arbitraily choose these directions, and so for the remainder of this work we denote by $N$ the directed 3-sunlet network in Figure 2 (right).

Let G be a finite abelian group and let B be a subgroup of $Aut G$ . Let $B \cdot G$ be the set of B-orbits of G, and define $ℓ + 1 : = | B \cdot G |$ . Note that when $B = {id}$ we have $ℓ + 1 = | G |$ . A consistent leaf-labelling is a triple $g = (g_{1}, g_{2}, g_{3}) \in G^{3}$ satisfying $g_{1} + g_{2} + g_{3} = 0$ . For a fixed G there are exactly ${| G |}^{2}$ consistent leaf-labellings. We give $C^{6 (ℓ + 1)}$ a basis indexed by B-orbits and edges of $N$ , and denote the basis element corresponding to the B-orbit [g] and edge $e_{i}$ as $E_{i}^{g}$ . We give $C^{2}$ a basis indexed by consistent leaf-labellings $g = (g_{1}, g_{2}, g_{3})$ . Then the parameterization map $ϕ_{N}^{(G, B)}$ (in transformed coordinates) is given by

\begin{matrix} ϕ_{N}^{(G, B)} : C^{6 (ℓ + 1)} \to C^{2}, \\ (ϕ_{N}^{(G, B)} (w))_{g} & = w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{g_{1} + g_{2}} w_{5}^{1} + w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{2} w_{6}^{1} \\ = m_{1} (g) + m_{2} (g) . \end{matrix}

where $w_{i}^{g}$ is the coefficient of $E_{i}^{g}$ in w. Here, the first term $m_{1} (g) : = w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{g_{1} + g_{2}} w_{5}^{1}$ comes from the phylogenetic tree $T_{1}$ (obtained from $N$ be removing the edge $e_{6}$ ). Each superscript $g_{i}$ is given by the edge-labelling of the edge $e_{i}$ in the right hand diagram in Figure 2 (see Gross et al. (2024) for further details). Similarly, the second term $m_{2} (g) : = w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{2} w_{6}^{1}$ comes from the phylogenetic tree $T_{2}$ (obtained from $N$ by removing the edge $e_{5}$ ). The phylogenetic variety of $N$ and (G, B) is defined as the Zariski closure of the image of $ϕ_{N}^{(G, B)}$ , denoted

V_{N}^{(G, B)} = \bar{im ϕ_{N}^{(G, B)}},

and this is the object that we study. Since the map $ϕ_{N}^{(G, B)}$ is homogeneous, the variety $V_{N}^{(G, B)}$ is a projective variety. However, we will mostly remain in affine space and consider the affine cone.

Observe that the map $ϕ_{N}^{(G, B)}$ is a morphism of affine varieties. It has comorphism given by

\begin{matrix} ψ_{N}^{(G, B)} : C [q_{g} & | g_{1} + g_{2} + g_{3} = 0] \to C [a_{i}^{g} | i = 1, \dots, 6, and g \in B \cdot G], \\ q_{g} & ⟼ a_{1}^{1} a_{2}^{2} a_{3}^{3} a_{4}^{g_{1} + g_{2}} a_{5}^{1} + a_{1}^{1} a_{2}^{2} a_{3}^{3} a_{4}^{2} a_{6}^{1}, \end{matrix}

somewhere we think of the coordinate ring of $C^{2}$ as being generated by variables $q_{g}$ for consistent leaf-labellings $g = (g_{1}, g_{2}, g_{3})$ , and we think of the coordinate ring of $C^{6 (ℓ + 1)}$ as being generated by variables $a_{i}^{g}$ for $i = 1, \dots, 6$ and $g \in B \cdot G$ . Under this definition, the vanishing ideal of $V_{N}^{(G, B)}$ , which we denote $I_{N}^{(G, B)}$ , is given by $ker ψ_{N}^{(G, B)}$ . This ideal, and in particular, its generating sets, are an important object of study in mathematical phylogenetics as they can be used for model selection Barton et al. (2022) Cummings et al. (2024) Martin et al. (2023) and to establish identifiability Gross and Long (2018) Hollering and Sullivant (2021) Gross et al. (2021) Cummings et al. (2023). While some polynomials are known for some group-based models such as CFN Cummings et al. (2024) Cummings et al. (2023), and for certain networks, such as the 4-sunlet, polynomials have been calculated for the JC, K2P Martin et al. (2023), and K3P Cummings and Hollering (2026) models, generating sets of $I_{N}^{(G, B)}$ are not known in general.

Determining Dimension

In Section 3, we give a dimension result for the 3-sunlet network and the general group-based model for groups of odd order by following the approach taken in Gross et al. (2024). Here, we introduce the concepts and objects we need. First, for a level-1 phylogenetic network $N$ and group-based model (G, B), the variety $V_{N}^{(G, B)}$ is defined as the Zariski closure of the image of a polynomial map (as defined above for the 3-sunlet). Considering the number of free parameters in the domain of the map gives us an upper bound on the dimension.

Lemma 2.1

(Gross et al. 2024, Proposition 4.2) Let $N$ be the 3-sunlet network, G a finite abelian group and B a subgroup of $Aut (G)$ with $| G \cdot B | = ℓ + 1$ . Then

dim V_{N}^{(G, B)} \leq 5 ℓ + 1 .

$□$

As a result of Lemma 2.1, to prove Theorem 1.1, it is sufficient for us to give a lower bound on the dimension. We do this by exhibiting a Jacobian matrix of sufficient rank of the tropicalization of the parameterization.

Let $ϕ = ϕ_{N}^{(G, B)} : C^{6 (ℓ + 1)} \to C^{2}$ be the parameterization map of the 3-sunlet network under the general group-based model for an abelian group G, and for a consistent leaf-labelling $g = (g_{1}, g_{2}, g_{3})$ , let $ϕ_{g}$ be the component of $ϕ$ mapping onto the $g$ -coordinate of $C^{2}$ . Since $ϕ$ is a polynomial map, each $ϕ_{g}$ is a polynomial, and we can define

\begin{matrix} Trop (ϕ_{g}) : & R^{6 (ℓ + 1)} \to R \\ λ \mapsto min_{α \in M} ⟨ λ, α ⟩, \end{matrix}

where $M \subset Z^{6 (ℓ + 1)}$ denotes the set of exponent vectors corresponding to the monomials in the polynomial expression for $ϕ_{g}$ . Then $Trop (ϕ) : R^{6 (ℓ + 1)} \to R^{2}$ is the map with components $Trop (ϕ_{g})$ . Now for $λ \in R^{6 (ℓ + 1)}$ at which $Trop (ϕ)$ is differentiable there exists a matrix $A_{λ}$ such that $Trop (ϕ) (μ) = A_{λ}^{T} μ$ for all $μ$ in an open neighbourhood of $λ$ (in fact $A_{λ}^{T}$ is the Jacobian of $Trop (ϕ)$ at $λ$ ). Then we have a lower bound on the dimension of the affine variety $V_{N}^{G}$ given by

dim V_{N}^{G} \geq max_{λ \in R^{6 (ℓ + 1)}} {rank}_{R} A_{λ} .

This is a specific case of Corollary 2.3 in Draisma (2008). For full details we recommend the reader consult Draisma (2008) and (Gross et al. 2024, Section 2.3).

In this paper, we study the matrices $A_{λ}$ , and, in particular, the cone in the space $R^{6 (ℓ + 1)}$ that they induce. We can think about this space as being a kind of ‘tropical dual’ to the parameter space $C^{6 (ℓ + 1)}$ , and we adopt the same indexing. That is, the entries of $λ \in R^{6 (ℓ + 1)}$ are indexed by B-orbits and edges of $N$ . Then $λ (w_{i}^{g}) = λ_{i}^{g}$ , where $i \in {1, 2, 3, 4, 5, 6}$ and $g \in B \cdot G$ . Observe that the vector $λ$ defines a monomial order on the polynomial ring $C [a_{i}^{g} | i = 1, \dots, 6, and g \in B \cdot G]$ (provided we specify another order for resolving ties). For this reason, we call $λ \in R^{6 (ℓ + 1)}$ a weight vector.

For weight vectors $λ$ where $Trop (ϕ)$ is differentiable, each column of $A_{λ}$ is indexed by a consistent leaf-labelling $g = (g_{1}, g_{2}, g_{3})$ . The $g$ -entry of the parameterization $ϕ_{N}$ is given by $m_{1} (g) + m_{2} (g)$ , where $m_{i} (g)$ is the $g$ -entry of the parameterization of $T_{i}$ . For each $g$ and $m_{i} = m_{i} (g)$ we define $λ (m_{i})$ to be the natural product of the row vector of exponents of $m_{i}$ , with the column vector $λ$ . That is,

\begin{matrix} \begin{matrix} λ (m_{1}) & = λ (w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{g_{1} + g_{2}} w_{5}^{1}) \\ = λ_{1}^{1} + λ_{2}^{2} + λ_{3}^{3} + λ_{4}^{g_{1} + g_{2}} + λ_{5}^{1} \end{matrix} \end{matrix}

and

\begin{matrix} \begin{matrix} λ (m_{2}) & = λ (w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{2} w_{6}^{1}) \\ = λ_{1}^{1} + λ_{2}^{2} + λ_{3}^{3} + λ_{4}^{2} + λ_{6}^{1} . \end{matrix} \end{matrix}

Thus for a given weight vector $λ$ , we take the column of $A_{λ}$ indexed by $g$ to be the exponent vector of the monomial $m_{i}$ where $λ (m_{i}) < λ (m_{j})$ for $i \neq j \in {1, 2}$ ; in this case, we will say that $λ$ assigns $g$ to tree $T_{i}$ . The procedure just described is equivalent to describing an initial ideal of the ideal generated by the image of $ψ_{N}^{(G, B)}$ , where the initial term for each generator, ${in}_{λ} (ψ_{N}^{(G, B)} (q_{g}))$ , is given by the monomial $m_{i}$ .

In Section 3, we give a solution to the dimension problem for general group-based models on the 3-sunlet network for all finite abelian groups of odd order at least 7 (we handle $Z / 5 Z$ by computation), by constructing $λ$ such that $A_{λ}$ has maximal rank.

Defining Hyperplanes

The matrix $A_{λ}$ is determined by inequalities between linear combinations of the coordinates of the weight vector $λ$ . In this subsection, we construct a hyperplane arrangement $H_{G}$ , which divides Euclidean space into the regions on which $Trop (ϕ)$ is differentiable and $A_{λ}$ is constant. The hyperplanes themselves correspond to regions on which $Trop (ϕ)$ is not differentiable. By understanding the defining hyperplanes and resulting geometry, we are able to construct a weight vector $λ$ so that $A_{λ}$ has the maximum possible rank, and therefore gives the best lower bound on the dimension of the variety $V_{N}^{(G, B)}$ .

Definition 2.2

A hyperplane arrangement is a collection of hyperplanes $H = {H_{i}}_{i \in I}$ , $H_{i} \subset R^{n}$ . A connected component of the complement $R^{n} \ H$ is a chamber of $H$ . We denote by $C (H)$ the set of chambers of $H$ .

For the 3-sunlet network, each column of $A_{λ}$ corresponds to a consistent leaf-labelling $g = (g_{1}, g_{2}, g_{3})$ . The column of $A_{λ}$ corresponding to $g$ can be one of two vectors, and depends on which inequality the coordinates of $λ$ satisfy. Thus, each consistent leaf-labelling $g$ determines a hyperplane of the arrangement we want to construct for the 3-sunlet network. As above, for $i = 1, 2$ we write $m_{i} (g)$ for the monomial corresponding to the $i^{th}$ tree in the $g$ -entry of the parametrization map $ϕ_{N}$ . Then, we can define the hyperplane arrangement corresponding to the 3-sunlet as the collection $H = {H_{g} | g = (g_{1}, g_{2}, g_{3})$ is a consistent leaf-labelling $}$ , where

\begin{matrix} \begin{matrix} H_{g} : & = {λ \in R^{6 (ℓ + 1)} ∣ λ (m_{1} (g)) = λ (m_{2} (g))} \\ = {λ \in R^{6 (ℓ + 1)} ∣ λ_{1}^{1} + λ_{2}^{2} + λ_{3}^{3} + λ_{4}^{g_{1} + g_{2}} + λ_{5}^{1} \\ = λ_{1}^{1} + λ_{2}^{2} + λ_{3}^{3} + λ_{4}^{2} + λ_{6}^{1}} . \end{matrix} \end{matrix}

In Lemma 2.3, we make several observations about the hyperplanes $H_{g}$ . Note that the hyperplane arrangement $H$ and the resulting chambers $C (H)$ form the Gröbner fan of the ideal generated by the image of $ψ_{N}^{(G, B)}$ , denoted $⟨ im ψ_{N}^{(G, B)} ⟩$ . In this interpretation, the hyperplanes themselves consist of those points $λ$ for which the initial ideal ${in}_{λ} (⟨ im ψ_{N}^{(G, B)} ⟩)$ is not a monomial ideal, and the chambers consist of those points $λ$ for which the initial ideal ${in}_{λ} (⟨ im ψ_{N}^{(G, B)} ⟩)$ is monomial.

We make the following observation about the matrix $A_{λ}$ .

Lemma 2.3

Let $μ_{g} : = λ_{6}^{g} - λ_{5}^{g}$ for all $g \in G$ . The inequalities determining the matrix $A_{λ}$ are:

\begin{matrix} 0 & < μ_{0} or μ_{0} < 0, a n d \end{matrix}

\begin{matrix} λ_{4}^{g_{1} + g_{2}} - λ_{4}^{2} & < μ_{g_{1}} or μ_{g_{1}} < λ_{4}^{g_{1} + g_{2}} - λ_{4}^{2} for all g_{1}, g_{2} \in G such that g_{1} \neq 0 . \end{matrix}

In particular, the determining inequalities depend only on $g_{1}$ and $g_{2}$ .

Proof

The matrix $A_{λ}$ is constant on a region R if and only if each column is constant on R, so it follows that the hyperplane arrangement we seek is the union of hyperplane arrangements each of whose regions defines the constant regions of a single column.

The assignment of columns is equivalent to choosing the direction of the inequality in $λ (m_{1}) < λ (m_{2})$ or $λ (m_{1}) > λ (m_{2})$ . Without loss of generality, assume $λ (m_{1}) < λ (m_{2})$ . Expanding $λ (m_{1})$ and $λ (m_{2})$ as in equations (1) and (2) we obtain

λ_{1}^{1} + λ_{2}^{2} + λ_{3}^{3} + λ_{4}^{g_{1} + g_{2}} + λ_{5}^{1} < λ_{1}^{1} + λ_{2}^{2} + λ_{3}^{3} + λ_{4}^{2} + λ_{6}^{1}

Cancelling terms we see that

$□$

Inequality (3) controls the assignment of the $| G | = ℓ + 1$ columns with label $(0, g_{2}, - g_{2})$ , for $g_{2} \in G$ . The inequalities given by (4) each control a single column with label $(g_{1}, g_{2}, - g_{1} - g_{2})$ , with $g_{1} \neq 0$ . Thus, by crossing the hyperplane $μ_{0} = 0$ exactly $ℓ + 1$ of the columns of $A_{λ}$ change. While, when crossing a hyperplane of the form $μ_{g_{1}} = λ_{4}^{g_{1} + g_{2}} - λ_{4}^{2}$ , exactly one of the columns of $A_{λ}$ changes. Observe that we cannot simply choose which inequalities are satisfied and expect to find a weight vector $λ$ that achieves this. That is, some combinations of inequalities cannot be simultaneously satisfied. Thus, for a given G, it is unclear how many chambers lie in the hyperplane arrangement $H$ , but it is at most $2^{ℓ (ℓ + 1) + 1}$ . Lemma 3.2 in the next section gives some restrictions on these assignments for groups of odd order, and we will see further examples in Section 4.

Example 2.4

Let $G = Z / 2 Z$ . We have four consistent leaf-labellings given by (0, 0, 0), (0, 1, 1), (1, 0, 1), (1, 1, 0). The image of $ϕ_{N}^{G}$ is given by

\begin{matrix} ϕ_{N}^{G} {(w)}_{000} & = w_{1}^{0} w_{2}^{0} w_{3}^{0} w_{4}^{0} w_{5}^{0} + w_{1}^{0} w_{2}^{0} w_{3}^{0} w_{4}^{0} w_{6}^{0}, \\ ϕ_{N}^{G} {(w)}_{011} & = w_{1}^{0} w_{2}^{1} w_{3}^{1} w_{4}^{1} w_{5}^{0} + w_{1}^{0} w_{2}^{1} w_{3}^{1} w_{4}^{1} w_{6}^{0}, \\ ϕ_{N}^{G} {(w)}_{101} & = w_{1}^{1} w_{2}^{0} w_{3}^{1} w_{4}^{1} w_{5}^{1} + w_{1}^{1} w_{2}^{0} w_{3}^{1} w_{4}^{0} w_{6}^{1}, \\ ϕ_{N}^{G} {(w)}_{110} & = w_{1}^{1} w_{2}^{1} w_{3}^{0} w_{4}^{0} w_{5}^{1} + w_{1}^{1} w_{2}^{1} w_{3}^{0} w_{4}^{1} w_{6}^{1}, \end{matrix}

where in each case the first monomial, $m_{1}$ corresponds to $T_{1}$ , and the second monomial $m_{2}$ corresponds to $T_{2}$ . Pick a weight vector $λ$ with $λ_{4}^{0} = 3, λ_{4}^{1} = 1, μ_{0} = 1$ , and $μ_{1} = - 1$ . We construct the matrix $A_{λ}$ . The columns of $A_{λ}$ are indexed by the 4 consistent leaf-labellings, and the rows are indexed by the 12 parameters $w_{i}^{g}$ for $i = 1, \dots, 6$ and $g = 0, 1$ . For columns (0, 0, 0) and (0, 1, 1) we have that $0 < μ_{0} = 1$ , so in these cases the monomial from $T_{1}$ is chosen. For (1, 0, 1) we have that $λ_{4}^{1} - λ_{4}^{0} = - 2 < μ_{1} = - 1$ so the monomial from $T_{1}$ is chosen. For (1, 1, 0) we have that $λ_{4}^{0} - λ_{4}^{1} = 2 > μ_{1} = - 1$ so the monomial from $T_{2}$ is chosen. This gives us the following matrix $A_{λ}$

graphic file with name 11538_2025_1506_Equ48_HTML.gif

where the entry corresponding to column $g_{1} g_{2} g_{3}$ and row $w_{i}^{g}$ is the exponent of $w_{i}^{g}$ in the monomial from the $g_{1} g_{2} g_{3}$ entry of the expression for $ϕ_{N}^{G} (w)$ corresponding to the tree chosen.

To end this section, we make some observations on the hyperplane arrangement $H$ coming from the 3-sunlet and a group-based model (G, B). We will define the $rank$ of a chamber $C \in C (H)$ as the rank of the corresponding tropical Jacobian matrix $A_{λ}$ for all $λ \in C$ .

First, observe that when we choose $λ$ so that all columns correspond to $T_{1}$ or $T_{2}$ (this is possible by choosing $μ_{g}$ either very large or very small for all g), then $A_{λ}$ is equal to the corresponding tropical Jacobian for the group-based phylogenetic tree model for $T_{1}$ or $T_{2}$ (albeit with some extra rows of 0’s corresponding to the parameters $w_{6}^{g}$ or $w_{5}^{g}$ respectively). Since, the matrix $A_{λ}$ has rank equal to the dimension of the toric variety corresponding to $A_{λ}$ (see e.g., (Sturmfels 1996, Lemma 4.2)), and these varieties are well studied for trees, in both cases, $rank A_{λ} = 3 ℓ + 1$ (see e.g. (Gross et al. 2024, Lemma 4.1)). This describes two chambers with rank equal to $3 ℓ + 1$ . In fact, there are two more, as the next proposition demonstrates.

Proposition 2.5

Let C and $C^{'}$ be two adjacent chambers in $C (H)$ , separated by the hyperplane $μ_{0} = 0$ . Then $rank C = rank C^{'}$ .

Proof

Let $λ \in C$ and $λ^{'} \in C^{'}$ be weight vectors, and consider the matrices $A_{λ}$ and $A_{λ^{'}}$ . Suppose, without loss of generality, that in $A_{λ}$ the columns indexed by $(0, g, - g)$ with $g \in G$ correspond to $T_{1}$ , and in $A_{λ^{'}}$ they correspond to $T_{2}$ . Thus, in the columns of $A_{λ}$ indexed by $(0, g, - g)$ , entries in the row indexed by $w_{5}^{0}$ are 1, and entries in the row indexed by $w_{6}^{0}$ are 0. On the other hand, in the columns of $A_{λ^{'}}$ indexed by $(0, g, - g)$ , entries in the row indexed by $w_{5}^{0}$ are 0, and entries in the row indexed by $w_{6}^{0}$ are 1. All other entries of $A_{λ}$ and $A_{λ^{'}}$ are the same. Finally, observe that in the rows indexed by $w_{5}^{0}$ and $w_{6}^{0}$ , all entries are 0 outside the columns indexed by $(0, g, - g)$ for $g \in G$ . Thus, the difference between $A_{λ}$ and $A_{λ^{'}}$ is a swap of the rows indexed by $w_{5}^{0}$ and $w_{6}^{0}$ , and this does not affect the rank. $□$

Applying Proposition 2.5 to the two chambers of rank $3 ℓ + 1$ found above, we have four chambers of this rank. These come from column assignments where all columns corresponding to a leaf-labelling $(0, g, - g)$ are assigned to $T_{i}$ , and all other columns assigned to $T_{j}$ for $i, j \in {1, 2}$ . It is easy to see that all other chambers have rank strictly greater than this (changing any column corresponding to $(g_{1}, g_{2}, g_{3})$ with $g_{1} \neq 0$ will introduce a 1 into the row $w_{5}^{1}$ or $w_{6}^{1}$ which was previously all 0’s), thus we have exactly four chambers of minimal rank $3 ℓ + 1$ .

Lemma 2.6

Let $H$ be the hyperplane arrangement given by the 3-sunlet network and group-based model (G, B). Then $H$ has exactly four chambers of minimal rank $3 ℓ + 1$ . $□$

Proposition 2.7

Let $H$ be the hyperplane arrangement given by the 3-sunlet network and group-based model (G, B). Given a chamber $C \in C (H)$ of rank r, the rank of each adjacent chamber is between $r - 1$ and $r + 1$ .

Proof

As described above, moving to an adjacent chamber is equivalent to either swapping the assignment of all columns indexed by $(0, g, - g)$ for $g \in G$ , or swapping the assignment of a single column indexed by $(g, h, - g, - h)$ with $g \neq 0$ . In the first case, the rank does not change by Proposition 2.5. In the second case, the rank can change by at most 1. $□$

Proposition 2.5 above shows that every chamber is adjacent to at least one other chamber of equal rank. We would also like to know whether every chamber is adjacent to another of chamber of strictly greater rank, or equivalently, whether the only maximal chambers (with respect to rank) are the globally maximal chambers. As we will see in Section 4.3, the answer to this is no, and there do exist locally maximal chambers.

Dimension for Odd Order Groups

In this section we give the dimension of the 3-sunlet variety for general group-based models where G is a finite abelian group of odd order at least 5. Our results agree with (Gross et al. 2024, Conjecture 7.1). Our method is to find a weight vector $λ$ so that the corresponding tropical Jacobian $A_{λ}$ has rank greater than or equal to $5 ℓ + 1$ , where $| G | = ℓ + 1$ , as detailed in Section 2.

First we set out some notation. Assume |G| is odd and let $| G | = 2 t + 1$ . Choose a subset $X \subset G$ with $| X | = t$ and satisfying the property that if $g \in X$ then $- g \notin X$ . In particular, $0 \notin X$ . For example, for G equal to the cyclic group of order $2 t + 1$ we could have $X = {1, 2, \dots, t}$ . We will need the following lemma.

Lemma 3.1

Let G be an abelian group with $| G | = 2 t + 1$ , where $t \in N$ and $t \geq 3$ , i.e. |G| is odd with $| G | \geq 7$ . Let $X \subset G$ be a subset with $| X | = t$ such that if $g \in X$ then $- g \notin X$ . Then there exists an injective function $h : X \to G \ X$ such that $h_{g} : = h (g)$ is not equal to $0, - g,$ or 2g.

Proof

Let $X = {g_{1}, \dots, g_{t}}$ and let $k_{i} = - g_{i} \notin X$ . We will give an iterative method of choosing the value of $h_{g_{i}}$ , where at each step we update the set of available choices. To begin with, let $K = {k_{1}, \dots, k_{t}}$ , this will be our initial set of available choices. For each $i = 1, \dots, t - 3$ , choose the value of $h_{g_{i}}$ to be an element of K satisfying $h_{g_{i}} \neq k_{i}$ and $h_{g_{i}} \neq 2 g_{i}$ (should $2 g_{i}$ be in K), and remove the value of $h_{g_{i}}$ from K. This is possible, since at each stage, K contains at least 3 elements.

Now it remains to choose $h_{g_{t - 2}}, h_{g_{t - 1}}$ and $h_{g_{t}}$ . In the worst case, we have $K = {k_{t - 2}, k_{t - 1}, k_{t}} = {2 g_{t - 2}, 2 g_{t - 1}, 2 g_{t}} .$ For each $g_{i}$ we take $h_{g_{i}}$ to be the (possibly unique) $k_{j}$ such that $i \neq j$ and $k_{j} \neq 2 g_{i}$ . $□$

The next lemma demonstrates that there are relationships among the inequalities in Lemma 2.3.

Lemma 3.2

Fix $g \in G$ such that $g \neq - g$ , and suppose that we have a weight vector $λ$ such that for some $h \in G$ the consistent leaf labelling $(g, h, - g - h)$ is assigned to $T_{1}$ and $(g, h^{'}, - g - h^{'})$ is assigned to $T_{2}$ for all $h^{'} \neq h$ . If there exists $k \in G$ such that $(- g, k, g - k)$ is assigned to $T_{2}$ then $(- g, g + h, - h)$ is assigned to $T_{2}$ .

Proof

By assumption, since $λ$ assigns $(g, h, - g - h)$ to $T_{1}$ and $(g, h^{'}, - g - h^{'})$ to $T_{2}$ , we have $λ_{4}^{g + h} - λ_{4}^{h} < μ_{g}$ and $λ_{4}^{g + h^{'}} - λ_{4}^{'} > μ_{g}$ for all $h^{'} \neq h$ (see proof of Lemma 2.3). Thus $λ_{4}^{g + h} - λ_{4}^{h} < λ_{4}^{g + h^{'}} - λ_{4}^{'}$ for all $h^{'} \neq h$ . Now if $k = g + h$ then the result is tautological, so assume $k \neq g + h$ . Let $h^{'} = k - g$ so that $(- g, k, g - k) = (- g, g + h^{'}, - h^{'})$ . Now if $(- g, g + h^{'}, - h^{'})$ is assigned to $T_{2}$ then $λ_{4}^{'} - λ_{4}^{g + h^{'}} > μ_{- g}$ . Since $λ_{4}^{'} - λ_{4}^{g + h^{'}} = - (λ_{4}^{g + h^{'}} - λ_{4}^{'})$ and $- (λ_{4}^{g + h^{'}} - λ_{4}^{'}) < - (λ_{4}^{g + h} - λ_{4}^{h})$ , we have $- (λ_{4}^{g + h} - λ_{4}^{h}) = λ_{4}^{h} - λ_{4}^{g + h} > μ_{- g}$ . Thus, $λ (m_{1} ((- g, g + h, - h))) > λ (m_{2} ((- g, g + h, - h)))$ , and consequently, $(- g, g + h, - h)$ is assigned to $T_{2}$ . $□$

Next we describe a procedure for choosing $λ$ so that the matrix $A_{λ}$ has rank equal to the expected dimension. This procedure will be illustrated below in Example 3.3. For ease of notation, we will write $ν$ for $λ_{4} \in R^{2 t + 1}$ , so that $ν_{g} = λ_{4}^{g}$ for all $g \in G$ . In this notation, for a leaf labeling $g = (g_{1}, g_{2}, - g_{1} - g_{2})$ ,

λ (m_{1}) < λ (m_{2}) ⟺ ν_{g_{1} + g_{2}} - ν_{g_{2}} < μ_{g_{1}} .

This means $g = (g_{1}, g_{2}, - g_{1} - g_{2})$ is assigned to $T_{1}$ if and only if $ν_{g_{1} + g_{2}} - ν_{g_{2}} < μ_{g_{1}},$ and $g$ is assigned to $T_{2}$ if and only if $ν_{g_{1} + g_{2}} - ν_{g_{2}} > μ_{g_{1}} .$

Choose $ν$ such that $ν_{g} \geq 0$ for all $g \in G$ , with $ν_{0}$ large enough and all $ν_{g}$ small enough such that for all $h, h^{'}, k^{'} \in G \ {0}$ we have $ν_{0} - ν_{h} > ν_{h^{'}} - ν_{k^{'}} > ν_{h} - ν_{0}$ . Fix $g \in X$ . By our choice of $ν$ , we can find $μ_{g}$ such that $ν_{0} - ν_{- g} > μ_{g}$ and $μ_{g} > ν_{g + h} - ν_{h}$ for all $h \in G$ with $h \neq - g$ . Then for the consistent leaf-labelling $(g, - g, 0)$ we have $ν_{0} - ν_{- g} > μ_{g}$ , so $(g, - g, 0)$ is assigned to $T_{2}$ . For all $h \neq - g$ we have $ν_{g + h} - ν_{h} < μ_{g}$ so the consistent leaf-labelling $(g, h, - g - h)$ is assigned to $T_{1}$ .

Next, consider $- g \notin X$ . In this case, for all $h \in G$ the consistent leaf-labelling $g^{'} = (- g, h, g - h)$ is assigned to $T_{1}$ if and only if

ν_{- g + h} - ν_{h} < μ_{- g},

and $g^{'}$ is assigned to $T_{2}$ if and only if

ν_{- g + h} - ν_{h} > μ_{- g} .

Choose $μ_{- g} = - μ_{g}$ . Then for $(- g, 0, g)$ we have

ν_{- g} - ν_{0} = - (ν_{0} - ν_{- g}) < - μ_{g} = μ_{- g},

and thus $(- g, 0, g)$ is assigned to $T_{1}$ . Now, consider the inequality $μ_{g} > ν_{g + h} - ν_{h}$ , which holds for all $h \neq - g$ . Write $h = - g + k$ with $k \neq 0$ . Then we have $μ_{g} > ν_{k} - ν_{- g + k}$ and therefore $ν_{- g + k} - ν_{k} > - μ_{g} = μ_{- g}$ . Thus $(- g, k, g - k)$ is assigned to $T_{2}$ for all $k \neq 0$ .

To summarize, at this point, we have found $ν, μ_{g}$ and $μ_{- g}$ to give us the following assignments of the consistent leaf-labellings $(g, h, - g - h)$ and $(- g, h, g - h)$ for all $h \in G$ :

To $T_{1}$ we have assigned $(- g, 0, g)$ and $(g, h, - g - h)$ for all $h \neq - g$ .
To $T_{2}$ we have assigned $(g, - g, 0)$ and $(- g, h, g - h)$ for all $h \neq 0$ .

We repeat the above procedure for every $g \in X$ . Then for each of the t pairs ${g, - g} \subset G \ {0}$ we have $ℓ + 1$ consistent leaf-labellings assigned to $T_{1}$ and $ℓ + 1$ consistent leaf-labellings assigned to $T_{2}$ . Now, for all $h \in G$ assign $(0, h, - h)$ to $T_{2}$ so that we have $t (ℓ + 1) + (ℓ + 1) = (t + 1) (2 t + 1)$ consistent leaf-labellings assigned to $T_{2}$ and $t (2 t + 1)$ consistent leaf-labellings assigned to $T_{1}$ . Note that for $t \geq 2$ we have $2 ℓ = 4 t < t (2 t + 1) < 2 t (2 t - 1) = ℓ (ℓ - 1)$ .

We give a small example to illustrate.

Example 3.3

Let $G = Z / 3 Z = {0, 1, 2}$ , and let $X = {1}$ . Pick $ν_{0} = 10$ , $ν_{1} = 0$ , and $ν_{2} = 2$ . When $g = 1$ , for the consistent leaf-labellings (1, 2, 0), (1, 1, 1), and (1, 0, 2) respectively we have

ν_{0} - ν_{2} = 8 > ν_{2} - ν_{1} = 2 > ν_{1} - ν_{0} = - 10 .

We choose $μ_{1} = 7$ so that (1, 2, 0) is assigned to $T_{2}$ ; and (1, 1, 1) and (1, 0, 2) are assigned to $T_{1}$ .

When $g = 2$ , for the consistent leaf-labellings (2, 1, 0), (2, 2, 2) and (2, 0, 1) respectively we have

ν_{0} - ν_{1} = 10 > ν_{1} - ν_{2} = - 2 > ν_{2} - ν_{0} = - 8 .

Setting $μ_{2} = - μ_{1} = - 7$ gives us (2, 1, 0) and (2, 2, 2) assigned to $T_{2}$ and (2, 0, 1) assigned to $T_{1}$ .

Finally, we choose $μ_{0} = - 1$ so that (0, 0, 0), (0, 1, 2), and (0, 2, 1) are assigned to $T_{2}$ .

Returning to an arbitrary finite abelian group of odd order, we choose $ν$ and $μ$ as above (with $μ_{0}$ negative) so that we have the assignments of leaf-labellings as described above. We will show that for this choice of $λ$ we have $rank A_{λ} \geq 5 ℓ + 1$ . Our strategy is to perform row and column operations on $A_{λ}$ in order to turn it into a block upper triangular matrix without changing the rank.

We will denote the row of $A_{λ}$ corresponding to the parameter $w_{i}^{g}$ as $r_{i}^{g}$ . Our first observation is that for any assignment of consistent leaf-labellings to $T_{1}$ and $T_{2}$ we have $r_{1}^{g} = r_{5}^{g} + r_{6}^{g}$ for all $g \in G$ . The rows $r_{1}^{g}$ therefore do not contribute to the rank of $A_{λ}$ , so we remove them. Note that this corresponds to the notion of a contracted semi-directed network, which we do not introduce here (see Gross et al. (2024) for further details). Next, perform column swaps so that all columns assigned to $T_{2}$ are to the left of the columns assigned to $T_{1}$ . Observe that for all columns assigned to $T_{2}$ , the entry corresponding to $w_{2}^{g}$ is equal to the entry corresponding to $w_{4}^{g}$ for all $g \in G$ (where edge labels are as in Figure 2), and the entry corresponding to $w_{5}^{g}$ is 0 for all $g \in G$ . In terms of the tree topology, this is because in $T_{2}$ the vertex between $e_{2}$ and $e_{4}$ has degree 2, and because $T_{2}$ does not contain the edge $e_{5}$ . Perform row swaps so that the corresponding edge order is $e_{2}, e_{3}, e_{6}, e_{4}, e_{5}$ from top to bottom. Next perform the row operations

r_{4}^{g} \to r_{4}^{g} - r_{2}^{g}

for all $g \in G$ . This gives a block upper-triangular matrix of the form

graphic file with name 11538_2025_1506_Equ5_HTML.gif

It follows that $rank A_{λ} \geq rank A + rank B$ . The submatrix A consists of all columns assigned to $T_{2}$ and rows corresponding to $w_{2}^{g}, w_{3}^{g},$ and $w_{6}^{g}$ for all $g \in G$ . These are all the variables associated to edges in $T_{2}$ that form a 3-star tree, and so we know $rank A \leq dim T_{2} = 3 ℓ + 1$ . In Lemma 3.5, we show that $rank A = dim T_{2} = 3 ℓ + 1$ . The submatrix B consists of all columns assigned to $T_{1}$ and rows $r_{4}^{g} - r_{2}^{g}$ and $r_{5}^{g}$ for all $g \in G$ . The columns of B are given by the consistent leaf-labellings assigned to $T_{1}$ which are

S = S_{1} \cup S_{2} = {(g, h, - g - h) | g \in X, h \neq - g} \cup {(- g, 0, g) | g \in X},

so $| S_{1} | = 2 t (t) = \frac{ℓ}{2} ℓ$ and $| S_{2} | = t = \frac{ℓ}{2}$ . In the next example, we illustrate the block triangular matrix in (5) for $G = Z / 3 Z$ .

Example 3.4

Here we continue Example 3.3. With $G = Z / 3 Z$ and assignments of consistent leaf-labellings as in the example, after all row operations we have the following matrix.

graphic file with name 11538_2025_1506_Equ49_HTML.gif

Note that in this case the dimension of the space containing the variety is ${(ℓ + 1)}^{2} = 9$ , which is less than the expected dimension of $5 ℓ + 1 = 11$ . Therefore the expected dimension cannot be reached, and indeed, explicit computation shows that the variety has dimension 9. Here we can see explicitly that the submatrices A and B have rank strictly less than as described in Lemmas 3.5 and 3.7.

In the next two lemmas, we give the rank of the submatrix A and a lower bound on the rank of the submatrix B.

Lemma 3.5

Let G be an abelian group with $| G | = ℓ + 1$ odd and $| G | \geq 5$ , and $A_{λ}$ as above in (5). Then $rank A = 3 ℓ + 1$ .

Proof

The submatrix A consists of columns corresponding to consistent leaf-labellings assigned to $T_{2}$ , and rows corresponding to the variables $w_{2}^{g}, w_{3}^{g}$ , and $w_{6}^{g}$ for all $g \in G$ . This submatrix also appears in the corresponding matrix of exponents for the 3-star tree with edges $e_{2}, e_{3}$ , and $e_{6}$ as in Figure 3 (possibly after performing some column swaps).

Fig. 3 — A 3-star tree with taxa labels and edge labels

Denote by $K_{3}$ the matrix of exponents corresponding to the general group-based model of G on the 3-star tree. The dimension of the variety corresponding to this model is $3 ℓ + 1$ (Gross et al. 2024, Lemma 4.1). Since the parameterization of this model is monomial, $K_{3}$ has rank equal to the dimension of the model.

Let $K_{(g_{1}, g_{2}, g_{3})}$ be the column of $K_{3}$ corresponding to the consistent leaf-labelling $(g_{1}, g_{2}, g_{3})$ , so that we have

K_{(g_{1}, g_{2}, g_{3})} = E_{6}^{1} + E_{2}^{2} + E_{3}^{3} .

We will show that the columns of $K_{3}$ corresponding to consistent leaf-labellings that we have assigned to $T_{1}$ can be written as linear combinations of columns corresponding to consistent leaf-labellings we have assigned to $T_{2}$ , thereby showing that $rank A = rank K_{3} = 3 ℓ + 1$ .

First, consider the consistent leaf-labelling $(- g, 0, g)$ . The reader can check that for any $h \in X$ with $h \neq g$ we have

K_{(- g, 0, g)} = K_{(0, 0, 0)} + K_{(- g, h, g - h)} + K_{(- h, h - g, g)} - K_{(- h, h, 0)} - K_{(0, h - g, g - h)},

and all terms on the right hand side are from consistent leaf-labellings that are assigned to $T_{2}$ . Note that since $| G | \geq 5$ , at least one such h exists.

Next consider the consistent leaf-labelling $(g, h, - g - h)$ for $g \in X$ and $h \neq - g$ . We consider two cases separately:

Case 1: $h \notin X$ . For this case we may write

\begin{matrix} K_{(g, h, - g - h)} = K_{(g, - g, 0)} + K_{(- g, h, g - h)} + K_{(h, g, - g - h)} - K_{(- g, g, 0)} - K_{(h, - g, g - h)} . \end{matrix}

Case 2: $h \in X$ . We break this down into two further cases. First, suppose $- g - h \in X$ . Then we have

\begin{matrix} K_{(g, h, - g - h)} = K_{(g, - g, 0)} + K_{(0, h, - h)} + K_{(g + h, 0, - g - h)} - K_{(0, 0, 0)} - K_{(g + h, - g, - h)}, \end{matrix}

where we are using that although $(g + h, 0, - g - h)$ is assigned to $T_{1}$ , the column $K_{(g + h, 0, - g - h)}$ is linearly dependent on columns corresponding to leaf-labellings assigned to $T_{2}$ by the first part of the proof. Therefore we can substitute the relation from (6) into (7) to obtain $K_{(g, h, - g - h)}$ as a linear combination of columns assigned to $T_{2}$ . On the other hand, if $- g - h \notin X$ then

K_{(g, h, - g - h)} = K_{(g, - g, 0)} + K_{(- g - h, h, g)} + K_{(0, g + h, - g - h)} - K_{(0, - g, g)} - K_{(- g - h, g + h, 0)} .

$□$

Remark 3.6

Observe that the relations used in the proof above correspond to the cubic binomials in the ideal for the 3-star tree, as described in Sturmfels and Sullivant (2005).

Lemma 3.7

Let G be a finite abelian group with $| G | = ℓ + 1$ odd and $| G | \geq 7$ , and $A_{λ}$ as above in (5). Then $rank B \geq 2 ℓ$ .

Proof

The columns of B correspond to the consistent leaf-labellings we assign to $T_{1}$ , that is, those in the set S, where

S = S_{1} \cup S_{2} = {(g, h, - g - h) | g \in X, h \neq - g} \cup {(- g, 0, g) | g \in X} .

The rows of B correspond to the variables $w_{5}^{g}$ and $w_{4}^{g}$ , where we have performed the row operation $r_{4}^{g} \to r_{4}^{g} - r_{2}^{g}$ , for all $g \in G$ . For the leaf-labelling $(g_{1}, g_{2}, g_{3})$ , we have $m_{1} (g_{1}, g_{2}, g_{3}) = w_{1}^{1} w_{2}^{2} w_{3}^{3} w_{4}^{g_{1} + g_{2}} w_{5}^{1}$ . Thus, the column of B corresponding to that leaf labelling is the vector in $C^{2 ℓ + 2}$ given by

B_{(g_{1}, g_{2}, g_{3})} = E_{5}^{1} + E_{4}^{g_{1} + g_{2}} - E_{4}^{2} .

To prove the result, it is sufficient to find a subset $B \subset S$ of $2 ℓ$ linearly independent columns.

First consider the columns corresponding to consistent leaf-labellings in $S_{2}$ . Here each column is given by

B_{(- g, 0, g)} = E_{5}^{- g} + E_{4}^{- g} - E_{4}^{0},

for $g \in X$ . Since $- g \notin X$ , the column $B_{(- g, 0, g)}$ is the only column of B with a non-zero component in the basis vector $E_{5}^{- g}$ , and is therefore linearly independent of all other columns.

Next we consider columns coming from consistent leaf-labellings in $S_{1}$ . For each $g \in X$ consider the leaf-labellings $(g, 0, - g), (g, g, - 2 g),$ and $(g, h_{g}, - g - h_{g})$ , where for each $g \in X$ the element $h_{g}$ is chosen from $G \ (X \cup {0})$ with the conditions that $h_{g} \neq - g, 2 g$ , and if $g \neq g^{'}$ then $h_{g} \neq h_{g^{'}}$ (this is possible by Lemma 3.1). We claim that the set

B = {B_{(- g, 0, g)} | g \in X} \cup {B_{(g, 0, - g)}, B_{(g, g, - 2 g)}, B_{(g, h_{g}, - g - h_{g})} | g \in X}

is linearly independent. Let

V_{g} = {span}_{C} {B_{(g, 0, - g)}, B_{(g, g, - 2 g)}, B_{(g, h_{g}, - g - h_{g})}} .

By the choice of $h_{g}$ , $dim V_{g} = 3$ for all $g \in X$ . Thus, to prove the claim, it is sufficient to show that $V_{g} \cap V_{g^{'}} = {0}$ for all $g, g^{'} \in X$ with $g \neq g^{'}$ .

First, by considering the basis vectors $E_{5}^{g}$ and $E_{5}^{'}$ , we must have that $V_{g} \cap V_{g^{'}} \subset W$ where $W = {span}_{C} {E_{4}^{h} | h \in G}$ , so that $V_{g} \cap V_{g^{'}} = (V_{g} \cap W) \cap (V_{g^{'}} \cap W)$ . Now $V_{g} \cap W$ is spanned by the vectors $B_{(g, 0, - g)} - B_{(g, g, - 2 g)}$ and $B_{(g, 0, - g)} - B_{(g, h_{g}, - g - h_{g})}$ , where

\begin{matrix} B_{(g, 0, - g)} - B_{(g, g, - 2 g)} & = 2 E_{4}^{g} - E_{4}^{2 g} - E_{4}^{0}, \\ B_{(g, 0, - g)} - B_{(g, h_{g}, - g - h_{g})} & = E_{4}^{g} - E_{4}^{g + h_{g}} + E_{4}^{g} - E_{4}^{0} . \end{matrix}

An arbitrary element of $v \in V_{g} \cap W$ is then given by

\begin{matrix} v & = α (B_{(g, 0, - g)} - B_{(g, g, - 2 g)}) + β (B_{(g, 0, - g)} - B_{(g, h_{g}, - g - h_{g})}) \\ = (2 α + β) E_{4}^{g} - α E_{4}^{2 g} + β E_{4}^{g} - β E_{4}^{g + h_{g}} - (α + β) E_{4}^{0}, \end{matrix}

for $α, β \in C$ . Observe that from the choice of $h_{g}$ , the basis vectors appearing in this expression are distinct. Now suppose that for $g^{'} \in X$ with $g \neq g^{'}$ , we have another such element $v^{'} \in V_{g^{'}} \cap W$ with coefficients $α^{'}$ and $β^{'}$ . We will show that if $v = v^{'}$ then $v = v^{'} = 0$ so that $(V_{g} \cap W) \cap (V_{g^{'}} \cap W) = {0}$ as claimed. We have

\begin{matrix} v & = (2 α + β) E_{4}^{g} - α E_{4}^{2 g} + β E_{4}^{g} - β E_{4}^{g + h_{g}} - (α + β) E_{4}^{0}, \end{matrix}

\begin{matrix} v^{'} & = (2 α^{'} + β^{'}) E_{4}^{'} - α^{'} E_{4}^{2 g^{'}} + β^{'} E_{4}^{g^{'}} - β^{'} E_{4}^{g^{'} + h_{g^{'}}} - (α^{'} + β^{'}) E_{4}^{0} . \end{matrix}

Now suppose $v = v^{'}$ . By examining the coefficients of $E_{4}^{0}$ we must have

\begin{matrix} α + β = α^{'} + β^{'} . \end{matrix}

Next consider the coefficient of $E_{4}^{'}$ . Either $E_{4}^{'}$ appears in the expression for v or $2 α^{'} + β^{'} = 0$ . For the former, since $g^{'} \neq g, 0$ and $g^{'} \in X$ so $g^{'} \neq h_{g}$ , we must have either $g^{'} = 2 g$ or $g^{'} = g + h_{g}$ . We consider these three cases separately.

Case 1: $g^{'} = 2 g$ . By equating coefficients of $E_{4}^{'} = E_{4}^{2 g}$ we have $2 α^{'} + β^{'} = - α$ , and substituting in to equation (10) gives $β = 3 α^{'} + 2 β^{'}$ . Next consider the coefficient of $E_{4}^{g^{'}}$ . Either this is zero (i.e. $β^{'} = 0)$ or $E_{4}^{g^{'}}$ appears in the expression for v.

If $β^{'} \neq 0$ then since $h_{g^{'}} \neq g, h_{g}, 0,$ or $g^{'} = 2 g$ , we must have $h_{g^{'}} = g + h_{g},$ and therefore $β^{'} = - β$ , from which it follows by Equation (10) that $α + 2 β = α^{'}$ . Now since the coefficient of $E_{4}^{g^{'} + h_{g^{'}}}$ is not zero we must have either $g^{'} + h_{g^{'}} = g$ or $h_{g}$ . If $g^{'} + h_{g^{'}} = h_{g}$ then substituting in $h_{g^{'}} = g + h_{g}$ gives $g^{'} + g + h_{g} = h_{g}$ and therefore $g = - g^{'}$ , a contradiction. We therefore must have $g^{'} + h_{g^{'}} = g$ so then $- β^{'} = 2 α + β$ , i.e., $α = 0$ . But now we have simultaneous equations $2 β = α^{'}$ and $3 β = α^{'}$ , so $α^{'} = β = 0$ .

Now, if $β^{'} = 0$ then we have $α = - 2 α^{'}$ and $β = 3 α^{'}$ . Substituting these values into equation (8) and setting $v = v^{'}$ gives

- α^{'} E_{4}^{g} + 2 α^{'} E_{4}^{2 g} + 3 α^{'} E_{4}^{g} - 3 α^{'} E_{4}^{g + h_{g}} = 2 α^{'} E_{4}^{'} - α^{'} E_{4}^{2 g^{'}},

so we conclude that $α^{'} = 0$ , and therefore $v = v^{'} = 0$ .

Case 2: $g^{'} = g + h_{g}$ . By equating coefficients of $E_{4}^{'} = E_{4}^{g + h_{g}}$ we have $2 α^{'} + β^{'} = - β$ . Next consider the coefficient of $E_{4}^{g^{'}}$ . As before, either this is zero or $E_{4}^{g^{'}}$ appears in the expression for v.

If $β^{'} \neq 0$ then we must have $h_{g^{'}} = 2 g$ , so that $β^{'} = - α$ and therefore $β = α^{'} + 2 β^{'}$ by Equation (10), and then $3 α^{'} + 3 β^{'} = 0$ . Now consider the coefficient of $E_{4}^{g^{'} + h_{g^{'}}}$ . Either $g^{'} + h_{g^{'}} = g$ or $g^{'} + h_{g^{'}} = h_{g}$ . If $g^{'} + h_{g^{'}} = g$ then $g + h_{g} + h_{g^{'}} = g$ so $h_{g}$ = $- h_{g^{'}}$ , but this contradicts how $h_{g}$ and $h_{g^{'}}$ were chosen; both $h_{g}$ and $h_{g^{'}} \in G \ (X \cup {0})$ , but then by definition of X we must have $- h_{g}$ and $- h_{g^{'}} \in X$ . Thus we must have $g^{'} + h_{g^{'}} = h_{g}$ , and so $β = - β^{'}$ . Now we have $β^{'} = - α = - β$ so by equation 10 we have $0 = α^{'} + 3 β^{'}$ . Solving this simultaneously with $3 α^{'} + 3 β^{'} = 0$ gives $v = v^{'} = 0$ .

If $β^{'} = 0$ we have $2 α^{'} = - β$ and so by Equation (10) $α = 3 α^{'}$ . Substituting into equation (8) and setting $v = v^{'}$ gives us $α^{'} = 0$ as in case 1, and therefore $v = v^{'} = 0$ .

Case 3: $2 α^{'} + β^{'} = 0$ . In this case we have $β^{'} = - 2 α^{'}$ so $α^{'} = - (α + β)$ . This gives us

v^{'} = (α + β) E_{4}^{2 g^{'}} + 2 (α + β) E_{4}^{g^{'}} - 2 (α + β) E_{4}^{g^{'} + h_{g^{'}}} - (α + β) E_{4}^{0} .

In particular, if $v^{'} \neq 0$ then $α + β \neq 0$ so that $v^{'}$ has 4 linearly independent non-zero terms. We have

v = (2 α + β) E_{4}^{g} - α E_{4}^{2 g} + β E_{4}^{g} - β E_{4}^{g + h_{g}} - (α + β) E_{4}^{0},

so if $v = v^{'}$ we must have exactly four non-zero terms in the expression for v. This is only possible if $α = 0$ or $2 α + β = 0$ . If $α = 0$ then we have

\begin{matrix} v & = β E_{4}^{g} + β E_{4}^{g} - β E_{4}^{g + h_{g}} - β E_{4}^{0}, \\ v^{'} & = β E_{4}^{2 g^{'}} + 2 β E_{4}^{g^{'}} - 2 β E_{4}^{g^{'} + h_{g^{'}}} - β E_{4}^{0} . \end{matrix}

If $v = v^{'}$ then by equating coefficients we must have $β = 0$ and thus $v = v^{'} = 0$ . On the other hand, if $2 α + β = 0$ we get

\begin{matrix} v & = - α E_{4}^{2 g} - 2 α E_{4}^{g} + 2 α E_{4}^{g + h_{g}} + α E_{4}^{0}, \\ v^{'} & = - α E_{4}^{2 g^{'}} - 2 α E_{4}^{g^{'}} + 2 α E_{4}^{g^{'} + h_{g^{'}}} + α E_{4}^{0} . \end{matrix}

Examining the coefficient of $E_{4}^{2 g}$ we see that since $g \neq g^{'}$ and $g \neq 0$ we must have $- α = \pm 2 α$ , so $α = 0$ and $v = v^{'} = 0$ . $□$

Putting together the results of this section and (Gross et al. 2024, Proposition 4.2), we get Theorem 1.1, which we restate here for convenience. By direct calculation (see Table 4), we have that for $G = Z / 5 Z$ , the dimension of $V_{N}^{G}$ is 21, the expected dimension, so we include this result in the statement.

Table 4.

The number of samples for each $A_{λ}$ rank when $G = Z / 5 Z$

rank	13	14	15	16	17	18	19	20	21
# chambers	4	80	560	2,160	5,228	11,520	27,960	41,360	24,480
# chambers	.004%	.071%	.494%	1.906%	4.612%	10.163%	24.667%	36.488%	21.596%

Open in a new tab

Theorem 1.1

dim V_{N}^{G} = 5 ℓ + 1 .

Proof

For $G = Z / 5 Z$ , the result is acheived computationally by using random sampling to find a weight vector $λ$ for which $A_{λ}$ has rank $5 ℓ + 1 = 21$ (Table 4). For all other G, choose $λ$ as described in this section. Through row operations we can transform the matrix $A_{λ}$ into a block upper triangular matrix of the form

\begin{matrix} A_{λ} = (\begin{matrix} A & * \\ 0 & B \end{matrix}), \end{matrix}

so that $rank A_{λ} \geq rank A + rank B$ . By Lemma 3.5, $rank A = 3 ℓ + 1$ , and, by Lemma 3.7, $rank B \geq 2 ℓ$ . Thus, we obtain $rank A_{λ} \geq 5 ℓ + 1$ , and so the affine dimension of $V_{N}^{G}$ is at least $5 ℓ + 1$ . On the other hand, Lemma 2.1 says that $dim V_{N}^{G}$ is at most $5 ℓ + 1$ . $□$

As discussed at the beginning of this work, the 3-sunlet was the only sunlet where a dimension formula was not given in Gross et al. (2024). Since level-1 phylogenetic networks can be broken down into trees and sunlet networks, for the case when G is an abelian group of odd order at least 5 and $N$ is a level-1 phylogenetic network, we can now give a full dimension result.

Theorem 3.8

Let $N$ be a level-1 phylgenetic network with n leaves, m edges, and c cycles. Let G be a finite abelian group of odd order $ℓ + 1 \geq 5$ . Then the variety corresponding to $N$ under the general group-based model for G, denoted $V_{N}^{G}$ , has dimension $ℓ (m - c) + 1$ .

Proof

Following (Gross et al. 2024, Theorem 1.1), we prove the result by induction on the number of cut edges of $N$ . If $N$ has no cut edges, then it is either the 3-star tree or a sunlet network. If $N$ is the 3-star tree, then it has no cycles and $dim V_{N}^{G} = 3 ℓ + 1$ (Gross et al. 2024, Lemma 4.1). If $N$ is a 3-sunlet network then it has a single cycle and 6 edges, and $dim V_{N}^{G} = 5 ℓ + 1$ by Theorem 1.1. If $N$ is an n-sunlet network with $n > 3$ then it has a single cycle and 2n edges, and $dim V_{N}^{G} = ℓ (2 n - 1) + 1$ (Gross et al. 2024, Theorem 4.7). Thus in all cases the result holds.

For the induction step, suppose that $N$ is a level-1 phylogenetic network with a cut edge e, m edges, and c cycles. Let $N_{1}$ and $N_{2}$ be the networks obtained by cutting $N$ at e and observe that both $N_{1}$ and $N_{2}$ have strictly fewer cut edges than $N$ . For $i = 1, 2$ let $m_{i}$ and $c_{i}$ be the number of edges and cycles respectively of $N_{i}$ , then by induction we have $dim V_{N_{i}}^{G} = ℓ (m_{i} - c_{i}) + 1$ . Next we have that the ideal $I_{N}^{G}$ defining $V_{N_{i}}^{G}$ is given by the toric fiber product of $I_{N_{1}}^{G}$ and $I_{N_{2}}^{G}$ (Cummings et al. 2024, Remark 3.3). Using (Gross et al. 2024, Corollary 3.4) we obtain

\begin{matrix} dim V_{N}^{G} & = dim V_{N_{1}}^{G} + dim V_{N_{2}}^{G} - (ℓ + 1) \\ = ℓ (m_{1} - c_{1}) + 1 + ℓ (m_{2} - c_{2}) + 1 - (ℓ + 1) \\ = ℓ (m_{1} + m_{2} - 1 - (c_{1} + c_{2})) + 1 \\ = ℓ (m - c) + 1 . \end{matrix}

$□$

The dimension results in Gross et al. (2024) were used to prove identifiability statements for level-1 phylogenetic networks under group-based models. Now that we have the dimension result for 3-sunlet networks, we can remove the ‘triangle-free’ restriction placed on those statements for the general-group based model for G, when G is a finite abelian group of odd order at least 5. The proof of the following result is identical to (Gross et al. 2024, Proposition 6.9). First, recall the following two definitions. We say that two phylogenetic networks $N_{1}$ and $N_{2}$ are distinguishable over G if $V_{N_{1}}^{G} ⊄ V_{N_{2}}^{G}$ and $V_{N_{2}}^{G} ⊄ V_{N_{1}}^{G}$ . Given a level-1 phylogenetic network $N$ on n leaves and a subset A of the leaf-set [n], the network restricted to A, denoted ${N |}_{A}$ , is the level-1 phylogenetic network obtained from $N$ by removing all edges and vertices that do not lie on any path between two leaves in A, and suppressing any resulting vertices of degree 2 (except the root vertex). See (Gross and Long 2018, Definition 4.1) for a full definition.

Proposition 3.9

Let $N_{1}$ and $N_{2}$ be two level-1 phylogenetic networks on n leaves and both with exactly c cycles. Let G be a finite abelian group of odd order $\geq 5$ . If there exists a subset $A \subset [n]$ such that either

$N_{1} |_{A}$ and $N_{2} |_{A}$ are level-1 phylogenetic networks with distinct numbers of cycles; or
$N_{1} |_{A}$ is a tree and $N_{2} |_{A}$ is a level-1 phylogenetic network (i.e. with at least 1 cycle); or
$N_{1} |_{A}$ and $N_{2} |_{A}$ are distinct trees;

then $N_{1}$ and $N_{2}$ are distinguishable over G. $□$

Experimental Results

Theorem 1.1 confirms that for odd order groups, the general group-based model on a 3-sunlet has the expected dimension $5 ℓ + 1$ (except $G = Z / 3 Z$ ). In this section, we investigate the dimension for small finite abelian groups, and whilst an analogous construction of $λ$ for even order groups does not give us a maximal rank $A_{λ}$ , through experiments we find that the expected dimension is obtained in all cases once G is sufficiently large. The code we used to perform these calculations was written in Julia Bezanson et al. (2017) and available to download at https://github.com/shelbycox/3-Sunlet.

Sampling Methods

We use the hyperplane description from Section 2.3 to compute the possible matrices $A_{λ}$ (and their ranks) that can appear for the groups $Z / 3 Z$ , $Z / 4 Z$ , $Z / 2 Z \times Z / 2 Z$ , and $Z / 5 Z$ . For each of the possible $2^{ℓ (ℓ + 1) + 1}$ regions, we use OSCAR Oscar (2024) to test whether the region is full dimensional. We then use random sampling of points $p = (μ, λ_{4})$ in ${(- . 5, . 5)}^{2 (ℓ + 1)}$ to obtain exactly one point in each region. For other small groups mentioned in this section, we used only random sampling with $2^{32}$ samples for each group to obtain the results. In each case, we retain at most one point for each region.

Chamber counts for small groups

Tables 1, 2, 3, 4 and 5 are the result of the computations described above for the groups $Z / 3 Z$ , $Z / 4 Z$ , $Z / 2 Z \times Z / 2 Z$ , $Z / 5 Z$ , and $Z / 6 Z$ respectively. For the first four groups listed, we confirmed that these are exactly the regions of the hyperplane arrangement. For $Z / 6 Z$ , the data in the table is the result of sampling $2^{32}$ points and then retaining no more than one point per region. In addition, we include an illustration of relationships between chambers for $Z / 3 Z$ in Figure 4. We make some observations.

Table 1.

The number of samples for each $A_{λ}$ rank when $G = Z / 3 Z$

rank	7	8	9
# chambers	4	24	64
% chambers	4.3%	26.1%	69.6%

Open in a new tab

Table 2.

The number of samples for each $A_{λ}$ rank when $G = Z / 4 Z$

rank	10	11	12	13	14	15	16
# chambers	4	48	180	496	864	624	112
% chambers	.2%	2.06%	7.7%	21.3%	37.1%	26.8%	4.8%

Open in a new tab

Table 3.

The number of samples for each $A_{λ}$ rank when $G = Z / 2 Z \times Z / 2 Z$

rank	10	11	12	13	14	15	16
# chambers	4	48	156	584	1056	480	0
% chambers	.2%	2.06%	6.7%	25.1%	45.4%	20.6%	0%

Open in a new tab

Table 5.

The number of samples for each $A_{λ}$ rank when $G = Z / 6 Z$

rank	16	17	18	19	20	21
# chambers	4	120	1296	7,180	26,576	79,156
% chambers	.00004%	.00129%	.01391%	.07707%	.28528%	.84969%
rank	22	23	24	25	26
# chambers	229,069	742,458	2,148,510	3,606,334	2,475,117
% chambers	2.45892%	7.96986%	23.06303%	38.71193%	26.56897%

Open in a new tab

Fig. 4 — Poset of $H (Z / 3 Z)$ , graded by distance (right to left) from the starting chamber, $T_{1}$ (at the bottom of the poset or far right of the diagram). The number in each node is the rank of the corresponding chamber. **Key**: blue/square - $rank A_{λ} = 7$ , red/triangle - $rank A_{λ} = 8$ , yellow/circle - $rank A_{λ} = 9$

Remark 4.1

As observed in Lemma 2.6, in all cases we have exactly 4 chambers of lowest rank, which is equal to the dimension of the 3-star tree, $3 ℓ + 1$ . Two of the chambers correspond to when all leaf-labellings are assigned to either $T_{1}$ or $T_{2}$ . Weight vectors $λ$ with $λ_{5}^{g}$ sufficiently large and $λ_{6}^{g} = 0$ for all $g \in G$ , or vice-versa, lie in these chambers. The remaining two chambers correspond to when all leaf-labellings $(0, g, - g)$ are assigned to $T_{1}$ and all others assigned to $T_{2}$ ; and when all leaf-labellings $(0, g, - g)$ are assigned to $T_{2}$ and all others assigned to $T_{1}$ . These chambers can be reached by taking the previous weight vector $λ$ and swapping the values of $λ_{5}^{0}$ and $λ_{6}^{0}$ .

For $G = Z / 3 Z$ , these chambers can be seen in Figure 4, drawn in squares and highlighted in blue at the very top and very bottom of the diagram. The shortest path in the poset between $T_{1}$ and $T_{2}$ has length 7, meaning that 7 hyperplanes need to be crossed to reach one from the other.

Remark 4.2

When $G = Z / 3 Z$ , most chambers have rank 9, which is the maximum possible, because the space containing the corresponding variety is $C^{9}$ . However, for $Z / 4 Z$ the maximum rank, 16, is achieved by only $4.8 %$ of chambers. For $Z / 2 Z \times Z / 2 Z$ the maximum rank is 15, which is achieved by only $20.6 %$ of chambers. For both $Z / 4 Z$ and $Z / 2 Z \times Z / 2 Z$ rank 14 chambers are observed the most ( $37.1 %$ and $45.4 %$ ).

Remark 4.3

From Table 2 and Table 3, we observe that the number of chambers in the hyperplane arrangement is the same for $Z / 4 Z$ and $Z / 2 Z \times Z / 2 Z$ . We speculate that this is true in general for groups of the same order. However, we also observe that the distribution of ranks for these chambers differs. Thus, the distribution of ranks can depend on the structure of G, and not just |G|.

Locally maximal chambers

In our investigations, we were curious whether a greedy algorithm could be used to find an appropriate $λ$ by moving from chamber to chamber. For such a method to work, there should be no locally maximal chambers that do not achieve the maximum rank. A locally maximal chamber in terms of rank means that all adjacent chambers have rank less than or equal to the rank of the chamber. Here, we describe our search for locally maximal chambers that do not achieve maximum rank. To do this computationally, we cycle through every chamber in the hyperplane arrangement found by random sampling (as in Section 4.2) and check whether the rank of this chamber is less than the global maximum, but greater than the rank of all adjacent chambers. The code is available in the file locallyMaximalChambers.jl.

The number of such chambers for small groups is given in Table 6. For groups with $| G | \leq 5$ , the rank of the locally maximal chambers of deficient rank is always one less than the dimension of the model. For $G = Z / 6 Z$ , this appears to not be the case. For this group, our code did not complete, but had still found 625 locally maximal chambers of ranks 23, 24, and 25 before it was terminated. However, since we were not able to verify that we had found all chambers in the hyperplane arrangement through random sampling, it may be the case that some of the locally maximal chambers found are not maximal at all.

Table 6.

The number of locally maximal chambers of deficient rank for each group. Locally maximal chambers of deficient rank are chambers which have non-maximal rank, and for which all adjacent chambers have rank lesser than or equal to the rank of the chamber

Group	$Z / 3 Z$	$Z / 4 Z$	$Z / 2 Z \times Z / 2 Z$	$Z / 5 Z$	$Z / 6 Z$
Locally maximal chambers	0	128	0	1840	$\geq 625$
%	0%	5.50%	0%	1.62%	$\geq 0.006 %$

Open in a new tab

Weight vector for groups of even order

In Section 3, we construct a weight vector $λ$ where $A_{λ}$ is the maximum rank possible for groups of odd order. Here, for even order groups, we construct an analogous vector $λ$ by following the construction in Section 3, but including elements of order 2 in the set X. In this case, we find that the rank of $A_{λ}$ does not always achieve the empirically maximum rank. That is, random sampling sometimes finds $λ^{'}$ with $A_{λ^{'}}$ having greater rank than $A_{λ}$ .

Observe that in all cases, through random sampling we are able to find a weight vector where $A_{λ}$ has the maximum possible rank according to Lemma 2.1. This means that in all cases in Table 7, the dimension of the variety is equal to the expected dimension of $5 ℓ + 1$ .

Table 7.

The third column of the table (Empirical maximum) lists the maximum rank of $A_{λ}$ found by randomly sampling $λ$ . The fourth column (Modified Section 3 construction) lists the rank of $A_{λ}$ , when $λ$ is the vector described at the beginning of this section. The last column records the difference between the third and fourth columns.

		$rank (A_{λ})$
			Modified Section 3
Group	\|G\|	Empirical maximum	construction	Gap
$Z / 4 Z$	4	16	15	1
${(Z / 2 Z)}^{2}$		15	13	2
$Z / 6 Z$	6	26	26	–
$Z / 8 Z$		36	36	–
$Z / 4 Z \times Z / 2 Z$	8		36	–
${(Z / 2 Z)}^{3}$			29	7
$Z / 10 Z$	10	46	46	–
$Z / 12 Z$	12	56	56	–
$Z / 3 Z \times {(Z / 2 Z)}^{2}$			56	–
$Z / 14 Z$	14	66	66	–
$Z / 16 Z$	16	76	76	–
$Z / 8 Z \times Z / 2 Z$			76	–
$Z / 4 Z \times Z / 4 Z$			76	–
$Z / 4 Z \times {(Z / 2 Z)}^{2}$			76	–
${(Z / 2 Z)}^{4}$			61	15
$Z / 18 Z$	18	86	86	–
${(Z / 3 Z)}^{2} \times Z / 2 Z$			86	–
${(Z / 2 Z)}^{5}$	32	156	125	31

Open in a new tab

We see that for $λ$ constructed according to the methods in Section 3, $A_{λ}$ often achieves the maximum empirical rank, and it is only for powers of $Z / 2 Z$ that this construction does not work. Furthermore, in these cases, the difference between the rank of $A_{λ}$ and the maximum rank is equal to $| G | - 1$ . In Appendix A, we provide an example comparing the constructed $λ$ and the empirically maximal $λ$ for $G = Z / 4 Z$ . Specifically, we find a weight vector $λ^{'}$ for which the corresponding matrix, $A_{λ^{'}}$ , achieves the maximum rank of 16, and compare it to the matrix $A_{λ}$ obtained from a construction analogous to that in Section 3. This gives good evidence that it is always possible to achieve the maximum possible rank (i.e., $5 ℓ + 1$ ) as the rank of $A_{λ}$ , when $| G | \geq 5$ .

Discussion

In this paper, we give a dimension formula for varieties associated to a 3-sunlet phylogenetic network and general group-based model, where the group G is a finite abelian group of odd order. To do this, we use ideas from tropical geometry and linear algebra. Our proof relies on the fact that for odd order groups, there are no elements of order 2, and thus the non-identity elements can be partitioned into two sets of equal size (one of which we refer to as X in Section 3), each containing mutually inverse elements. Thus, our proof does not obviously generalise to even-order groups, and a full understanding of the dimension of these models remains open. In Section 4.4 we construct weight vectors for even-order groups by assigning self-inverse elements to X and following the construction in Section 3. However, as shown in Table 7, for those groups that are products of $Z / 2 Z$ , the rank of the corresponding $A_{λ}$ is less than the dimension of the model. Interestingly, the difference is $| G | - 1$ .

We have not yet explored models for which the subgroup $B \subset Aut G$ is non-trivial. In Gross et al. (2024), a dimension formula is given for triangle-free phylogenetic networks for all group-based models (i.e., for all such subgroups B). This is achieved by choosing a weight vector that, for each edge in the network, is constant on the coordinates associated to the B-orbits on that edge. Since the structure of $Aut (G)$ varies with G, this is only possible if we consider the two orbits ${0}$ and $G \ {0}$ . However, in our experiments we found that for the 3-sunlet, the weight vectors $λ$ with $A_{λ}$ of maximal rank were not constant on these orbits, suggesting that a case-by-case analysis may be required. We identified equivalent weight vectors $λ$ (i.e. those for which $A_{λ}$ is the same) with chambers in a hyperplane arrangement. Each chamber can be interpreted as a toric degeneration of the variety. Two of the chambers correspond to the two distinct phylogenetic trees displayed by a 3-sunlet. These have the lowest rank among all chambers, and the rank is equal to the dimension of the variety associated to the group-based model on the corresponding tree. The remaining chambers correspond to a mixture of the Fourier coordinates for these two trees, and so there is no clear phylogenetic intepretation of these chambers.

The investigations in this paper highlight the intricate challenges involved in understanding 3-cycles. In the sampling experiments in Section 4, we were able to examine the ranks of all chambers in the hyperplane arrangement. However, it is only when $| G | \geq 5$ that the expected dimension of the variety ( $5 ℓ + 1$ ) is less than the dimension of the ambient space ( ${| G |}^{2}$ ), so the cases with $| G | \leq 4$ are exceptional. We observe that as the group gets larger, there are proportionally more chambers of maximal rank. However, due to the large growth in the number of chambers as the size of the group grows, we were not able to determine if this is a general pattern.

A first approach at obtaining the result for all finite abelian groups may be to try to adapt the weight vector $λ$ . Guidance on appropriate adaptations could be found by understanding how changes in the vector correspond to moving between chambers. Indeed, a good understanding of the hyperplane arrangement may make it possible to devise an algorithm to search for a weight vector $λ$ for which $A_{λ}$ has maximal rank. However, as we have shown in Section 4.3, this is also not straightforward. In this section, we found that for at least some groups there are locally maximal chambers, and therefore a greedy algorithm starting at a lowest rank chamber and moving to chambers of strictly larger rank may not terminate on a globally maximal chamber.

To further understand the varieties associated to 3-sunlet networks and group-based models, we would like to be able to describe generating sets of the corresponding ideals. Polynomials in these ideals are called phylogenetic invariants, and are useful for determining model identifiability, model selection, and even topology inference from sequence data. However, calculating generating sets is challenging. Using Macaulay2Grayson and Stillman (2022) and elimination theory, we attempted to find a Gröbner basis for the ideal corresponding to a 3-sunlet under the general group-based model with $G = Z / 5 Z$ – the smallest odd-order group for which we have a dimension result and for which the variety does not fill the whole space. After 50 days on an HPC these computations were still running. We also used the MultigradedImplicitization package Cummings and Hollering (2026) to find generators of fixed total degree. The computations completed for degrees up to and including 7, and in each case there were no generators. After 45 days the computations for degree 8 were still running. In this case we have a polynomial ring with $n = 25$ generators, so the dimension of the space spanned by monomials of degree $d = 8$ is large: $(\begin{matrix} n + d - 1 \\ d \end{matrix}) = (\begin{matrix} 32 \\ 8 \end{matrix}) = 10518300$ . This highlights the difficulty in calculating generators even for small models. Further work is necessary for this to be achieveable.

Acknowledgements

SC was supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1841052, and by the National Science Foundation under Grant No. 1855135 during the writing of this paper. SM was supported by the Biotechnology and Biological Sciences Research Council (BBSRC), part of UK Research and Innovation, through the Core Capability Grant BB/CCG1720/1 at the Earlham Institute and is grateful for HPC support from NBI’s Research Computing group. SM is grateful for further funding from BBSRC (grant number BB/X005186/1) which also supported this work. EG was supported by the National Science Foundation grant DMS-1945584. This project was initiated at the “Algebra of Phylogenetic Networks Workshop" held at the University of Hawai‘i at Mānoa and supported by National Science Foundation grant DMS-1945584. Additional parts of this research were performed while EG and SC were visiting the Institute for Mathematical and Statistical Innovation (IMSI) for the semester-long program on “Algebraic Statistics and Our Changing World," IMSI is supported by the National Science Foundation (Grant No. DMS-1929348).

Appendix A. Example Weights and Matrices

A.1 Example when $G = Z / 4 Z$ .

Below, we study the $λ$ -construction from Section 3, adapted for even order groups (so all order two elements are in X), for $G = Z / 4 Z$ . We denote this weight vector by $λ_{guess}$ and denote the corresponding matrix by $A_{guess}$ . As noted in Table 7, $rk A_{guess} = 15$ , which is not the empirical maximum. Through random sampling, we find in Table 2 that there are 112 chambers of the $Z / 4 Z$ -sunlet arrangement whose corresponding matrices achieve the maximal rank. We pick a weight vector, $λ_{max}$ , in one of these chambers so that the corresponding matrix, $A_{max}$ , has maximal rank and differs from $A_{guess}$ in exactly one column. For the $λ_{max}$ chosen here, the two matrices differ only in the column [[2], [1], [1]].

graphic file with name 11538_2025_1506_Equ50_HTML.gif

Funding

Open Access funding enabled and organized by Projekt DEAL.

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

Barton T, Gross E, Long C, Rusinko J (2022) Statistical learning with phylogenetic network invariants. arXiv preprint arXiv:2211.11919
Bezanson J, Edelman A, Karpinski S, Shah VB (2017) Julia: A fresh approach to numerical computing. SIAM Rev 59(1):65–98 [Google Scholar]
Cummings J, Gross E, Hollering B, Martin S, Nometa I (2023) The Pfaffian structure of CFN phylogenetic networks. arXiv preprint arXiv:2312.07450
Cummings J, Hollering B (2026) Computing implicitizations of multi-graded polynomial maps. J Symb Comput 132:102459 [Google Scholar]
Cummings J, Hollering B, Manon C (2024) Invariants for level-1 phylogenetic networks under the Cavendar-Farris-Neyman model. Adv Appl Math 153:102633 [Google Scholar]
Dayhoff MO, Schwartz RM, Orcutt BC (1978) Atlas of protein sequences and structure, Vol 5, chapter A model of evolutionary change in proteins, pages 345–352. National Biomedical Research Foundation
Draisma J (2008) A tropical approach to secant dimensions. J Pure Appl Algebra 212(2):349–363 [Google Scholar]
Evans SN, Speed TP (1993) Invariants of some probability models used in phylogenetic inference. Ann Stat 21(1):355–377 [Google Scholar]
Goldman N, Thorne JL, Jones DT (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149(1):445–458 [DOI] [PMC free article] [PubMed] [Google Scholar]
Goldman N, Yang Z (1994) A codon-based model of nucleotide substitution for protein-coding dna sequences. Mol Biol Evol 11:725–736 [DOI] [PubMed] [Google Scholar]
Grayson DR, Stillman ME (2022) Macaulay2, Version 1.20, http://www.math.uiuc.edu/Macaulay2/
Gross E, Krone R, Martin S (2024) Dimensions of level-1 group-based phylogenetic networks. Bull Math Biol 86(8):90 [DOI] [PMC free article] [PubMed] [Google Scholar]
Gross E, Long C (2018) Distinguishing phylogenetic networks. SIAM J Appl Algebra Geom 2(1):72–93 [Google Scholar]
Gross E, Lv I, Janssen R, Jones M, Long C, Murakami Y (2021) Distinguishing level-1 phylogenetic networks on the basis of data generated by Markov processes. J Math Biol 83(3):32 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hollering B, Sullivant S (2021) Identifiability in phylogenetics using algebraic matroids. J Symb Comput 104:142–158 [Google Scholar]
Kosio C, Goldman N, Buttimore NH (2004) A new criterion and method for amino acid classification. J Theor Biol 228:97–106 [DOI] [PubMed] [Google Scholar]
Martin S, Moulton V, Leggett RM (2023) Algebraic invariants for inferring 4-leaf semi-directed phylogenetic networks. bioRxiv:2023.09.11.557152
Nakhleh L (2011) Problem Solving Handbook in Computational Biology and Bioinformatics, chapter Evolutionary Phylogenetic Networks: Models and Issues, pages 125–158. Springer Science+Business Media, LLC
Oscar – open source computer algebra research system, version 1.0.0, 2024
Sturmfels B (1996) Gröbner Bases and Convex Polytopes, vol 8. Universty Lectures Series. American Mathematical Society, Providence, RI
Sturmfels B, Sullivant S (2005) Toric ideals of phylogenetic invariants. J Comput Biol 12(4):457–481 [DOI] [PubMed] [Google Scholar]
Sullivant S (2018) Algebraic Statistics, vol 194. Graduate Studies in Mathematics. American Mathematical Society, Providence, RI
Székely LA, Steel MA, Erdős PL (1993) Fourier calculus on evolutionary trees. Adv Appl Math 14:200–216 [Google Scholar]

[CR1] Barton T, Gross E, Long C, Rusinko J (2022) Statistical learning with phylogenetic network invariants. arXiv preprint arXiv:2211.11919

[CR2] Bezanson J, Edelman A, Karpinski S, Shah VB (2017) Julia: A fresh approach to numerical computing. SIAM Rev 59(1):65–98 [Google Scholar]

[CR3] Cummings J, Gross E, Hollering B, Martin S, Nometa I (2023) The Pfaffian structure of CFN phylogenetic networks. arXiv preprint arXiv:2312.07450

[CR4] Cummings J, Hollering B (2026) Computing implicitizations of multi-graded polynomial maps. J Symb Comput 132:102459 [Google Scholar]

[CR5] Cummings J, Hollering B, Manon C (2024) Invariants for level-1 phylogenetic networks under the Cavendar-Farris-Neyman model. Adv Appl Math 153:102633 [Google Scholar]

[CR6] Dayhoff MO, Schwartz RM, Orcutt BC (1978) Atlas of protein sequences and structure, Vol 5, chapter A model of evolutionary change in proteins, pages 345–352. National Biomedical Research Foundation

[CR7] Draisma J (2008) A tropical approach to secant dimensions. J Pure Appl Algebra 212(2):349–363 [Google Scholar]

[CR8] Evans SN, Speed TP (1993) Invariants of some probability models used in phylogenetic inference. Ann Stat 21(1):355–377 [Google Scholar]

[CR9] Goldman N, Thorne JL, Jones DT (1998) Assessing the impact of secondary structure and solvent accessibility on protein evolution. Genetics 149(1):445–458 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR10] Goldman N, Yang Z (1994) A codon-based model of nucleotide substitution for protein-coding dna sequences. Mol Biol Evol 11:725–736 [DOI] [PubMed] [Google Scholar]

[CR11] Grayson DR, Stillman ME (2022) Macaulay2, Version 1.20, http://www.math.uiuc.edu/Macaulay2/

[CR12] Gross E, Krone R, Martin S (2024) Dimensions of level-1 group-based phylogenetic networks. Bull Math Biol 86(8):90 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR13] Gross E, Long C (2018) Distinguishing phylogenetic networks. SIAM J Appl Algebra Geom 2(1):72–93 [Google Scholar]

[CR14] Gross E, Lv I, Janssen R, Jones M, Long C, Murakami Y (2021) Distinguishing level-1 phylogenetic networks on the basis of data generated by Markov processes. J Math Biol 83(3):32 [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR15] Hollering B, Sullivant S (2021) Identifiability in phylogenetics using algebraic matroids. J Symb Comput 104:142–158 [Google Scholar]

[CR16] Kosio C, Goldman N, Buttimore NH (2004) A new criterion and method for amino acid classification. J Theor Biol 228:97–106 [DOI] [PubMed] [Google Scholar]

[CR17] Martin S, Moulton V, Leggett RM (2023) Algebraic invariants for inferring 4-leaf semi-directed phylogenetic networks. bioRxiv:2023.09.11.557152

[CR18] Nakhleh L (2011) Problem Solving Handbook in Computational Biology and Bioinformatics, chapter Evolutionary Phylogenetic Networks: Models and Issues, pages 125–158. Springer Science+Business Media, LLC

[CR19] Oscar – open source computer algebra research system, version 1.0.0, 2024

[CR20] Sturmfels B (1996) Gröbner Bases and Convex Polytopes, vol 8. Universty Lectures Series. American Mathematical Society, Providence, RI

[CR21] Sturmfels B, Sullivant S (2005) Toric ideals of phylogenetic invariants. J Comput Biol 12(4):457–481 [DOI] [PubMed] [Google Scholar]

[CR22] Sullivant S (2018) Algebraic Statistics, vol 194. Graduate Studies in Mathematics. American Mathematical Society, Providence, RI

[CR23] Székely LA, Steel MA, Erdős PL (1993) Fourier calculus on evolutionary trees. Adv Appl Math 14:200–216 [Google Scholar]

PERMALINK

Group-based phylogenetic models on 3-sunlet networks

Shelby Cox

Elizabeth Gross

Samuel Martin

Abstract

Introduction

Theorem 1.1

Background

Fig. 1.

The 3-sunlet networks

Fig. 2.

Determining Dimension

Lemma 2.1

Defining Hyperplanes

Definition 2.2

Lemma 2.3

Proof

Example 2.4

Proposition 2.5

Proof

Lemma 2.6

Proposition 2.7

Proof

Dimension for Odd Order Groups

Lemma 3.1

Proof

Lemma 3.2

Proof

Example 3.3

Example 3.4

Lemma 3.5

Proof

Fig. 3.

Remark 3.6

Lemma 3.7

Proof

Table 4.

Theorem 1.1

Proof

Theorem 3.8

Proof

Proposition 3.9

Experimental Results

Sampling Methods

Chamber counts for small groups

Table 1.

Table 2.

Table 3.

Table 5.

Fig. 4.

Remark 4.1

Remark 4.2

Remark 4.3

Locally maximal chambers

Table 6.

Weight vector for groups of even order

Table 7.

Discussion

Acknowledgements

Appendix A. Example Weights and Matrices

A.1 Example when G=Z/4Z.

Funding

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

A.1 Example when $G = Z / 4 Z$ .