Point configurations, phylogenetic trees, and dissimilarity vectors

Alessio Caminata; Noah Giansiracusa; Han-Bom Moon; Luca Schaffler

doi:10.1073/pnas.2021244118

. 2021 Mar 15;118(12):e2021244118. doi: 10.1073/pnas.2021244118

Point configurations, phylogenetic trees, and dissimilarity vectors

Alessio Caminata ^a,¹, Noah Giansiracusa ^b,^1,², Han-Bom Moon ^c,¹, Luca Schaffler ^d,¹

PMCID: PMC8000033 PMID: 33723055

Significance

Motivated by the desire to estimate phylogenetic trees from data on collections of taxa rather than just pairs, in 2004 a new variant of phylogenetic tree reconstruction was introduced, and two important theoretical questions were asked about it. One of these questions was solved within a few years; the other remained open until now. We resolve this second question and in doing so significantly enlarge a bridge between phylogenetics and a rapidly developing area of contemporary mathematics called tropical geometry.

Keywords: phylogenetic tree, dissimilarity vector, Grassmannian, tropical geometry, rational normal curve

Abstract

In 2004, Pachter and Speyer introduced the higher dissimilarity maps for phylogenetic trees and asked two important questions about their relation to the tropical Grassmannian. Multiple authors, using independent methods, answered affirmatively the first of these questions, showing that dissimilarity vectors lie on the tropical Grassmannian, but the second question, whether the set of dissimilarity vectors forms a tropical subvariety, remained opened. We resolve this question by showing that the tropical balancing condition fails. However, by replacing the definition of the dissimilarity map with a weighted variant, we show that weighted dissimilarity vectors form a tropical subvariety of the tropical Grassmannian in exactly the way that Pachter and Speyer envisioned. Moreover, we provide a geometric interpretation in terms of configurations of points on rational normal curves and construct a finite tropical basis that yields an explicit characterization of weighted dissimilarity vectors.

In one of the first papers on tropical geometry, Speyer and Sturmfels (1) introduced the tropical Grassmannian and showed that ${G r}^{t r o p} (2, n) \subseteq R^{(\binom{n}{2})}$ coincides with the space of $n$ -leaf phylogenetic trees, a tropical analogue of the moduli space of stable rational $n$ -pointed curves that plays an important role in genomics. With this Euclidean embedding, each phylogenetic tree is identified with its dissimilarity vector, the $(\binom{n}{2})$ -tuple of path lengths connecting each pair of the $n$ leaves.

Pachter and Speyer (2) generalized this embedding by introducing the higher dissimilarity maps: for each integer $r$ with $2 \leq r \leq \frac{n + 1}{2}$ they showed that any phylogenetic tree can be recovered from its $r$ -dissimilarity vector, the $(\binom{n}{r})$ -tuple recording the sum of edge lengths in the subtree spanned by each subset of $r$ leaves. They also stated two questions concerning the possible tropical geometry of these higher dissimilarity maps: 1) Is the space of $r$ -dissimilarity vectors in $R^{(\binom{n}{r})}$ contained in the tropical Grassmannian ${G r}^{t r o p} (r, n)$ ? If so, then 2) is there a rational map $G r (2, n) --\to G r (r, n)$ whose image tropicalizes to yield the space of $r$ -dissimilarity vectors? The first question was answered positively by several authors using distinct methods (3–5), whereas the second question has remained open other than the case $r = 3$ that was confirmed in the original (2). There have been numerous papers studying other aspects of Pachter and Speyer’s higher dissimilarity maps as well (e.g., refs. 6–13).

In this paper we resolve the second question of Pachter and Speyer and introduce and study a variant of the higher dissimilarity maps that is more compatible with tropical geometry.

1. Statement of Results

By direct calculation we provide a negative answer to the second tropical question of Pachter and Speyer (recall that the first open case is $r = 4$ , $n = 7$ ).

Theorem 1.1. For $n = 7$ the space of four-dissimilarity vectors in $R^{(\binom{7}{4})}$ is a polyhedral complex that is not balanced, for any choice of weights on the facets, and hence is not a tropical variety (see also Theorem 3.1).

However, this is not the end of the story. The rational map $G r (2, n) --\to G r (3, n)$ in ref. 2, providing the motivation for their second tropical question, does not tropicalize to a map sending the two-dissimilarity vector of each phylogenetic tree to the corresponding three-dissimilarity vectors—as Pachter and Speyer point out, the output is twice the corresponding three-dissimilarity vector. This generalizes to a rational map $G r (2, n) --\to G r (r, n)$ whose tropicalization sends the two-dissimilarity vector of a phylogenetic tree to the $(\binom{n}{r})$ -tuple recording, for each size $r$ subset of the $n$ leaves, the sum of all path lengths connecting all pairs of leaves in this subset. It is just a coincidence that for $r = 3$ these two different notions of subtree weights differ by a scalar. We call these $(\binom{n}{r})$ -tuples defined using path lengths within subtrees weighted $r$ -dissimilarity vectors and the map sending a phylogenetic tree to its vector of weighted $r$ -dissimilarity vectors the weighted $r$ -dissimilarity map. While for $r > 3$ the original $r$ -dissimilarity vectors do not have the tropical geometry interpretation Pachter and Speyer had hoped for, it turns out these weighted variants do.

Theorem 1.2. For $2 \leq r \leq n - 2$ , the weighted $r$ -dissimilarity map embeds the space of phylogenetic trees as a tropical subvariety in $R^{(\binom{n}{r})}$ . This tropical variety is the tropicalization of a subvariety of $G r (r, n)$ that is both 1) the image of a natural rational map $G r (2, n) --\to G r (r, n)$ and 2) the Gelfand–MacPherson correspondence applied to the open subvariety of ${(P^{r - 1})}^{n}$ parameterizing configurations of $n$ distinct points that lie on a rational normal curve in $P^{r - 1}$ (see also Theorem 4.1 and Proposition 5.1).

The equations for the Zariski closure of the locus in ${(P^{r - 1})}^{n}$ mentioned in the preceding theorem were studied in ref. 14. While they are not known in full generality, we prove here that a particularly simple subset of the defining equations, after applying the Gelfand–MacPherson correspondence, yields a tropical basis for an ideal whose set-theoretic vanishing locus is the subvariety of $G r (r, n)$ alluded to in the preceding theorem. As a consequence of this tropical basis result, we obtain the following characterization of weighted dissimilarity vectors, generalizing the classic tree-metric theorem for two-dissimilarity vectors.

Theorem 1.3. Fix $2 \leq r \leq n - 2$ . A vector $w = {(w_{I})}_{I \in (\binom{[n]}{r})} \in R^{(\binom{n}{r})}$ is a weighted $r$ -dissimilarity vector if and only if the following two conditions hold:

i)
for each 4-tuple ${i, j, k, l} \subseteq [n]$ , there exists an $A \subseteq [n] \ {i, j, k, l}$ of size $r - 2$ such that two of the following expressions equal each other and are greater than or equal to the third:

w_{i j A} + w_{k l A}, w_{i k A} + w_{j l A}, w_{i l A} + w_{j k A};

i)
for each $I \in (\binom{[n]}{6})$ , $J \in (\binom{[n] \ I}{r - 3})$ , and for each cube $C$ on $I$ (see section 5) with corresponding bipartition $B, W$ we have

\sum_{K \in B} w_{J ⊔ K} = \sum_{K \in W} w_{J ⊔ K} .

(See also Corollary 5.1.)

The case $r = 2$ is a main result of ref. 1, and our proof relies on their result; in both this case and the case $r = n - 2$ , condition 2 here is vacuous because $(\binom{[n] \ I}{r - 3}) = \emptyset$ . In general, this characterization does not provide a minimal, nonredundant set of conditions, and indeed, our proof suggests an algorithmic approach for reducing the number of conditions of type 2 that need to be checked.

Remark 1.1: In ref. 1 it is shown that the quadratic Plücker relations do not form a tropical basis for the ideal of $G r (r, n)$ when $r \geq 3$ and $n \geq 7$ , and in general, the tropical Grassmannian depends on the characteristic of the base field. It is interesting to contrast with the present situation where the tropical subvariety of ${G r}^{t r o p} (r, n)$ parameterizing weighted $r$ -dissimilarity vectors, and the tropical basis we construct for it, is independent of the base field.

2. Background and Preliminaries

We begin with some conventions. We work over an algebraically closed field $𝕜$ of arbitrary characteristic, equipped with the trivial valuation. For a subvariety $X \subseteq P^{N - 1}$ of projective space, we denote by $X ° : = X^{a ff} \cap {(𝕜^{\times})}^{N}$ the restriction of the affine cone over $X$ to the dense open torus in $A^{N}$ . Tropicalization sends subvarieties of the torus ${(𝕜^{\times})}^{N}$ to subsets of Euclidean space $R^{N}$ .

2.1. Phylogenetic Trees

For us, an $n$ -leaf phylogenetic tree is a connected graph, without cycles or vertices of degree 2, with $n$ leaves labeled by the integers $[n] : = {1, \dots, n}$ , that is equipped with an $R$ -valued length on each edge such that all of the internal edges have nonnegative length. The set of $n$ -leaf phylogenetic trees with a fixed combinatorial tree as the underlying graph forms a half-space $R_{\geq 0}^{# e d g e s - n} \times R^{n}$ , and by identifying trees having edges of length 0 with the trees obtained by deleting such edges these half-spaces are naturally glued together and form an abstract polyhedral complex that we shall denote by $T_{n}$ , known as the space of phylogenetic trees (15). An influential result of Speyer and Sturmfels is that the tropical Grassmannian

{G r}^{t r o p} (2, n) : = T r o p (G r (2, n) °) \subseteq R^{(\binom{n}{2})}

coincides with the space of phylogenetic trees $T_{n}$ (theorem 3.4 in ref. 1).

Remark 2.1: A phylogenetic tree is sometimes defined to have edge lengths only on its internal edges. The space of such phylogenetic trees is the quotient of ${G r}^{t r o p} (2, n)$ by a linear subspace of dimension $n$ , and it coincides with the moduli space of tropical $n$ -pointed stable rational curves $M_{0, n}^{t r o p}$ somewhat analogous to Kapranov’s construction (16) of ${\bar{M}}_{0, n}$ as a (Chow) quotient of the Grassmannian $G r (2, n)$ by the maximal torus ${(𝕜^{\times})}^{n}$ (indeed, the linear subspace $R^{n}$ acting on ${G r}^{t r o p} (2, n)$ is the tropicalization of Kapranov’s torus action). Throughout this paper we include the noninternal edge lengths and hence work in $R^{(\binom{n}{2})}$ without taking this linear subspace quotient.

2.2. Dissimilarity Vectors and Maps

The map

d_{2} : T_{n} \to R^{(\binom{n}{2})},

sending each phylogenetic tree $T$ to the vector whose $(i < j)$ entry is the sum of edge lengths along the unique path in $T$ connecting leaf $i$ to leaf $j$ , is known as the dissimilarity map, and the output $d_{2} (T)$ is a dissimilarity vector. This map is injective (17), with image equal to ${G r}^{t r o p} (2, n)$ ; it identifies phylogenetic trees with dissimilarity vectors or, equivalently, points of the tropical Grassmannian (1).

The higher dissimilarity map

d_{r} : T_{n} \to R^{(\binom{n}{r})},

introduced in ref. 2 for $r \geq 3$ , sends $T$ to the higher dissimilarity vector whose $I$ entry, for $I \in (\binom{[n]}{r})$ , is the sum of edge lengths among all edges in the subtree spanned by the $r$ leaves indexed by $I$ ; it is injective for $2 \leq r \leq \frac{n + 1}{2}$ (theorem in section 2 in ref. 2). Since a tree spanned by two leaves is a path, the $r = 2$ case of this map coincides with the dissimilarity map in the preceding paragraph.

Pachter and Speyer asked two questions about these higher dissimilarity maps (problems 3 and 4 in ref. 2): 1) is the image of $d_{r}$ contained in ${G r}^{t r o p} (r, n)$ , and if so, then 2) is there a rational map $G r (2, n) --\to G r (r, n)$ whose image, viewed as a subvariety of ${(𝕜^{\times})}^{(\binom{n}{r})}$ by taking the affine cone over the Plücker embedding and then intersecting with the big torus in $A^{(\binom{n}{r})}$ , tropicalizes to the space of higher dissimilarity vectors $d_{r} (T_{n})$ ? Various authors, cited above in the Introduction, resolved the first of these questions in the affirmative. For the second question, there has been progress in characterizing the image of $d_{r}$ (13), even in terms of a piecewise linear map that appears related to tropical geometry (8), but the only case that had been fully resolved is $r = 3$ , where in section 3 in ref. 2, it is observed that the rational map $G r (2, n) --\to G r (3, n)$ induced by applying the second Veronese map to the columns of a $2 \times n$ matrix achieves the desired goal. This Pachter and Speyer map can be generalized as follows.

Definition 2.1: The matrix morphism $A^{2 n} \to A^{r n}$ ,

(\begin{matrix} x_{1} & x_{2} & \dots & x_{n} \\ y_{1} & y_{2} & \dots & y_{n} \end{matrix}) \mapsto

(\begin{matrix} x_{1}^{r - 1} & x_{2}^{r - 1} & \dots & x_{n}^{r - 1} \\ x_{1}^{r - 2} y_{1} & x_{2}^{r - 2} y_{2} & \dots & x_{n}^{r - 2} y_{n} \\ x_{1}^{r - 3} y_{1}^{2} & x_{2}^{r - 3} y_{2}^{2} & \dots & x_{n}^{r - 3} y_{n}^{2} \\ ⋱ \\ y_{1}^{r - 1} & y_{2}^{r - 1} & \dots & y_{n}^{r - 1} \end{matrix})

[1]

given by applying the $(r - 1)$ -Veronese map to each column, descends to a rational map of Grassmannians $G r (2, n) --\to G r (r, n)$ that we call the column-wise $(r - 1)$ -Veronese Grassmannian map, or simply Veronese Grassmannian map.

The fact that this matrix map descends to the Grassmannians follows from the elementary observation that the image of each ${G L}_{2}$ orbit is contained in a ${G L}_{r}$ orbit. Note, however, that the image of a full-rank matrix need not be a full-rank matrix, so at the level of Grassmannians this really is just a rational map and not a regular morphism; for instance, the full-rank matrix

(\begin{matrix} 1 & 0 & 0 & \dots & 0 \\ 0 & 1 & 0 & \dots & 0 \end{matrix})

is sent to the following nonfull-rank matrix:

(\begin{matrix} 1 & 0 & 0 & \dots & 0 \\ 0 & 0 & 0 & \dots & 0 \\ ⋮ \\ 0 & 1 & 0 & \dots & 0 \end{matrix}) .

This column-wise Veronese Grassmannian map will play a central role in our paper.

3. Resolution of Pachter and Speyer’s Second Question

Recall that for each tree $G$ underlying an $n$ -leaf phylogenetic tree (meaning $G$ has leaves labeled by $1, \dots, n$ but the edges do not carry weights), there is a polyhedral cone in the space of phylogenetic trees $T_{n}$ , let us call it $T_{n}^{G}$ , parameterizing phylogenetic trees on $G$ . The restriction of the dissimilarity map $d_{r}$ to each such polyhedral cone is linear, and so the image $d_{r} (T_{n}^{G})$ is a polyhedral cone in $R^{(\binom{n}{r})}$ . By varying $G$ , the polyhedral cones $d_{r} (T_{n}^{G})$ provide a polyhedral decomposition of the space of $r$ -dissimilarity vectors.

For each edge $E$ of $G$ , let $T_{E}$ be the phylogenetic tree on $G$ where $E$ has length 1 and all of the other edges have length 0. In phylogenetics, such trees $T_{E}$ are called split metrics. Then the polyhedral cone $d_{r} (T_{n}^{G})$ consists of all $R$ -linear combinations of the vectors $d_{r} (T_{E})$ such that the coefficient on $d_{r} (T_{E})$ is nonnegative whenever $E$ is an internal edge.

To show that the space of $r$ -dissimilarity vectors $d_{r} (T_{n})$ for $r > 3$ is not the tropicalization of the image of a map $G r (2, n) --\to G r (r, n)$ , we show a stronger result: $d_{r} (T_{n})$ is not even a tropical variety in general. This is because as a polyhedral complex, $d_{r} (T_{n})$ is not balanced for $r \geq 4$ (for the definition of balanced see definition 3.3.1 in ref. 18), and tropical varieties are balanced by theorem 3.3.5 in ref. 18. We check this explicitly in the first nontrivial case.

Theorem 3.1. The 11-dimensional polyhedral complex $d_{4} (T_{7}) \subseteq R^{(\binom{7}{4})} = R^{35}$ is not balanced for any choice of facet weights, and hence, it is not a tropical variety.

Proof: Consider the graph $G$ in Fig. 1 with a unique vertex of degree 4. This corresponds to a codimension-one cone $σ : = d_{4} (T_{7}^{G})$ that is the common face of three maximal-dimensional cones, call them $τ_{1}, τ_{2}, τ_{3}$ . Each $τ_{i}$ corresponds to the graph obtained by inserting an edge $E_{i}$ separating the four incident edges in $G$ into two pairs of coincident edges (Fig. 2). Since the edge $E_{i}$ is internal, the cone $τ_{i}$ is the $R_{\geq 0}$ span of $σ$ and the vector $ρ_{i} : = d_{4} (T_{E_{i}})$ . For $d_{4} (T_{7})$ to be balanced along $σ$ it is necessary that $ρ_{1}, ρ_{2}, ρ_{3}$ are linearly dependent modulo the subspace $⟨ σ ⟩$ spanned by $σ$ .

Fig. 2. — The three graphs whose corresponding 11-dimensional cones $τ_{1}, τ_{2}, τ_{3}$ meet along the common face $σ$ .

Consider the $13 \times 35$ matrix whose columns are indexed by the size 4 subsets of [7] (ordered lexicographically: ${1,2,3,4} < {1,2,3,5} < \dots < {4,5,6,7}$ ), whose first 10 rows are the images under $d_{4}$ of the split metrics defined by the edges in $G$ , and whose last three rows are the images of the split metrics defined by the edges $E_{i}$ :

(\begin{matrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 1 \\ 0 & 1 & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 1 & 1 \\ 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 1 \\ 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 \\ 0 & 1 & 1 & 0 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 & 1 & 1 \end{matrix})

The last three rows are the vectors $ρ_{i}$ (the only entry of $ρ_{i}$ not equal to 1 is in the column indexed by the unique 4-tuple of vertices whose induced subgraph does not contain $E_{i}$ ). With computer assistance we check that this matrix has full rank. This implies that the last three rows are linearly independent modulo the subspace $⟨ σ ⟩$ spanned by the first 10 rows, which shows that tropical balancing is not possible at $σ$ . $□$

4. Weighted Dissimilarity Vectors

In this section we tropicalize the Veronese Grassmannian map from Definition 2.1 and show that the image of the tropicalized map is the space of phylogenetic trees, embedded by the weighted dissimilarity vectors that we introduce in this paper. One of the main steps is to recognize the Veronese Grassmannian map as the restriction of a monomial map of tori; this crucially avails us of functoriality of tropicalization.

4.1. Coordinatizing the Veronese Grassmannian Map

Recall that the Veronese Grassmannian map $G r (2, n) --\to G r (r, n)$ in Definition 2.1 is expressed in terms of a matrix map of affine spaces $A^{2 n} \to A^{r n}$ . In order to tropicalize it we need to coordinatize the induced map on Grassmannians in their Plücker embeddings. Since these Grassmannians are obtained as $G L$ quotients, this means expressing the matrix map in terms of homogeneous collections of $S L$ invariants, i.e., maximal minors. We do this by defining a morphism of tori ${(𝕜^{\times})}^{(\binom{n}{2})} \to {(𝕜^{\times})}^{(\binom{n}{r})}$ that restricts to the Veronese Grassmannian map.

Remark 4.1: Since technically, in this paper we tropicalize projective varieties by first lifting to affine cones and then restricting to dense tori, by a slight abuse of terminology we shall use the term Veronese Grassmannian map to refer to the rational map $G r (2, n) --\to G r (r, n)$ in Definition 2.1 as well as the induced morphism $G r (2, n) ° \to G r (r, n) °$ (the fact that the latter is indeed a regular morphism follows from Proposition 4.1 below); the context will always make clear which meaning is intended.

Definition 4.1: For each $2 \leq r \leq n$ , let

φ_{r} : {(𝕜^{\times})}^{(\binom{n}{2})} \to {(𝕜^{\times})}^{(\binom{n}{r})}

be the group scheme morphism induced from the $𝕜$ -algebra homomorphism

\begin{matrix} φ_{r}^{*} : 𝕜 [x_{I}^{\pm}] & \to & 𝕜 [x_{i j}^{\pm}] \\ x_{I} & \mapsto & \prod_{i, j \in I, i < j} x_{i j} . \end{matrix}

Proposition 4.1. The monomial morphism $φ_{r}$ restricts to the Veronese Grassmannian map

G r (2, n) ° \to G r (r, n) ° .

Proof: By the first fundamental theorem of invariant theory, we need to see how the maximal minors of the right-hand matrix in Eq. 1 depend on the maximal minors of the left-hand matrix. However, the right-hand matrix is just a Vandermonde matrix where the columns have been homogenized, so for any collection $I \in (\binom{[n]}{r})$ of columns the corresponding maximal minor is

\prod_{i, j \in I, i < j} (x_{i} y_{j} - x_{j} y_{i}) = \prod_{i, j \in I, i < j} m_{i j},

where $m_{i j}$ denotes the $i j$ -maximal minor of the left-hand matrix. This shows the restricted morphism $φ_{r} |_{G r (2, n) °}$ is indeed induced by the column-wise Veronese map. $□$

Since $φ_{r}$ is a toric morphism, we can now apply functoriality of tropicalization for toric morphisms (corollary 3.2.13 in ref. 18) which tells us that the tropicalization of the closure (in ${(𝕜^{\times})}^{(\binom{n}{r})}$ ) of the image of the Veronese Grassmannian map coincides with the image of the tropicalized map $T r o p (φ_{r})$ restricted to the tropical Grassmannian ${G r}^{t r o p} (2, n)$ . As we discussed earlier, ${G r}^{t r o p} (2, n)$ is the space of two-dissimilarity vectors, and $T r o p (φ_{r})$ is the linear map described explicitly in the following proposition (whose proof is trivial). Our next steps are to go through this functoriality argument in detail and to interpret $T r o p (φ_{r})$ as sending two-dissimilarity vectors to the weighted dissimilarity vectors that we introduce next.

Proposition 4.2. The monomial morphism

φ_{r} : {(𝕜^{\times})}^{(\binom{n}{2})} \to {(𝕜^{\times})}^{(\binom{n}{r})}

tropicalizes to the linear map

T r o p (φ_{r}) : R^{(\binom{n}{2})} \to R^{(\binom{n}{r})}

whose $I$ component, for $I \in (\binom{[n]}{r})$ , is $\sum_{i, j \in I, i < j} x_{i j}$ .

4.2. Weighted Dissimilarity Maps

We being this section with a weighted variant of the dissimilarity map.

Definition 4.2: For each $2 \leq r \leq n$ , let

d_{r}^{w t} : T_{n} \to R^{(\binom{n}{r})}

be the weighted dissimilarity map sending a phylogenetic tree $T$ to the weighted dissimilarity vector defined as follows. For each $I \in (\binom{[n]}{r})$ , let $T (I)$ be the $r$ -leaf subtree of $T$ spanned by the leaves indexed by $I$ , and let the $I$ component of $d_{r}^{w t} (T)$ be the sum of the entries of the dissimilarity vector $d_{2} (T (I))$ .

In other words, $d_{r}^{w t}$ records for each $r$ -leaf subtree the sum of all $(\binom{r}{2})$ path lengths in the subtree. The usual $r$ -dissimilarity map $d_{r}$ records for each $r$ -leaf subtree the sum of all edge lengths in the subtree, whereas $d_{r}^{w t}$ is a weighted variant because it counts each edge with multiplicity equal to the number of leaf-to-leaf paths in the subtree in which the edge occurs.

Remark 4.2: Note that $d_{3}^{w t} = 2 d_{3}$ since in a three-leaf tree, every edge is traversed exactly twice among the $(\binom{3}{2}) = 3$ possible leaf-to-leaf paths, whereas for $r > 3$ the usual and weighted dissimilarity maps are, in general, not simply scalar multiples of each other.

We will later show that the image of the weighted dissimilarity map is a tropical variety (Theorem 4.1) and hence in particular is a balanced polyhedral complex. Before getting to that general proof, one might be curious to see how the matrix used to establish nonbalancing in the proof of Theorem 3.1 changes when using the weighted dissimilarity map.

Example 4.1: Replacing every instance of the dissimilarity map $d_{4}$ with the weighted dissimilarity map $d_{4}^{w t}$ in the construction of the $13 \times 35$ matrix in the proof of Theorem 3.1 yields the following:

(\begin{matrix} 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 0 \\ 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 0 & 0 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 3 & 3 & 3 & 3 & 0 \\ 3 & 0 & 0 & 0 & 3 & 3 & 3 & 0 & 0 & 0 & 3 & 3 & 3 & 0 & 0 & 0 & 3 & 3 & 3 & 0 & 3 & 3 & 3 & 0 & 0 & 0 & 3 & 3 & 3 & 0 & 3 & 3 & 3 & 0 & 3 \\ 0 & 3 & 0 & 0 & 3 & 0 & 0 & 3 & 3 & 0 & 3 & 0 & 0 & 3 & 3 & 0 & 3 & 3 & 0 & 3 & 3 & 0 & 0 & 3 & 3 & 0 & 3 & 3 & 0 & 3 & 3 & 3 & 0 & 3 & 3 \\ 0 & 0 & 3 & 0 & 0 & 3 & 0 & 3 & 0 & 3 & 0 & 3 & 0 & 3 & 0 & 3 & 3 & 0 & 3 & 3 & 0 & 3 & 0 & 3 & 0 & 3 & 3 & 0 & 3 & 3 & 3 & 0 & 3 & 3 & 3 \\ 0 & 0 & 0 & 3 & 0 & 0 & 3 & 0 & 3 & 3 & 0 & 0 & 3 & 0 & 3 & 3 & 0 & 3 & 3 & 3 & 0 & 0 & 3 & 0 & 3 & 3 & 0 & 3 & 3 & 3 & 0 & 3 & 3 & 3 & 3 \\ 0 & 3 & 3 & 0 & 3 & 3 & 0 & 4 & 3 & 3 & 3 & 3 & 0 & 4 & 3 & 3 & 4 & 3 & 3 & 4 & 3 & 3 & 0 & 4 & 3 & 3 & 4 & 3 & 3 & 4 & 4 & 3 & 3 & 4 & 4 \\ 4 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 4 & 4 & 4 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 4 & 4 & 4 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 4 & 4 & 4 & 3 & 3 \\ 4 & 4 & 4 & 4 & 4 & 4 & 4 & 4 & 4 & 4 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 3 & 0 & 0 & 0 & 0 & 0 \\ 0 & 3 & 3 & 3 & 3 & 3 & 3 & 4 & 4 & 4 & 3 & 3 & 3 & 4 & 4 & 4 & 4 & 4 & 4 & 3 & 3 & 3 & 3 & 4 & 4 & 4 & 4 & 4 & 4 & 3 & 4 & 4 & 4 & 3 & 3 \\ 4 & 3 & 3 & 4 & 3 & 3 & 4 & 0 & 3 & 3 & 4 & 4 & 3 & 3 & 4 & 4 & 3 & 4 & 4 & 3 & 4 & 4 & 3 & 3 & 4 & 4 & 3 & 4 & 4 & 3 & 4 & 3 & 3 & 4 & 4 \\ 4 & 4 & 4 & 3 & 4 & 4 & 3 & 4 & 3 & 3 & 3 & 3 & 4 & 3 & 4 & 4 & 3 & 4 & 4 & 4 & 3 & 3 & 4 & 3 & 4 & 4 & 3 & 4 & 4 & 4 & 0 & 3 & 3 & 3 & 3 \end{matrix}) .

Recall that the first 10 rows are the images of the split metrics defined by the graph $G$ in Fig. 1, and the last 3 rows are the images of the split metrics defined by the edges $E_{i}$ in Fig. 2. Note that $d_{4}^{w t}$ and $d_{4}$ indeed are not scalar multiples of each other, but as expected, the locations of the zero entries in this matrix are the same as in the previous matrix. For this matrix, the first 10 rows are linearly independent, but the whole matrix has rank 12. Hence, the last three rows are linearly dependent modulo the linear subspace spanned by the previous 10, and this is what allows for balancing to hold here. Explicitly, the one-dimensional left kernel is spanned by the vector

(0,0,0,0,0,0,1,1,1,1, - 1, - 1, - 1),

which tells us that the sum of the images of the split metrics given by the three edges $E_{i}$ equals the sum of the images of the split metrics given by the four coincident edges in the graph $G$ .

The following proposition, whose proof follows immediately from the definition and Proposition 4.2, plays a fundamental role in this paper (indeed, we were led to the definition of the weighted dissimilarity map primarily so that this holds).

Proposition 4.3. The weighted dissimilarity map factors as follows:

d_{r}^{w t} = T r o p (φ_{r}) ○ d_{2} .

Accordingly, in order to better understand the weighted dissimilarity map, we need to first establish a key property of the linear map $T r o p (φ_{r})$ .

Lemma 4.1. For $r \leq n - 2$ the map $T r o p (φ_{r})$ is injective, and for $r \in {2, n - 2}$ it is bijective.

Proof: This is trivial for $r = 2$ since $φ_{2}$ is the identity map, so assume $r \geq 3$ . Let $M$ be the matrix associated to $T r o p (φ_{r})$ , namely,

M_{I J} = \{\begin{matrix} 1 & if J \subseteq I \\ 0 & otherwise . \end{matrix}

[2]

We will construct an explicit left-inverse of $M$ . Define the $(\binom{n}{2}) \times (\binom{n}{r})$ matrix $M^{+}$ by

M_{J I}^{+} = {(- 1)}^{i} \frac{r - 2}{r - i} \cdot \frac{1}{(\binom{n - 2}{r - i})},

where $i = | I \cap J |$ . That is, for $J \in (\binom{[n]}{2})$ and $I \in (\binom{[n]}{r})$ we have

M_{J I}^{+} = \{\begin{matrix} \frac{1}{(\binom{n - 2}{r - 2})} & if J \subseteq I \\ - \frac{r - 2}{r - 1} \cdot \frac{1}{(\binom{n - 2}{r - 1})} & if | J \cap I | = 1 \\ \frac{r - 2}{r} \cdot \frac{1}{(\binom{n - 2}{r})} & if J \cap I = \emptyset . \end{matrix}

We will show that $M^{+} M = I d$ by directly calculating its entries. First of all,

{(M^{+} M)}_{J J} = \sum_{I} M_{J I}^{+} M_{I J} = \sum_{I \supset J} M_{J I}^{+} =

\sum_{I \supset J} \frac{1}{(\binom{n - 2}{r - 2})} = \frac{1}{(\binom{n - 2}{r - 2})} (\binom{n - 2}{r - 2}) = 1 .

For $J, K \in (\binom{[n]}{2})$ , we have

{(M^{+} M)}_{J K} = \sum_{I} M_{J I}^{+} M_{I K} =

\begin{array}{l} \sum_{I \supset J, K} \frac{1}{(\binom{n - 2}{r - 2})} - \sum_{| I \cap J | = 1, I \supset K} \frac{r - 2}{r - 1} \cdot \frac{1}{(\binom{n - 2}{r - 1})} \\ + \sum_{I \cap J = \emptyset, I \supset K} \frac{r - 2}{r} \cdot \frac{1}{(\binom{n - 2}{r})} . \end{array}

If $| J \cap K | = 1$ , then the condition in the third summation is impossible since $J \cap K \neq \emptyset$ , so

{(M^{+} M)}_{J K} = \frac{(\binom{n - 3}{r - 3})}{(\binom{n - 2}{r - 2})} - \frac{r - 2}{r - 1} \cdot \frac{(\binom{n - 3}{r - 2})}{(\binom{n - 2}{r - 1})} = 0,

where the last equality follows from an elementary calculation. If instead $J \cap K = \emptyset$ , then

{(M^{+} M)}_{J K} = \frac{(\binom{n - 4}{r - 4})}{(\binom{n - 2}{r - 2})} - \frac{r - 2}{r - 1} \cdot \frac{2 (\binom{n - 4}{r - 3})}{(\binom{n - 2}{r - 1})} + \frac{r - 2}{r} \cdot \frac{(\binom{n - 4}{r - 2})}{(\binom{n - 2}{r})} = 0,

where again the last equality is an elementary calculation.

The equality $\dim R^{(\binom{n}{2})} = \dim R^{(\binom{n}{n - 2})}$ then implies surjectivity when $r = n - 2$ . $□$

Remark 4.3: By Proposition 4.3, the matrix $M^{+}$ constructed in the preceding proof, when viewed as a linear map $R^{(\binom{n}{r})} \to R^{(\binom{n}{2})}$ , sends the weighted $r$ -dissimilarity vector of a phylogenetic tree to the corresponding two-dissimilarity vector.

Corollary 4.1. For $r \leq n - 2$ , the weighted dissimilarity map $d_{r}^{w t} : T_{n} \to R^{(\binom{n}{r})}$ is injective.

Proof: Lemma 4.1 and Proposition 4.3, together with the fact that the two-dissimilarity map is injective, show that $d_{r}^{w t}$ is a composition of injective maps and hence is injective. $□$

Corollary 4.2. For $r \leq n - 2$ , the space of phylogenetic trees $T_{n}$ and the space of weighted $r$ -dissimilarity vectors are isomorphic as combinatorial polyhedral complexes. Furthermore, if $r \leq \frac{n + 1}{2}$ , then they are also isomorphic to the space of $r$ -dissimilarity vectors.

Proof: This follows from the injectivity of $d_{r}^{w t}$ in Corollary 4.1, the additional injectivity of $d_{r}$ when $r \leq \frac{n + 1}{2}$ , and the observation that both maps are linear on each polyhedral stratum of $T_{n}$ . $□$

Although both the dissimilarity map and the weighted dissimilarity map provide Euclidean embeddings of the space of phylogenetic trees, we have seen in section 3 that the former embedding is not a tropical variety; we show in the following subsection that the latter embedding is tropical, and we use the Veronese Grassmannian map to produce an algebraic variety realizing it as a tropicalization.

4.3. Back to Pachter and Speyer’s Second Question

Recall that the second question of Pachter and Speyer, whether the space of $r$ -dissimilarity vectors is the tropicalization of the image of a rational map of Grassmannians, ended up being false for the plain reason that higher dissimilarity vectors are not a balanced polyhedral complex and hence cannot be a tropical variety. We now establish a positive answer to the variant of Pachter and Speyer’s second question where dissimilarity vectors are replaced with weighted dissimilarity vectors.

Theorem 4.1. For $r \leq n$ , the space of weighted $r$ -dissimilarity vectors is the tropicalization of the image of the Veronese Grassmannian map $G r (2, n) ° \to G r (r, n) °$ .

Proof: By functoriality of tropicalization with respect to toric morphisms (corollary 3.2.13 in ref. 18), we have that

T r o p (φ_{r}) ({G r}^{t r o p} (2, n)) = T r o p (\bar{φ_{r} (G r (2, n) °)}) .

By Proposition 4.1, $φ_{r} (G r (2, n) °)$ is the image of the Veronese Grassmannian map; by Lemma 4.2, below, this image is closed in the torus so we can ignore the Zariski closure in the right-hand side of this equality; and by Proposition 4.3, the left-hand side is $d_{r}^{w t} (T_{n})$ . $□$

Lemma 4.2. For $r \leq n$ , the image $φ_{r} (G r (2, n) °)$ is closed in ${(𝕜^{\times})}^{(\binom{n}{r})}$ .

Proof: Let $x \in \bar{φ_{r} (G r (2, n) °)} \subseteq {(𝕜^{\times})}^{(\binom{n}{r})}$ , and let $R$ be a DVR with field of fractions $K$ and residue field $𝕜$ such that we have a map $S p e c (R) \to \bar{φ_{r} (G r (2, n) °)}$ with $S p e c (K)$ mapping to $φ_{r} (G r (2, n) °)$ and $S p e c (𝕜)$ mapping to $x$ . Let $U \subseteq A^{r n}$ be the open locus of matrices all of whose maximal minors are nonzero. The ${S L}_{r}$ -quotient morphism $U \to G r (r, n) °$ is a locally trivial bundle in the Zariski topology, so we can lift $S p e c (R) \to \bar{φ_{r} (G r (2, n) °)}$ to a map $S p e c (R) \to U$ ; fix a choice of lift. This is a matrix over $R$ all of whose maximal minors are nonzero—so in particular, none of the columns of this matrix is the zero vector—and whose restriction to $S p e c (K)$ is, up to the ${S L}_{r}$ action, a matrix in the form shown in the right-hand side of Eq. 1.

Because none of the columns of this matrix is 0, it descends to an $R$ point of the ${(𝕜^{\times})}^{n}$ -quotient ${(P_{R}^{r - 1})}^{n}$ . The restriction of this latter $R$ point to $S p e c (K)$ is a configuration of $n$ points in $P_{K}^{r - 1}$ that lie on a rational normal curve because the map in Eq. 1 simply applies the $(r - 1)$ -Veronese map to each column, and the ${S L}_{r}$ action preserves the property of the configuration lying on a rational normal curve. Therefore, the induced $𝕜$ point is in the Zariski closure of the locus of $n$ points lying on a rational normal curve, and it is nondegenerate by the nonvanishing of maximal minors. So by proposition 2.7 in ref. 14 this $𝕜$ point is a configuration of $n$ points on a quasi-Veronese curve (a nondegenerate flat limit of rational normal curves; definition 2.5 in ref. 14), which we denote by $C \subseteq P^{r - 1}$ . We claim there is an actual rational normal curve $C^{'} \subseteq P^{r - 1}$ containing this $n$ -point configuration.

If $C$ is irreducible, then it is a rational normal curve, and we may set $C^{'} = C$ . Suppose not, i.e., $C$ is a reducible quasi-Veronese curve. We can then write $C = C_{1} \cup C_{2}$ where, by lemma 2.6 in ref. 14, $C_{1}$ and $C_{2}$ are connected, possibly reducible, quasi-Veronese curves of positive degrees $d_{1}$ and $d_{2}$ , respectively, with $d_{1} + d_{2} = r - 1$ . The same lemma shows that the projective linear subspace spanned by a degree $d_{i}$ quasi-Veronese curve is of dimension $d_{i}$ . It follows that the number of points lying on $C_{i}$ is at most $d_{i} + 1$ , for $i = 1,2$ , since otherwise, the points on $C_{i}$ would be linearly dependent and so any set of $r$ points containing these points would also be linearly dependent, contradicting the fact that all maximal minors of the corresponding matrix are nonzero. Consequently,

n \leq d_{1} + d_{2} + 2 = r + 1 .

Thus, we have at most $r + 1$ points in $P^{r - 1}$ , and they are in general linear position by the nonzero maximal minors condition, so Castelnuovo’s lemma (theorem 1.18 in ref. 19) implies the existence of a rational normal curve $C^{'}$ through all $n$ points, as claimed.

Any rational normal curve in $P^{r - 1}$ is in the ${G L}_{r}$ orbit of the standard Veronese rational normal curve $P^{1} ↪ P^{r - 1}$ . So, up to acting on the lift $S p e c (R) \to U$ by ${S L}_{r}$ , we can assume that $C^{'}$ is the standard Veronese rational normal curve. This implies that the corresponding limiting $𝕜$ point in $U$ is in the form shown in the right-hand side of Eq. 1, so its image $x$ under the ${S L}_{r}$ quotient $U \to G r (r, n) °$ is indeed in the image of $φ_{r}$ . $□$

Remark 4.4: In theorem 3.2 in ref. 8, Bocci and Cools introduce a piecewise linear map

ϕ^{(r)} : R^{(\binom{n}{2})} \to R^{(\binom{n}{r})}

that provides a factorization of the $r$ -dissimilarity map, namely, $d_{r} = ϕ^{(r)} ○ d_{2}$ . On the other hand, as shown in Proposition 4.3, our linear map $T r o p (φ_{r})$ provides a factorization of our weighted $r$ -dissimilarity map, namely, $d_{r}^{w t} = T r o p (φ_{r}) ○ d_{2}$ . Since $T r o p (φ_{r})$ is injective, we can choose a left inverse for it (such as the one explicitly constructed in the proof of Lemma 4.1), and then composing this with $ϕ^{(r)}$ yields a piecewise linear map $g_{r} : R^{(\binom{n}{r})} \to R^{(\binom{n}{r})}$ such that the following diagram commutes:

graphic file with name pnas.2021244118fx01.jpg

In particular, we obtain a factorization $d_{r} = g_{r} ○ d_{r}^{w t}$ . As we have seen, $d_{r}^{w t} (T_{n})$ is a tropical variety in $R^{(\binom{n}{r})}$ , whereas $d_{r} (T_{n})$ is not. Intuitively, the map $g_{r}$ tilts rays in the space of weighted dissimilarity vectors in such a way that certain collections of rays go from being linearly dependent to being linearly independent, and this is what destroys the balancing condition needed to be a tropical variety.

5. Tropical Bases and a Generalized Tree-Metric Theorem

Recall that $φ_{r} (G r (2, n) °) \subseteq G r (r, n) °$ is a closed subvariety (in the ambient torus ${(𝕜^{\times})}^{(\binom{n}{r})}$ ) whose tropicalization is the space of weighted $r$ -dissimilarity vectors $d_{r}^{w t} (T_{n}) \subseteq R^{(\binom{n}{r})}$ . In order to find tropical equations for the tropicalization of this subvariety—and hence a characterization of weighted dissimilarity vectors—we need to first find equations for the subvariety $φ_{r} (G r (2, n) °)$ itself.

5.1. Gelfand–MacPherson Correspondence

The proof of Lemma 4.2 shows that points of $φ_{r} (G r (2, n) °)$ correspond to configurations of $n$ points in $P^{r - 1}$ that lie on a rational normal curve. This correspondence is in essence the Gelfand–MacPherson correspondence, which identifies generic ${G L}_{r}$ orbits in ${(P^{r - 1})}^{n}$ with generic ${(𝕜^{\times})}^{n}$ orbits in $G r (r, n)$ and vice versa (cf. section 2.2 in ref. 16). In fact, we have the following.

Proposition 5.1. For $r \leq n$ , $φ_{r} (G r (2, n) °)$ corresponds under Gelfand–MacPherson to the open locus in ${(P^{r - 1})}^{n}$ of configurations of $n$ distinct points that lie on a rational normal curve.

Proof: The proof of Lemma 4.2 shows that each point of $φ_{r} (G r (2, n) °)$ corresponds to a configuration of $n$ points on a rational normal curve, and these points must be distinct since otherwise two columns in the matrix of coordinates would be proportional, and hence, any maximal minor containing these columns would be zero, contradicting the fact that all maximal minors are nonzero. Conversely, it is a classical fact (coming from the Vandermonde determinant) that distinct points on a rational normal curve are linearly independent, so any configuration of such points corresponds to a matrix all of whose maximal minors are nonzero, and as noted in the proof of Lemma 4.2, such a matrix yields a point of $φ_{r} (G r (2, n) °)$ . $□$

In particular, any ${S L}_{r}$ -invariant polynomial that vanishes on the locus in ${(P^{r - 1})}^{n}$ of configurations lying on a rational normal curve corresponds to a ${(𝕜^{\times})}^{n}$ -invariant polynomial that vanishes on $φ_{r} (G r (2, n) °)$ . In other words, to find the ideal defining $φ_{r} (G r (2, n) °)$ , a natural place to look is the ideal defining the Zariski closure in ${(P^{r - 1})}^{n}$ of the locus of points lying on a rational normal curve. This latter closed subvariety, and the ideal defining it, was the focus of ref. 14, where it is denoted $V_{r - 1, n} \subseteq {(P^{r - 1})}^{n}$ (since it parameterizes configurations on a quasi-Veronese curve).

Two potential issues arise with this strategy: 1) generators for the ideal of $V_{r - 1, n}$ are not fully known in general and 2) not all of the generators for this ideal are ${S L}_{r}$ -invariant (remark 4.24 in ref. 14). However, we will establish in this section that the generators that are known from ref. 14 (all of which are ${S L}_{r}$ -invariant) suffice to cut out the tropicalization of $φ_{r} (G r (2, n) °)$ . We begin by reviewing these equations.

5.2. Equations for Points to Lie on a Rational Normal Curve

The closure $V_{r - 1, n} \subseteq {(P^{r - 1})}^{n}$ of the locus of $n$ points lying on a rational normal curve in $P^{r - 1}$ is the whole space if $r = 2$ or $r \geq n - 2$ . Thus, we will assume $3 \leq r \leq n - 3$ from now on. The first nontrivial example of $V_{r - 1, n} \subseteq {(P^{r - 1})}^{n}$ is $V_{2,6}$ , which parametrizes 6-tuples of points in $P^{2}$ that lie on a conic. This is an irreducible hypersurface in ${(P^{2})}^{6}$ defined by the vanishing of the following ${S L}_{3}$ -invariant polynomial expressed as a quartic binomial in bracket notation (equation 3.4.9 in ref. 20 and remark 3.3 in ref. 14):

ϕ = | 123 ‖ 145 ‖ 246 ‖ 356 | - | 124 ‖ 135 ‖ 236 ‖ 456 | .

The notation $| i j k |$ here denotes the determinant of the $3 \times 3$ submatrix, of a $3 \times 6$ matrix of coordinates on ${(P^{2})}^{6}$ , with columns $i j k$ . This bracket expression is not fully $S_{6}$ -symmetric, because brackets satisfy many nontrivial Plücker relations. Indeed, up to obvious sign changes, there are 15 different presentations of $ϕ$ , as we next describe.

Let $G$ be the graph with vertex set $(\binom{[6]}{3})$ where vertices $I$ and $J$ are connected if $| I \cap J | = 2$ . A straightforward combinatorial argument shows that $G$ has 15 subgraphs isomorphic to the three-dimensional cube, and these form a single orbit under the natural $S_{6}$ action. A cube is a bipartite graph, so for each cube subgraph we can uniquely divide the vertex set into black and white subsets, which we label $B$ and $W$ , respectively, where we adopt the convention that the smallest triplet in the lexicographic order is black. For each vertex $I = {i, j, k}$ , we have the associated bracket $m_{I} : = | i j k |$ , and for each cube $C$ in $G$ we may define a polynomial

ϕ_{C} : = \prod_{I \in B} m_{I} - \prod_{J \in W} m_{J} .

Example 5.1: The subgraph $C$ generated by

{1,2,3}, {1,2,4}, {1,3,5}, {1,4,5},

{2,3,6}, {2,4,6}, {3,5,6}, {4,5,6}

is a cube, and the corresponding black–white bipartition is

B : = {{1,2,3}, {1,4,5}, {2,4,6}, {3,5,6}},

W : = {{1,2,4}, {1,3,5}, {2,3,6}, {4,5,6}},

so here $ϕ_{C}$ coincides with the polynomial $ϕ$ presented above.

Lemma 5.1. For each cube $C$ , we have $V (ϕ_{C}) = V (ϕ)$ as subvarieties of ${(P^{2})}^{6}$ .

Proof: As noted above, $ϕ = ϕ_{C}$ , where $C$ is the cube in Example 5.1, so it suffices to show that if $C^{'}$ is another cube, then $V (ϕ_{C}) = V (ϕ_{C^{'}})$ . By geometric considerations, the irreducible hypersurface $V_{2,6} = V (ϕ_{C})$ is invariant under the natural $S_{6}$ action on ${(P^{2})}^{6}$ . This implies that any $S_{6}$ permutation of $ϕ_{C}$ must be a polynomial whose vanishing locus is also $V_{2,6}$ . The transitive $S_{6}$ action on the set of cubes is compatible with the action on bracket polynomials induced from the permutation action on ${(P^{2})}^{6}$ . Therefore, for any cube $C^{'}$ , there exists a permutation $σ \in S_{6}$ for which $σ \cdot C = C^{'}$ and

V (ϕ_{C}) = V (σ \cdot ϕ_{C}) = V (ϕ_{σ \cdot C}) = V (ϕ_{C^{'}}),

as desired. $□$

Remark 5.1: Even though all 15 polynomials $ϕ_{C}$ define the hypersurface $V_{2,6}$ (and so this discussion of cubes and bipartitions did not arise in ref. 14), when we turn attention to tropicalization later in this section, we will need the extra flexibility provided by the choice of cube $C$ .

For $n > 6$ , $V_{2, n}$ is defined scheme-theoretically by the $(\binom{n}{6})$ polynomials obtained by pulling $ϕ$ back along the projection maps ${(P^{2})}^{n} \to {(P^{2})}^{6}$ (theorem 3.6 in ref. 14).

For $r > 3$ , things get trickier; the polynomials found in ref. 14 were obtained as follows. The idea is to take the polynomial for $V_{2,6}$ , pull it back to ${(P^{2})}^{r + 3}$ , apply the Gale transformation which, up to a constant, in bracket form is simply taking the complement of each index set (proposition 4.5 in ref. 14) to get a polynomial on ${(P^{r - 1})}^{r + 3}$ , then pull this back to ${(P^{r - 1})}^{n}$ . More formally:

i)
Choose $S \in (\binom{[n]}{r + 3})$ , $T \in (\binom{[r + 3]}{6})$ , and a cube $C$ in $(\binom{[6]}{3})$ .
ii)
Take the pull-back $π_{T}^{*} (ϕ_{C})$ along the projection $π_{T} : {(P^{2})}^{r + 3} \to {(P^{2})}^{6}$ .
iii)
Take the Gale transform $\hat{π_{T}^{*} (ϕ_{C})}$ .
iv)
Take the pull-back $π_{S}^{*} (\hat{π_{T}^{*} (ϕ_{C})})$ along the projection $π_{S} : {(P^{r - 1})}^{n} \to {(P^{r - 1})}^{r + 3}$ .

In slightly different notation, by using proposition 4.1 and remark 4.2 in ref. 21, we can rewrite the resulting polynomials explicitly as follows. For each

I = {i_{1} < \dots < i_{6}} \in (\binom{[n]}{6}) and J \in (\binom{[n] \ I}{r - 3}),

let $C$ be a cube in $I$ , and let $B, W$ be the corresponding bipartition. For instance, the choice of cube in Example 5.1 yields

\begin{matrix} B = {{i_{1}, i_{2}, i_{3}}, {i_{1}, i_{4}, i_{5}}, {i_{2}, i_{4}, i_{6}}, {i_{3}, i_{5}, i_{6}}}, \\ W = {{i_{1}, i_{2}, i_{4}}, {i_{1}, i_{3}, i_{5}}, {i_{2}, i_{3}, i_{6}}, {i_{4}, i_{5}, i_{6}}} . \end{matrix}

Then let

ψ_{C, I, J} : = \prod_{K \in B} m_{J ⊔ K} - \prod_{K \in W} m_{J ⊔ K} .

Each $ψ_{C, I, J}$ vanishes on $V_{r - 1, n}$ by lemma 4.17 in ref. 14.

5.3. Tropical Basis

Since these ${S L}_{r}$ -invariant polynomials $ψ_{C, I, J}$ are expressed in bracket form (i.e., they are written as polynomials in the maximal minors), they can immediately be interpreted as polynomial functions on the Grassmannian $G r (r, n)$ ; this is done simply by viewing each minor as the corresponding Plücker coordinate function. These are quartic binomials on the Grassmannian, and the choice of cube $C$ corresponds to all 15 possible ways of lifting this to a quartic binomial on the ambient $P^{(\binom{n}{r}) - 1}$ .

Definition 5.1: Let $S_{r - 1, n}$ be the set of bracket binomials $ψ_{C, I, J}$ from section 5.2, and let $J_{r, n} \subseteq 𝕜 {[x_{I}^{\pm}]}_{I \in (\binom{[n]}{r})}$ be the ideal generated by $S_{r - 1, n}$ and the Plücker relations for $G r (r, n)$ .

Note that if $r = 2$ or $r = n - 2$ , then $(\binom{[n] \ I}{r - 3}) = \emptyset$ for any $I \in (\binom{[n]}{6})$ , so it is safe to extend the preceding definition by setting $S_{r - 1, n} = \emptyset$ in these cases.

Proposition 5.2. The set-theoretic vanishing locus in ${(𝕜^{\times})}^{(\binom{n}{r})}$ of the ideal $J_{r, n}$ is $φ_{r} (G r (2, n) °)$ .

Proof: The result is trivial for $r = 2$ , so let $r \geq 3$ . First, we shall establish the set-theoretic containment $φ_{r} (G r (2, n) °) \subseteq V (J_{r, n})$ . By definition the left-hand side is contained in $G r (r, n) °$ , so all of the Plücker relations vanish on it. On the other hand, since each $ψ_{C, I, J}$ vanishes on $V_{r - 1, n}$ , when viewed as a Grassmannian polynomial it vanishes on $φ_{r} (G r (2, n) °)$ by Proposition 5.1. So it suffices to establish the reverse set-theoretic containment.

Let $p \in V (J_{r, n})$ , and let $M (p)$ be any matrix (necessarily with nonzero maximal minors) in the corresponding ${G L}_{r}$ orbit. Let $p^{'}$ be any collection of $r + 3$ columns in $M (p)$ , viewed as a configuration of $r + 3$ points in $P^{r - 1}$ . Due to the nonzero maximal minors, $p^{'}$ is in general linear position, so we can choose a Gale dual configuration $q^{'} \in {(P^{2})}^{r + 3}$ , and it too is in general linear position by proposition 4.5 in ref. 14. Now $ψ_{C, I, J} (p^{'}) = 0$ for all $I, J$ involving the labels of the points in $p^{'}$ , so by theorem 3.6 in ref. 14, $q^{'}$ must lie on a conic. This conic must be smooth since $q^{'}$ is in general linear position. It now follows from a classical result of Goppa (corollary 3.2 in ref. 22) that the configuration $p^{'}$ also lies on a rational normal curve, call it $X$ . Now replace a single point of $p^{'}$ with one of the other columns of $M (p)$ and apply the same argument to deduce that this new configuration lies on a rational normal curve $X^{'}$ . However, these two rational normal curves have $r + 2$ points in common, so by Castelnuovo’s lemma we have $X = X^{'}$ . Repeating this for the remaining columns shows that the full configuration given by $M (p)$ lies on a rational normal curve, and hence, $p \in φ_{r} (G r (2, n) °)$ as desired. $□$

Remark 5.2: We expect that $V (J_{r, n}) = φ_{r} (G r (2, n) °)$ as subschemes of ${(𝕜^{\times})}^{(\binom{n}{r})}$ , not just subvarieties, but we have not been able to establish this.

By viewing $ψ_{C, I, J}$ as a polynomial on $A^{(\binom{n}{r})}$ , we can tropicalize it to obtain a tropical polynomial $T r o p (ψ_{C, I, J})$ on $R^{(\binom{n}{r})}$ . Moreover, since $ψ_{C, I, J}$ is a binomial, the corresponding tropical hypersurface is a classical hyperplane. Concretely, for coordinates $x_{S}$ on $R^{(\binom{n}{r})}$ , where $S \in (\binom{[n]}{r})$ , the tropical hypersurface $V^{t r o p} (T r o p (ψ_{C, I, J}))$ is given by

\sum_{K \in B} x_{J ⊔ K} - \sum_{K \in W} x_{J ⊔ K} = 0 .

[3]

We first show that the above classical hyperplanes cut out the image of the injective classically linear map $T r o p (φ_{r}) : R^{(\binom{n}{2})} ↪ R^{(\binom{n}{r})}$ (recall Lemma 4.1).

Proposition 5.3. For $2 \leq r \leq n - 2$ , we have

⋂_{ψ_{C, I, J} \in S_{r - 1, n}} V^{t r o p} (T r o p (ψ_{C, I, J})) = T r o p (φ_{r}) (R^{(\binom{n}{2})}) .

Proof: If $r = 2$ or $n - 2$ , then $S_{r - 1, n} = \emptyset$ so the left-hand side is $R^{(\binom{n}{r})}$ , but so is the right-hand side due to the bijectivity of $T r o p (φ_{r})$ in these cases established in Lemma 4.1. So assume that $3 \leq r \leq n - 3$ .

Let $N$ be the $((\binom{n}{6}) \cdot (\binom{n - 6}{r - 3}) \cdot 15) \times (\binom{n}{r})$ matrix whose rows encode the coefficients of the linear forms in Eq. 3, so that $\ker N$ is the intersection on the left side of the proposition statement. Let $M$ be the matrix associated to $T r o p (φ_{r})$ , which was described explicitly in the proof of Lemma 4.1. Our task is thus to prove $\ker N = i m M$ .

We shall first show that $N M = 0$ , i.e., $i m M \subseteq \ker N$ . From the definition of $M$ , this is equivalent to the following: for each $ψ_{C, I, J}$ and each $A \in (\binom{[n]}{2})$ , the number of terms $x_{S}$ in Eq. 3 with a positive coefficient for which $A \subseteq S$ equals the number of such terms with a negative coefficient. If we write the bipartition corresponding to the cube $C$ as

B = {B_{1}, B_{2}, B_{3}, B_{4}}, W = {W_{1}, W_{2}, W_{3}, W_{4}},

then the positive terms of $ψ_{C, I, J}$ are $x_{J ⊔ B_{j}}$ for $j = 1,2,3,4$ , and the negative terms are $x_{J ⊔ W_{j}}$ for $j = 1,2,3,4$ . So we need to show that the number of $j$ for which $A \subseteq J ⊔ B_{j}$ equals the number of $j$ for which $A \subseteq J ⊔ W_{j}$ . This follows immediately from the observations that 1) each element of $I$ occurs in exactly two $B_{j}$ and two $W_{j}$ and 2) if a pair of elements of $I$ occurs in a $B_{j}$ or a $W_{j}$ , then it occurs in exactly one $B_{j}$ and one $W_{j}$ .

Having shown that $i m M \subseteq \ker N$ , since $r a n k M = (\binom{n}{2})$ (Lemma 4.1), it now suffices to show that $\dim (\ker N) \leq (\binom{n}{2})$ or, equivalently, $r a n k N \geq (\binom{n}{r}) - (\binom{n}{2})$ . To do this, we will find $(\binom{n}{r}) - (\binom{n}{2})$ linearly independent rows in $N$ . Order the columns of $N$ according to the lexicographic order on $(\binom{[n]}{r})$ . We will first find a collection of rows where the leftmost nonzero entries all occur in distinct columns since such rows are necessarily linearly independent, and then we will show that this collection has $(\binom{n}{r}) - (\binom{n}{2})$ elements in it.

Consider a column $I \in (\binom{[n]}{r})$ . Let $K = {a < b < c} \subseteq I$ be the subset comprising the three smallest elements and let $K^{c} = I \ K$ be the remaining $r - 3$ elements. Choose another set of three elements $K^{'} = {d < e < f}$ in $[n] \ I$ satisfying $a < d$ , $b < e$ , and $c < f$ . Consider the following cube $C$ on $K ⊔ K^{'}$ :

{a, b, c}, {a, b, f}, {a, e, c}, {a, e, f},

{b, d, c}, {b, d, f}, {c, d, e}, {d, e, f} .

Notice that $K$ is the smallest vertex of the cube in lexicographic order. Let $B ⊔ W$ be the usual bipartition of $C$ , so in particular, $K \in B$ . Then the vector $(a_{J})$ , where

a_{J} = \{\begin{matrix} 1 & if J = T ⊔ K^{c}, T \in B \\ - 1 & if J = T ⊔ K^{c}, T \in W \\ 0 & otherwise, \end{matrix}

is a row of $N$ such that the first nonzero entry is $a_{I} = 1$ . Let $p_{n, r}$ be the number of columns $I$ for which we can construct a row $(a_{J})$ by the above description. We will show that $p_{n, r} = (\binom{n}{r}) - (\binom{n}{2})$ by using induction on $n$ . It is obvious that $p_{r + 2, r} = 0$ .

Now we count the possibilities. First of all, in $K = {a < b < c}$ we have that $a = 1$ or $a > 1$ . The number of cases with $a > 1$ is precisely $p_{n - 1, r}$ , which by inductive assumption is equal to $(\binom{n - 1}{r}) - (\binom{n - 1}{2})$ . Thus, we only have to count the cases with $a = 1$ .

The possible range of $c$ is $3 \leq c \leq n - r + 2$ as we need at least $r - 2$ elements in $[n]$ larger than $c$ , namely, $K^{c} \cup {f}$ . When $c < n - r + 2$ , then $b$ can be any number between 2 and $c - 1$ , so the number of possibilities for $b$ is $c - 2$ . In this case, the number of ways to choose $K^{c}$ is $(\binom{n - c}{r - 3})$ . When $c = n - r + 2$ , then $b$ cannot be $c - 1$ , because we need two elements larger than $c$ (for $e$ and $f$ ) to make a cube where the smallest term is $K$ , but in $[n] \ K$ , there is only one element larger than $b$ . So for $b$ we have $c - 3 = n - r - 1$ possibilities and $(\binom{n - c}{r - 3}) = (\binom{r - 2}{r - 3})$ possibilities for $K^{c}$ .

In summary, the number of ways to make such a construction is

\sum_{c = 3}^{n - r + 1} (c - 2) (\binom{n - c}{r - 3}) + (n - r - 1) (\binom{r - 2}{r - 3}) =

\sum_{c = 3}^{n - r + 2} (c - 2) (\binom{n - c}{r - 3}) - (\binom{r - 2}{r - 3}) = \sum_{i = 1}^{n - r} i (\binom{n - 2 - i}{r - 3}) - (r - 2) .

Thus, we obtain a recursive formula

p_{n, r} = \sum_{i = 1}^{n - r} i (\binom{n - 2 - i}{r - 3}) - (r - 2) + p_{n - 1, r} .

From the inductive assumption and the lemma below, we obtain that $p_{n, r} = (\binom{n}{r}) - (\binom{n}{2})$ . $□$

Lemma 5.2.

\begin{array}{l} \sum_{i = 1}^{n - r} i (\binom{n - 2 - i}{r - 3}) - (r - 2) \\ = ((\binom{n}{r}) - (\binom{n}{2})) - ((\binom{n - 1}{r}) - (\binom{n - 1}{2})) . \end{array}

Proof: First, note that $(\binom{n}{r}) - (\binom{n - 1}{r}) = (\binom{n - 1}{r - 1})$ and $(\binom{n}{2}) - (\binom{n - 1}{2}) = n - 1$ , so the right-hand side in the formula equals $(\binom{n - 1}{r - 1}) - (n - 1)$ . Thus, the identity we need to show is equivalent to

\sum_{i = 1}^{n - r} i (\binom{n - 2 - i}{r - 3}) = (\binom{n - 1}{r - 1}) - (n - r + 1),

which, in turn, is equivalent to

\sum_{i = 1}^{n - r + 1} i (\binom{n - 2 - i}{r - 3}) = (\binom{n - 1}{r - 1}) .

By using the substitution $m = n - 1$ and $s = r - 1$ , this is equivalent to

\sum_{i = 1}^{m - s + 1} i (\binom{m - 1 - i}{s - 2}) = (\binom{m}{s}) .

This last form of the identity can be established by combinatorial considerations: the term $i (\binom{m - 1 - i}{s - 2})$ is precisely the number of ways one can choose a subset of $[m]$ of cardinality $s$ whose second smallest entry is $i + 1$ . $□$

Remark 5.3: The analogue of Proposition 5.3 for the unweighted dissimilarity map does not hold. As we discussed in Remark 4.4, for the $r$ -dissimilarity map we have that $T r o p (φ_{r})$ is replaced by the Bocci–Cools piecewise linear map $ϕ^{(r)}$ . Hence, in general $ϕ^{(r)} (R^{(\binom{n}{2})})$ is not equal to the intersection of tropical hypersurfaces $V^{t r o p} (T r o p (ψ_{C, I, J}))$ , which is instead a linear subspace of $R^{(\binom{n}{r})}$ as Proposition 5.3 shows.

The three-term Plücker relations are the polynomials

x_{i j A} x_{k l A} - x_{i k A} x_{j l A} + x_{i l A} x_{j k A},

1 \leq i < j < k < l \leq n, A \in (\binom{[n] \ {i, j, k, l}}{r - 2})

with the standard convention that index sets are permuted to be increasing and the corresponding permutation signs are included when doing so. These do not generate the full ideal of Plücker relations in general, but they do so when passing to the Laurent polynomial ring so they define $G r (r, n) °$ in the torus ${(𝕜^{\times})}^{(\binom{n}{r})}$ . In particular, in Definition 5.1 we could have defined the same ideal $J_{r, n}$ by using only the three-term Plücker relations rather than all of the Plücker relations.

For $r = 2$ , the three-term Plücker relations form a tropical basis for the ideal of Plücker relations (corollary 4.3.12 in ref. 18), which means 1) as already noted, they generate the ideal of Plücker relations in the Laurent polynomial ring and 2) the intersection of the tropical hypersurfaces

V^{t r o p} (T r o p (x_{i j} x_{k l} - x_{i k} x_{j l} + x_{i l} x_{j k}))

for $1 \leq i < j < k < l \leq n$ equals ${G r}^{t r o p} (2, n)$ in $R^{(\binom{n}{2})}$ .

Theorem 5.1. Fix $2 \leq r \leq n - 2$ . The three-term Plücker relations together with the bracket binomials $ψ_{C, I, J}$ form a tropical basis for the ideal $J_{r, n}$ .

Proof: We already noted above that these polynomials generate $J_{r, n}$ since we are working in the Laurent polynomial ring. So, we just need to show that the intersection of the tropical hypersurfaces defined by these polynomials coincides with the tropicalization of the vanishing locus of the ideal $J_{r, n}$ . By Proposition 5.1 we have $V (J_{r, n}) = φ_{r} (G r (2, n) °)$ , and by Theorem 4.1 we have $T r o p (φ_{r} (G r (2, n) °)) = d_{r}^{w t} (T_{n})$ . Thus, our task is reduced to showing that the intersection of the tropical hypersurfaces associated to the polynomials in the theorem statement equals the space of weighted $r$ -dissimilarity vectors $d_{r}^{w t} (T_{n})$ .

Proposition 4.3 shows that $d_{r}^{w t} (T_{n}) = T r o p (φ_{r}) ({G r}^{t r o p} (2, n))$ , and Proposition 5.3 shows that the intersection of the tropical hypersurfaces associated to the $ψ_{C, I, J}$ is $T r o p (φ_{r}) (R^{(\binom{n}{2})})$ . Thus, all that remains is to show that the pull-backs along $φ_{r}$ of the three-term Plücker relations define ${G r}^{t r o p} (2, n)$ ; indeed, this suffices since $φ_{r}$ is a monomial map; hence, pulling back along it commutes with tropicalization. From the definition of $φ_{r}$ we have

φ_{r}^{*} (x_{i j A} x_{k l A} - x_{i k A} x_{j l A} + x_{i l A} x_{j k A}) =

(x_{i j} x_{k l} - x_{i k} x_{j l} + x_{i l} x_{j k}) \prod_{B \in (\binom{A}{2})} x_{B}^{2} \prod_{t \in A} x_{i t} x_{j t} x_{k t} x_{l t},

so in the Laurent polynomial ring the three-term Plücker relations for $G r (r, n) °$ pull back to the three-term Plücker relations for $G r (2, n) °$ , which as we noted above are a tropical basis. $□$

Remark 5.4: It follows from this proof that not all of the bracket binomials $ψ_{C, I, J}$ are needed to form this tropical basis. Indeed, the only role they play is cutting out the codimension $(\binom{n}{r}) - (\binom{n}{2})$ linear subspace that is the image of $T r o p (φ_{r})$ , so this codimension is the number that is actually needed if they are chosen correctly. Similarly, not all of the three-term Plücker relations are needed: for each 4-tuple $i, j, k, l$ , only a single choice of $A \in (\binom{[n] \ {i, j, k, l}}{r - 2})$ is needed (and any such choice will do).

As an immediate corollary, by spelling out explicitly the conditions defining the tropical hypersurfaces for each polynomial in this tropical basis we obtain a characterization of weighted dissimilarity vectors, generalizing the classic tree-metric theorem for two-dissimilarity vectors.

Corollary 5.1. A vector $w = {(w_{I})}_{I \in (\binom{[n]}{r})} \in R^{(\binom{n}{r})}$ is a weighted $r$ -dissimilarity vector if and only if the following two conditions hold:

i)
for each 4-tuple ${i, j, k, l} \subseteq [n]$ , there exists an $A \subseteq [n] \ {i, j, k, l}$ of size $r - 2$ such that two of the following expressions equal each other and are greater than or equal to the third:

w_{i j A} + w_{k l A}, w_{i k A} + w_{j l A}, w_{i l A} + w_{j k A};

i)
for each $I \in (\binom{[n]}{6})$ , $J \in (\binom{[n] \ I}{r - 3})$ , and for each cube $C$ in $I$ with corresponding bipartition $B, W$ we have

\sum_{K \in B} w_{J ⊔ K} = \sum_{K \in W} w_{J ⊔ K} .

Acknowledgments

N.G. was supported in part by NSF DMS-1802263 and thanks the members of the Spring 2016 University of Georgia Vertical Integration of Research and Education graduate student tropical research group: Natalie Hobson, Andrew Maurer, Xian Wu, Matt Zawodniak, and Nate Zbacnik. We also thank the anonymous referee for the valuable comments and suggestions.

Footnotes

The authors declare no competing interest.

This article is a PNAS Direct Submission.

Data Availability

There are no data underlying this work.

References

1.Speyer D., Sturmfels B., The tropical Grassmannian. Adv. Geom. 4, 389–411 (2004). [Google Scholar]
2.Pachter L., Speyer D., Reconstructing trees from subtree weights. Appl. Math. Lett. 17, 615–621 (2004). [Google Scholar]
3.Cools F., On the relation between weighted trees and tropical Grassmannians. J. Symbolic Comput. 44, 1079–1086 (2009). [Google Scholar]
4.Giraldo B., Dissimilarity vectors of trees are contained in the tropical Grassmannian. Electron. J. Comb. 17, 7 (2010). [Google Scholar]
5.Manon C., Dissimilarity maps on trees and the representation theory of ${S L}_{m} (C)$ . J. Algebr. Comb. 33, 199–213 (2011). [Google Scholar]
6.Baldisserri A., Rubei E., On graphlike $k$ -dissimilarity vectors. Ann. Comb. 18, 365–381 (2014). [Google Scholar]
7.Baldisserri A., Rubei E., A characterization of dissimilarity families of trees. Discrete Appl. Math. 220, 35–45 (2017). [Google Scholar]
8.Bocci C., Cools F., A tropical interpretation of $m$ -dissimilarity maps. Appl. Math. Comput. 212, 349–356 (2009). [Google Scholar]
9.Evans S., Lanoue D., Recovering a tree from the lengths of subtrees spanned by a randomly chosen sequence of leaves. Adv. Appl. Math. 96, 39–75 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Herrmann S., Moulton V., The split decomposition of a $k$ -dissimilarity map. Adv. Appl. Math. 49, 39–56 (2012). [Google Scholar]
11.Levy D., Yoshida R., Pachter L., Beyond pairwise distances: Neighbor-joining with phylogenetic diversity estimates. Mol. Biol. Evol. 23, 491–498 (2006). [DOI] [PubMed] [Google Scholar]
12.Rubei E., Sets of double and triple weights of trees. Ann. Comb. 15, 723–734 (2007). [Google Scholar]
13.Rubei E., On dissimilarity vectors of general weighted trees. Discrete Math. 312, 2872–2880 (2012). [Google Scholar]
14.Caminata A., Giansiracusa N., Moon H.-B., Schaffler L., Equations for point configurations to lie on a rational normal curve. Adv. Math. 340, 653–683 (2018). [Google Scholar]
15.Billera L., Holmes S., Vogtmann K., Geometry of the space of phylogenetic trees. Adv. Appl. Math. 27, 733–767 (2001). [Google Scholar]
16.Kapranov M., “Chow quotients of Grassmannians I” in I. M. Gelfand Seminar, S. Gelfand, S. Gindikin, Eds. (Adv. Soviet Math., American Mathematical Society, Providence, RI, 1993), vol. 16, pp. 29–110. [Google Scholar]
17.Buneman P., “The recovery of trees from measures of dissimilarity” in Mathematics in the Archaeological and Historical Sciences, Hodson F., Kendall D., Tautu P., Eds. (Edinburgh University Press, Edinburgh, 1971), pp. 387–395. [Google Scholar]
18.Maclagan D., Sturmfels B., Introduction to Tropical Geometry (Graduate Studies in Mathematics, American Mathematical Society, Providence, RI, 2015), vol. 161, pp. xii+363. [Google Scholar]
19.Harris J., Algebraic Geometry. A First Course (Graduate Texts in Mathematics, Corrected Reprint, Springer-Verlag, New York, 1995), vol. 133. [Google Scholar]
20.Sturmfels Bernd., Algorithms in Invariant Theory (Texts and Monographs in Symbolic Computation, Springer-Verlag, Vienna, ed. 2, 2008). [Google Scholar]
21.Caminata A., Schaffler L., A Pascal’s theorem for rational normal curves. arXiv:1903.00460 (1 March 2019).
22.Eisenbud D., Popescu S., The projective geometry of the Gale transform. J. Algebra 230, 127–173 (2000). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

There are no data underlying this work.

[r1] 1.Speyer D., Sturmfels B., The tropical Grassmannian. Adv. Geom. 4, 389–411 (2004). [Google Scholar]

[r2] 2.Pachter L., Speyer D., Reconstructing trees from subtree weights. Appl. Math. Lett. 17, 615–621 (2004). [Google Scholar]

[r3] 3.Cools F., On the relation between weighted trees and tropical Grassmannians. J. Symbolic Comput. 44, 1079–1086 (2009). [Google Scholar]

[r4] 4.Giraldo B., Dissimilarity vectors of trees are contained in the tropical Grassmannian. Electron. J. Comb. 17, 7 (2010). [Google Scholar]

[r5] 5.Manon C., Dissimilarity maps on trees and the representation theory of ${S L}_{m} (C)$ . J. Algebr. Comb. 33, 199–213 (2011). [Google Scholar]

[r6] 6.Baldisserri A., Rubei E., On graphlike $k$ -dissimilarity vectors. Ann. Comb. 18, 365–381 (2014). [Google Scholar]

[r7] 7.Baldisserri A., Rubei E., A characterization of dissimilarity families of trees. Discrete Appl. Math. 220, 35–45 (2017). [Google Scholar]

[r8] 8.Bocci C., Cools F., A tropical interpretation of $m$ -dissimilarity maps. Appl. Math. Comput. 212, 349–356 (2009). [Google Scholar]

[r9] 9.Evans S., Lanoue D., Recovering a tree from the lengths of subtrees spanned by a randomly chosen sequence of leaves. Adv. Appl. Math. 96, 39–75 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

[r10] 10.Herrmann S., Moulton V., The split decomposition of a $k$ -dissimilarity map. Adv. Appl. Math. 49, 39–56 (2012). [Google Scholar]

[r11] 11.Levy D., Yoshida R., Pachter L., Beyond pairwise distances: Neighbor-joining with phylogenetic diversity estimates. Mol. Biol. Evol. 23, 491–498 (2006). [DOI] [PubMed] [Google Scholar]

[r12] 12.Rubei E., Sets of double and triple weights of trees. Ann. Comb. 15, 723–734 (2007). [Google Scholar]

[r13] 13.Rubei E., On dissimilarity vectors of general weighted trees. Discrete Math. 312, 2872–2880 (2012). [Google Scholar]

[r14] 14.Caminata A., Giansiracusa N., Moon H.-B., Schaffler L., Equations for point configurations to lie on a rational normal curve. Adv. Math. 340, 653–683 (2018). [Google Scholar]

[r15] 15.Billera L., Holmes S., Vogtmann K., Geometry of the space of phylogenetic trees. Adv. Appl. Math. 27, 733–767 (2001). [Google Scholar]

[r16] 16.Kapranov M., “Chow quotients of Grassmannians I” in I. M. Gelfand Seminar, S. Gelfand, S. Gindikin, Eds. (Adv. Soviet Math., American Mathematical Society, Providence, RI, 1993), vol. 16, pp. 29–110. [Google Scholar]

[r17] 17.Buneman P., “The recovery of trees from measures of dissimilarity” in Mathematics in the Archaeological and Historical Sciences, Hodson F., Kendall D., Tautu P., Eds. (Edinburgh University Press, Edinburgh, 1971), pp. 387–395. [Google Scholar]

[r18] 18.Maclagan D., Sturmfels B., Introduction to Tropical Geometry (Graduate Studies in Mathematics, American Mathematical Society, Providence, RI, 2015), vol. 161, pp. xii+363. [Google Scholar]

[r19] 19.Harris J., Algebraic Geometry. A First Course (Graduate Texts in Mathematics, Corrected Reprint, Springer-Verlag, New York, 1995), vol. 133. [Google Scholar]

[r20] 20.Sturmfels Bernd., Algorithms in Invariant Theory (Texts and Monographs in Symbolic Computation, Springer-Verlag, Vienna, ed. 2, 2008). [Google Scholar]

[r21] 21.Caminata A., Schaffler L., A Pascal’s theorem for rational normal curves. arXiv:1903.00460 (1 March 2019).

[r22] 22.Eisenbud D., Popescu S., The projective geometry of the Gale transform. J. Algebra 230, 127–173 (2000). [Google Scholar]

PERMALINK

Point configurations, phylogenetic trees, and dissimilarity vectors

Alessio Caminata

Noah Giansiracusa

Han-Bom Moon

Luca Schaffler

Significance

Abstract

1. Statement of Results

2. Background and Preliminaries

2.1. Phylogenetic Trees

2.2. Dissimilarity Vectors and Maps

3. Resolution of Pachter and Speyer’s Second Question

Fig. 1.

Fig. 2.

4. Weighted Dissimilarity Vectors

4.1. Coordinatizing the Veronese Grassmannian Map

4.2. Weighted Dissimilarity Maps

4.3. Back to Pachter and Speyer’s Second Question

5. Tropical Bases and a Generalized Tree-Metric Theorem

5.1. Gelfand–MacPherson Correspondence

5.2. Equations for Points to Lie on a Rational Normal Curve

5.3. Tropical Basis

Acknowledgments

Footnotes

Data Availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Point configurations, phylogenetic trees, and dissimilarity vectors

Alessio Caminata

Noah Giansiracusa

Han-Bom Moon

Luca Schaffler

Significance

Abstract

1. Statement of Results

2. Background and Preliminaries

2.1. Phylogenetic Trees

2.2. Dissimilarity Vectors and Maps

3. Resolution of Pachter and Speyer’s Second Question

Fig. 1.

Fig. 2.

4. Weighted Dissimilarity Vectors

4.1. Coordinatizing the Veronese Grassmannian Map

4.2. Weighted Dissimilarity Maps

4.3. Back to Pachter and Speyer’s Second Question

5. Tropical Bases and a Generalized Tree-Metric Theorem

5.1. Gelfand–MacPherson Correspondence

5.2. Equations for Points to Lie on a Rational Normal Curve

5.3. Tropical Basis

Acknowledgments

Footnotes

Data Availability

References

Associated Data

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases