Abstract
Path homology proposed by S.-T.Yau and his co-workers provides a new mathematical model for directed graphs and networks. Persistent path homology (PPH) extends the path homology with filtration to deal with asymmetry structures. However, PPH is constrained to purely topological persistence and cannot track the homotopic shape evolution of data during filtration. To overcome the limitation of PPH, persistent path Laplacian (PPL) is introduced to capture the shape evolution of data. PPL’s harmonic spectra fully recover PPH’s topological persistence and its non-harmonic spectra reveal the homotopic shape evolution of data during filtration.
2020 Mathematics Subject Classification. Primary: 62R40, Secondary: 55N31
Key words and phrases. Persistent homology, persistent Laplacian, spectral graph theory, topological data analysis, spectral data analysis, simultaneous geometric and topological analyses
1. Introduction.
Recent years witness the emergence of a variety of advanced mathematical tools in topological data analysis (TDA) [13]. As the main workhorse of TDA, persistent homology (PH) [2, 9, 43, 10] pioneered a new branch in algebraic topology, offering a powerful tool to decode the topological structures of data during filtration in terms of persistent Betti numbers. Persistent homology has had tremendous success in many areas of science and technology, such as biology [41], chemistry [35], drug discovery [29], 3D shape analysis [33], etc.
Inspired by the success of PH, other mathematical tools have been given due attention. One of them is de Rham-Hodge theory in differential geometry, which uses the differential forms to represent the cohomology of an oriented closed Riemannian manifold with boundary in terms of a topological Laplacian, namely Hodge Laplacian [8]. The de Rham-Hodge theory has been applied to computational biology [42], graphic [34], and robotics [28]. However, like homology, the de Rham-Hodge theory does not offer an in-depth analysis of data, which is a famous problem in spectral geometry [24]. To overcome this drawback, the evolutionary de Rham-Hodge theory [5] was introduced in terms of persistent Hodge Laplacian to offer a multiscale analysis of the de Rham-Hodge theory. Defined on a family of evolutionary manifolds, the evolutionary de Rham-Hodge theory gives a new answer to, or at least reopens, the famous 55-years old question “can one hear the shape of a drum”. [24] The persistent Hodge Laplacian captures both the topological persistence and the homotopic shape evolution of data during filtration.
Nevertheless, the evolutionary de Rham-Hodge theory is set up on Riemannian manifolds, which may be computationally demanding for large datasets. Hence, a similar multiscale-based topological Laplacian, called persistent spectral graph (PSG) [37], was proposed by introducing a filtration to combinatorial graph Laplacians. PSG, aka persistent Laplacian (PL) [26], extends persistent homology to non-harmonic analysis of data, showing much advantage in sophisticated applications [27, 4, 39]. Dealing with point cloud data instead of manifolds, PL encodes a point cloud to a family of simplicial complexes generated from filtration and analyzes both harmonic and non-harmonic spectra. It is worthy to notice that the harmonic spectra from the null spaces of PLs reveal the same topological persistence as that of persistent homology, whereas, the non-harmonic spectra of PLs capture the homotopic shape evolution of data during filtration. Meanwhile, open-source software called HERMES [38] was developed for the simultaneous topological and geometric analysis of data. However, like persistent homology, PSG treats all data points equally. That is to say, each point does not carry any labeled information such as the type, mass, color, etc. Therefore, an extension of PSG, called persistent sheaf Laplacian (PSL), was proposed to generalize cellular sheaves [32, 22] for the multiscale analysis of point cloud data with attached labeled information [40]. PSL is also a topological Laplacian that carries topological information in its null space but tracks homotopic shape evolution during filtration. It is worthy to mention that eigenvectors computed from Hodge Laplacians defined on manifolds [42] are sharply different from those computed from combinatorial Laplacians defined on simplicial complexes [39]. The minor similarity and fundamental difference of these Laplacians were discussed in the literature [31]. Another interesting development is the persistent Dirac Laplacian (PDL) by Ameneyro, Maroulas, and Siopsis [1]. PDL offers an efficient quantum computation of persistent Betti numbers across different scales. The above-mentioned approaches have great potentials to deal with complex data in science and engineering.
It is noticed that the aforementioned homologies and topological Laplacians are insensitive to asymmetry or directed relations, which limits their representational power in encoding structures that have directional information. For example, in gene regulation data, the directions of gene regulations are indicated by arrowheads or perpendicular edges in systems biology [25]. Therefore, a technique that can deal with directed graphs (digraphs) is of vital importance to inferring gene regulation relationships. Notably, the path homology [16] proposed by Grigor’yan, Lin, Muranov, and Yau provides a powerful tool to analyze datasets with asymmetric structures using the path complex. Particular cases of homologies of digraphs and their path cohomology were also discussed [16, 18, 20]. The notion of path homology of digraphs has a richer mathematical structure than the earlier homology and Laplacian, opening new directions for both pure and applied mathematics. For example, path homology theory was extended to various objects such as quivers, multigraphs, digraphs pairs, cylinder, cone, hypergraphs, etc. [21, 15, 14] Path homology has drawn much attention from researchers in the TDA community. To encode richer information, Chowdhury and Mémoli extended path homology to a persistent framework on a directed network [6]. Wang, Ren, and Wu constructed a weighted path homology for weight digraphs and proved a persistent version of a Künneth-type formula for joins of weighted digraphs [36]. Recently, Dey, Li, and Wang have designed an efficient algorithm for 1-dimensional persistent path homology [7], which is useful in real applications.
Similar to persistent homology, persistent path homology cannot track the homotopic shape evolution of data during filtration. To overcome this limitation, we introduce path Laplacian as a new topological Laplacian to analyze the spectral geometry of data, in addition to its topology. Moreover, we introduce a filtration to path Laplacian to obtain a persistent path Laplacian (PPL), a new framework that captures both the topological persistence and shape evolution of directed graphs and networks. By varying the filtration parameter, one can construct a series of digraphs, which result in a family of persistent path Laplacian matrices. The harmonic spectra of the persistent path Laplacian recover all the topological invariants of the digraphs, while the non-harmonic spectra provide additional geometric information, which can distinguish two systems when they are homotopy but geometrically different. PPL has potential applications in science, engineering, industry, and technology. This work is organized as follows: Section 2 reviews the necessary background on path homology. Section 3 describes path Laplacian and persistent path Laplacian. Detailed PPL matrix constructions are illustrated with various examples for the interested readers in Section 3 and Section 4.
2. Background on path homology and directed graph.
Graph structure offers a powerful and versatile data representation that encodes inter-dependencies among constituents, which has been driven by widely spread applications in various fields such as graph theory, topological data analysis, science, and engineering. In this section, we first recap basic concepts in path homology, including paths on a finite set, boundary operator on the path complex, and homologies of path complex. Then, we briefly review the concept of directed graphs (digraphs) and give a discussion of path homologies on the directed graph without self loops. Such concepts and notations, due to Yau and coworkers, form a basis for us to introduce path Laplacian and persistent path Laplacian in section 3.
2.1. Paths on a finite set.
Denote set an arbitrary nonempty finite set, and elements in are called vertices. For (i.e., a set with integers ), an elementary -path on is any sequence of vertices in . An elementary -path is an empty set for . For a fixed field , a vector space that consists of all formal linear combinations of elementary -paths with its coefficients in is called the space generated by the elementary paths, denoted as . One says the elements in are -paths on , and an elementary -path is denoted by . By definition, , its unique representation can be given by the basis in :
(1) |
where is the coefficient in . For instance, contains all linear combination of with , has all linear combination of with , and so on so forth. Since consists of all multiples of , one has .
Additionally, , the linear boundary operator from to that acts on elementary paths can be defined as
(2) |
with
(3) |
where denotes the omission of index from the elementary -path . One sets , and for , defines to be a zero map. Following Lemma 2.1 in [19], one has , which indicates that the collection of boundary operator and space can form a chain complex of denoted as as
(4) |
Next, the concepts of regular path and non-regular path are introduced according to [19]. An elementary path on a set is regular if , and non-regular if for . For any , let be the subspace of spanned by all regular elementary paths, and be the subspace of spanned by all non-regular elementary paths. Therefore, one has
Note that for integers .
Then , . Therefore,
According to Section 2.4 in [19], the boundary operator is well-defined on the quotient space . Moreover, and the product rules are satisfied in the quotient space as well. One has an induced regular boundary operator:
(5) |
where the regular boundary operator satisfies (3) except that all non-regular terms on the right hand side should be treated as 0. Then a chain complex of , denoted as and equipped with , can be expressed as:
(6) |
It can be verified that is an isomorphism of chain complexes [18]. In the following sections, for simplicity, we use to denote the boundary operator of Eq. (6) unless specified differently.
2.2. Path complex.
A path complex over set is a nonempty collection of elementary paths on for any ,
(7) |
For a fixed path complex, all the paths from are called allowed (i.e. for any ), while the elementary paths on that are not in are non-allowed. We say a path complex is perfect if any subsequence of any path from is also in . We choose to denote all -paths from . Then the set has a single empty path , the set consists of all the vertices of , and clearly, . To be noted, a path complex is a collection satisfying Eq. (7). Let be an abstract simplicial complex defined over a finite vertex set , satisfying
The collection of elementary paths on is denoted by . Following Ref. [19] (cf. Example 3.2), the family is a path complex.
2.3. Path homology.
For any , the -linear space is spanned by all the elementary -paths from a given path complex over a finite set , i.e.,
We call the elements of the allowed -paths. By the definition of , , and for . It is natural that the boundary operator defined on can be introduced to under certain condition: . For example, for perfect path complexes, we can obtain a chain complex:
Next, we consider a general path complex (i.e., does not have to be a subspace of ). For any , we define a subspace of :
(8) |
The elements of are called -invariant -paths. To be noted, always satisfies. Moreover, has been established in the previous section. Therefore, the augmented chain complex of -invariant paths can be denoted as
(9) |
whose homology group of the chain complex in Eq. (9) are called the reduced path homology groups of the path complex for . The truncated version of the chain complex in Eq. (9) for is:
(10) |
whose homology group of the chain complex in Eq. (10) are called the path homology groups of the path complex .
2.4. Path homology of directed graphs.
A directed graph is an ordered pair , where is a set of all vertices and is a set of ordered pairs of vertices (i.e. directed edges that satisfy ). If does not contain any loop and multiple edge, then it is called simple directed graph. Moreover, for the path homology of multigraph or quiver, one can refer to Ref. [3]. In the following section of this work, we use to represent the simple directed graphs unless specified differently.
The path complex is regular if is a simple directed graph. In this section, we mainly discuss the regular spaces and their associated regular homology groups . Similar to the discussion in subsection 2.3, given a simple digraph , for any , the space of -invariant -paths on is defined by the subspace of :
with and . Since (as ), then we have the following chain complex of denoted as ,
and the associated -dimensional path homology groups of are defined as:
(11) |
To be noted, the elements of are called -cycles, and the elements of are referred to as -boundaries. For simplicity, we define , and the chain complex of -invariant paths is written as
Notably, the path cohomology, introduced in Refs. [18, 12], is isomorphic to the dual space of path homology when the coefficient ring is a field. The associated -dimensional path homology groups of digraphs are defined as:
(12) |
where is called coboundary operator.
Given two simple digraphs and . According to the Definition 2.2 in [17], a morphism of digraphs/digraphs map from to is a map such that for any directed edge in , one has either is a directed edge on or .
Let be a digraph map from to . For , one defines a map such that:
(13) |
Assume and are the boundary operators of chain complexes and , then for , one has
(14) |
(15) |
(16) |
Hence is a chain map. By the definition of digraph map, maps non-regular elementary -paths on to non-regular elementary -paths on . Therefore, one has , and then descends to a quotient homomorphism of chain complexes:
(17) |
It can be verified that is an isomorphism of chain complexes [18], then the map in (17) induces a morphism of chain complexes:
(18) |
Since maps non-regular paths to non-regular, then similarly to what Eq. (14) shows, is also a chain map that follows:
(19) |
Following Theorem 2.10 in [17], the induced map induces a morphism of chain complexes:
(20) |
and consequently induces a homomorphism between the path homology groups:
(21) |
2.5. Homologies of directed subgraphs.
Some interesting propositions on the homologies of subgraphs provide a way to simplify complicated digraphs to relatively simple ones. Following Section 4.2 [19], three propositions are discussed.
Proposition 2.1. Given a simple digraph that has a vertex with outcoming arrows . Note that does not have any incoming arrows. Assume that for all , one has . Denote be the subgraph of by removing the vertex with all adjacent edges (i.e. and ). Then, one has (See Figure 1 a).
Figure 1.
Homologies of directed subgraphs. a, b, and c illustrate three subgraphs whose homology groups or homology group dimensions are related to the original digraphs.
Proposition 2.2. Given a simple digraph that has a vertex with incoming arrows . Note that does not have any outcoming arrows. Assume that for all , one has . Denote be the subgraph of by removing the vertex with all adjacent edges (i.e. and ). Then, one has (See Figure 1 b).
Proposition 2.3. Given a simple digraph that has a vertex with only one outcoming arrow and only one incoming arrow , where . Denote be the subgraph of (See Figure 1 c) by removing the vertex and the adjacent edges and (i.e. and ). Then
for or for if is an edge/semi-edge in .
If is neither an edge or a semi-edge in , but and are in the same connected component of , then , and .
If and are not in the same connected component of , then and .
3. Path Laplacian and persistent path Laplacian.
One can extract topological invariants by introducing the persistent Betti numbers from the homology groups along the filtration of simplicial complex [43]. However, persistent Betti numbers do not capture homotopic geometric changes during filtration. Therefore, persistent topological Laplacians, including persistent Laplacian [37, 38] (persistent spectral graph) and persistent Hodge Laplacian [5], were introduced to reveal additional geometric information. Similarly, the constructions of path Laplacian and persistent path Laplacian are motivated by the earlier persistent spectral graphs [37, 38]. In this section, we first discuss the construction of path Laplacian. Then, we introduce filtration to the path complex to generate a series of digraphs, which gives rise to persistent path Laplacian.
3.1. Path Laplacian.
Recall that a chain complex of -invariant paths is given by
where and . Alternatively, assume to be the set of -th elementary paths in , then we define an inner product
such that for any , the following satisfies
(22) |
Let be a matrix representation of with respect to the standard basis of and . Define an inclusion map , then the matrix representation of with respect to the basis of (i.e., the standard basis of with the removal of generators that are not in ) and the standard basis of is denoted as . Denote the boundary matrix representation of as , then we have
(23) |
If is a square matrix, then is actually an identity matrix, and we have
(24) |
where is with the removal of rows that their basis are not elementary -paths in . Otherwise, is the least-square solution to Eq. (23).
Note that is the matrix representation of with respect to the basis of and . Dual space of is equipped with dual maps to form a cochain complex
where is called a coboundary operator. The inner product on induces an inner product on such that
We denote the adjoint operator of be . Note that similar inner product on was defined in the literature [23]. Hence, the coboundary operator is consistent with the adjoint operator . Then, for integers , the -th path Laplacian operator is a linear operator: given by
(25) |
and . The -th path Laplacian matrix corresponding to is expressed by
(26) |
Since is positive semi-definite and symmetric, its eigenvalues are all real and non-negative. Additionally, recall that the Betti number of path complex satisfies
(27) |
It is easy to show that
(28) |
Moreover, assume the dimension of is , then the set of spectra of is denoted as
Figure 2 shows 5 digraphs with multiple vertices and directed edges. Here, we take them as examples to give a detailed illustration of matrix constructions.
Figure 2.
Five digraphs. a and b Digraphs with 3 vertices and 3 directed edges. c and d Digraphs with 4 vertices and 4 directed edges. e A digraph with 6 vertices and 8 directed edges. f A digraph with 6 vertices and 8 directed edges.
Construction of − Figure 2a Since , then we first construct , where according to Eq. (24), we have , , and . Since are all elementary 0-paths (vertices), . We have . Then , which gives Spectra and thus, one finally has .
Construction of − Figure 2a We have , where has been formed, so we focus on the construction of according to Eq. (24). Since , , and is a empty matrix since . Therefore, is a empty matrix. Additionally, , where and thus, one finally has .
Construction of − Figure 2a We have , where is an empty matrix. Hence, we focus on the construction of according to Eq. (24). We have and . Note that where is not in . Hence, is not in . The same conclusion can be deduced for and . Therefore, we have , and it is straightforward to get that is an empty matrix.
Construction of − Figure 2b Since , then we should first construct , where according to Eq. (24). Since , , and . Since , and are all elementary 0-paths (vertices). Therefore, , and we have . Then , which gives the and thus, one finally has .
Construction of − Figure 2b We have , where has been formed, so we focus on the construction of according to Eq. (24). First, and . Note that where , and are all in . Hence, . Note that , , and . The , and are not elementary 1-paths in . Hence, , and then . Therefore, , where and thus, we finally have .
Construction of − Figure 2b According to Eq. (26), we have and . Since there is no 3-path existing, so the and are both empty matrix. Hence , and thus, one has .
In the following section, we will omit the detailed construction steps of boundary matrix . Table 1, Table 2, Table 3, and Table 4 list the boundary matrix and the -th path Laplacian matrix for with its corresponding Betti numbers and spectrum for Figure 2 c, d, e, and f. It is worth to mention that can distinguish the same graph with different paths assigned. For example, Figure 2 c and d have the same undirected graph structure with different paths assigned. We have for Figure 2 c and for Figure 2 d.
Table 1.
Illustration of digraph c in Figure 2
empty matrix | |||
(4) | |||
1 | 0 | 0 | |
{0, 2, 2, 4} | {2, 2, 4, 4} | {4} |
Table 2.
Illustration of digraph d in Figure 2
Ωn | {0} | ||
empty matrix | (/) | ||
(/) | |||
1 | 1 | 0 | |
{0, 2, 2, 4} | {0, 2, 4, 4} | / |
Table 3.
Illustration of digraph e in Figure 2.
empty matrix | |||
1 | 1 | 0 | |
{0, 1.4384, 3, 3, 3, 5} | {0, 1.4384, 2, 3, 3, 3, 5.5616, 6} | {2,6} |
Table 4.
Illustration of digraph f in Figure 2.
empty matrix | |||
1 | 0 | 1 | |
{0, 2, 2, 2, 4, 6} | {2, 2, 2, 4, 4, 4, 6, 8} | {0, 4, 4, 8} |
3.2. Persistent path Laplacian.
From Section 3.1, the way to calculate both harmonic spectra (topological invariants) and non-harmonic spectra of -th path Laplacian matrix is genuinely free of metrics or coordinates, which contains too little information to fully describe the object. Therefore, inspired by the idea of the persistent spectral graph (PSG), persistent path Laplacian (PPL) is proposed to create a sequence of digraphs induced by varying a filtration parameter to encode more geometric or structural information.
First, we consider a filtration of digraphs , which is a morphism from the category of real number to the category of digraphs that satisfies:
where and . Consider a sequence of finitely many positive integers , we have a sequence of digraphs
For each digraph , we denote its corresponding chain group to be , and the -boundary operator of is denoted by .
Similarly, as in persistent homology, a sequence of chain complexes can be denoted as
(29) |
For the sake of simplicity, we use to represent . Suppose a subset of whose boundary is in as:
(30) |
The persistent -boundary operator is denoted as , and its corresponding adjoint operator is . Therefore, the persistent -th path Laplacian operator defined along the filtration is:
(31) |
Since inherits the inner product from , then the adjoint map well defined. Intuitively, the matrix representation of is
(32) |
where is the associated inner product matrix of with arbitrary basis. Moreover, assume the dimension of is , then the spectra of that are arranged in ascending order can be displayed as:
Note that the smallest non-harmonic spectra of is denoted as . We call the multiplicity of zero spectra of to be persistent -th Betti number from to .
(33) |
Distanced-based filtration
Specifically, suppose is a weighted digraph, where is the set of the vertices and is the set of the directed edges. Assume is a weight function . For example, if is in the Euclidean space, then a digraph is a geometric digraph (a geometric digraph is a digraph in which the vertices are embedded as points in the Euclidean space, and the edges are embedded as non-crossing directed line segments). For any where , we define , where is a Euclidean metric. Hence, for every , a digraph can be described as , and a filtration of digraphs can be described as .
Therefore, the persistent -th path Laplacian matrix defined on the filtration is
(34) |
where its corresponding Betti numbers and spectra can be expressed as:
(35) |
(36) |
Notably, the Fiedler value (i.e., spectral gap) of is widely used in many other areas such as physics and geography, which is denoted as . As shown below, it is sensitive to both topological and geometric changes.
Moreover, it is worth to mention that isolated points (vertices) can be either included in the digraphs (under the distance-based filtration) or removed from the digraphs (under the distanced-based filtration with removal of isolated points).
One can get both abstract information (revealed by Betti numbers) and geometric information (revealed by non-harmonic spectra) from digraphs along filtration. For instance, Figure 3 illustrates the filtration on two tetrahedrons. The top panel is a tetrahedron (Tetra 1) with edge lengths , and . The bottom panel is another tetrahedron (Tetra 2) with edge lengths , and , and . We say , and . Figure 4 shows the changes of and of persistent -th path Laplacian along filtration. It can be seen that by varying the filtration parameter from 0 to 1, the Betti 1 and Betti 2 are always 0. However, the smallest nonzero eigenvalue of Tetra 1 and Tetra 2 have changes along filtration parameter . Additionally, when , the can distinguish Tetra 1 and Tetra 2, while cannot. This indicates that non-harmonic spectra of persistent path Laplacian can reveal more geometric information than the persistent Betti numbers in distinguishing similar topological structures. Notably, we remove all the isolated points from each digraph for the simplicity of calculation.
Figure 3.
Illustration of filtration on a tetrahedron. Here, 1,2,3, and 4 represent four elementary 0-paths , and . The top panel is a tetrahedron that has edge lengths and . The bottom panel is a tetrahedron that has edge lengths , , , and .
Figure 4.
Comparison of Betti numbers and non-harmonic spectra of when on tetrahedrons Tetra 1 and Tetra 2. Note that since and for Tetra 1 and Tetra 2, topological variants from persistent path homology cannot discriminate Tetra 1 and Tetra 2. However and show the differences between Tetra 1 and Tetra 2.
Moreover, a more complicated example is also illustrated in Figure 5 to describe the filtration on two pyramids. The top panel is a pyramid (Pyra 1) with edge lengths , and . The bottom panel is a pyramid (Pyra 2) with edge lengths , and , and . We say , and . Figure 6 depicts the changes of and of persistent -th path Laplacian for objects Pyra 1 and Pyra 2 along filtration. For Pyra 1 and Pyra 2, when and , their corresponding digraphs form, which result in and for both Pyra 1 and Pyra 2. When , we have for Pyra 1 since the introducing of a new directed edges . When , we have for Pyra 2 since the introducing of a new directed edges kills the 1-cycle formed by , and . Furthermore, although Pyra 1 and Pyra 2 do not have exactly the same geometric structure, their share the same value from to . However, Pyra 1 and Pyra 2 can be distinguished by the along filtration. Therefore, we can see that similar to the PSG, one can use the non-harmonic spectra from the persistent path laplacian to reveal the intrinsic geometric information of a givens point-cloud dataset by varying the filtration parameters. In addition, the detailed calculations of can be found in the Appendix.
Figure 5.
Illustration of filtration on a pyramid. Here, 1, 2, 3, 4, and 5 represent five elementary 0-paths , and . The top panel is a pyramid that has edge lengths , , and . The bottom panel is a pyramid that has edge lengths , , and .
Figure 6.
Comparison of Betti number and non-harmonic spectra of when and 2 on pyramids Pyra 1 and Pyra 2. Note that since , it cannot distinguish Pyra 1 and Pyra 2. But can tell the difference.
4. Application.
In this section, we apply the persistent path Laplacian to the analysis of the curcurbit[n]urils system. Cucurbiturils are macrocyclic molecules, which are made of glycoluril (=C6H2N4O2=) monomers linked by methylene bridges (−CH2−). is commonly used as an abbreviation of Cucurbiturils. Here, is the number of glycoluril units. In this work, we consider CB7 as an example. The molecular formulas of CB7 is C42H14N28O14. The molecular structure of CB7 is obtained from the Supporting Information of Ref. [11].
Figure 7 illustrates how PPL is employed for a molecular system to extract its rich topological and geometric information. The first two charts of Figure 7a describe the three-dimensional (3D) top view and side view of CB7. The green, blue, red, and gray colors represent C,N,O, and H atoms, respectively. The third chart of Figure 7a is a basic “Octagon-pentagon” unit that consists of two glycolurils. It can be seen that 7 glycolurils exist in CB7. The last chart of Figure 7a demonstrates the path direction assignment to pairs of atoms based on atomic electronegativity. The periodic table of electronegativity is given by the Pauling scale [30], in which the electronegativities of C,N,O, and H are 2.55, 3.04, 3.44, and 2.20, respectively. Then, we set the directions of edges following the order “H → C → N → O”.
Figure 7.
a The 3D structures of CB7, 2 glycolurils, and path direction assignment. Here, from left to right, the side view of CB7, top view of CB7, the structure of two glycoluril units (=C10H4N8O4=), and electronegativity-based path direction assignment are depicted as well. b Illustration of filtration-induced geometries of CB7. Eight digraphs are constructed under filtration parameter . c Illustration of filtration-induced path complexes within two glycoluril units. Path directions can be inferred from their colors as shown in the last chart of a d Betti numbers and non-harmonic spectra of persistent path Laplacians when , and 2) for CB7.
Figure 7b depicts the distance-based filtration of CB7. Here, structures were obtained at the filtration radii of 0.200, 0.565, 0.710, 0.745, 0.800, 1.210, 1.315, and 1.800Å, respectively. In our digraph notation, we denote these structures as , and . Note that, in the present formulation, all of the isolated points were removed from these digraphs.
Figure 7c illustrates the filtration-induced path complexes in the aforementioned . To clearly show the topological and geometric changes, only the path complexes in one “Octagon-pentagon” unit (or two glycolurils) are considered and depicted for each structure. For simplicity, only edges are presented. However, their path directions can be easily assigned based on their color map as shown in the last chart of Figure 7a.
Figure 7d depicts the PPL spectra of CB7. We can see that at the initial state when ), total 98 atoms are isolated from one another. When radius , C atoms on each pentagon are connected with their H atom neighborhoods. Therefore, four isolated components are formed in each glycoluril, which makes . At , C atoms on each pentagon are connected with their N and O neighborhoods. At this stage, two more connected components are involved in one glycoluri structure, which makes . Only one connected structure can be formed if all of the atoms get connected with their neighborhood atoms. Therefore, (see ). Notably, the and provide rich topological and geometric information when the filtration parameter increases.
This example shows that PPL can decode topological persistence and the shape evolution of a given molecular system with chemical- or biological-based directional assignment. Specifically, can still offer geometric information when does not changes for large radii. Therefore, PPL keeps revealing homotopic shape evolution when the topological invariant from persistent path homology does not change.
Additionally, unlike persistent Laplacian, high-order PPL operators provide rich topological information. For instance, when the filtration parameter increases to from PPL dramatically goes up. Whereas, in persistent Laplacian, the value of Betti 2 is quite limited since the CB7 system can barely form 2-cycles at a similar filtration parameter using either Rips complex or alpha complex. This trait endows PPL with a better ability to characterize the geometry and topology of an object at large scales.
5. Conclusion.
Path homology, a rich mathematical concept introduced by Grigor’yan, Lin, Muranov, and Yau, has stimulated a variety of new developments in pure and applied mathematics, including much attention from the topological data analysis (TDA) community. Unlike original homology or persistent homology, path homology enables the treatment of directed graphs and networks. Persistent path homology bridges path homology with multiscale analysis, making it a powerful tool for practical applications. Nonetheless, these formulations are insensitive to homotopic shape evolution during filtration.
Topological Laplacians, including Hodge Laplacian, graph Laplacian, sheaf Laplacian, and Dirac Laplacian, are versatile mathematical tools that not only preserve all topological invariants but also describe geometric shapes. This work introduces a new topological Laplacian, namely persistent path Laplacian, as a new mathematical tool for the multi-scale analysis of directed graphs and networks. For a given data, the proposed persistent path Laplacian fully recovers the topological persistence of persistent homology in its harmonic spectra and meanwhile, captures homotopic shape evolution of the data during filtration in its non-harmonic spectra.
Acknowledgments
This work was supported in part by NIH grants R01GM126189 and R01AI164266, NSF grants DMS-2052983, DMS-1761320, and IIS-1900473, NASA grant 80NSSC21M0023, Michigan Economic Development Corporation, MSU Foundation, Bristol-Myers Squibb 65109, and Pfizer.
Appendix.
In Tables 5–19, we present the detailed matrix constructions, Betti numbers, and spectra for various digraphs shown in Figure 5 top and bottom panels.
Table 5.
Matrix construction of graph (with isolated points included) in the top panel of Figure 5.
{0} | {0} | ||
empty matrix | / | / | |
zero matrix | / | / | |
5 | / | / | |
{0, 0, 0, 0, 0} | / | / |
Table 6.
Matrix construction of graph (without isolated points) in the top panel of Figure 5.
{0} | {0} | {0} | |
/ | / | / | |
/ | / | / | |
/ | / | / | |
/ | / | / |
Table 7.
Matrix construction of graph in the top panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
1 | 1 | 0 | |
{0, 0.8299, 2, 2.6889, 4.4812} | {0, 0.8299, 2, 2.6889, 4.4812} | / |
Table 8.
Matrix construction of graph in the top panel of Figure 5.
empty matrix | |||
1 | 1 | 0 | |
{0, 2, 3, 4, 5} | {0, 2, 2, 3, 4, 4, 5} | {2, 4} |
Table 9.
Matrix construction of graph in the top panel of Figure 5.
empty matrix | |||
1 | 1 | 0 | |
{0, 3, 3, 5, 5} | {1, 3, 3, 3, 3, 5, 5, 5} | {1, 3, 3, 5} |
Table 10.
Matrix construction of graph in the top panel of Figure 5.
empty matrix | |||
1 | 0 | 0 | |
{0, 3, 3, 5, 5} | {1, 3, 3, 3, 3, 5, 5, 5} | {1, 3, 3, 5} |
Table 11.
Matrix construction of graph (with isolated points included) in the bottom panel of Figure 5.
/ | / | ||
empty matrix | / | / | |
zero matrix | / | / | |
5 | / | / | |
{0, 0, 0, 0, 0} | / | / |
Table 12.
Matrix construction of graph (without isolated points) in the bottom panel of Figure 5.
{0} | {0} | {0} | |
/ | / | / | |
/ | / | / | |
/ | / | / | |
/ | / | / |
Table 13.
Matrix construction of graph (with isolated points included) in the bottom panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
2 | 1 | 0 | |
{0, 0, 0.6571, 2.5293, 4.8136} | {0, 0.6571, 2.5293, 4.8136} | / |
Table 14.
Matrix construction of graph (without isolated points) in the bottom panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
1 | 1 | 0 | |
{0, 2, 2, 4} | {0, 2, 2, 4} | / |
Table 15.
Matrix construction of graph (with isolated points included) in the bottom panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
2 | 1 | 0 | |
{0, 0, 0.6571, 2.5293, 4.8136} | {0, 0.6571, 2.5293, 4.8136} | / |
Table 16.
Matrix construction of graph (without isolated points) in the bottom panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
1 | 1 | 0 | |
{0, 2, 2, 4} | {0, 2, 2, 4} | / |
Table 17.
Matrix construction of graph in the bottom panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
1 | 1 | 0 | |
{0, 2, 2, 4} | {0, 2, 4, 4} | / |
Table 18.
Matrix construction of graph in the bottom panel of Figure 5.
{0} | |||
empty matrix | (/) | ||
(/) | |||
1 | 1 | 0 | |
{0, 0.8299, 2, 2.6889, 4.4812} | {0, 0.8299, 2, 2.6889, 4.4812} | / |
Table 19.
Matrix construction of graph in the bottom panel of Figure 5.
empty matrix | |||
1 | 0 | 0 | |
{0, 3, 3, 5, 5} | {1, 3, 3, 3, 3, 5, 5, 5} | {1, 3, 3, 5} |
REFERENCES
- [1].Ameneyro B, Maroulas V and Siopsis G, Quantum persistent homology, arXiv preprint, arXiv:2202.12965, 2022. [Google Scholar]
- [2].Carlsson G, Topology and data, Bulletin of the American Mathematical Society, 46 (2009), 255–308. [Google Scholar]
- [3].Chartrand G, Introductory Graph Theory, Courier Corporation, 1977. [Google Scholar]
- [4].Chen J, Qiu Y, Wang R and Wei G-W, Persistent Laplacian projected omicron BA.4 and BA.5 to become new dominating variants, arXiv preprint, arXiv:2205.00532, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [5].Chen J, Zhao R, Tong Y and Wei G-W, Evolutionary de Rham-Hodge method, Discrete and Continuous Dynamical Systems. Series B, 26 (2021), 3785–3821. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [6].Chowdhury S and Mémoli F, Persistent path homology of directed networks, In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, 2018, 1152–1169. [Google Scholar]
- [7].Dey TK, Li T and Wang Y, An efficient algorithm for 1-dimensional (persistent) path homology, arXiv preprint, arXiv:2001.09549, 2020. [Google Scholar]
- [8].Dodziuk J, de Rham-Hodge theory for l2-cohomology of infinite coverings, Topology, 16 (1977), 157–165. [Google Scholar]
- [9].Edelsbrunner H, Harer J, et al. , Persistent homology-a survey, Contemporary Mathematics, 453 (2008), 257–282. [Google Scholar]
- [10].Edelsbrunner H, Letscher D and Zomorodian A, Topological persistence and simplification, In Proceedings 41st Annual Symposium on Foundations of Computer Science, IEEE, 2000, 454–463. [Google Scholar]
- [11].Gao K, Yin J, Henriksen NM, Fenley AT and Gilson MK, Binding enthalpy calculations for a neutral host–guest pair yield widely divergent salt effects across water models, Journal of Chemical Theory and Computation, 11 (2015), 4555–4564. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [12].Gomes A and Miranda D, Path cohomology of locally finite digraphs, Hodge’s theorem and the p-lazy random walk, arXiv preprint, arXiv:1906.04781, 2019. [Google Scholar]
- [13].Grbic J, Wu J, Xia K and Wei G-W, Aspects of topological approaches for data science, Foundations of Data Science, 4 (2022), 165–216. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [14].Grigor’yan A, Jimenez R, Muranov Y and Yau S-T, Homology of path complexes and hypergraphs, Topology and its Applications, 267 (2019), 106877. [Google Scholar]
- [15].Grigor’yan A, Jimenez R, Muranov Y and Yau S-T, On the path homology theory of digraphs and Eilenberg–Steenrod axioms, Homology, Homotopy and Applications, 20 (2018), 179–205. [Google Scholar]
- [16].Grigor’yan A, Lin Y, Muranov Y and Yau S-T, Homologies of path complexes and digraphs, arXiv preprint, arXiv:1207.2834, 2013. [Google Scholar]
- [17].Grigor’yan A, Lin Y, Muranov Y and Yau S-T, Homotopy theory for digraphs, Pure Appl. Math. Q, 10 (2014), 619–674. arXiv preprint, arXiv:1407.0234, 2014. [Google Scholar]
- [18].Grigor’yan A, Lin Y, Muranov Y and Yau S-T, Cohomology of digraphs and (undirected) graphs, Asian Journal of Mathematics, 19 (2015), 887–932. [Google Scholar]
- [19].Grigor’yan A, Lin Y, Muranov YV and Yau S-T. Path complexes and their homologies, Journal of Mathematical Sciences, 248 (2020), 564–599. [Google Scholar]
- [20].Grigor’yan A, Lin Y and Yau S-T, Torsion of digraphs and path complexes, arXiv preprint, arXiv:2012.07302, 2020. [Google Scholar]
- [21].Grigor’yan A, Muranov Y, Vershinin V and Yau S-T, Path homology theory of multigraphs and quivers, In Forum Mathematicum, 30 (2018), 1319–1337. [Google Scholar]
- [22].Hansen J and Ghrist R, Toward a spectral theory of cellular sheaves, Journal of Applied and Computational Topology, 3 (2019), 315–358. [Google Scholar]
- [23].Horak D and Jost J, Spectra of combinatorial Laplace operators on simplicial complexes, Advances in Mathematics, 244 (2013), 303–336. [Google Scholar]
- [24].Kac M, Can one hear the shape of a drum?, The Aamerican Mathematical Monthly, 73 (1966), 1–23. [Google Scholar]
- [25].Long TA, Brady SM and Benfey PN, Systems approaches to identifying gene regulatory networks in plants, Annual Review of Cell and Developmental Biology, 24 (2008), 81–103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Mémoli F, Wan Z and Wang Y, Persistent Laplacians: Properties, algorithms and implications, SIAM J. Math. Data Sci, 4 (2022), 858–884. [Google Scholar]
- [27].Meng Z and Xia K, Persistent spectral–based machine learning (perspect ml) for protein-ligand binding affinity prediction, Science Advances, 7 (2021), eabc5329. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [28].Mochizuki Y and Imiya A, Spatial reasoning for robot navigation using the Helmholtz-Hodge decomposition of omnidirectional optical flow, In 2009 24th International Conference Image and Vision Computing New Zealand, IEEE, 2009, 1–6. [Google Scholar]
- [29].Nguyen DD, Cang Z, Wu K, Wang M, Cao Y and Wei G-W, Mathematical deep learning for pose and binding affinity prediction and ranking in D3R grand challenges, Journal of Computer-Aided Molecular Design, 33 (2019), 71–82. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [30].Pauling L, The nature of the chemical bond. iv. the energy of single bonds and the relative electronegativity of atoms, Journal of the American Chemical Society, 54 (1932), 3570–3582. [Google Scholar]
- [31].Ribando-Gros E, Wang R, Chen J, Tong Y and Wei G-W, Graph and Hodge Laplacians: Similarity and difference, arXiv preprint, arXiv:2204.12218, 2022. [Google Scholar]
- [32].Shepard AD, A Cellular Description of the Derived Category of a Stratified Space, PhD thesis, Brown University, 1985. [Google Scholar]
- [33].Skraba P, Ovsjanikov M, Chazal F and Guibas L, Persistence-based segmentation of deformable shapes, In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, IEEE, 2010, 45–52. [Google Scholar]
- [34].Tong Y, Lombeyda S, Hirani AN and Desbrun M, Discrete multiscale vector field decomposition, ACM Transactions on Graphics (TOG), 22 (2003), 445–452. [Google Scholar]
- [35].Townsend J, Micucci CP, Hymel JH, Maroulas V and Vogiatzis KD, Representation of molecular structures with persistent homology for machine learning applications in chemistry, Nature Communications, 11 (2020), 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [36].Wang C, Ren S, Wu J and Lin Y, Weighted path homology of weighted digraphs and persistence, arXiv preprint, arXiv:1910.09891, 2019. [Google Scholar]
- [37].Wang R, Nguyen DD and Wei G-W, Persistent spectral graph, International Journal for Numerical Methods in Biomedical Engineering, 36 (2020), e3376. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [38].Wang R, Zhao R, Ribando-Gros E, Chen J, Tong Y and Wei G-W, Hermes: Persistent spectral graph software, Foundations of Data Science (Springfield, Mo.), 3 (2021), 67–97. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [39].Wei RKJ, Wee J, Laurent VE and Xia K, Hodge theory-based biomolecular data analysis, Scientific Reports, 12 (2022), 1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [40].Wei X and Wei G-W, Persistent sheaf Laplacians, arXiv preprint, arXiv:2112.10906, 2021. [Google Scholar]
- [41].Xia K and Wei G-W, Persistent homology analysis of protein structure, flexibility, and folding, International Journal for Numerical Methods in Biomedical Engineering, 30 (2014), 814–844. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [42].Zhao R, Wang M, Chen J, Tong Y and Wei G-W, The de Rham–Hodge analysis and modeling of biomolecules, Bulletin of Mathematical Biology, 82 (2020), Paper No. 108, 38 pp. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [43].Zomorodian A and Carlsson G, Computing persistent homology, Discrete & Computational Geometry, 33 (2005), 249–274. [Google Scholar]