The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Zhiyan Shi; Zhongzhi Wang; Pingping Zhong; Yan Fan

doi:10.1007/s10959-021-01117-1

. 2021 Aug 3;35(3):1367–1390. doi: 10.1007/s10959-021-01117-1

The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Zhiyan Shi ^1,^✉, Zhongzhi Wang ², Pingping Zhong ¹, Yan Fan ¹

PMCID: PMC8329646 PMID: 34366565

Abstract

In this paper, we study the generalized entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. Firstly, by constructing a class of random variables with a parameter and the mean value of one, we establish a strong limit theorem for delayed sums of the bivariate functions of such chains using the Borel–Cantelli lemma. Secondly, we prove the strong law of large numbers for the frequencies of occurrence of states of delayed sums and the generalized entropy ergodic theorem. As corollaries, we generalize some known results.

Keywords: Binary tree, Nonhomogeneous bifurcating Markov chains, Entropy ergodic theorem

Introduction

Let ${X_{n}, n \geq 0}$ be an arbitrary information source taking values in a finite alphabet set on probability space $(Ω, F, P)$ . Set $f_{n} (ω) = - \frac{1}{n} ln P (X_{0}, X_{1}, \dots, X_{n})$ , and $f_{n} (ω)$ is called the relative entropy density of ${X_{0}, X_{1}, \dots, X_{n}}$ in information theory. The convergence of $f_{n} (ω)$ to a constant in a sense ( $L_{1}$ convergence, convergence in probability, a.e. convergence) is regarded as the entropy ergodic theorem or the asymptotic equipartiton property (AEP), or the Shannon–McMillan–Breiman theorem, which is the fundamental theorem in information theory. Shannon [22] first proved the entropy ergodic theorem for convergence in probability for stationary ergodic information sources with finite alphabet. The entropy ergodic theorem in $L_{1}$ and a.e. convergence, respectively, for stationary ergodic information sources was explored by McMillan [20] and Breiman [6]. Chung [8] considered the case of countable alphabet and Billingsley [5] extended the result to stationary nonergodic sources. Gray and Kieffer [14] extended it to asymptotically stationary measure process. The entropy ergodic theorem for general stochastic processes can be found, for example, in Barron [2] and Algoet and Cover [1]. Yang and Liu [19, 30] obtained the entropy ergodic theorem of nonhomogeneous Markov information sources in finite state space. Yang and Liu [32] studied the entropy ergodic theorem for mth-order nonhomogeneous Markov information sources.

Let $S_{n} = \sum_{k = 1}^{n} X_{k}$ , $S_{0} = 0$ and for any $m, n \in N^{+}$ , define

\begin{matrix} T_{m, n} = S_{m + n} - S_{m} = \sum_{k = m + 1}^{m + n} X_{k}, \end{matrix}

$\frac{T_{m, n}}{n}$ is regarded as the moving average or delay sum in probability theory. Many researches have been taken on topics of moving average. Shepp [23] studied the limiting values of the averages $[S_{n + f (n)} - S_{n}] / f (n)$ for i.i.d. random variables. Gaposhkin [12] established the law of large numbers for moving averages of independent random variables. Lanzinger [17] studied an almost sure limit theorem for moving averages of random variables between the strong law of large numbers and the Erdos–Rényi law. Lai [16] gave a review of limit theorems for moving averages and described some recent developments motivated by applications to signal detection and change point problems. Recently, Wang and Yang [27] considered the entropy ergodic theorem of the moving average form and obtained the generalized entropy ergodic theorem for nonhomogeneous Markov chains.

The tree indexed stochastic process is one of the research hotspots of stochastic structure in recent years. The tree indexed stochastic process generally includes tree indexed random wak, random tree (such as Galton–Watson tree) and tree indexed Markov chain et al. There are a lot of researches about probability limit theorems of tree indexed stochastic process, we briefly list as follows: Chen [7] studied the average properties of random walks on Galton-Watson trees. Telcs and Wormald [26] studied the strong recurrence of tree indexed random walks determined by the resistance properties of spherically symmetric graphs. Dembo et al. [10] extended the notions of shift-invariance and specific relative entropy—as typically understood for Markov fields on deterministic graphs such as $Z^{d}$ -to Markov fields on random trees, and also developed single-generation empirical measure large deviation principles for a more general class of random trees. Le Gall [18] considered Galton–Watson trees associated with a critical offspring distribution and condition to have exactly n vertices, and they proved that these conditioned spatial trees converge as $n \to \infty$ , moduloan appropriate rescaling, towards the conditioned brownian tree under suitable assumptions on the offspring distribution and the spatial displacements. Guyon [13] studied the law of large numbers and central limit theorems for the bifurcating Markov chains indexed by a binary tree, and applied these results to detect cellular aging in Escherichia Coli, using the data of Stewart et al. and a bifurcating autoregressive model. Yamamoto [34] established a large deviation theorem for the number of branches of each order in a random binary tree, where the rate function associated with a large deviation was given by asymptotic forms of the rate function.

The significant progress of tree indexed Markov chains is its entropy ergodic theorem. Benjamini and Peres [3] gave the definition of tree-indexed Markov chains and studied the recurrence and ray-recurrence for them. Berger and Ye [4] studied the existence of entropy rate for some stationary random fields on a homogenous tree. Ye and Berger [35, 36], by using Pemantle’s [21] result and a combinational approach, have obtained entropy ergodic theorem in probability for a PPG-invariant and ergodic random field on a homogenous tree. Yang and Liu [30] established the strong law of large numbers for frequency of state occurrence on Markov chains indexed by a homogenous tree (in fact, it is special case of tree-indexed Markov chains and PPG-invariant random field). Yang [31, 33] obtained the strong law of large numbers and the entropy ergodic theorem for tree-indexed Markov chains. Huang and Yang [15] studied the strong law of large numbers and entropy ergodic theorem for Markov chains indexed by an uniformly bounded tree. Shi and Yang [25] studied the entropy ergodic theorem for mth-order nonhomogeneous Markov chains indexed by a tree. Recently, Dang et al. [9] defined a discrete form of nonhomogeneous bifurcating Markov chains indexed by a binary tree and discuss the equivalent properties for them, meanwhile the strong law of large numbers and the entropy ergodic theorem are studied for these Markov chains with finite state space. Shi et al. [24] studied the strong law of large numbers and entropy ergodic theorem for Markov chains indexed by a Cayley tree in a Markovian environment with countable state space.

Inspired by Dang et al. [9], Wang and Yang [27], and infused with some new ideas, in this paper, we study the generalized entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. Firstly, we prove a strong limit theorem for moving average of the bivariate functions of such chains. Secondly, we prove the strong law of large numbers for the frequencies of occurrence of states of moving average and the generalized entropy ergodic theorem. As corollaries, we generalize some known results. The research innovations of this paper are embodied in generalizing the entropy ergodic theorem in the form of moving average. As the classical Doob martingale convergence theorem cannot be employed, the core technique in this paper is that we construct a class of random variables with a parameter and the mean value of one, and use Borel–Cantelli lemma to prove the existence of a.e. convergence of certain random variables.

The rest of this paper is organized as follows. Section 2 describes some preliminaries, some concepts and properties of Markov chains indexed by a tree and the entropy density are reviewed. The most significant results of this article, i.e. the strong law of large numbers for the frequencies of occurrence of states and the generalized entropy ergodic theorem for the finite nonhomogeneous bifurcating Markov chains indexed by a binary tree, will be illustrated in Sect. 3. Finally, the proofs of main results in Sect. 3 are provided in Sect. 4.

Preliminaries

A tree is a graph T which is connected and contains no circuits. Given any two vertices $α \neq β \in T$ . Let $\bar{α β}$ be the unique path connecting $α$ and $β$ . Define the distance $d (α, β)$ to be the number of edges contained in the path $\bar{α β}$ . Select a vertex as the root (denoted by o). For any two vertices $σ$ and t of tree T, we write $σ \leq t$ if $σ$ is on the unique path from the root o to t. We denote by $σ \land t$ the vertex farthest from o satisfying $σ \land t \leq t$ and $σ \land t \leq σ$ . The set of all vertices with distance n from the root o is called the n-th level of T . We denote by $L_{n}$ the set of all vertices on level n $(L_{o} = {o})$ . We denote by $L_{m}^{n}$ to be the set of all vertices on the mth to nth level of T, specially by $T^{(n)}$ to be the set of all vertices on level 0 (the root o) to level n. Let T be any tree and $t \in T \ {o}$ . If a vertex in this tree is on the unique path from the root o to t and is the nearest to t, we call it the predecessor of t and denote it by $1_{t}$ , we also call t a successor of $1_{t}$ . If the root of a tree has N neighboring vertices and other vertices have $N + 1$ neighboring vertices, we call this type of tree a Cayley tree and denote it by $T_{C, N}$ . That is, for any vertex t of Cayley tree $T_{C, N}$ , it has N successors on the next level. In this paper, we mainly investigate the binary tree $T_{C, 2}$ , on which each vertex has two successors on the next level. For simplicity, we denote $T_{C, 2}$ by $T_{2}$ (see Fig. 1). For any vertex t of the binary tree $T_{2}$ , we denote by $t^{1}$ and $t^{2}$ the two successors of t, and call them the first successor and the second successor of t respectively.

Let $(Ω, F, P)$ be a probability space, and T be any tree, ${X_{t}, t \in T}$ be tree-indexed stochastic processes defined on $(Ω, F, P)$ . Let A be the subgraph of T, $X^{A} = {X_{t}, t \in A}$ . We denote by |A| the number of vertices of A, $x^{A}$ the realization of $X^{A}$ . Dang et al. [9] defined the discrete form of nonhomogeneous bifurcating Markov chains indexed by a binary tree. First we review the definition of this process.

Definition 2.1

(Dang et al. [9]) Let $T_{2}$ be a binary tree, G a countable state space, ${X_{t}, t \in T_{2}}$ be a collection of G-valued random variables defined on probability space $(Ω, F, P)$ . Let

\begin{matrix} p = {p (x), x \in G} \end{matrix}

be a distribution on G, and

\begin{matrix} P_{t} = (P_{t} (y_{1}, y_{2} | x)), x, y_{1}, y_{2} \in G, t \in T_{2} \end{matrix}

be a collection of stochastic matrices (that is $P_{t} (y_{1}, y_{2} | x) \geq 0, \forall y_{1}, y_{2}, x \in G$ , and $\sum_{(y_{1}, y_{2}) \in G^{2}} P_{t} (y_{1}, y_{2} | x) = 1, \forall x \in G)$ on $G \times G^{2}$ . If $\forall n \geq 1$ ,

\begin{matrix} P (X^{L_{n}} = x^{L_{n}} | X^{T^{(n - 1)}} = x^{T^{(n - 1)}}) = \prod_{t \in L_{n - 1}} P_{t} (x_{t^{1}}, x_{t^{2}} | x_{t}), \end{matrix}

and

\begin{matrix} P (X_{o} = x) = p (x), \forall x \in G, \end{matrix}

${X_{t}, t \in T_{2}}$ will be called G-valued nonhomogeneous bifurcating Markov chains indexed by a binary tree $T_{2}$ with the initial distribution (1) and stochastic matrices (2). If $\forall t \in T_{2}, P_{t} = P$ , where $P = {P (y_{1}, y_{2} | x), x, y_{1}, y_{2} \in G}$ is a stochastic matrix on $G \times G^{2}$ , ${X_{t}, t \in T_{2}}$ will be called G-valued homogeneous bifurcating Markov chains indexed by a binary tree.

Dang et al. [9] presented the equivalent properties for nonhomogeneous bifurcating Markov chains indexed by a binary tree as following.

Property 2.1

(Dang et al. [9]) Let $T_{2}$ be a binary tree, G a countable state space, and ${X_{t}, t \in T_{2}}$ be a collection of G-valued random variables defined on probability space $(Ω, F, P)$ , then the three propositions below are equivalent:

(i)
${X_{t}, t \in T_{2}}$ is a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree $T_{2}$ with the initial distribution (1) and stochastic matrices (2) defined by Definition 2.1;
(ii)
For $\forall n \geq 1$ and $\forall x^{T^{(n)}} \in G^{T^{(n)}}$ , we have
$\begin{matrix} P (X^{T^{(n)}} = x^{T^{(n)}}) = p (x_{o}) \prod_{t \in T^{(n - 1)}} P_{t} (x_{t^{1}}, x_{t^{2}} | x_{t}) ; \end{matrix}$ 5
(iii)
For $\forall n \geq 1$ and $t, t_{1}, t_{2}, \dots, t_{n} \in T_{2}$ , satisfying $t_{i} \land t^{1} \leq t, t_{i} \land t^{2} \leq t, 1 \leq i \leq n$ , we have
$\begin{matrix} P (X_{t^{1}} = y_{1}, X_{t^{2}} = y_{2} | X_{t} = x, X_{t_{1}} = x_{t_{1}}, \dots, X_{t_{n}} = x_{t_{n}}) \\ = P_{t} (y_{1}, y_{2} | x) = P (X_{t^{1}} = y_{1}, X_{t^{2}} = y_{2} | X_{t} = x), \forall x, y_{1}, y_{2} \in G, \end{matrix}$ 6
and
$\begin{matrix} P (X_{o} = x) = p (x), \forall x \in G . \end{matrix}$

Remark 2.1

It is a consequence of Kolmogorov extension theorem that there exists a collection of G-valued random variables ${X_{t}, t \in T_{2}}$ on some probability space such that (5) holds.

Remark 2.2

By (5), we can easily obtain that for $\forall m, n \geq 1, n \geq m$ and $\forall x^{L_{m}^{n}} = G^{L_{m}^{n}}$ ,

\begin{matrix} P (X^{L_{m}^{n}} = x^{L_{m}^{n}}) = P (X^{L_{m}} = x^{L_{m}}) \prod_{t \in L_{m}^{n - 1}} P (x_{t^{1}}, x_{t^{2}} | x_{t}) . \end{matrix}

Remark 2.3

If ${X_{t}, t \in T_{2}}$ is a G-valued nonhomogeneous bifurcating Markov chains indexed by a binary tree $T_{2}$ with the stochastic matrices (2) defined by Definition 2.1. From the second equality of (6), we have that for any $t \in T$ ,

\begin{matrix} P (X_{t^{1}} = y_{1}, X_{t^{2}} = y_{2} | X_{t} = x) = P_{t} (y_{1}, y_{2} | x) . \end{matrix}

Below we will recall the definition of tree indexed nonhomogeneous Markov chains.

Definition 2.2

(Dong et al. [11]) Let T be a local finite and infinite tree, G a countable state space, ${X_{t}, t \in T}$ be a collection of G-valued random variables defined on probability space $(Ω, F, P)$ . Let

\begin{matrix} p = {p (x), x \in G} \end{matrix}

be a distribution on G, and

\begin{matrix} Q_{t} = (Q_{t} (y | x)), x, y \in G, t \in T \ {o} \end{matrix}

be a collection of transition matrices on $G^{2}$ . If $\forall n \geq 1$ , and $t, t_{1}, t_{2}, \dots, t_{n} \in T$ , satisfying $t_{i} \land t \leq 1_{t}, 1 \leq i \leq n$ , we have

\begin{matrix} P (X_{t} & = y | X_{1_{t}} = x, X_{t_{1}} = x_{t_{1}}, \dots, X_{t_{n}} = x_{t_{n}}) \\ = P (X_{t} = y | X_{1_{t}} = x) = Q_{t} (y | x), \forall x, y \in G, \end{matrix}

and

\begin{matrix} P (X_{o} = x) = p (x), \forall x \in G, \end{matrix}

${X_{t}, t \in T}$ will be called G-valued nonhomogeneous Markov chains indexed by tree T with the initial distribution (8) and transition matrices (9), or called tree indexed nonhomogeneous Markov chains with state space G.

The above definition is the natural generalization of the definition of homogeneous Markov chains indexed by tree T (see Benjamini and Peres [3]). Similar to the equivalent property of nonhomogeneous bifurcating Markov chains indexed by a binary tree, by Property 2.1, we can immediately obtain the equivalent property of nonhomogeneous Markov chains indexed by a tree.

Property 2.2

(Dang et al. [9]) Let T be a local finite and infinite tree, G a countable state space, and ${X_{t}, t \in T}$ be a collection of G-valued random variables defined on probability space $(Ω, F, P)$ . Then ${X_{t}, t \in T}$ is a tree indexed nonhomogeneous Markov chain taking values in G defined by Definition 2.2 if and only if $\forall n \geq 1$ and $\forall x^{T^{(n)}} \in G^{T^{(n)}}$ ,

\begin{matrix} P (X^{T^{(n)}} = x^{T^{(n)}}) = p (x_{o}) \prod_{t \in T^{(n)} \ {o}} Q_{t} (x_{t} | x_{1_{t}}) . \end{matrix}

Remark 2.4

From Property 2.2, we know that ${X_{t}, t \in T_{2}}$ is a tree indexed nonhomogeneous Markov chain if and only if, $\forall n \geq 1$ and $\forall x^{T^{(n)}} \in G^{T^{(n)}}$ ,

\begin{matrix} P (X^{T^{(n)}} = x^{T^{(n)}}) = p (x_{o}) \prod_{t \in T^{(n - 1)}} Q_{t^{1}} (x_{t^{1}} | x_{t}) Q_{t^{2}} (x_{t^{2}} | x_{t}) . \end{matrix}

Thus a nonhomogeneous bifurcating Markov chain indexed by a binary tree is the nonhomogeneous Markov chain indexed by a binary tree if and only if, for $\forall t \in T_{2}$ and $\forall x, y_{1}, y_{2} \in G$ ,

\begin{matrix} P_{t} (y_{1}, y_{2} | x) = Q_{t^{1}} (y_{1} | x) Q_{t^{2}} (y_{2} | x), \end{matrix}

that is $\forall t \in T_{2}$ ,

\begin{matrix} P (X_{t^{1}} = y_{1}, X_{t^{2}} = y_{2} | X_{t} = x) = P (X_{t^{1}} = y_{1} | X_{t} = x) P (X_{t^{2}} = y_{2} | X_{t} = x) . \end{matrix}

The above equality means that a nonhomogeneous bifurcating Markov chain indexed by a binary tree is the nonhomogeneous Markov chain indexed by a binary tree if and only if for any $t \in T_{2}$ , their two successors of the same predecessor of t are conditionally independent.

Let T be a tree, ${X_{t}, t \in T}$ be a stochastic process indexed by tree T taking values in countable state space G. Denote $P (x^{L_{m}^{n}}) = P (X^{L_{m}^{n}} = x^{L_{m}^{n}})$ . Let ${a_{n}, n \geq 0}$ and ${ϕ (n), n \geq 0}$ be two sequences of nonnegative integers such that ${lim}_{n \to \infty} ϕ (n) = \infty$ . Define

\begin{matrix} f_{a_{n}, ϕ (n)} (ω) = - \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} ln P (X^{L_{an}^{a_{n} + ϕ (n)}}), \end{matrix}

$f_{a_{n}, ϕ (n)} (ω)$ will be called the generalized entropy density of $X^{L_{a_{n}}^{a_{n} + ϕ (n)}}$ . Particularly, if $a_{n} \equiv 0$ and $ϕ (n) = n$ , $f_{a_{n}, ϕ (n)} (ω)$ will become the classical entropy density of $X^{T^{(n)}}$ defined as follows

\begin{matrix} f_{n} (ω) ≐ f_{0, n} (ω) = - \frac{1}{| T^{(n)} |} ln P (X^{T^{(n)}}) . \end{matrix}

Obviously, if ${X_{t}, t \in T}$ is a nonhomogeneous bifurcating Markov chains indexed by a binary tree defined by Definition 2.1, it follows from (7) that

\begin{matrix} f_{a_{n}, ϕ (n)} (ω) = - \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} [ln P (X^{L_{a_{n}}}) + \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln P_{t} (X_{t^{1}}, X_{t^{2}} | X_{t})], \end{matrix}

and

\begin{matrix} f_{n} (ω) = - \frac{1}{|T^{(n)}|} [ln P (X_{o}) + \sum_{t \in T^{(n - 1)}} ln P_{t} (X_{t^{1}}, X_{t^{2}} | X_{t})] . \end{matrix}

Property 2.3

(Yang and Yang [28]) Let $T_{2}$ be a binary tree, $G = {0, 1, \dots, b - 1}$ a finite state space and ${X_{t}, t \in T_{2}}$ a tree-indexed stochastic process taking values in G. Let $f_{a_{n}, ϕ (n)} (ω)$ be defined by (15). Then $f_{a_{n}, ϕ (n)} (ω)$ are uniformly integrable.

Main Results

Let $G = {0, 1, \dots, b - 1}$ be a finite state space, ${X_{t}, t \in T_{2}}$ be a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree defined as before. Let $S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)}) (k \in G)$ be the number of k in set of random variables ${X_{t}, t \in L_{a_{n}}^{a_{n} + ϕ (n)}}$ , and $S_{k} (L_{a_{n}}) (k \in G)$ be the number of k in set of random variables ${X_{t}, t \in L_{a_{n}}}, S_{k}^{i} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) (k \in G)$ be the number of k in set of random variables ${X_{t^{i}} = k, t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}}, i = 1, 2$ , which are defined as,

\begin{matrix} S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)}) = | {t \in L_{a_{n}}^{a_{n} + ϕ (n)} : X_{t} = k} | ; \\ S_{k} (L_{a_{n}}) = | {t \in L_{a_{n}} : X_{t} = k} | ; \end{matrix}

and

\begin{matrix} S_{k}^{i} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) = | {t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1} : X_{t^{i}} = k} |, i = 1, 2 . \end{matrix}

It follows that

\begin{matrix} S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)}) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n)}} I_{k} (X_{t}) ; \end{matrix}

\begin{matrix} S_{k} (L_{a_{n}}) = \sum_{t \in L_{a_{n}}} I_{k} (X_{t}) ; \end{matrix}

\begin{matrix} S_{k}^{1} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} I_{k} (X_{t^{1}}) ; \end{matrix}

\begin{matrix} S_{k}^{2} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} I_{k} (X_{t^{2}}) ; \end{matrix}

and

\begin{matrix} S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)}) = S_{k}^{1} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) + S_{k}^{2} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) + S_{k} (L_{a_{n}}), \end{matrix}

where

\begin{matrix} I_{k} (i) = \{\begin{matrix} 1, & i = k, \\ 0, & i \neq k . \end{matrix}) \end{matrix}

In this section, we will establish the strong law of large numbers for the frequencies of occurrence of states and the generalized entropy ergodic theorem for the finite nonhomogeneous bifurcating Markov chains indexed by a binary tree. Firstly, we will give the strong law of large numbers for the frequencies of occurrence of states for this chains with finite state space.

Theorem 3.1

Let $G = {0, 1, \dots, b - 1}$ be a finite state space, and ${X_{t}, t \in T_{2}}$ be a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree $T_{2}$ with stochastic matrices ${P_{t}, t \in T_{2}}$ defined by Definition 2.1, $S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)})$ be defined by (19). Let $P = (P (y_{1}, y_{2} | x)), x, y_{1}, y_{2} \in G$ be another stochastic matrix, and let $P_{1} (y_{1} | x) = \sum_{y_{2} \in G} P (y_{1}, y_{2} | x)$ , $P_{2} (y_{2} | x) = \sum_{y_{1} \in G} P (y_{1}, y_{2} | x)$ , $P_{1} = (P_{1} (y | x))$ , $P_{2} = (P_{2} (y | x))$ . Let $Q = \frac{1}{2} (P_{1} + P_{2})$ , and assume that the transition matrix Q is ergodic. Let ${a_{n}, n \geq 0}$ and ${ϕ (n), n \geq 0}$ be two nonnegative integer sequences such that for any positive integers n, m

\begin{matrix} ϕ (m + n) - ϕ (n) \geq m . \end{matrix}

If $\forall x, y_{1}, y_{2} \in G$ ,

\begin{matrix} lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{t \in T^{(n - 1)}} | P_{t} (y_{1}, y_{2} | x) - P (y_{1}, y_{2} | x) | = 0, \end{matrix}

then

\begin{matrix} lim_{n \to \infty} \frac{S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} = π (k) a . e . \forall k \in G, \end{matrix}

where $π = {π (0), π (1), \dots, π (b - 1)}$ is the unique stationary distribution determined by the transition matrix Q.

The proof of the above theorem will be given in Sect. 4.

In the following, we will study the generalized entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree with finite state space $G = {0, 1, \dots, b - 1}$ .

Theorem 3.2

Under the conditions of Theorem 3.1, let $f_{a_{n}, ϕ_{n}} (ω)$ be as defined in (17) and ${a_{n}, n \geq 0}$ be a sequence of bounded nonnegative numbers, then

\begin{matrix} lim_{n \to \infty} f_{a_{n}, ϕ_{n}} (ω) = - \frac{1}{2} \sum_{l = 0}^{b - 1} π (l) \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) ln P (k_{1}, k_{2} | l) a . e . . \end{matrix}

The proof of the above theorem will be presented in Sect. 4.

Remark 3.1

Let $a_{n} \equiv 0, ϕ (n) = n$ in Theorem 3.2, we can immediately get the entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree with finite state space G (see Dang et al. [9]).

Remark 3.2

From Property 2.3, we know that $f_{a_{n}, ϕ (n)} (ω)$ are uniformly integrable. Thus (27) also holds with $L_{1}$ convergence.

We denote by $g_{a_{n}, ϕ (n)} (ω)$ the generalized entropy density of nonhomogeneous Markov chains indexed by a tree with the initial distribution (8) and transition matrices (9). From (12), it is easy to see that

\begin{matrix} g_{a_{n}, ϕ (n)} (ω) = - \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} [ln P (X^{L_{a_{n}}}) + \sum_{t \in L_{a_{n} + 1}^{a_{n} + ϕ (n)}} ln Q_{t} (X_{t} | X_{1_{t}})] . \end{matrix}

By Theorem 3.2, we can establish the generalized entropy ergodic theorem for nonhomogeneous Markov chains indexed by a binary tree.

Corollary 3.1

Let $T_{2}$ be a binary tree, $G = {0, 1, 2, \dots, b - 1}$ be a finite state space, ${X_{t}, t \in T_{2}}$ be a G-valued nonhomogeneous Markov chain indexed by $T_{2}$ with the transition matrices (9) defined by Definition 2.2. Let $Q = (Q (k | l)), k, l \in G$ be another transition matrix, and assume that Q is ergodic. Let ${a_{n}, n \geq 0}$ be a sequence of bounded nonnegative integers and ${ϕ (n), n \geq 0}$ be a nonnegative integer sequences such that for any positive integers n, m,

\begin{matrix} ϕ (m + n) - ϕ (n) \geq m . \end{matrix}

\begin{matrix} lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{T^{(n)} \ {o}} | Q_{t} (k | l) - Q (k | l) | = 0, \forall k, l \in G, \end{matrix}

then

\begin{matrix} lim_{n \to \infty} g_{a_{n}, ϕ (n)} (ω) = - \sum_{l = 0}^{b - 1} \sum_{k = 0}^{b - 1} π (l) Q (k | l) ln Q (k | l) a . e ., \end{matrix}

where $π = {π (0), \dots, π (b - 1)}$ is the unique stationary distribution determined by the transition matrix Q.

The proof of the above corollary will be given in Sect. 4.

Remark 3.3

Take $a_{n} \equiv 0, ϕ (n) = n$ in Corollary 3.1, it is straightforward to obtain the entropy ergodic theorem for nonhomogeneous Markov chains indexed by a Cayley tree $T_{C, 2}$ with finite state space G. The result is a special case of Dong, Yang and Bai [11] for $N = 2$ .

If there is only one son for each vertex of the tree, nonhomogeneous Markov chains indexed by a binary tree will degenerate into nonhomogeneous Markov chains. Similarly, we denote by $h_{a_{n}, ϕ (n)} (ω)$ the generalized entropy density of nonhomogeneous Markov chain with the initial distribution ${μ_{0} (0), \dots, μ_{0} (b - 1)}$ and transition matrices $P_{n} = (p_{n} (i, j)), i, j \in G$ . It easily follows that

\begin{matrix} h_{a_{n}, ϕ (n)} (ω) = - \frac{1}{ϕ (n)} {log μ_{a_{n}} (X_{a_{n}}) + \sum_{k = a_{n} + 1}^{a_{n} + ϕ (n)} log p_{k} (X_{k - 1}, X_{k})}, \end{matrix}

where $μ_{a_{n}} (x)$ is the distribution of $X_{a_{n}}$ . Thus we can get the generalized entropy ergodic theorem for nonhomogeneous Markov chains.

Corollary 3.2

Suppose ${X_{n}, n \geq 0}$ is a nonhomogeneous Markov chain taking values from a finite state space $G = {0, 1, \dots, b - 1}$ with the initial distribution ${μ_{0} (0), \dots, μ_{0} (b - 1)}$ and the transition matrices ${P_{n} = (p_{n} (i, j)), i, j \in G, n = 1, 2, \dots}$ , where $p_{n} (i, j) = P (X_{n} = j | X_{n - 1} = i)$ . Let ${a_{n}, n \geq 0}$ be a sequence of bounded nonnegative integer and ${ϕ (n), n \geq 0}$ be a nonnegative integer sequences such that for any positive integers n, m

\begin{matrix} ϕ (m + n) - ϕ (n) \geq m . \end{matrix}

Let $P = (p (i, j))$ be another transition matrix, and assume that P is irreducible. If

\begin{matrix} lim_{n \to \infty} \frac{1}{n} \sum_{k = 1}^{n} | p_{k} (i, j) - p (i, j) | = 0, \end{matrix}

then

\begin{matrix} lim_{n \to \infty} h_{a_{n}, ϕ (n)} (ω) = - \sum_{i = 0}^{b - 1} \sum_{j = 0}^{b - 1} π_{i} p (i, j) log p (i, j) a . e . . \end{matrix}

Proof

The corollary is a special case of Corollary 3.1, where $T_{2}$ is the set of nonnegative integers $N$ . $□$

Remark 3.4

Note that

\begin{matrix} \frac{1}{ϕ (n)} \sum_{k = a_{n} + 1}^{a_{n} + ϕ (n)} | p_{k} (i, j) - p (i, j) | \\ \leq (1 + \frac{a_{n}}{ϕ (n)}) \frac{1}{a_{n} + ϕ (n)} \sum_{k = 1}^{a_{n} + ϕ (n)} | p_{k} (i, j) - p (i, j) |, \end{matrix}

and ${a_{n}}$ is bounded, by (32), we have that

\begin{matrix} lim_{n \to \infty} \frac{1}{ϕ (n)} \sum_{k = a_{n} + 1}^{a_{n} + ϕ (n)} | p_{k} (i, j) - p (i, j) | = 0 . \end{matrix}

Thus, we can immediately obtain the results of Wang and Yang [27] on the generalized entropy ergodic theorem for delayed sums of nonhomogeneous Markov chains.

Remark 3.5

If $a_{n} \equiv 0, ϕ (n) = n$ in Corollary 3.2, we can get the entropy ergodic theorem of nonhomogeneous Markov chains (see Yang, [29]).

The Proofs

Before providing the proofs of the main results in Sect 3, we begin with some lemmas.

Lemma 4.1

Let $T_{2}$ be a binary tree, and G be a countable state space. Assuming that ${X_{t}, t \in T_{2}}$ be a G-valued nonhomogeneous bifurcating Markov chain indexed by a binary tree $T_{2}$ defined by Definition 2.1, and ${g_{t} (x, y_{1}, y_{2}), t \in T_{2}}$ be a collection of functions defined on $G^{3}$ . Suppose that $\exists α > 0$ , s.t. $E [e^{α | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |}] < \infty, \forall t \in T_{2}$ . Let ${a_{n}, n \geq 0}$ and ${ϕ (n), n \geq 0}$ be two sequences of nonnegative integers such that $ϕ (n)$ converges to infinity as $n \to \infty$ . Assume that for $\forall ε > 0$ ,

\begin{matrix} \sum_{n = 1}^{\infty} exp (- | L_{a_{n}}^{a_{n} + ϕ (n)} | ε) < \infty . \end{matrix}

Let

\begin{matrix} H_{a_{n}, ϕ (n)} (ω) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}), \end{matrix}

and

\begin{matrix} G_{a_{n}, ϕ (n)} (ω) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} E [g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}] . \end{matrix}

Let $α > 0$ , and set

\begin{matrix} D (α) = & {ω : \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n} + 1}^{a_{n} + ϕ (n)}} E [g_{t}^{2} (X_{t}, X_{t^{1}}, X_{t^{2}}) e^{α | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |} | X_{t}] \\ = & M (α ; ω) < \infty} . \end{matrix}

Then we have

\begin{matrix} lim_{n \to \infty} \frac{H_{a_{n}, ϕ (n)} (ω) - G_{a_{n}, ϕ (n)} (ω)}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} = 0 a . e . ω \in D (α) . \end{matrix}

Remark 4.1

It is easy to see that if ${g_{t} (x, y_{1}, y_{2}), t \in T_{2}}$ is a collection of uniformly bounded functions, then for any $α > 0, D (α) = Ω$ , thus we can get

\begin{matrix} lim_{n \to \infty} \frac{H_{a_{n}, ϕ (n)} (ω) - G_{a_{n}, ϕ (n)} (ω)}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} = 0 a . e . . \end{matrix}

Remark 4.2

Let $a_{n} = 0$ and $ϕ (n) = [{log}_{2} n^{α}] (α > 0)$ . Since $T_{2}$ is a binary tree, we have

\begin{matrix} | L_{a_{n}}^{a_{n} + ϕ (n)} | = 2^{[{log}_{2} n^{α}] + 1} - 1 \geq 2^{{log}_{2} n^{α} - 1 + 1} - 1 = n^{α} - 1, \end{matrix}

where $[\cdot]$ is the usual greatest integer function. In this case (34) holds.

Proof

Let $λ$ be a nonzero real number, for fixed n, define

\begin{matrix} t_{a_{n}, m} (λ, ω) = \frac{e^{λ \sum_{t \in L_{a_{n}}^{a_{n} + m - 1}} g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})}}{\prod_{t \in L_{a_{n}}^{a_{n} + m - 1}} E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}]}, m = 1, 2, \dots, ϕ (n) . \end{matrix}

Noticing that

\begin{matrix} E [e^{λ \sum_{t \in L_{a_{n} + ϕ (n) - 1}} g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X^{T^{(a_{n} + ϕ (n) - 1)}}] \\ = \sum_{t \in L_{a_{n} + ϕ (n) - 1}, (x_{t^{1}}, x_{t^{2}}) \in G^{2}} e^{λ \sum_{t \in L_{a_{n} + ϕ (n) - 1}} g_{t} (X_{t}, x_{t^{1}}, x_{t^{2}})} \\ \cdot P (X^{L_{a_{n} + ϕ (n)}} = x^{L_{a_{n} + ϕ (n)}} | X^{T^{(a_{n} + φ (n) - 1)}}) \\ = \sum_{t \in L_{a_{n} + ϕ (n) - 1}, (x_{t^{1}}, x_{t^{2}}) \in G^{2}} e^{λ \sum_{t \in L_{a_{n} + ϕ (n) - 1}} g_{t} (X_{t}, x_{t^{1}}, x_{t^{2}})} \cdot \prod_{t \in L_{a_{n} + ϕ (n) - 1}} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ = \prod_{t \in L_{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} e^{λ g_{t} (X_{t}, x_{t^{1}}, x_{t^{2}})} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ = \prod_{t \in L_{a_{n} + ϕ (n) - 1}} E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}] . \end{matrix}

It is easy to see that $E [t_{a_{n}, 1} (λ, ω)] = 1$ . Hence by (40),

\begin{matrix} E [t_{a_{n}, ϕ (n)} (λ, ω)] \\ = E [E [t_{a_{n}, ϕ (n)} (λ, ω) | X^{T^{(a_{n} + ϕ (n) - 1)}}]] \\ = E [E [t_{a_{n}, ϕ (n) - 1} (λ, ω) \frac{e^{λ \sum_{t \in L_{a_{n} + ϕ (n) - 1}} g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})}}{\prod_{t \in L_{a_{n} + ϕ (n) - 1}} E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}]} | X^{T^{(a_{n} + ϕ (n) - 1)}}]] \\ = E [t_{a_{n}, ϕ (n) - 1} (λ, ω) \cdot \frac{E [e^{λ \sum_{t \in L_{a_{n} + ϕ (n) - 1}} g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} ∣ X^{T^{(a_{n} + ϕ (n) - 1)}}]}{\prod_{t \in L_{a_{n} + ϕ (n) - 1}} E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{1_{t}}]}] \\ = E [t_{a_{n}, ϕ (n) - 1} (λ, ω)] = \dots = E [t_{a_{n}, 1} (λ, ω)] = 1 . \end{matrix}

By Markov inequality, (34) and (41), for any $ε > 0$ , we have

\begin{matrix} \sum_{n = 1}^{\infty} P [\frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} ln t_{a_{n}, ϕ (n)} (λ, ω) \geq ε] \\ = \sum_{n = 1}^{\infty} P [t_{a_{n}, ϕ (n)} (λ, ω) \geq exp (| L_{a_{n}}^{a_{n} + ϕ (n)} | \cdot ε)] \\ \leq \sum_{n = 1}^{\infty} exp (- | L_{a_{n}}^{a_{n} + ϕ (n)} | \cdot ε) < \infty . \end{matrix}

According to Borel–Cantelli Lemma and arbitrariness of $ε$ , we have

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} ln t_{a_{n}, ϕ (n)} (λ, ω) \leq 0 a . e . . \end{matrix}

Noticing that

\begin{matrix} \frac{ln t_{a_{n}, ϕ (n)} (λ, ω)}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} & = \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) \\ - ln E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}]} . \end{matrix}

by (43) and (44), we have

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) \\ - ln E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}]} \leq 0 a . e . . \end{matrix}

Let $0 < λ \leq α$ , dividing both sides of (45) by $λ$ , we have

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) \\ - \frac{ln E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}]}{λ}} \leq 0 a . e . . \end{matrix}

By (37), (46), and inequalities $ln x \leq x - 1 (x > 0)$ and $0 \leq e^{x} - 1 - x \leq \frac{1}{2} x^{2} e^{| x |}$ , we get

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) - E [g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}]} \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {\frac{ln E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}]}{λ} \\ - E [g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}]} \\ \leq \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {\frac{E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} | X_{t}] - 1}{λ} \\ - \frac{E [λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}]}{λ}} \\ = \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \\ \frac{E [e^{λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}})} - 1 - λ g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}]}{λ} \\ \leq \frac{λ}{2} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} E [g_{t}^{2} (X_{t}, X_{t^{1}}, X_{t^{2}}) e^{| λ | \cdot | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |} | X_{t}] \\ \leq \frac{λ}{2} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} E [g_{t}^{2} (X_{t}, X_{t^{1}}, X_{t^{2}}) e^{| α | \cdot | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |} | X_{t}] \\ = \frac{λ}{2} M (α ; ω) a . e . ω \in D (α) . \end{matrix}

Letting $λ \to 0^{+}$ in (47) we have

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) - E [g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}]} \\ \leq 0 a . e . ω \in D (α) . \end{matrix}

Let $- α \leq λ < 0$ , we similarly get

\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} {g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) - E [g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}]} \\ \geq 0 a . e . ω \in D (α) . \end{matrix}

Combining (48) and (49), we obtain (38) directly. $□$

Lemma 4.2

Let $T_{2}$ be a binary tree, ${a_{n}, n \geq 0}$ and ${ϕ (n), n \geq 0}$ defined as in Lemma 4.1. Let ${a_{t}, t \in T}$ be a collection of real numbers, and a be a real number. If

\begin{matrix} lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{t \in T^{(n - 1)}} | a_{t} - a | = 0, \end{matrix}

then

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} | a_{t} - a | = 0 . \end{matrix}

Proof

Noticing that

\begin{matrix} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} | a_{t} - a | \leq \frac{| T^{(a_{n} + ϕ (n))} |}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \frac{1}{| T^{(a_{n} + ϕ (n))} |} \sum_{t \in T^{(a_{n} + ϕ (n) - 1)}} | a_{t} - a | . \end{matrix}

Since $T_{2}$ is a binary tree, we have

\begin{matrix} lim_{n \to \infty} \frac{| T^{(a_{n} + ϕ (n))} |}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} = lim_{n \to \infty} \frac{2^{a_{n} + ϕ (n) + 1} - 1}{2^{a_{n}} (2^{ϕ (n) + 1} - 1)} = 1 . \end{matrix}

Equation (51) immediately follows from (50), (52) and (53). $□$

Now, we present the proof of Theorem 3.1 as follows.

Proof of Theorem 3.1

It is easy to see from (24) that $lim_{n \to \infty} ϕ (n) = \infty$ and (34) is satisfied. By (25) and Lemma 4.2, we have

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} | P_{t} (y_{1}, y_{2} | x) - P (y_{1}, y_{2} | x) | = 0 . \end{matrix}

Let $g_{t} (x, y_{1}, y_{2}) = I_{k} (y_{1})$ in Lemma 4.1. Obviously, ${g_{t} (x, y_{1}, y_{2}), t \in T_{2}}$ are uniformly bounded. Since

\begin{matrix} H_{a_{n}, ϕ (n)} (ω) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} I_{k} (X_{t^{1}}) = S_{k}^{1} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}), \end{matrix}

and

\begin{matrix} G_{a_{n}, ϕ (n)} (ω) = & \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} E [g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) | X_{t}] \\ = & \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} g_{t} (X_{t}, x_{t^{1}}, x_{t^{2}}) \cdot P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ = & \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} I_{k} (x_{t^{1}}) \cdot P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) . \end{matrix}

From Lemma 4.1, we have

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} {S_{k}^{1} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) \\ - \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} I_{k} (x_{t^{1}}) P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t})} = 0 a . e . . \end{matrix}

From (54), it can be easily verified that

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} {\sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} I_{k} (x_{t^{1}}) P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ - \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} I_{k} (x_{t^{1}}) P (x_{t^{1}}, x_{t^{2}} | X_{t})} = 0 . \end{matrix}

Since $\sum_{x_{t^{2}} \in G} P (x_{t^{1}}, x_{t^{2}} | X_{t}) = P_{1} (x_{t^{1}} | X_{t})$ , so

\begin{matrix} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} I_{k} (x_{t^{1}}) P (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} P_{1} (k | X_{t}) \\ = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{l = 0}^{b - 1} I_{l} (X_{t}) P_{1} (k | l) \\ = \sum_{l = 0}^{b - 1} P_{1} (k | l) S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) . \end{matrix}

By (57)–(59), we have

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} {S_{k}^{1} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) - \sum_{l = 0}^{b - 1} P_{1} (k | l) S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 1})} = 0 a . e . . \end{matrix}

Let $g_{t} (x, y_{1}, y_{2}) = I_{k} (y_{2})$ in Lemma 4.1, similarly, we obtain that

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} {S_{k}^{2} (L_{a_{n}}^{a_{n} + ϕ (n) - 1}) - \sum_{l = 0}^{b - 1} P_{2} (k | l) S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 1})} = 0 a . e . . \end{matrix}

Adding (60) and (61), and noticing that

\begin{matrix} 0 \leq lim_{n \to \infty} \frac{S_{k} (L_{a_{n}})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \leq lim_{n \to \infty} \frac{| L_{a_{n}} |}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} = lim_{n \to \infty} \frac{2^{a_{n}}}{2^{a_{n}} (2^{ϕ (n)} - 1)} = 0, \end{matrix}

$lim_{n \to \infty} \frac{| L_{a_{n}}^{a_{n} + ϕ n)} |}{| L_{a_{n}}^{a_{n} + ϕ (n) - 1} |} = 2$ , and $Q = \frac{1}{2} (P_{1} + P_{2})$ . By (23), we have

\begin{matrix} lim_{n \to \infty} {\frac{S_{k} (L_{a_{n}}^{a_{n} + ϕ (n)})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} - \sum_{l = 0}^{b - 1} Q (k | l) \frac{S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 1})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 1} |}} = 0 a . e . \end{matrix}

Letting $ϕ^{'} (n) = ϕ (n) - 1$ , it is easy to see that ${ϕ^{'} (n), n \geq 0}$ also satisfies (34). Using the same argument as that used to derive (62), we can prove that

\begin{matrix} lim_{n \to \infty} {\frac{S_{k} (L_{a_{n}}^{a_{n} + ϕ (n) - 1})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 1} |} - \sum_{l = 0}^{b - 1} Q (k | l) \frac{S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 2})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 2} |}} = 0 a . e . . \end{matrix}

Multiplying the k-th equality of (63) by Q(j|k), adding them together and using (62), we have

\begin{matrix} 0 = & lim_{n \to \infty} [\sum_{k = 0}^{b - 1} \frac{S_{k} (L_{a_{n}}^{a_{n} + ϕ (n) - 1})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 1} |} Q (j | k) - \sum_{k = 0}^{b - 1} \sum_{l = 0}^{b - 1} \frac{S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 2})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 2} |} Q (k | l) Q (j | k)] \\ = & lim_{n \to \infty} {[\sum_{k = 0}^{b - 1} \frac{S_{k} (L_{a_{n}}^{a_{n} + ϕ (n) - 1})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 1} |} Q (j | k) - \frac{S_{j} (L_{a_{n}}^{a_{n} + ϕ (n)})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |}] \\ + [\frac{S_{j} (L_{a_{n}}^{a_{n} + ϕ (n)})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} - \sum_{k = 0}^{b - 1} \sum_{l = 0}^{b - 1} \frac{S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 2})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 2} |} Q (k | l) Q (j | k)]} \\ = & lim_{n \to \infty} [\frac{S_{j} (L_{a_{n}}^{a_{n} + ϕ (n)})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} - \sum_{l = 0}^{b - 1} \frac{S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - 2})}{| L_{a_{n}}^{a_{n} + ϕ (n) - 2} |} Q^{(2)} (j | l)] a . e . . \end{matrix}

where $Q^{(N)} (j | l)$ is the N-step transition probability determined by Q. By induction, we have

\begin{matrix} lim_{n \to \infty} [\frac{S_{j} (L_{a_{n}}^{a_{n} + ϕ (n)})}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} - \sum_{l = 0}^{b - 1} \frac{S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - N})}{| L_{a_{n}}^{a_{n} + ϕ (n) - N} |} Q^{(N)} (j | l)] = 0 a . e . . \end{matrix}

Noticing that

\begin{matrix} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n) - N} |} \sum_{l = 0}^{b - 1} S_{l} (L_{a_{n}}^{a_{n} + ϕ (n) - N}) = 1, \end{matrix}

and

\begin{matrix} lim_{N \to \infty} Q^{(N)} (j | l) = π (j), j \in G . \end{matrix}

(26) follows from (65), (66) and (67). This completes the proof of the Theorem 3.1.

$□$

Before presenting the proof of Theorem 3.2, we cite a lemma which will be used.

Lemma 4.3

(Dong et al. [11]) Let $T_{2}$ be a binary tree, $φ (x)$ be a bounded function defined on interval $△$ , and $φ$ be continuous at $x = b (b \in △)$ . Let ${b_{t}, t \in T_{2}}$ be a collection of real numbers. If

\begin{matrix} lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{t \in T^{(n - 1)}} | b_{t} - b | = 0, \end{matrix}

then

\begin{matrix} lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{t \in T^{(n - 1)}} | φ (b_{t}) - φ (b) | = 0 . \end{matrix}

Proof of Theorem 3.2

Since ${a_{n}, n \geq 0}$ is bounded, then there exists $M \geq 0$ such that $| a_{n} | \leq M$ for all $n \geq 0$ . Since

\begin{matrix} E [e^{| ln P (X^{L_{a_{n}}}) |}] = \sum_{x^{L_{a_{n}}}} e^{- ln P (X^{L_{a_{n}}} = x^{L_{a_{n}}})} P (X^{L_{a_{n}}} = x^{L_{a_{n}}}) \leq b^{| L_{a_{n}} |} . \end{matrix}

It is easy to see from (24) that ${ϕ (n), n \geq 0}$ satisfies (34). By Markov inequality and (34), we have for every $ε > 0$ ,

\begin{matrix} \sum_{n = 1}^{\infty} P [\frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} ln P (X^{L_{a_{n}}}) \geq ε] \leq b^{2^{M}} \sum_{n = 1}^{\infty} exp {- ε | L_{a_{n}}^{a_{n} + ϕ (n)} |} < \infty . \end{matrix}

By Borel–Cantelli lemma, we get

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} ln P (X^{L_{a_{n}}}) = 0 a . e . . \end{matrix}

Let $φ (x) = x log x (φ (0) = 0)$ . It is easy to see that $φ (x)$ is a continuous function on the interval [0, 1]. By Lemmas 4.2, 4.3 and (25), we have $\forall k_{1}, k_{2}, l \in G$

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} |P_{t} (k_{1}, k_{2} | l) ln P_{t} (k_{1}, k_{2} | l)) \\ (- P (k_{1}, k_{2} | l) ln P (k_{1}, k_{2} | l)| = 0 . \end{matrix}

Let $g_{t} (x, y_{1}, y_{2}) = ln P_{t} (y_{1}, y_{2} | x)$ for all $t \in T_{2}$ in Lemma 4.1. By (35) and (36), we have

\begin{matrix} H_{a_{n}, ϕ (n)} (ω) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln P_{t} (X_{t^{1}}, X_{t^{2}} | X_{t}), \end{matrix}

\begin{matrix} G_{a_{n}, ϕ (n)} (ω) = \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) ln P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) . \end{matrix}

Letting $α = \frac{1}{2}$ , noticing that for any $t \in T_{2}$ , we have

\begin{matrix} E [g_{t}^{2}, (X_{t}, X_{t^{1}}, X_{t^{2}}), e^{α | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |}, |, X_{t}] \\ = \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} {ln}^{2} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \cdot e^{- \frac{1}{2} ln P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t})} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ = \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} {ln}^{2} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) {[P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t})]}^{\frac{1}{2}} \\ \leq 16 b^{2} e^{- 2} . \end{matrix}

and $\forall t \in T_{2}$ ,

\begin{matrix} E [e^{\frac{1}{2} | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |} | X_{t}] < \infty . \end{matrix}

Thus

\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} E [g_{t}^{2} (X_{t}, X_{t^{1}}, X_{t^{2}}) \cdot e^{\frac{1}{2} | g_{t} (X_{t}, X_{t^{1}}, X_{t^{2}}) |} | X_{t}] \\ \leq 16 b^{2} e^{- 2} . \end{matrix}

By (73)–(76) and Lemma 4.1, we have

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} {\sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln P_{t} (X_{t^{1}}, X_{t^{2}} | X_{t}) \\ - \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \cdot ln P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t})} = 0 a . e . . \end{matrix}

Now, we have

\begin{matrix} | \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{(x_{t^{1}}, x_{t^{2}}) \in G^{2}} P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \cdot ln P_{t} (x_{t^{1}}, x_{t^{2}} | X_{t}) \\ - \frac{1}{2} \sum_{l = 0}^{b - 1} π (l) \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) ln P (k_{1}, k_{2} | l) | \\ \leq | \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} I_{l} (X_{t}) P_{t} (k_{1}, k_{2} | l) \cdot ln P_{t} (k_{1}, k_{2} | l) \\ - \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} I_{l} (X_{t}) P (k_{1}, k_{2} | l) \cdot ln P (k_{1}, k_{2} | l) | \\ + | \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} I_{l} (X_{t}) P (k_{1}, k_{2} | l) \cdot ln P (k_{1}, k_{2} | l) \\ - \frac{1}{2} \sum_{l = 0}^{b - 1} π (l) \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) ln P (k_{1}, k_{2} | l) | \\ \leq \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} | P_{t} (k_{1}, k_{2} | l) \cdot ln P_{t} (k_{1}, k_{2} | l) \\ - P (k_{1}, k_{2} | l) \cdot ln P (k_{1}, k_{2} | l) | + \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) \cdot ln P (k_{1}, k_{2} | l) \\ |\frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{an}^{a_{n} + ϕ (n) - 1}} I_{l} (X_{t}) - \frac{1}{2} π (l)| \\ \leq \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} | P_{t} (k_{1}, k_{2} | l) \cdot ln P_{t} (k_{1}, k_{2} | l) \\ - P (k_{1}, k_{2} | l) \cdot ln P (k_{1}, k_{2} | l) | + \sum_{l = 0}^{b - 1} \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) \cdot ln P (k_{1}, k_{2} | l) \\ |\frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} S_{l} (L_{an}^{a_{n} + ϕ (n) - 1}) - \frac{1}{2} π (l)| a . e . . \end{matrix}

By Theorem 3.1, (72), (77) and (78), and noticing that $lim_{n \to \infty} \frac{| L_{a_{n}}^{a_{n} + ϕ (n) - 1} |}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} = \frac{1}{2}$ . We have

\begin{matrix} lim_{n \to \infty} \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln P_{t} (X_{t^{1}}, X_{t^{2}} | X_{t}) \\ = \frac{1}{2} \sum_{l = 0}^{b - 1} π (l) \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) ln P (k_{1}, k_{2} | l) a . e . . \end{matrix}

(27) can be obtained from (17), (71) and (79), which completes the proof of the theorem 3.2. $□$

Proof of Corollary 3.1

Let $\forall t \in T_{2}$ and $\forall x, y_{1}, y_{2} \in G, P_{t} (y_{1}, y_{2} | x) = Q_{t^{1}} (y_{1} | x) Q_{t^{2}} (y_{2} | x)$ . From Remark 2.4 we know that nonhomogeneous Markov chain indexed by a binary tree given in this corollary is a nonhomogeneous bifurcating Markov chain indexed by a binary tree with the stochastic matrices ${P_{t} = (P_{t} (y_{1}, y_{2} | x)), t \in T_{2}}$ , and

\begin{matrix} g_{a_{n}, ϕ (n)} (ω) = \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} [ln P (X^{L_{a_{n}}}) + \sum_{t \in L_{a_{n} + 1}^{a_{n} + ϕ (n)}} ln Q_{t} (X_{t} | X_{1_{t}})] \\ = - \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} [ln P (X^{L_{a_{n}}}) + \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln Q_{t^{1}} (X_{t^{1}} | X_{t}) \\ + \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln Q_{t^{2}} (X_{t^{2}} | X_{t})] \\ = - \frac{1}{| L_{a_{n}}^{a_{n} + ϕ (n)} |} [ln P (X^{L_{a_{n}}}) + \sum_{t \in L_{a_{n}}^{a_{n} + ϕ (n) - 1}} ln P_{t} (X_{t^{1}}, X_{t^{2}} | X_{t})] \\ = f_{a_{n}, ϕ (n)} (ω) . \end{matrix}

Let $P (y_{1}, y_{2} | x) = Q (y_{1} | x) Q (y_{2} | x)$ . It is easy to see that $P_{1} = Q, P_{2} = Q, \frac{1}{2} (P_{1} + P_{2}) = Q$ , and Q is ergodic. Since

\begin{matrix} | P_{t} (k_{1}, k_{2} | l) - P (k_{1}, k_{2} | l) | \\ = | Q_{t^{1}} (k_{1} | l) Q_{t^{2}} (k_{2} | l) - Q (k_{1} | l) Q (k_{2} | l) | \\ \leq | Q_{t 1} (k_{1} | l) Q_{t^{2}} (k_{2} | l) - Q (k_{1} | l) Q_{t^{2}} (k_{2} | l) | + | Q (k_{1} | l) Q_{t^{2}} (k_{2} | l) \\ - Q (k_{1} | l) Q (k_{2} | l) | \\ \leq | Q_{t^{1}} (k_{1} | l) - Q (k_{1} | l) | + | Q_{t^{2}} (k_{2} | l) - Q (k_{2} | l) |, \end{matrix}

and by (29), for $i = 1, 2,$

\begin{matrix} lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{t \in T^{(n - 1)} \ {o}} | Q_{t^{i}} (k_{1} | l) - Q (k_{1} | l) | \\ \leq lim_{n \to \infty} \frac{1}{| T^{(n)} |} \sum_{t \in T^{(n)} \ {o}} | Q_{t} (k_{1} | l) - Q (k_{1} | l) | = 0, \end{matrix}

Thus (25) follows from (81), (82). By Theorem 3.2 and (80),

\begin{matrix} lim_{n \to \infty} g_{a_{n}, ϕ (n)} (ω) \\ = lim_{n \to \infty} f_{a_{n}, ϕ (n)} (ω) \\ = - \frac{1}{2} \sum_{l = 0}^{b - 1} π (l) \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} P (k_{1}, k_{2} | l) ln P (k_{1}, k_{2} | l) \\ = - \frac{1}{2} \sum_{l = 0}^{b - 1} π (l) \sum_{k_{1} = 0}^{b - 1} \sum_{k_{2} = 0}^{b - 1} Q (k_{1} | l) Q (k_{2} | l) \cdot [ln Q (k_{1} | l) + ln Q (k_{2} | l)] \\ = - \sum_{l = 0}^{b - 1} \sum_{k = 0}^{b - 1} π (l) Q (k | l) ln Q (k | l) a . e . . \end{matrix}

Thus, (30) holds. $□$

Acknowledgements

The authors sincerely thank the editor and reviewers for their helpful and important comments, especially during the time with COVID-19 pandemic. The authors are also very thankful to Professor Keyue Ding who helped us to improve the English of this paper greatly. This work is supported by the National Natural Science Foundation of China (11971197, 11601191).

Footnotes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Algoet PH, Cover TM. A sandwich proof of the Shannon–McMillan–Breiman theorem. Ann. Probab. 1988;16(2):899–909. doi: 10.1214/aop/1176991794. [DOI] [Google Scholar]
2.Barron AR. The strong ergodic theorem for densities: generalized Shannon–McMillan–Breiman theorem. Ann. Probab. 1985;13(4):1292–1303. doi: 10.1214/aop/1176992813. [DOI] [Google Scholar]
3.Benjamini I, Peres Y. Markov chains indexed by trees. Ann. Probab. 1994;22:219–243. doi: 10.1214/aop/1176988857. [DOI] [Google Scholar]
4.Berger T, Ye Z. Entropic aspects of random fields on trees. IEEE Trans. Inform. Theory. 1990;36:1006–1018. doi: 10.1109/18.57200. [DOI] [Google Scholar]
5.Billingsley P. Ergodic Theory and Information. New York: Wiley; 1965. [Google Scholar]
6.Breiman L. The individual ergodic theorem of information theory. Ann. Math. Stat. 1957;28(3):809–811. doi: 10.1214/aoms/1177706899. [DOI] [Google Scholar]
7.Chen, D.Y.: Average properties of random walks on Galton-Watson trees. Ann. Inst. Henri Poincare 33(3), 359-369 (1997)
8.Chung KL. The ergodic theorem of information theory. Ann. Math. Stat. 1961;32(2):612–614. doi: 10.1214/aoms/1177705069. [DOI] [Google Scholar]
9.Dang H, Yang WG, Shi ZY. The strong law of large numbers and the entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. IEEE Trans. Inf. Theory. 2015;61(4):1640–164. doi: 10.1109/TIT.2015.2404310. [DOI] [Google Scholar]
10.Dembo A, Mörter P, Sheffied S. Large deviations of Markov chains indexed by random trees. Ann. I. H. Poincare-Fr. 2005;41(6):971–996. doi: 10.1016/j.anihpb.2004.09.005. [DOI] [Google Scholar]
11.Dong Y, Yang WG, Bai JF. The strong law of large numbers and the Shannon-McMillan theorem for nonhomogeneous Markov chains indexed by a Cayley tree. Stat. Probab. Lett. 2011;81(12):1883–1890. doi: 10.1016/j.spl.2011.06.021. [DOI] [Google Scholar]
12.Gaposhkin VF. The law of large numbers for moving averages of independent random variables. Mathematical notes of the Academy of Sciences of the USSR. 1987;42:579–583. [Google Scholar]
13.Guyon, J.: Limit theorems for bifurcating Markov chains. Application to the detection of cellular aging. Ann. Appl. Probab. 17(5-6):1538–1569 (2007)
14.Gray RM, Kieffer JC. Asymptotically mean stationary measures. Ann. Probab. 1980;8:962–973. doi: 10.1214/aop/1176994624. [DOI] [Google Scholar]
15.Huang HL, Yang WG. Strong law of large number for Markov chains indexed by an infinite tree with uniformly bounded degree. Sci. China Ser. A. 2008;51(2):195–202. doi: 10.1007/s11425-008-0015-1. [DOI] [Google Scholar]
16.Lai, T. L.: Limit theorems for moving averages. Probab. Finance Insur 25(4), 1-14 (2004)
17.Lanzinger H. An almost sure limit theorem for moving averages of random variables between the strong law of large numbers and the Erdös–Rényi law. ESAIM-Probab. Stat. 1998;2:163–183. doi: 10.1051/ps:1998106. [DOI] [Google Scholar]
18.Le Gall JF. A conditional limit theorem for tree-indexedrandom walk. Stoch. Proc. Appl. 2006;116:539–567. doi: 10.1016/j.spa.2005.11.008. [DOI] [Google Scholar]
19.Liu W, Yang WG. A extension of Shannon–McMillan theorem and some limit properties for nonhomogeneous Markov chains. Stoch. Proc. Appl. 1996;61:129–145. doi: 10.1016/0304-4149(95)00068-2. [DOI] [Google Scholar]
20.McMillan B. The basic theorems of information theory. Ann. Math. Statist. 1953;24:196–219. doi: 10.1214/aoms/1177729028. [DOI] [Google Scholar]
21.Pemantle R. Antomorphism invariant measure on trees. Ann. Probab. 1992;20:1549–1566. doi: 10.1214/aop/1176989706. [DOI] [Google Scholar]
22.Shannon C. A mathematical theory of communication. Bell. Syst. Tech. J. 1948;27(379–423):623–656. doi: 10.1002/j.1538-7305.1948.tb00917.x. [DOI] [Google Scholar]
23.Shepp LA. A limit law concerning moving averages. Ann. Math. Statist. 1964;35(1):424–428. doi: 10.1214/aoms/1177703767. [DOI] [Google Scholar]
24.Shi ZY, Zhong PP, Fan Y. The Shannon–McMillan theorem for Markov chains indexed by a Cayley tree in random environment. Probab. Eng. Inform. Sc. 2018;32(4):626–639. doi: 10.1017/S0269964817000444. [DOI] [Google Scholar]
25.Shi ZY, Yang WG. Some limit properties for the mth-order nonhomogeneous Markov chains indexed by an rooted Cayley tree. Stat. Probab. Lett. 2010;80:1223–1233. doi: 10.1016/j.spl.2010.03.020. [DOI] [Google Scholar]
26.Telcs A, Wormald N. Branching and tree indexed random walks on fractals. J. Appl. Prob. 1999;36:999–1011. doi: 10.1239/jap/1032374750. [DOI] [Google Scholar]
27.Wang ZZ, Yang WG. The generalized entropy ergodic theorem for nonhomogeneous Markov chains. J. Theor. Probab. 2016;29:761–775. doi: 10.1007/s10959-015-0597-9. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Yang, J., Yang, W.G.: The generalized entropy ergodic theorem for nonhomogeneous Markov chains indexed by a Cayley tree. Chin. Ann. Math. A 41(1): 99-114 (2020). (In Chinese)
29.Yang WG. The asymptotic equipartition property for nonhomogeneous Markov information sources. Probab. Eng. Inform. Sci. 1998;12:509–518. doi: 10.1017/S0269964800005350. [DOI] [Google Scholar]
30.Yang WG, Liu W. Strong law of large numbers for Markov chains fields on a Bethe tree. Statist. Probab. Lett. 2000;49:245–250. doi: 10.1016/S0167-7152(00)00053-5. [DOI] [Google Scholar]
31.Yang WG. Some limit properties for Markov chains indexed by a homogeneous tree. Statist. Probab. Lett. 2003;65:241–250. doi: 10.1016/j.spl.2003.04.001. [DOI] [Google Scholar]
32.Yang WG, Liu W. The asymptotic equipartition property for mth-order nonhomogeneous Markov information sources. IEEE Trans. Inform. Theory. 2004;50(12):3326–3330. doi: 10.1109/TIT.2004.838339. [DOI] [Google Scholar]
33.Yang WG, Ye Z. The asymptotic equipartition property for Markov chains indexed by a Homogeneous tree. IEEE Trans. Inf. Theory. 2007;53(9):3275–3280. doi: 10.1109/TIT.2007.903134. [DOI] [Google Scholar]
34.Yamamoto K. Large deviation theorem for branches of the random binary tree in the Horton–Strahler analysis. SIAM J. Discrete Math. 2020;34(1):938–949. doi: 10.1137/18M1192810. [DOI] [Google Scholar]
35.Ye Z, Berger T. Ergodic, regulary and asymptotic equipartition property of random fields on trees. J Combin. Inform. System Sci. 1996;21:157–184. [Google Scholar]
36.Ye Z, Berger T. Information measures for discrete random fields. Beijing: Science; 1998. [Google Scholar]

[CR1] 1.Algoet PH, Cover TM. A sandwich proof of the Shannon–McMillan–Breiman theorem. Ann. Probab. 1988;16(2):899–909. doi: 10.1214/aop/1176991794. [DOI] [Google Scholar]

[CR2] 2.Barron AR. The strong ergodic theorem for densities: generalized Shannon–McMillan–Breiman theorem. Ann. Probab. 1985;13(4):1292–1303. doi: 10.1214/aop/1176992813. [DOI] [Google Scholar]

[CR3] 3.Benjamini I, Peres Y. Markov chains indexed by trees. Ann. Probab. 1994;22:219–243. doi: 10.1214/aop/1176988857. [DOI] [Google Scholar]

[CR4] 4.Berger T, Ye Z. Entropic aspects of random fields on trees. IEEE Trans. Inform. Theory. 1990;36:1006–1018. doi: 10.1109/18.57200. [DOI] [Google Scholar]

[CR5] 5.Billingsley P. Ergodic Theory and Information. New York: Wiley; 1965. [Google Scholar]

[CR6] 6.Breiman L. The individual ergodic theorem of information theory. Ann. Math. Stat. 1957;28(3):809–811. doi: 10.1214/aoms/1177706899. [DOI] [Google Scholar]

[CR7] 7.Chen, D.Y.: Average properties of random walks on Galton-Watson trees. Ann. Inst. Henri Poincare 33(3), 359-369 (1997)

[CR8] 8.Chung KL. The ergodic theorem of information theory. Ann. Math. Stat. 1961;32(2):612–614. doi: 10.1214/aoms/1177705069. [DOI] [Google Scholar]

[CR9] 9.Dang H, Yang WG, Shi ZY. The strong law of large numbers and the entropy ergodic theorem for nonhomogeneous bifurcating Markov chains indexed by a binary tree. IEEE Trans. Inf. Theory. 2015;61(4):1640–164. doi: 10.1109/TIT.2015.2404310. [DOI] [Google Scholar]

[CR10] 10.Dembo A, Mörter P, Sheffied S. Large deviations of Markov chains indexed by random trees. Ann. I. H. Poincare-Fr. 2005;41(6):971–996. doi: 10.1016/j.anihpb.2004.09.005. [DOI] [Google Scholar]

[CR11] 11.Dong Y, Yang WG, Bai JF. The strong law of large numbers and the Shannon-McMillan theorem for nonhomogeneous Markov chains indexed by a Cayley tree. Stat. Probab. Lett. 2011;81(12):1883–1890. doi: 10.1016/j.spl.2011.06.021. [DOI] [Google Scholar]

[CR12] 12.Gaposhkin VF. The law of large numbers for moving averages of independent random variables. Mathematical notes of the Academy of Sciences of the USSR. 1987;42:579–583. [Google Scholar]

[CR13] 13.Guyon, J.: Limit theorems for bifurcating Markov chains. Application to the detection of cellular aging. Ann. Appl. Probab. 17(5-6):1538–1569 (2007)

[CR14] 14.Gray RM, Kieffer JC. Asymptotically mean stationary measures. Ann. Probab. 1980;8:962–973. doi: 10.1214/aop/1176994624. [DOI] [Google Scholar]

[CR15] 15.Huang HL, Yang WG. Strong law of large number for Markov chains indexed by an infinite tree with uniformly bounded degree. Sci. China Ser. A. 2008;51(2):195–202. doi: 10.1007/s11425-008-0015-1. [DOI] [Google Scholar]

[CR16] 16.Lai, T. L.: Limit theorems for moving averages. Probab. Finance Insur 25(4), 1-14 (2004)

[CR17] 17.Lanzinger H. An almost sure limit theorem for moving averages of random variables between the strong law of large numbers and the Erdös–Rényi law. ESAIM-Probab. Stat. 1998;2:163–183. doi: 10.1051/ps:1998106. [DOI] [Google Scholar]

[CR18] 18.Le Gall JF. A conditional limit theorem for tree-indexedrandom walk. Stoch. Proc. Appl. 2006;116:539–567. doi: 10.1016/j.spa.2005.11.008. [DOI] [Google Scholar]

[CR19] 19.Liu W, Yang WG. A extension of Shannon–McMillan theorem and some limit properties for nonhomogeneous Markov chains. Stoch. Proc. Appl. 1996;61:129–145. doi: 10.1016/0304-4149(95)00068-2. [DOI] [Google Scholar]

[CR20] 20.McMillan B. The basic theorems of information theory. Ann. Math. Statist. 1953;24:196–219. doi: 10.1214/aoms/1177729028. [DOI] [Google Scholar]

[CR21] 21.Pemantle R. Antomorphism invariant measure on trees. Ann. Probab. 1992;20:1549–1566. doi: 10.1214/aop/1176989706. [DOI] [Google Scholar]

[CR22] 22.Shannon C. A mathematical theory of communication. Bell. Syst. Tech. J. 1948;27(379–423):623–656. doi: 10.1002/j.1538-7305.1948.tb00917.x. [DOI] [Google Scholar]

[CR23] 23.Shepp LA. A limit law concerning moving averages. Ann. Math. Statist. 1964;35(1):424–428. doi: 10.1214/aoms/1177703767. [DOI] [Google Scholar]

[CR24] 24.Shi ZY, Zhong PP, Fan Y. The Shannon–McMillan theorem for Markov chains indexed by a Cayley tree in random environment. Probab. Eng. Inform. Sc. 2018;32(4):626–639. doi: 10.1017/S0269964817000444. [DOI] [Google Scholar]

[CR25] 25.Shi ZY, Yang WG. Some limit properties for the mth-order nonhomogeneous Markov chains indexed by an rooted Cayley tree. Stat. Probab. Lett. 2010;80:1223–1233. doi: 10.1016/j.spl.2010.03.020. [DOI] [Google Scholar]

[CR26] 26.Telcs A, Wormald N. Branching and tree indexed random walks on fractals. J. Appl. Prob. 1999;36:999–1011. doi: 10.1239/jap/1032374750. [DOI] [Google Scholar]

[CR27] 27.Wang ZZ, Yang WG. The generalized entropy ergodic theorem for nonhomogeneous Markov chains. J. Theor. Probab. 2016;29:761–775. doi: 10.1007/s10959-015-0597-9. [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Yang, J., Yang, W.G.: The generalized entropy ergodic theorem for nonhomogeneous Markov chains indexed by a Cayley tree. Chin. Ann. Math. A 41(1): 99-114 (2020). (In Chinese)

[CR29] 29.Yang WG. The asymptotic equipartition property for nonhomogeneous Markov information sources. Probab. Eng. Inform. Sci. 1998;12:509–518. doi: 10.1017/S0269964800005350. [DOI] [Google Scholar]

[CR30] 30.Yang WG, Liu W. Strong law of large numbers for Markov chains fields on a Bethe tree. Statist. Probab. Lett. 2000;49:245–250. doi: 10.1016/S0167-7152(00)00053-5. [DOI] [Google Scholar]

[CR31] 31.Yang WG. Some limit properties for Markov chains indexed by a homogeneous tree. Statist. Probab. Lett. 2003;65:241–250. doi: 10.1016/j.spl.2003.04.001. [DOI] [Google Scholar]

[CR32] 32.Yang WG, Liu W. The asymptotic equipartition property for mth-order nonhomogeneous Markov information sources. IEEE Trans. Inform. Theory. 2004;50(12):3326–3330. doi: 10.1109/TIT.2004.838339. [DOI] [Google Scholar]

[CR33] 33.Yang WG, Ye Z. The asymptotic equipartition property for Markov chains indexed by a Homogeneous tree. IEEE Trans. Inf. Theory. 2007;53(9):3275–3280. doi: 10.1109/TIT.2007.903134. [DOI] [Google Scholar]

[CR34] 34.Yamamoto K. Large deviation theorem for branches of the random binary tree in the Horton–Strahler analysis. SIAM J. Discrete Math. 2020;34(1):938–949. doi: 10.1137/18M1192810. [DOI] [Google Scholar]

[CR35] 35.Ye Z, Berger T. Ergodic, regulary and asymptotic equipartition property of random fields on trees. J Combin. Inform. System Sci. 1996;21:157–184. [Google Scholar]

[CR36] 36.Ye Z, Berger T. Information measures for discrete random fields. Beijing: Science; 1998. [Google Scholar]

PERMALINK

The Generalized Entropy Ergodic Theorem for Nonhomogeneous Bifurcating Markov Chains Indexed by a Binary Tree

Zhiyan Shi

Zhongzhi Wang

Pingping Zhong

Yan Fan

Abstract

Introduction

Preliminaries

Fig. 1.

Definition 2.1

Property 2.1

Remark 2.1

Remark 2.2

Remark 2.3

Definition 2.2

Property 2.2

Remark 2.4

Property 2.3

Main Results

Theorem 3.1

Theorem 3.2

Remark 3.1

Remark 3.2

Corollary 3.1

Remark 3.3

Corollary 3.2

Proof

Remark 3.4

Remark 3.5

The Proofs

Lemma 4.1

Remark 4.1

Remark 4.2

Proof

Lemma 4.2

Proof

Proof of Theorem 3.1

Lemma 4.3

Proof of Theorem 3.2

Proof of Corollary 3.1

Acknowledgements

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases