Skip to main content
Entropy logoLink to Entropy
. 2024 Apr 19;26(4):346. doi: 10.3390/e26040346

Evaluating the Gilbert–Varshamov Bound for Constrained Systems

Keshav Goyal 1, Han Mao Kiah 1,*
Editor: T Aaron Gulliver1
PMCID: PMC11049528  PMID: 38667900

Abstract

We revisit the well-known Gilbert–Varshamov (GV) bound for constrained systems. In 1991, Kolesnik and Krachkovsky showed that the GV bound can be determined via the solution of an optimization problem. Later, in 1992, Marcus and Roth modified the optimization problem and improved the GV bound in many instances. In this work, we provide explicit numerical procedures to solve these two optimization problems and, hence, compute the bounds. We then show that the procedures can be further simplified when we plot the respective curves. In the case where the graph presentation comprises a single state, we provide explicit formulas for both bounds.

Keywords: Gilbert–Varshamov bound, constrained codes, asymptotic rates, sliding window constrained codes

1. Introduction

From early applications in magnetic recording systems to recent applications in DNA-based data storage [1,2,3,4] and energy-harvesting [5,6,7,8,9,10], constrained codes have played a central role in enhancing reliability in many data storage and communications systems (see also [11] for an overview). Specifically, for most data storage systems, certain substrings are more prone to errors than others. Thus, by forbidding the appearance of such strings, that is, by imposing constraints on the codewords, the user is able to reduce the likelihood of error. We refer to the collection of words that satisfy the constraints as the constrained space S.

To further reduce the error probability, one can impose certain distance constraints on the codebook. In this work, we focus on the Hamming metric and consider the maximum size of a codebook whose words belong to the constrained space S and whose pairwise distance is at least of a certain value d. Specifically, we study one of the most well-known and fundamental lower bounds of this quantity—the Gilbert–Varshamov (GV) bound.

To determine the GV bound, one requires two quantities: the size of the constrained space, |S|, and, also, the ball volume, that is, the number of words with a distance of at most d1 from a “center” word. In the case where the space is unconstrained, i.e., S={0,1}n, the ball volume does not depend on the center. Then, the GV bound is simply |S|/V, where V is the ball volume of a center. However, for most constrained systems, the ball volume varies with the center. Nevertheless, Kolesnik and Krachkovsky showed that the GV lower bound can be generalized to |S|/4V¯, where V¯ is the average ball volume [12]. This was further improved by Gu and Fuja to |S|/V¯ in [13] (see pp. 242–243 in [11] for additional details). In the same paper [12], they showed the asymptotic rate of average ball volume can be computed via an optimization problem. Later, Marcus and Roth modified the optimization problem by including an additional constraint and variable [14], and the resulting bound, which we refer to as GV-MR bound, improves the usual GV bound. Furthermore, in most cases, the improvement is strictly positive.

However, about three decades later, very few works have evaluated these bounds for specific constrained systems. To the best of our knowledge, in all works that numerically computed the GV bound and/or GV-MR bound, the constrained systems of interest have, at most, eight states [15]. In [15], the authors wrote that “evaluation of the bound required considerable computation”, referring to the GV-MR bound.

In this paper, we revisit the optimization problems defined by Kolesnik and Krachkovsky [12] and Marcus and Roth [14] and develop a suite of explicit numerical procedures that solve these problems. In particular, to demonstrate the feasibility of our methods, we evaluated and plotted the GV and GV-MR bounds for a constrained system involving 120 states in Figure 1b.

Figure 1.

Figure 1

Lower bounds for optimal asymptotic code rates R(δ;S) for the class of sliding-window constrained codes.

We provide a high-level description of our approach. For both optimization problems, we first characterized the optimal solutions as roots of certain equations. Then, using the celebrated Newton–Raphson iterative procedure, we proceeded to find the roots of these equations. However, as the latter equations involved the largest eigenvalues of certain matrices, each Newton–Raphson iteration required the (partial) derivatives of these eigenvalues (in some variables). To resolve this, we made modifications to another celebrated iterative procedure—the power iteration method—and the resulting procedures computed the GV and GV-MR bounds efficiently for a specific relative distance δ. Interestingly, if we plot the bounds for 0δ1, the numerical procedure can be further simplified. Specifically, by exploiting certain properties of the optimal solutions, we provided procedures that use less Newton–Raphson iterations.

Parts of this paper were presented in the IEEE International Symposium on Information Theory (ISIT 2022) [16]. In the next section, we provide the formal definitions and state the optimization problems that compute the GV bound.

2. Preliminaries

Let Σ={0,1} be the binary alphabet and let Σn denote the set of all words of length n over Σ. A labeled graph G=(V,E,L) is a finite directed graph with states V, edgesEV×V, and an edge labeling L:EΣs for some s1. Here, we use viσvj to mean that there is an edge from vi to vj with label σ. The labeled graph G is deterministic if, for each state, the outgoing edges have distinct labels.

A constrained system S is, then, the set of all words obtained by reading the labels of paths in a labeled graph G. We say that G is a graph presentation of S. We further denote the set of all n-length words S by Sn. Alternatively, Sn is the set of all words obtained by reading the labels of (n/s)-length paths in G. Then, the capacity of S, denoted by Cap(S), is given by Cap(S)lim supnlog|Sn|/n. It is well-known that Cap(S) corresponds to the largest eigenvalue of the adjacency matrix AG (see, for example, [11]). Here, AG is a (|V|×|V|)-matrix whose rows and columns are indexed by V. For each entry (u,v)V×V, we set the corresponding entry to be one if (u,v) is an edge, and zero otherwise.

Every constrained system can be presented by a deterministic graph G. Furthermore, any deterministic graph can be transformed into a primitive deterministic graph H such that the capacity of G is same as the capacity of the constrained system presented by some irreducible component (maximal irreducible subgraph) of H (see, for example, Marcus et al. [11]). It should be noted that a graph G is primitive if there exists a positive integer such that (AG) is strictly positive. Therefore, we henceforth assume that our graphs are deterministic and primitive. When |V|=1, we call this a single-state graph presentation and study these graphs in Section 5.

For x,yS, dH(x,y) is the Hamming distance between x and y. We fix 1dn, and a fundamental problem in coding theory is finding the largest subset C of Sn such that dH(x,y)d for all distinct x,yC. Let A(n,d;S) denote the size of largest subset C.

In terms of asymptotic rates, we fix 0δ1, and our task is to find the highest attainable rate, denoted by R(δ), which is given by R(δ;S)lim supnlogA(n,δn;S)/n.

2.1. Review of Gilbert–Varshamov Bound

To define the GV bound, we need to determine the total ball size. Specifically, for xSn and 0rn, we define V(x,r;S)|{ySn:dH(x,y)r}|. We further define T(n,d;S)=xSnV(x,d1;S) . Then, the GV bound, as given by Gu and Fuja [13,17], states that there exists an (n,d;S) code of size at least |Sn|2/T(n,d;S).

In terms of asymptotic rates, there exists a family of (n,δn;S) codes such that their rates approach

RGV(δ)=2Cap(S)T(δ), (1)

where T(δ)lim supnlogT(n,δn;S)/n .

In this paper, our main task is to determine RGV(δ)efficiently. We observe that since Cap(S)=T(0), it suffices to find efficient ways of determining T(δ). It turns out that T(δ) can be found via the solution of a convex optimization problem. Specifically, given a labeled graph G=(V,E,L), we define its product graph G=(V,E,L) as follows:

  • VV×V.

  • For (vi,vj),(vk,v)V, and (σ1,σ2)Σs×Σs, we draw an edge (vi,vj)(σ1,σ2)(vk,v) if and only if both viσ1vk and vjσ2v belong to E.

  • Then, we label the edges in E with the function L:EZ0, where L(vi,vj)(σ1,σ2)(vk,v)=dH(σ1,σ2)/s.

A stationary Markov chain P on a graph G=(V,E,L) is a probability distribution function P:E[0,1] such that eEP(e)=1 and, for any state uG, the sum of the probabilities of the outgoing edges equals the sum of the probabilities of the incoming edges. We denote by M(G) the set of all stationary Markov chains on G. For a state uV, let Eu denote the set of outgoing edges from u in G. The state vector πT=(πu)uV of a stationary Markov chain P on G is defined by πu=eEuP(e). The entropy rate of a stationary Markov chain is defined by

H(P)=uVeEuπuP(e)log(P(e))

Furthermore, T(δ) can be obtained by solving the following optimization problem [12,14]:

T(δ)=supH(P):PM(G×G),eEP(e)D(e)δ. (2)

To this end, we consider the dual problem of (2). Specifically, we define a (|V|2×|V|2)-distance matrix TG×G(y) whose rows and columns are indexed by V. For each entry indexed by eV×V, we set the entry to be zero if eE and we set it to be yD(e) if eE. Then, the dual problem can be stated in terms of the dominant eigenvalue of the matrix TG×G(y).

By applying the reduction techniques from [14], we can reduce the problem size by a factor of two. Formally, in the case of s=1, we define a |V|+12×|V|+12-reduced distance matrix BG×G(y) whose rows and columns are indexed by V(2){(vi,vj):1ij|V|} using the following procedure.

Two states s1=(vi,vj) and s2=(vk,v) in G×G are said to be equivalent if vi=v and vj=vk. The matrix BG×G(y) is then obtained by merging all pairs of equivalent states s1 and s2. That is, we add the column indexed by v2 to the column indexed by v1 and then remove the row and column which are indexed by v2. It should be noted that it may be possible to reduce the size of this matrix BG×G(y) further. However, for the ease of exposition, we did not consider this case in this work.

Following this procedure, we observe that the entries in the matrix BG×G(y) can be described by the rules in Table 1. Moreover, the dominant eigenvalue of BG×G(y) is the same as that of TG×G(y). Then, by strong duality, computing (2) is equivalent to solving the following dual problem [18,19] (see also, [20]):

T(δ)=infδlogy+logΛ(BG×G(y)):0y1. (3)

Here, we use Λ(M) to denote the dominant eigenvalue of matrix M. To simplify further, we write Λ(y;B)Λ(BG×G(y)).

Table 1.

We set the (vi,vj),(vk,v) entry of the matrix BG×G(y) according to subgraph induced by the states vi,vj,vk Gilbert–Varshamov v. Here, σ¯ denotes the complement of σ.

BG×G(y) at Entry (vi,vj),(vk,v) Subgraph Induced by the States {vi,vj,vk,v}
0 graphic file with name entropy-26-00346-i001.jpg graphic file with name entropy-26-00346-i002.jpg graphic file with name entropy-26-00346-i003.jpg graphic file with name entropy-26-00346-i004.jpg graphic file with name entropy-26-00346-i005.jpg
1 graphic file with name entropy-26-00346-i006.jpg graphic file with name entropy-26-00346-i007.jpg graphic file with name entropy-26-00346-i008.jpg
y graphic file with name entropy-26-00346-i009.jpg graphic file with name entropy-26-00346-i010.jpg graphic file with name entropy-26-00346-i011.jpg
2y graphic file with name entropy-26-00346-i012.jpg

Since the objective function in (3) is convex, it follows from standard calculus that any local minimum solution y* in the interval [0,1] is also a global minimum solution. Furthermore, y* is a zero of the first derivative of the objective function. If we consider the numerator of this derivative, then y* is a root of the function

F(y)yΛ(y;B)δΛ(y;B). (4)

In Corollary 1, we showed that there is only one y* such that F(y*)=0 and F(y) is strictly positive for all values of y. Therefore, to evaluate the GV bound for a fixed δ, it suffices to determine y*.

Later, Marcus and Roth [14] improved the GV bound (1) by considering certain subsets of the constrained space S. This entails the inclusion of an additional constraint defined in the optimization problem (2), and, correspondingly, an additional variable in the dual problem (3). Specifically, they considered certain subsets S(p)S where each symbol in the words of S(p) appears with a certain frequency dependent on the parameter p. We describe this in more detail in Section 4.

2.2. Our Contributions

  • (A)

    In Section 3, we develop the numerical procedures to compute T(δ) for a fixed δ and, hence, determine the GV bound (1). Our procedure modifies the well-known power iteration method to compute the derivatives of Λ(y;B). After that, using these derivatives, we apply the classical Newton–Raphson method to determine the root of (4). In the same section, we also study procedures to plot the GV curve, that is, the set {(δ,RGV(δ)):0δ1}. Here, we demonstrate that the GV curve can be plotted without any Newton–Raphson iterations.

  • (B)

    In Section 4, we then develop similar power iteration methods and numerical procedures to compute the GV-MR bound. Similar to the GV curve, we also provide a plotting procedure that uses significantly less Newton–Raphson iterations.

  • (C)

    In Section 5, we provide explicit formulas for the computation of the GV bound and GV-MR bound for graph presentations that have exactly one state but multiple parallel edges.

  • (D)

    In Section 6, we validate our methods by computing the GV and the GV-MR bounds for some specific constrained systems. For comparison purposes, we also plot a simple lower bound that is obtained by using an upper estimate of the ball size. From the plots in Figure 1, Figure 2 and Figure 3, it is also clear that the GV and GV-MR bounds are significantly better. We also observe that the GV bound and GV-MR bound for subblock energy-constrained codes (SECCs) obtained through our procedures improve the GV-type bound given by Tandon et al. (Proposition 12 in [21]).

Figure 2.

Figure 2

Lower bounds for optimal asymptotic code rates R(δ;S) for the class of runlength limited codes.

Figure 3.

Figure 3

Lower bounds for optimal asymptotic code rates R(δ;S) where S is the class of (3,2)-SECCs (subblock energy-constrained codes).

3. Evaluating the Gilbert–Varshamov Bound

In this section, we first describe a numerical procedure that solves (3) and, hence, determine RGV(δ) for fixed values of δ. Then, we show that the procedure can be simplified when we compute the GV curve, that is, the set of points {(δ,RGV(δ)):δ0,1}. Here, we eschew notation and use a,b to denote the interval {x:axb}, if a<b, and the interval {x:bxa} otherwise.

Below, we provide formal description of our procedure to obtain the GV bound for a fixed relative distance δ.

Procedure 1 (GV bound for fixed relative distance).

Input: Adjacency matrix AG, reduced distance matrix BG×G(y), and relative minimum distance δ

Output: GV bound, that is, RGV(δ) as defined in (1)

  • (1)

    Apply the Newton–Raphson method to obtain y* such that F(y*) is approximately zero.

    • Fix the tolerance value ϵ.

    • Set t=0 and pick an initial guess 0yt1.

    • While |ytyt1|>ϵ,

      • Compute the next guess yt+1 as follows:
        yt+1=ytF(yt)F(yt)=ytytΛ(yt;B)δΛ(yt;B))(1δ)Λ(yt;B)+ytΛ(yt;B).
      • In this step, apply the power iteration method to compute Λ(yt;B), Λ(yt;B), and Λ(yt;B).

      • Increment t by one.

    • Set y*yt.

  • (2)

    Determine RGV(δ) using y*. Specifically, compute T(δ)δlogy*+logΛ(y*;B), Cap(S)logΛ(AG), and RGV(δ)2Cap(S)T(δ).

Throughout Section 3 and Section 4, we illustrate our numerical procedures via a running example using the class of sliding window-constrained codes (SWCCs). Formally, we fix a window length L and window weightw, and say that a binary word satisfies the (L,w)-sliding window weight constraint if the number of ones in every consecutive L bits is at least w. We refer to the collection of words that meet this constraint as an (L,w)-SWCC constrained system. The class of SWCCs was introduced by Tandon et al. for the application of simultaneous energy and information transfer [7,10]. Later, Immink and Cai [8,9] studied encoders for this constrained system and provided a simple graph presentation that uses only Lw states.

In the next example, we illustrate how the numerical procedure can be used to compute the GV bound for the value when δ=0.1.

Example 1.

Let L=3 and w=2, and we consider a (3,2)-SWCC constrained system. From [8], we have the following graph presentation with states x11, 101, and  110

Example 1.

Then, the corresponding adjacency and reduced distance matrices are as follows:

AG=110001100,BG×G(y)=12y01000010y01y0000000001001000100000.

To determine the GV bound at δ=0.1, we first approximate the optimal point y* for which δlogy+logΛ(y;B) is minimized.

We apply the Newton–Raphson method to find a zero of the function F(y). Now, with the initial guess y0=0.3, we apply the power iteration method to determine

Λ(0.3;B)=1.659,Λ(0.3;B)=0.694,Λ(0.3;B)=0.183.

Then, we compute that y10.238. Repeating the computations, we have that y20238. Since |y2y1| is less than the tolerance value 105, we set y*=0.238. Hence, we have that T(0.1)=0.9. Applying the power iteration method to either AG or BG×G(0), we compute the capacity of the (3,2)-SWCC constrained system to be Cap(S)=0.551. Then, the GV bound is given by RGV(0.1)=2(0.551)0.9=0.202.

We discuss the convergence issues arising from Procedure 1. We observe that there are two different iterative processes in Step 1, namely, (a) the power iteration method to compute the values Λ(yt;B), Λ(yt;B), and Λ(yt;B), and (b) the Newton–Raphson method that determines the zero of F(y).

  • (a)

    We recall that Λ(y;B) is the largest eigenvalue of the reduced distance matrix BG×G(y). If we apply naive methods to compute this dominant eigenvalue, the computational complexity increases very rapidly with the matrix size. Specifically, if G has M states, then the reduced distance matrix has dimensions Θ(M2)×Θ(M2) and finding its characteristic equation takes O(M6) time. Even then, determining the exact roots of characteristic equations with at least five degrees is generally impossible. Therefore, we turn to the numerical procedures like the ubiquitous power iteration method [22]. However, the standard power iteration method is only able to compute the dominant eigenvalue Λ(y;B). Nevertheless, we can modify the power iteration method to compute Λ(y;B) and its higher order derivatives. In Appendix A, we demonstrate that under certain mild assumptions, the modified power iteration method always converges. Moreover, using the sparsity of the reduced distance matrix, we have that each iteration can be completed in O(M2) time.

  • (b)

    Next, we discuss whether we can guarantee that yt converges to y* as t approaches infinity. Even though the Newton–Raphson method converges in all our numerical experiments, we are unable to demonstrate that it always converges for F(y). Nevertheless, we can circumvent this issue if we are interested in plotting the GV curve. Specifically, if our objective is to determine the curve {(δ,RGV(δ)):δ0,1}, it turns out that we do not need to implement the Newton–Raphson iterations and we discuss this next.

We fix some constrained system S. Let us define its corresponding GV curve to be the set of points GV(S){(δ,RGV(δ)):δ0,1}. Here, we demonstrate that the GV curve can be plotted without any Newton–Raphson iterations.

To this end, we observe that when F(y*)=0, we have that δ=y*Λ(y*;B)/Λ(y*;B). Hence, we eschew notation and define the function

δ(y)yΛ(y;B)/Λ(y;B). (5)

We further define δmax=δ(1)=Λ(1;B)/Λ(1;B). In this section, we prove the following theorem.

Theorem 1.

Let G be the graph presentation for the constrained system S. If we define the function

ρGV(y)2Cap(S)+δ(y)logylogΛ(y;B), (6)

then the corresponding GV curve is given by

GV(S)=(δ(y),ρGV(y)):y[0,1](δ,0):δδmax. (7)

Before we prove Theorem 1, we discuss its implications. It should be noted that to compute δ(y) and ρ(y), it suffices to determine Λ(y;B) and Λ(y;B) using the modified power iteration methods described in Appendix A. In other words, no Newton–Raphson iterations are required. We also have additional computational savings, as we do not need to apply the power iteration method to compute the second derivative Λ(y;B).

Example 2.

We continue our example and plot the GV curve for the (3,2)-SWCC constrained system in Figure 1a. Before plotting, we observe that when y=0, we have (δ(0),ρ(0))=(0,0.551)=(0,Cap(S)), as expected. When y=1, we have δ(1)=δmax=0.313. Indeed, both ρ(1) and RGV(δmax) are equal to zero and we have that RGV(δ)=0 for δδmax.

Next, we compute a set of 100 points on the GV curve. If we apply Procedure 1 to compute RGV(δ) for 100 values of δ in the interval [0,δmax], we require 275 Newton–Raphson iterations and 6900 power iterations to find these points. In contrast, applying Theorem 1, we compute (δ(y),ρ(y)) for 100 values of y in the interval 0,1. This does not require any Newton–Raphson iterations and involves only 2530 power iterations.

To prove Theorem 1, we demonstrate the following lemmas. Our first lemma is immediate from the definitions of RGV, δ, and ρ in (1), (5), and (6), respectively.

Lemma 1.

RGV(δ(y))=ρ(y) for all y[0,1].

The next lemma studies the behaviour of both δ and ρ as functions in y.

Lemma 2.

In terms of y, the functions δ(y) and ρ(y) are monotone increasing and decreasing, respectively. Furthermore, we have that (δ(0),ρ(0))=(0,Cap(S)), (δ(1),ρ(1))=(δmax,0) and RGV(δ)=0 for δδmax.

Proof. 

To simplify notation, we write Λ(y;B), Λ(y;B), and Λ(y;B) as Λ, Λ, and Λ, respectively.

First, we show that δ(y) is positive for 0y<1. Differentiating the expression in (5), we have that δ(y)>0 is equivalent to

Λ(Λ+yΛ)y(Λ)2>0. (8)

We recall that (3) is a convex minimization problem. Hence, the second order derivative of the objective function is always positive. In other words,

δy2+ΛΛ(Λ)2Λ2>0.

Substituting δ with yΛ/Λ and multiplying by yΛ2, we obtain (8), as desired.

Next, we show that ρ is monotone decreasing. We recall that ρ(y)=RGV(δ(y))=Cap(S)T(δ). Since T(δ) yields the asymptotic rate of the total ball size, we have that as y increases, δ(y) increases and so, T(δ) increases. Therefore, ρ(y) decreases, as desired.

Next, we show that ρ(1)=0. When y=1, we have from (6) that ρ(1)=2Cap(S)logΛ(1;B). Now, we recall that BG×G(y) shares the same dominant eigenvalue as the matrix TG×G(y) [12]. Furthermore, it can be verified that when y=1, TG×G(1) is tensor product of AG and AG. That is, TG×G(1)=AGAG. It then follows from standard linear algebra that Λ(1;B)=Λ(1;T)=Λ(AG)2. Thus, logΛ(1;B)=2Cap(S) and ρ(1)=0. In this instance, we also have that T(δmax)=2Cap(S).

Finally, for δδmax, we have that T(δmax)=2Cap(S) and thus, RGV(δ)=0, as required.    □

Theorem 1 is then immediate from Lemmas 1 and 2.

We have the following corollary that immediately follows from Lemma 2. This corollary then implies that y* yields the global minimum for the optimization problem.

Corollary 1.

When 0δδmax=Λ(1,B)Λ(1,B), F(y)yΛ(y;B)δΛ(y;B) has a unique zero in 0,1. Furthermore, F(y) is strictly positive for all y0,1.

4. Evaluating Marcus and Roth’s Improvement of the Gilbert–Varshamov Bound

In [14], Marcus and Roth improved the GV lower bound for most constrained systems by considering subsets S(p) of S where p is some parameter. Here, we focus on the case s=1 and set p to be the normalized frequency of edges whose labels correspond to one. Specifically, we set S(p){xS:wt(x)=p|x|}.

Next, let Sn(p) be the set of all words/paths of length n in S(p) and we define S(p)lim supn1nlog|Sn(p)|.

Similar to before, we define T(p,δ)=lim supn1nlogT(δn,n;Sn(p)). Since Sn(p) is a subset of Sn, it follows from the usual GV argument that there exists a family of (n,δn;S) codes whose rates approach 2S(p)T(p,δ) for all 0p1. Therefore, we have the following lower bound on asymptotic achievable code rates:

RMR(δ)=sup{2S(p)T(p,δ):0p1}. (9)

Now, a key result from [14] is that both S(p) and T(p,δ) can be obtained via two different convex optimization problems. For succinctness, we state the dual formulations of these optimization problems.

First, S(p) can be obtained from the following problem:

S(p)=infplogz+logΛ(CG(z)):z0. (10)

Here, CG(z) is the following (|V|×|V|) matrix CG(z) whose rows and columns are indexed by V. For each entry indexed by e, we set (CG(z))e to be zero if eE, and zL(e) if eE.

As before, we simplify notation by writing Λ(z;C)Λ(CG(z)). Again, following the convexity of (10), we are interested in finding the zero of the following function:

G1(z)zΛ(z;C)pΛ(z;C). (11)

Next, T(p,δ) can be obtained via the following optimization:

T(p,δ)=inf2plogxδlogy+logΛ(DG×G(x,y)):x0,0y1. (12)

Here, DG×G(x,y) is a |V|+12×|V|+12-reduced distance matrix indexed by V(2). To define the entry of matrix DG×G(x,y) indexed by ((vi,vj),(vk,v)), we look at the vertices vi, vj, vk, and v and follow the rules given in Table 2.

Table 2.

We set the (vi,vj),(vk,v) entry of the matrix DG×G(x,y) according to the subgraph induced by the states vi,vj,vk, and v.

DG×G(x,y) at Entry (vi,vj),(vk,v) Subgraph Induced by the States {vi,vj,vk,v}
0 graphic file with name entropy-26-00346-i013.jpg graphic file with name entropy-26-00346-i014.jpg graphic file with name entropy-26-00346-i015.jpg graphic file with name entropy-26-00346-i016.jpg graphic file with name entropy-26-00346-i017.jpg
1 graphic file with name entropy-26-00346-i018.jpg graphic file with name entropy-26-00346-i019.jpg graphic file with name entropy-26-00346-i020.jpg
x2 graphic file with name entropy-26-00346-i021.jpg graphic file with name entropy-26-00346-i022.jpg graphic file with name entropy-26-00346-i023.jpg
xy graphic file with name entropy-26-00346-i024.jpg graphic file with name entropy-26-00346-i025.jpg graphic file with name entropy-26-00346-i026.jpg
2xy graphic file with name entropy-26-00346-i027.jpg

Again, we write Λ(x,y;D)Λ(DG×G(x,y)). Furthermore, following the convexity of (12), we have that if the optimal solution is obtained at x and y, then

G2(x,y)xΛx(x,y;D)2pΛ(x,y;D)=0. (13)
G3(x,y)yΛy(x,y;D)δΛ(x,y;D)=0. (14)

To this end, we consider the function Δ(x)=Λy(x,1;D)/Λ(x,1;D) for x>0 and set δmax=sup{Δ(x):x>0}. As with the previous section, we develop a numerical procedure to solve the optimization problem (9). To this end, we have the following critical observation.

Theorem 2.

For a given δ<δmax, consider the optimization problem

sup{2plogz+2logΛ(z;C)+2plogx+δlogylogΛ(x,y;D):G1(z)=G2(x,y)=G3(x,y)=0}.

If (p*,x*,y*,z*) is an optimal solution, then x*=z*. Furthermore, if 0p*1, then x*,z*0 and 0y*1.

Proof. 

Let λ1,λ2, and λ3 be real-valued variables and we define L(p,x,y,z,λ1,λ2,λ3)G(p,x,y,z)+λ1G1(z)+λ2G2(x,y)+λ3G3(x,y). Using the Lagrangian multiplier theorem, we have that L/p=L/x=L/y=L/z=0 for any optimal solution. Solving these equations with the constraints G1(z)=G2(x,y)=G3(x,y)=0, we have that λ1=λ2=λ3=0 and x=z for any optimal solution.

Now, when p*[0,1], using G1(z)=0, let us define z(p)zΛ(z;C)/Λ(z;C). Then, proceeding as with the proof of Lemma 2, we see that z(p) is monotone increasing with z(0)=0. Therefore, z*=z(p*) is zero.

Similarly, given p* and x*, we use G3(x*,y)=0 to define δ(y)=yΛy(x*,y;D)/Λ(x*,y;D). Again, we can proceed as with the proof of Lemma 2 to show that δ(y) is monotone increasing. Furthermore, since δ(y*)<δmax=δ(1), we have that y*[0,1].    □

Therefore, to determine RMR(δ) for any fixed δ, it suffices to find x, y, z, and p such that G1(z)=G2(x,y)=G3(x,y)=0 and x=z.

Now, the optimization in Theorem 2 does not constrain the values of p. Furthermore, for certain constrained systems, there are instances where p falls outside the interval [0,1]. In this case, instead of solving the optimization problem (9), we set p to be either zero or one, and we solve the corresponding optimization problems (10) and (12). Specifically, if we have p*<0, then we set p*=0 and x*=0, or if p*>1, then we set p*=1 and x*=. Hence, the resulting rates that we obtain are a lower bound for the GV-MR bound.  

Procedure 2 (RMR(δ) for fixed δδmax).

Input: Matrices CG(x), DG(x,y)

Output: RMR(δ) or RLB(δ), where RMR(δ)RLB(δ).

  • (1)
    Apply the Newton–Raphson method to obtain p*,x*, and y* such that G1(x*), G2(x*,y*), and G3(x*,y*) are approximately zero. Specifically, do the following:
    • Fix a tolerance value ϵ
    • Set t=0 and pick an initial guess pt0, xt0, 0yt1.
    • While |ptpt1|+|xtxt1|+|ytyt1|>ϵ ,
      • Compute the next guess pt+1,xt+1,yt+1:
        pt+1xt+1yt+1=ptxtytG1pG1xG1yG2pG2xG2yG3pG3xG3y1G1(xt)G2(xt,yt)G3(xt,yt).
      • Here, apply the power iteration method to compute Λ(xt;C), Λ(xt;C), Λ(xt;C), Λ(xt,yt;D), Λx(xt,yt;D), Λy(xt,yt;D), Λxx(xt,yt;D), Λyy(xt,yt;D), and Λxy(xt,yt;D).
      • Increment t by one.
    • Set p*pt, x*xt, y*yt.
  • (2A)

    If 0p*1, set RMR(δ)2logΛ(x*;C)+δlogy*logΛ(x*,y*;D).

  • (2B)

    Otherwise,

    • If p*<0, set p*0, x*0, and y*solutionofG3(0,y)=0.

    • If p*>1, set p*1, x*, and y*solutionofG3(,y)=0.

    Finally, set RLB(δ)2logΛ(x*;C)+δlogy*logΛ(x*,y*;D).

Remark 1.

Let p* be the value computed at Step 1. When p* falls outside the interval [0,1], we set p*{0,1}, and we argued earlier that the value returned RLB(δ) (at Step 2B) is, at most, RMR(δ). Nevertheless, we conjecture that RLB(δ)=RMR(δ).

As before, we develop a plotting procedure that minimizes the use of Newton–Raphson iterations.

We note that we have three scenarios for Δ(x). If Δ(x) is monotone decreasing, then δmax=limx0Δ(x) and we set x#=0. If Δ(x) is monotone increasing, then δmax=limxΔ(x) and we set x#=. Otherwise, Δ(x) is maximized for some positive value and we set x# to be this value. Next, to obtain the GV-MR curve (see Remark 2); we iterate over x1,x#. It should be noted that if y(x#)<1 or, equivalently, δ(x#)<δmax, we obtain a lower bound on the GV-MR curve by iterating over yy(x#),1. Similar to Theorem 1, we define

ρMR(x)2logΛ(x;C)+δ(x)logy(x)logΛ(x,y(x);D), (15)

and

ρLB(y)2logΛ(x#;C)+δ(y)logylogΛ(x#,y;D). (16)

Finally, we state the following analogue of Theorem 1.

Theorem 3.

We define δmax, x# as before. For x1,x#, we set

p(x)xΛ(x;C)/Λ(x;C),y(x)solutionoofG2(x,y)=0,δ(x)y(x)Λy(x,y(x);D)/Λ(x,y(x);D),

If y(x#)<1, then for yy(x#),1, we set

δ(y)yΛy(x#,y;D)/Λ(x#,y;D),

then, the corresponding GV-MR curve is given by

(δ(x),ρMR(x)):x1,x#{(δ(y),ρLB(y)):yy(x#),1}(δ,0):δδmax. (17)

where ρMR and ρLB are defined in (15) and (16), respectively.

Example 3.

We continue our example and evaluate the GV-MR bound for the (3,2)-SWCC constrained system. In this case, the matrices of interest are

CG(z)=z1000zz00andDG×G(x,y)=x22xy010000x20xy0x2xy000000000x200x2000x200000.

Here, we observe that Δ(x) is a monotone decreasing function and so, we set x#=0.01 and δmax=limx0Δ(x)0.426. If we apply Procedure 2 to compute RMR(δ) for 100 points in 0,δmax, we require 437 Newton–Raphson iterations and 85,500 power iterations. In contrast, we use Theorem 3 to compute (δ(x),ρMR(x)) for 100 values of x in the interval 1,x#. This requires 323 Newton–Raphson iterations and involves 22,296 power iterations. The resulting GV-MR curve is given in Figure 1a.

Remark 2.

Strictly speaking, the GV-MR curve described by (17) may not be equal to the curve defined by the optimization problem (15). Nevertheless, the curve provides a lower bound for the optimal asymptotic code rates and we conjecture that the GV-MR curve described by (17) is a lower bound for the curve defined by the optimization problem (15).

5. Single-State Graph Presentation

In this section, we focus on graph presentations that have exactly one state. Here, we allow these single-state graph presentations to contain the parallel edges and their labels to be binary strings of length possibly greater than one. Now, for these constrained systems, the procedures to evaluate the GV bound and its MR improvements can be greatly simplified. This is because the matrices BG×G(y), CG(z), and DG×G(x,y) are all of dimensions one by one. Therefore, determining their respective dominant eigenvalues is straightforward and does not require the power iteration method. The results in this section follow directly from previous sections and our objective is to provide explicit formulas whenever possible.

Formally, let S be the constrained system with graph presentation G=(V,E,L) such that |V|=1 and L:EΣs with s1 (existing methods that determine the GV bound for constrained systems with |V|1 assume that the edge-labels have single letters, i.e., s=1. In other words, previous methods developed in [12,14] do not apply).

We further define αt#{(x,y)L(E)2:dH(x,y)=t} for 0ts. Then. the corresponding adjacency and reduced distance matrices are as follows:

AG=|E|andBG×G(y)=t0αtyt.

Then, we compute the capacity using its definition as Cap(S)=(log|E|)/s.

To compute T(δ), we consider the following extension of the optimization problem (3) for the case s1:

T(δ)=1sinfδslogy+logλ(y;B):0y1=1sinfδslogy+logt0αtyt:0y1. (18)

As before, following the convexity of the objective function in (18), we have that the optimal y is the zero (in the interval [0,1]) of the function

F(y)t0(tδs)αtyt. (19)

So, for fixed values of δ, we can use the Newton–Raphson procedure to compute the root y of (19), and, hence, evaluate RGV(δ). It should be noted that the power iteration method is not required in this case.

On the other hand, to plot the GV curve, we have the following corollary of Theorem 1.

Corollary 2.

Let G be the single-state graph presentation for a constrained system S. Then, the corresponding GV curve is given by

GV(S)(δ,RGV(δ)):δ[0,1]=(δ(y),ρ(y)):y[0,1](δ,0):δδmax, (20)

where

δmax=t0tαts|E|2,δ(y)=t0tαtytst0αtyt,ρ(y)=1slog|E|2t0αtytt0tαtytt0αtytlogy.

We illustrate this evaluation procedure via an example of the class of subblock energy-constrained codes (SECCs). Formally, we fix a subblock length L and energy constraintw. A binary word x of length mL is said to satisfy the (L,w)-subblock energy constraint if we partition x into m subblocks of length L, then the number of ones in every subblock is at least w. We refer to the collection of words that meet this constraint as an (L,w)-SECC constrained system. The class of SECCs was introduced by Tandon et al. for the application of simultaneous energy and information transfer [7]. Later, in [21], a GV-type bound was introduced (see Proposition 12 in [21] and also, (28)) and we make comparisons with the GV bound (20) in the following example.

Example 4.

Let L=3 and w=2 and we consider a (3,2)-SECC constrained system. It is straightforward to observe that the graph presentation is as follows with the single state x. Here, s=L=3.

Example 4.

Then, the corresponding adjacency and reduced distance matrices are as follows:

AG=4,BG×G(y)=4+6y+6y2.

First, we determine the GV bound at δ=1/3. We observe that F(y)=4+6y2 and, so, the optimal point y for (18) is 2/3 (the unique solution to F(y) in the interval [0,1]). Hence, we have that T(1/3)1.327. On the other hand, the capacity of a (3,2)-SECC constrained system is Cap(S)=2/3. Therefore, the GV bound is given by RGV(1/3)=0.006.

In contrast, the GV-type lower bound given by Proposition 12 in [21] is zero for δ>0.174. Hence, the evaluation of the GV bound yields a significantly better lower bound. In fact, we can show that RGV(δ)>0 for all δδmax=3/8.

To plot the GV curve, using the fact that δmax=3/8, we have that

GV(S)=y+2y22+3y+3y2,13log82+3y+3y2+3y+6y22+3y+3y2logy:y[0,1](δ,0):δ38.

We plot the curve in Section 6.

From this example, we see that our methods yield better lower bounds in terms of asymptotic coding rates for a specific pair of (L,w). It is open to determine how much improvement can be achieved for general pairs of L and w.

Next, we evaluate the GV-MR bound. To this end, we consider some proper subset PE and define

αt#{(x,y)L(E)2:dH(x,y)=t,x,yP},βt#{(x,y)L(E):dH(x,y)=t,(xP,yP)or(xP,yP)},γt#{(x,y)L(E):dH(x,y)=t,x,yP}.

Then, we consider the following matrices:

CG(z)=|E||P|+|P|zandDG×G(x,y)=t0(αtx2+βtx+γt)yt.

Setting p to be the normalized frequency of edges in P, we obtain S(p) by solving the optimization problem (10).

Specifically, we have that

S(p)=1sH(p)+p+log|P|+(1p)log(|E||P|), (21)

and this value is achieved when

z=p(|E||P|)(1p)|P|. (22)

To compute T(p,δ), we consider the following extension of the optimization problem (12) for the case s1.

T(p,δ)=1sinf2plogxδslogy+logλ(y;D):0y1=1sinf2plogxδslogy+logt0(αtx2+βtx+γt)yt:0y1. (23)

As before, following the convexity of the objective function in (23), we have that the optimal x and y are the zeroes (in the interval [0,1]) of the functions

G2(x,y)2(1p)(t0αtyt)x2+(12p)(t0βtyt)x2p(t0γtyt)G3(x,y)t0(tδs)(αtx2+βtx+γt)yt (24)

So, for fixed values of p and δ, we can use the Newton–Raphson procedure to compute the roots x and y of (24), and, hence, evaluate RGV(p,δ). It should be noted that the power iteration method is not required in this case. We find x# as defined in Section 4 and set

ρMR(x)2log(|E||P|+|P|x)+δ(x)logy(x)logt0(αtx2+βtx+γt)y(x)t. (25)

Furthermore, if y(x#)<1, we set

ρLB(y)2log(|E||P|+|P|x#)+δ(y)logylogt0(αt(x#)2+βtx#+γt)yt. (26)

Next, to plot the GV-MR curve, we have the following corollary of Theorem 3.

Corollary 3.

Let G be the single-state graph presentation for a constrained system S. For x1,x#, we set

p(x)=|P|x(|E||P|)+|P|x),δ(x)=t1t(αtx2+βtx+γt)y(x)tst0(αtx2+βtx+γt)y(x)t,

where y(x) is the smallest root of the equation

2(|E||P|)(t0αtyt)x+(|E||P||P|x)(t0βtyt)2|P|(t0γtyt)=0.

If y(x#)<1, then for yy(x#),1, we set

δ(y)=t1t(αt(x#)2+βtx#+γt)ytst0(αt(x#)2+βtx#+γt)yt,

Then, the corresponding GV-MR curve is given by

(δ(x),ρMR(x)):x1,x#{(δ(y),ρLB(y)):yy(x#),1}(δ,0):δδmax. (27)

where ρMR and ρLB are defined in (25) and (26), respectively.

Example 5.

We continue our example and evaluate the GV-MR bound for the (3,2)-SECC constrained system. We have the following single-state graph presentation:

Example 5.

Then, the matrices of interest are:

CG=1+3z,DG×G(x,y)=(3+6y2)x2+6xy+1.

Since CG and DG×G(x,y) are both singleton matrices, we have Λ(z;C)=1+3z and Λ(x,y;D)=(3+6y2)x2+6xy+1. Then, G1(z)=p(1+3z)+3z, G2(x,y)=3(1+2y2)x2(1p)+3xy(12p)p and G3(x,y)=4x2y23δ(1+2y2)x2+2xy(13δ)δ. Now, we apply Theorem 2 and express p,y, and δ in terms of x where x[1,x#] where x#.

p=3x(1+3x)y=x12xδ=2x(x1)(9x21)

Now, we observe that we have y(x#)=1/2. Since we can still increase y to 1, we apply the GV bound with p=1 and x=z=x# once we reach the boundary that is p=1. Hence, at the boundary, we solve the following problem:

S(1)=2log3T(1,δ)=inf2logx3δlogy+log(3(1+2y2)x2+6xy+1):1/2y1;x=x#=inf3δlogy+log3+log(1+2y2):1/2y1RMR(δ)=S(1)T(1,δ).

By setting F(y)=3δ(1+2y2)+4y2=0, we get δ=4y2/3(1+2y2) where y[1/2,1] and we plot the respective curve.

6. Numerical Plots

In this section, we apply our numerical procedures to compute the GV and the GV-MR bounds for some specific constrained systems. In particular, we consider the (L,w)-SWCC constrained systems defined in Section 3, the ubiquitous (d,k)-runlength limited systems (see, for example, p. 3 in [11]) and the (L,w)-subblock energy constrained codes recently introduced in [7]. In addition to the GV and GV-MR curves, we also plot a simple lower bound. For each δ0,1/2, any ball size is at most 2H(δn). So, for any constrained system S, we have that T˜(δ)Cap(S)+H(δ). Therefore, we have that

R(δ;S)Cap(S)H(δ). (28)

From the plots in Figure 1, Figure 2 and Figure 3, it is also clear that the computations of (7) and (17) yield a significantly better lower bound.

6.1. (L,w)-Sliding Window Constrained Codes

We fix L and w. We recall from Section 3 that a binary word satisfies the (L,w)-sliding window weight constraint if the number of ones in every consecutive L bits is at least w and the (L,w)-SWCC constrained system refers to the collection of words that meet this constraint. From [8,9], we have a simple graph presentation that uses only Lw states. To validate our methods, we choose (L,w){(3,2),(10,7)} and the corresponding graph presentations have 3 and 120 states, respectively. Applying the plotting procedures described in Theorems 1 and 3, we obtain Figure 1.

6.2. (d,k)-Runlength Limited Codes

Next, we revisit the ubiquitous runlength constraint. We fix d and k. We say that a binary word satisfies the (d,k)-RLL constraint if each run of zeroes in the word has a length of at least d and at most k. Here, we allow the first and last runs of zeroes to have a length of less than d. We refer to the collection of words that meet this constraint as a (d,k)-RLL constrained system. It is well known that a (d,k)-RLL constrained system has the graph presentation with k+1 states (see, for example, [11]). Here, we choose (d,k){(1,3),(3,7)} to validate our methods and apply Theorems 1 and 3 to obtain Figure 2. For (d,k)=(3,7), we corroborate our results with those derived in [15]. Specifically, Winick and Yang determined the GV bound (1) for the (3,7)-RLL constraint and remarked that the “evaluation of the (GV-MR) bound required considerable computation” for “a small improvement”. In Table 3, we verify this statement.

Table 3.

Comparison of the GV-MR bound with lower bound [15] for (3,7)-RLL constrained systems.

δ GV-MR Bound (15) GV Bound [15] (see Equation (1))
0 0.406 0.406
0.05 0.255 0.225
0.1 0.163 0.163
0.15 0.095 0.094
0.2 0.048 0.044
0.25 0.018 0.012

6.3. (L,w)-Subblock Energy-Constrained Codes

We fix L and w. We recall from Section 5 that a binary word satisfies the (L,w)-subblock energy constraint if each subblock of length L has a weight of at least w and the (L,w)-SECC constrained system refers to the collection of words that meet this constraint. Then, the corresponding graph presentation has a single state x with i=0wLi edges, where each edge is labeled by a word of length L and weight at least w. We apply the methods in Section 5 to determine the GV and GV-MR bounds.

For the GV bound, we provide the explicit formula for αt and proceed as in Example 4.

αt=Lt(|E|j=1tk=0j21Ltwj+ktk) (29)

Similarly, for GV-MR bound, we provide the explicit formula for αt, βt, and γt and proceed as in Example 5.

αt=LwLwi/2wi/2iftiseven,otherwise,αt=0. (30)
βt=2Lwj=1t2Lwtjwj2αt (31)
γt=Lt(|E|j=1tk=0j21Ltwj+ktk)αtβt (32)

In Figure 3, we plot the GV bound and GV-MR bounds. We remark that the simple lower bound (28) corresponds to Proposition 12 in [21].

Acknowledgments

The authors would also like to thank the assistant editor for her skillful handling and the anonymous reviewers for their valuable suggestions.

Appendix A. Power Iteration Method for Derivatives of Dominant Eigenvalues

Throughout this appendix, we assume that A is a diagonalizable matrix with dominant eigenvalue λ1 and whose corresponding eigenspace has dimension one. Let e1 be the unit eigenvector whose entries are positive in this space. Then, the power iteration method is a well-known numerical procedure that finds the dominant eigenvalue λ1 and the corresponding eigenvector e1 efficiently.

Now, in the preceding sections, the entries in the matrix A are given functions in either one or two variables and, thus, the dominant eigenvalue λ1 is a function in the same variables. Moreover, the numerical procedures in these sections require us to compute the higher order (partial) derivatives of this dominant eigenvalue function λ1. To the best of our knowledge, we are unaware of any algorithms or numerical procedures that estimate the values of these derivatives. Hence, in this appendix, we modify the power iteration method to compute these estimates.

Formally, let A be an irreducible nonnegative diagonalizable square matrix with dominant eigenvalue λ1 and corresponding unit eigenvector e1. Since A is diagonalizable, A has n eigenvectors e1,e2,,en that form an orthonormal basis for Rn. Let λ1,λ2,,λn be the corresponding eigenvalues and, so, we have that

Aei=λieiforalli=1,2,...,n. (A1)

Since A is irreducible, the dominant eigenspace has dimension one and, also, the dominant eigenvalue is real and positive. Therefore, we can assume that λ1>|λ2||λn|.

We first assume that the entries of A are functions in the variable z. Hence, λi and the entries of ei are functions in z too. Then Power Iteration I then evaluates both λ1 and λ1 for some fixed value of z, while Power Iteration II additionally evaluates the second order derivative λ1.

The case where the entries of A are functions in two variables x and y is discussed at the end of the appendix. Here, Power Iteration III evaluates higher order partial derivatives of λ1 for certain fixed values of x and y. For ease of exposition, we provide detailed proofs for the correctness of Power Iteration I and the proofs can be extended for Power Iteration II and Power Iteration III.

We continue our discussion where the entries of A are univariate functions in z. We differentiate each entry of A with respect to z to obtain the matrix A. Furthermore, for all 1in, we differentiate each entry of eigenvectors ei and the eigenvalue λi to obtain ei and λi, respectively. Specifically, it follows from (A1) that

Aei+Aei=λiei+λieiforalli=1,2,,n. (A2)

Then, the following procedure computes both λ1 and λ1.

Power Iteration I.

Input: Irreducible nonnegative diagonalizable matrix A

Output: Estimates of λ1 and λ1

  • (1)
    Initialize q(0) such that all its entries are strictly positive.
    • Fix a tolerance value ϵ.
    • While |q(k)q(k1)|>ϵ,
      • Set
        λ(k)=Aq(k1),q(k)=Aq(k1)λ(k),μ(k)=Aq(k1)+Ar(k1)λ(k)r(k1),r(k)=Ar(k1)+Aq(k1)μ(k)q(k1)λ(k).
      • Increment k by one.
    • (2)
      Set λ1λ(k) and λ1μ(k).

Theorem A1.

If A is an irreducible nonnegative diagonalizable matrix and q(0) has positive components with unit norm, then, as k, we have

λ(k)λ1,q(k)e1,μ(k)λ1.

Here, q(k)e1 means that q(k)e10 as k.

Before we present the proof of Theorem A1, we remark that the usual power iteration method computes only λ(k) and q(k). Then, it is well-known (see, for example, [22]) that λ(k) and q(k) tend to λ1 and e1, respectively.

Now, since ei spans Rn, we can write q(0)=i=1nαiei for any initial vector q(0). The next technical lemma provides closed formulas for λ(k), q(k), μ(k), and r(k) in terms of λi, ei and αi.

Lemma A1.

Let q(0)=i=1nαiei. Then,

q(k)=i=1nαiλikeii=1nαiλikei, (A3)
λ(k)=i=1nαiλikeii=1nαiλik1ei, (A4)
r(k)=i=1n(αiei+αiei)λik+(kλij=1kμ(j))αiλik1eii=1nαiλikei, (A5)
μ(k)=i=1n(αiei+αiei)λik1(λiλ(k))+αiλik1λiei+((k1)λij=1k1μ(j))αiλik2(λiλ(k))eii=1nαiλik1ei. (A6)

Proof. 

Since q(k) is defined recursively as q(k)=Aq(k1)λ(k)=Aq(k1)Aq(k1), we have that

q(k)=Akq(0)Akq(0).

Then, it follows from Equation (A1) that

Akq(0)=Aki=1nαiei=i=1nαi(Akei)=i=1nαiλikei, (A7)

and, so, we obtain (A3). Similarly, from (A1), we have that

λ(k)=Aq(k1)=Akq(0)Ak1q(0)=i=1nαiλikeii=1nαiλik1ei,

as required for (A4).

Next, we note that r(0)=i=1nαiei+i=1nαiei. Then, using the recursive definition of r(k), we have

r(k)=Akr(0)+j=0k1AjAAkj1q(0)(j=1kμ(j))Ak1q(0)Akq(0). (A8)

Then, from (A1), we have

Akr(0)=Aki=1nαiei+i=1nαiei=i=1nαi(Akei)+i=1nαiλikei. (A9)

and, from (A2),

Ai=1nαiλikj1ei=i=1nαiλikj1(Aei)=i=1nαiλikj1(λiei+λieiAei).

Therefore, using (A1) again,

j=0k1AjAi=1nαiλikj1ei=j=0k1Aji=1nαiλikj1(λiei+λieiAei)=ki=1nαiλik1λiei+i=1nαiλikeii=1nαi(Akei).

Therefore, we obtain (A5).

Finally, we recall that μ(k) is defined as

μ(k)=Aq(k1)+Ar(k1)λ(k)r(k1).

Then, by replacing r(k1) and q(k1) from (A5) and (A3), respectively, and then using Equation (A2), we obtain (A6). □

Finally, we are ready to demonstrate the correctness of Power Iteration I.

Proof of Theorem A1.

Since A is an irreducible nonnegative diagonalizable matrix, λ1 is real positive and there exists 0<ϵ<1 such that |λi|λ1<ϵforalli=2,3,,n (see, for example, [11]). For purposes of brevity, we write

Φk=i=1nαiλikei (A10)

and, so, we can rewrite (A3) as

q(k)=ΦkΦk=λ1kΦkΦkλ1k=λ1kΦkα1e1+i=2nαiλikλ1kei.

Now, since λik/λ1kϵk for all i=2,,n, we have that

Φkλ1kα1e1C1ϵkforsomeconstantC1. (A11)

Then, using the triangle inequality, we have that as k, Φkλ1kα10 and, thus, λ1kΦk1α1. Therefore, q(k)e10 as required.

It should be noted that since λ1kΦk tends to a finite limit, we have that λ1kΦk is bounded above by some constant. In other words, we have that

λ1kΦkC2forsomeconstantC2. (A12)

Next, we show the following inequality:

|λ(k)λ1|C3ϵk1forsomeconstantC3. (A13)

Using (A4), we have that

Φkλ1Φk1Φk1=λ1k1Φk1i=1nαiλikeiαiλ1λik1eiλ1k1=λ1k1Φk1·λ1·i=2nαiλikλ1kλik1λ1k1ei.

Now, observe that λikλ1kλik1λ1k12ϵk1 for i=2,,n. Since λ1k1Φk1C2, we have (A13) after applying the triangle inequality.

Again, to reduce clutter, we introduce the following abbreviations:

Dk=i=1n(αiei+αiei)λik1(λiλ(k)),Ek=i=1nαiλik1λiei,Fk=i=1n(k1)λij=1k1μ(j)αiλik2(λiλ(k))ei.

Thus, we can rewrite (A6) as

μ(k)=Dk+Ek+FkΦk1λ1+DkΦk1+Ekλ1Φk1Φk1+FkΦk1.

Next, we bound each of the summands on the right-hand side. Specifically, we show the following inequalities:

DkΦk1+Ekλ1Φk1Φk1C4ϵk1forsomeconstantC4, (A14)
FkΦk1C5(k1)ϵk1+C5j=1k1μ(k)ϵk1forsomeconstantC5. (A15)

To demonstrate (A14), we consider

Dkλ1k1=i=1n(αiei+αiei)λik1λ1k1(λiλ(k))α1e1+α1e1|λ1λ(k)|+ϵk1i=2nαiei+αiei|λiλ(k)|.

We use (A13) to bound the first summand by some constant multiple of ϵk1. On the other hand, we have |λiλ(k)||λiλ1|+|λ1λ(k)|max{|λiλ1|:2in}+C3ϵk1 for 2in. In other words, the second summand is also bounded by some constant multiple of ϵk1. Next, we consider

Ekλ1Φk1λ1k1=i=1nαiλik1λ1k1(λiλ1)eiϵk1i=2n|αi(λiλ1)|.

and, so, Ekλ1Φk1λ1k1 is also bounded by a multiple of ϵk1. Therefore, since λ1k1Φk1C2, we have (A14). Using similar methods, we can establish (A15).

Next, we apply (A14) and then recursively apply (A15) until the right-hand side is free of μ(i)s. Then, it follows that

μ(k)λ1+C4ϵk1+C5(k1)ϵk1+j=2k1(1+C5ϵkj)+C5ϵk1i=1k1(λ1+C4ϵki1C5(ki1)ϵki1)j=2i(1+C5ϵkj)). (A16)

Furthermore, since ik1, j=2i(1+C5ϵkj)j=2k1(1+C5ϵkj), we can rewrite (A16) as

μ(k)λ1+C4ϵk1+C5(k1)ϵk1+j=2k1(1+C5ϵkj)1+C5ϵk1i=1k1(λ1+C4ϵki1C5(ki1)ϵki1). (A17)

Next, it follows from standard calculus that j=2k1(1+C5ϵkj)<eC51ϵ. Furthermore, since ϵ<1, we have i=0k2ϵj<11ϵ and i=0k2jϵj<1(1ϵ)2. Putting everything together, we have

μ(k)λ1+C4ϵk1+C5(k1)ϵk1+C5ϵk1eC51ϵ1+(k1)λ1+C41ϵ+C5(1ϵ)2. (A18)

As k, since ϵ<1, we have ϵk0 and kϵk0. Therefore, limkμ(k)λ1. Using similar methods, we have that limkμ(k)λ1 and, so, limkμ(k)=λ1, as required. □

Next, we modify Power Iteration I so as to compute the higher order derivatives. We omit a detailed proof as it is similar to the proof of Theorem A1.

Power Iteration II.

Input: Irreducible nonnegative diagonalizable matrix A

Output: Estimates of λ1, λ1, and λ1

  • (1)

    Initialize q(0) such that all its entries are strictly positive.

    • Fix a tolerance value ϵ.

    • While |q(k)q(k1)|>ϵ,

      • Set
        λ(k)=Aq(k1),q(k)=Aq(k1)λ(k),μ(k)=Aq(k1)+Ar(k1)λ(k)r(k1),r(k)=Ar(k1)+Aq(k1)μ(k)q(k1)λ(k),ν(k)=Aq(k1)+2Ar(k1)+As(k1)λ(k)s(k1)2μ(k)r(k1),s(k)=Aq(k1)+2Ar(k1)+As(k1)2μ(k)r(k1)ν(k)q(k1)λ(k).
      • Increment k by one.

    • (2)

      Set λ1λ(k), λ1μ(k) and λ1ν(k).

Theorem A2.

If A is an irreducible nonnegative diagonalizable matrix and q(0) has positive components with unit norm, then, as k, we have

λ(k)λ1,q(k)e1,μ(k)λ1,ν(k)λ1.

Finally, we end this appendix with a power iteration method that computes the partial derivatives when the elements of the given matrix are bivariate functions.

Power Iteration III.

Input: Irreducible nonnegative diagonalizable matrix A

Output: Estimates of λ1, (λ1)x, (λ1)y, (λ1)xx, (λ1)yy, and (λ1)xy

  • (1)

    Initialize q(0) such that all its entries are strictly positive.

    • Fix a tolerance value ϵ.

    • While |q(k)q(k1)|>ϵ,

      • Set
        λ(k)=Aq(k1),q(k)=Aq(k1)λ(k),λx(k)=Axq(k1)+Aqx(k1)λqx(k1),qx(k)=Axq(k1)+Aqx(k1)λx(k1)q(k1)λ(k),λy(k)=Ayq(k1)+Aqy(k1)λqy(k1),qy(k)=Ayq(k1)+Aqy(k1)λy(k1)q(k1)λ(k),λxx(k)=Axxq(k1)+2Axqx(k1)+Aqxx(k1)λ(k1)qxx(k1)2λx(k1)qx(k1),qxx(k)=Axxq(k1)+2Axqx(k1)+Aqxx(k1)2λx(k1)qx(k1)λxx(k1)q(k1)λ(k)λyy(k)=Ayyq(k1)+2Ayqy(k1)+Aqyy(k1)λ(k1)qyy(k1)2λy(k1)qy(k1),qyy(k)=Ayyq(k1)+2Ayqy(k1)+Aqyy(k1)2λy(k1)qy(k1)λyy(k1)q(k1)λ(k)λxy(k)=Axyq(k1)+Axqy(k1)+Ayqx(k1)+Aqxy(k1)λ(k1)qxy(k1)λx(k1)qy(k1)λy(k1)qx(k1),qxy(k)=Axyq(k1)+Axqy(k1)+Ayqx(k1)+Aqxy(k1)λxy(k1)q(k1)λx(k1)qy(k1)λy(k1)qx(k1)λ(k).
      • Increment k by one.

    • Set λ(k)λ1, λx(k)(λ1)x, λy(k)(λ1)y, λxx(k)(λ1)xx, λyy(k)(λ1)yy, λxy(k)(λ1)xy.

Theorem A3.

If A is an irreducible nonnegative diagonalizable matrix and q(0) has positive components with unit norm, then, as k, we have λxx(k)(λ1)xx, λyy(k)(λ1)yy, λxy(k)(λ1)xy.

Author Contributions

Conceptualization, K.G. and H.M.K.; software, K.G.; writing—original draft preparation, K.G.; writing—review and editing, K.G. and H.M.K. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Funding Statement

The work of Han Mao Kiah was supported by the Ministry of Education, Singapore, under its MOE AcRF Tier 2 Award under Grant MOE-T2EP20121-0007 and MOE AcRF Tier 1 Award under Grant RG19/23.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Yazdi S.M.H.T., Kiah H.M., Garcia-Ruiz E., Ma J., Zhao H., Milenkovic O. DNA-Based Storage: Trends and Methods. IEEE Trans. Mol. Biol. Multi-Scale Commun. 2015;1:230–248. doi: 10.1109/TMBMC.2016.2537305. [DOI] [Google Scholar]
  • 2.Immink K.A.S., Cai K. Efficient balanced and maximum homopolymer-run restricted block codes for DNA-based data storage. IEEE Commun. Lett. 2019;23:1676–1679. doi: 10.1109/LCOMM.2019.2930970. [DOI] [Google Scholar]
  • 3.Nguyen T.T., Cai K., Immink K.A.S., Kiah H.M. Capacity-Approaching Constrained Codes with Error Correction for DNA-Based Data Storage. IEEE Trans. Inf. Theory. 2021;67:5602–5613. doi: 10.1109/TIT.2021.3066430. [DOI] [Google Scholar]
  • 4.Kovačević M., Vukobratović D. Asymptotic Behavior and Typicality Properties of Runlength-Limited Sequences. IEEE Trans. Inf. Theory. 2022;68:1638–1650. doi: 10.1109/TIT.2021.3134871. [DOI] [Google Scholar]
  • 5.Popovski P., Fouladgar A.M., Simeone O. Interactive joint transfer of energy and information. IEEE Trans. Commun. 2013;61:2086–2097. doi: 10.1109/TCOMM.2013.031213.120723. [DOI] [Google Scholar]
  • 6.Fouladgar A.M., Simeone O., Erkip E. Constrained codes for joint energy and information transfer. IEEE Trans. Commun. 2014;62:2121–2131. doi: 10.1109/TCOMM.2014.2317480. [DOI] [Google Scholar]
  • 7.Tandon A., Motani M., Varshney L.R. Subblock-constrained codes for real-time simultaneously energy and information transfer. IEEE Trans. Inf. Theory. 2016;62:4212–4227. doi: 10.1109/TIT.2016.2559504. [DOI] [Google Scholar]
  • 8.Immink K.A.S., Cai K. Block Codes for Energy-Harvesting Sliding- Window Constrained Channels. IEEE Commun. Lett. 2020;24:2383–2386. doi: 10.1109/LCOMM.2020.3012301. [DOI] [Google Scholar]
  • 9.Immink K.A.S., Cai K. Properties and Constructions of Energy-Harvesting Sliding-Window Constrained Codes. IEEE Commun. Lett. 2020;24:1890–1893. doi: 10.1109/LCOMM.2020.2993467. [DOI] [Google Scholar]
  • 10.Wu T.Y., Tandon A., Varshney L.R., Motani M. Skip-sliding window codes. IEEE Trans. Commun. 2021;69:2824–2836. doi: 10.1109/TCOMM.2021.3058965. [DOI] [Google Scholar]
  • 11.Marcus B.H., Roth R.M., Siegel P.H. An Introduction to Coding for Constrained Systems. 2001. [(accessed on 1 October 2020)]. Lecture Notes. Available online: https://ronny.cswp.cs.technion.ac.il/wp-content/uploads/sites/54/2016/05/chapters1-9.pdf.
  • 12.Kolesnik V.D., Krachkovsky V.Y. Generating functions and lower bounds on rates for limiting error-correcting codes. IEEE Trans. Inf. Theory. 1991;37:778–788. doi: 10.1109/18.79947. [DOI] [Google Scholar]
  • 13.Gu J., Fuja T. A generalized Gilbert-Varshamov bound derived via analysis of a code-search algorithm. IEEE Trans. Inf. Theory. 1993;39:1089–1093. doi: 10.1109/18.256522. [DOI] [Google Scholar]
  • 14.Marcus B.H., Roth R.M. Improved Gilbert-Varshamov bound for constrained systems. IEEE Trans. Inf. Theory. 1992;38:1213–1221. doi: 10.1109/18.144702. [DOI] [Google Scholar]
  • 15.Winick K.A., Yang S.H. Upper bounds on the size of error-correcting runlength-limited codes. Eur. Trans. Telecommun. 1996;37:273–283. doi: 10.1002/ett.4460070309. [DOI] [Google Scholar]
  • 16.Goyal K., Kiah H.M. Evaluating the Gilbert-Varshamov Bound for Constrained Systems; Proceedings of the 2022 IEEE International Symposium on Information Theory (ISIT); Espoo, Finland. 26 Jun–1 July 2022; pp. 1348–1353. [Google Scholar]
  • 17.Tolhuizen L.M.G.M. The generalized Gilbert-Varshamov bound is implied by Turan’s theorem. IEEE Trans. Inf. Theory. 1997;43:1605–1606. doi: 10.1109/18.623158. [DOI] [Google Scholar]
  • 18.Luenberger D.G. Introduction to Linear and Nonlinear Programming. Addison-Wesley; Reading, MA, USA: 1973. [Google Scholar]
  • 19.Rockafellar T. Convex Analysis. Princeton University; Pressrinceton, NJ, USA: 1970. [Google Scholar]
  • 20.Kashyap N., Roth R.M., Siegel P.H. The Capacity of Count-Constrained ICI-Free Systems; Proceedings of the 2019 IEEE International Symposium on Information Theory (ISIT); Paris, France. 7–12 July 2019; pp. 1592–1596. [Google Scholar]
  • 21.Tandon A., Kiah H.M., Motani M. Bounds on the size and asymptotic rate of subblock-constrained codes. IEEE Trans. Inf. Theory. 2018;64:6604–6619. doi: 10.1109/TIT.2018.2864137. [DOI] [Google Scholar]
  • 22.Stewart G.W. Introduction to Matrix Computations. Academic Press; New York, NY, USA: 1973. Computer Science and Applied Mathematics. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.


Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES