Skip to main content
Entropy logoLink to Entropy
. 2021 Dec 24;24(1):29. doi: 10.3390/e24010029

The Listsize Capacity of the Gaussian Channel with Decoder Assistance

Amos Lapidoth 1,*, Yiming Yan 1
Editors: Luca Barletta1, Alex Dytso1
PMCID: PMC8774540  PMID: 35052055

Abstract

The listsize capacity is computed for the Gaussian channel with a helper that—cognizant of the channel-noise sequence but not of the transmitted message—provides the decoder with a rate-limited description of said sequence. This capacity is shown to equal the sum of the cutoff rate of the Gaussian channel without help and the rate of help. In particular, zero-rate help raises the listsize capacity from zero to the cutoff rate. This is achieved by having the helper provide the decoder with a sufficiently fine quantization of the normalized squared Euclidean norm of the noise sequence.

Keywords: bit pipe, cutoff rate, decoder assistance, Gaussian channel, helper, listsize capacity

1. Introduction

The order-ρ listsize capacity Clist(ρ) of a channel is the supremum of the coding rates for which there exist codes guaranteeing the large-blocklength convergence to one of the ρ-th moment of the cardinality of the list of messages that, given the received output sequence, have positive a posteriori probability. It is zero for the Gaussian channel because, on this channel, no codeword is ruled out by any received sequence so said list contains all the messages. Here we derive this capacity for the Gaussian channel with a helper that observes the noise sequence and describes it to the decoder using a rate-limited noise-free bit pipe; see Figure 1.

Figure 1.

Figure 1

Gaussian channel with decoder assistance.

We show that the listsize capacity Clist(ρ)(Rh) is then the sum of bit-pipe’s rate Rh and the order-ρ cutoff rate Rcutoff(ρ) of the Gaussian channel without a helper

Clist(ρ)(Rh)=Rcutoff(ρ)+Rh. (1)

The latter’s definition is similar to that of the listsize capacity, but with the list now comprising only those messages that are a posteriori at least as likely as the transmitted one. As we shall see, for the Gaussian channel with average power P, noise-variance N, and corresponding signal-to-noise ratio (SNR) AP/N,

Rcutoff(ρ)=R0(ρ), (2)

where

R0(ρ)=12ln121+A1+ρ+1A1+ρ2+4A(1+ρ)2+1+ρ2ρ121+A1+ρ1A1+ρ2+4A(1+ρ)2+12ρln121A1+ρ+1A1+ρ2+4A(1+ρ)2 (3)

(in nats) is a function that plays a prominent role in the analysis of the Reliability Function of said channel (Section 7.4 in [1]), [2]. That analysis does not, however, carry over directly to our setting because it deals with error exponents and not lists.

It is interesting to note that (1) also holds when the help rate Rh is zero: the number of help bits required to increase the listsize capacity from zero to Rcutoff(ρ) is sublinear in the blocklength. In fact, as we shall see, all it takes is a sufficiently fine quantization of the normalized squared Euclidean norm of the noise sequence.

The relation (1) is reminiscent of the analogous result on the erasures-only capacity Ce-o(Rh) of the Gaussian channel with a rate-Rh helper (Remark 10 in [3]), namely, that

Ce-o(Rh)=C+Rh, (4)

where C denotes the Shannon capacity of the Gaussian channel (without help) (Theorem 9.1.1 in [4]), and Ce-o(Rh) is the erasures-only capacity, which is defined like Clist(ρ)(Rh) but with the requirement on the ρ-th moment of the list replaced by the requirement that the list be of size 1 with probability tending to one. (The Gaussian erasures-only capacity with a helper is given by the RHS of (4) irrespective of whether the assistance is provided to the encoder or decoder.) The latter result in turn is reminiscent of the analogous result on the Shannon capacity with a helper C(Rh) [5,6,7,8]

C(Rh)=C+Rh. (5)

In proving (1), we shall focus on the “direct part,” i.e., that the right-hand side (RHS) of (1) is achievable. The “converse,” that no rate exceeding the RHS of (1) is achievable, is omitted because it follows directly from (Remark 4 in [3]): There it is shown that this is true even if, given the received sequence and the provided help, the list contains only a subset of the messages that are of positive a posteriori probability, namely, those that are a posteriori at least as likely as the transmitted message.

The listsize capacity is relevant, for example, when the message set corresponds to tasks [9] and the transmitted message corresponds to one that must be performed by the decoder with absolute certainty. To ensure this, the decoder must perform all the tasks in the list of tasks that are not ruled out by the received sequence. (In addition to the transmitted task, other tasks need not but may be performed.) The ρ-th moment of the list’s size then measures the receiver’s average effort.

Results on the listsize capacity and the erasures-only capacity of general discrete memoryless channels (DMCs) in the absence of help are scarce. Noteworthy exceptions are the results of Pinsker and Sheverdjaev [10], Csiszár and Narayan [11], and Telatar [12], that provide sufficient conditions for the erasures-only capacity to equal the Shannon capacity and for the listsize capacity to equal the cutoff rate. Asymptotic results on the erasures-only capacity in the low-noise regime can be found in [13,14]. Once noiseless feedback is introduced, the problems become more tractable [15,16,17].

The rest of the paper is organized as follows. Section 2 describes our set-up and presents the main result. Section 3 contains some classical and some new observations regarding Gallager’s E0 function and its modification. Section 4 derives the cutoff rate of the Gaussian channel without help and proves (2). Section 5 describes and analyzes a coding scheme that proves the direct part of (1).

2. The Main Result

A power-P blocklength-n encoder f(n) for a message set M is a mapping

f(n):MRn (6)

that maps each message mM to an n-tuple f(n)(m) whose Euclidean norm f(n)(m) satisfies

f(n)(m)2nP,mM. (7)

We sometimes use xm to denote f(n)(m), and xm,k to denote the k-th component of xm, so

f(n)(m)=xm=(xm,1,,xm,n). (8)

The encoder is said to be of rate R if the cardinality of M is enR, in which case we often assume that M={1,,enR}. (We ignore the fact that enR need not be an integer; this issue washes out in the large-n asymptotics we study.)

When a message mM is sent over the discrete-time additive Gaussian noise channel using the encoder f(n), the channel produces the random vector YRn whose k-th component Yk is

Yk=xm,k+Zk,k=1,,n, (9)

where {Zk} are independent and identically distributed (IID) zero-mean Gaussians of variance N. We assume that N is positive and use w(y|x) to denote the density of the channel’s output when its input is x, i.e., the mean-x variance-N Gaussian density

w(y|x)=12πNe(yx)22N,x,yR, (10)

which we extend to n-tuples in a memoryless fashion:

w(y|x)=k=1nwyk|xk,x,yRn. (11)

For convenience, we define

A=PN. (12)

Given an output sequence y and a message m, we define the “at-least-as-likely list”

L(m,y)=mM:w(y|xm)w(y|xm). (13)

Assuming, as we do, that the messages are a priori equally likely, this list comprises the messages that, given the output sequence y, are a posteriori at least as likely as m.

If a message M, drawn equiprobably from M, is transmitted over the channel with a resulting received sequence Y, then the cardinality of the at-least-as-likely list is a random positive integer, and we denote its ρ-th moment E|L(M,Y)|ρ:

E|L(M,Y)|ρ=1|M|mMw(y|xm)|L(m,y)|ρdν(y), (14)

where ν(·) denotes the Lebesgue measure on Rn.

For a given ρ>0, we define the order-ρ cutoff rate Rcutoff(ρ) as the supremum of the rates R for which there exists a sequence of rate-R power-P blocklength-n encoders {f(n)} satisfying

limnE|L(M,Y)|ρ=1. (15)

Theorem 1.

The order-ρ cutoff rate Rcutoff(ρ) of the additive Gaussian noise channel equals R0(ρ) of (3).

Proof. 

See Section 4. □

A Tn-valued description of the noise sequence Z=(Z1,,Zn) is a mapping

ϕ(n):RnTn (16)

with the understanding that ϕ(n)(Z), which we denote T, is the description of Z. We say that a sequence {ϕ(n)} of descriptions is of rate Rh (nats) if

limn1nlnTn=Rh. (17)

Suppose now that, in addition to the received sequence Y, the receiver is also presented with the description T=ϕ(n)(Z) of the noise, and that, based on the two, it forms the “remotely-plausible list” L(Y,T) comprising the messages that have positive a posteriori probability given the two:

L(y,t)=mM:ϕ(n)(yxm)=t. (18)

Given ρ>0, the listsize capacity Clist(ρ)(Rh) with rate-Rh decoder assistance is the supremum of the rates R for which there exists a sequence of rate-R power-P blocklength-n encoders {f(n)} and a sequence {ϕ(n)} of descriptions of rate Rh such that

limnELY,ϕ(n)(Z)ρ=1. (19)

Theorem 2.

On the Gaussian channel, the listsize capacity with rate-Rh decoder assistance Clist(ρ)(Rh) is given by

Clist(ρ)(Rh)=Rcutoff(ρ)+Rh (20)

where Rcutoff(ρ) is the order-ρ cutoff rate of the channel (without assistance) as given in (2) and (3).

Proof. 

The “converse,” that (19) cannot be achieved when the rate exceeds the RHS of (20), follows from (Remark 4 in [3]). The “direct part,” describing a coding scheme that achieves (19) with rates approaching the RHS of (20), is proved in Section 5. □

3. Preliminaries

Given ρ0 and any probability measure Q on R, Gallager’s E0 function for our channel is defined as [1]

E0(ρ,Q)=lnyRxRw(y|x)11+ρdQ(x)1+ρdν(y), (21)

where ν(·) is now the Lebesgue measure on R. The result of maximizing E0(ρ,Q) over all Q under which E[X2]P, is denoted E0*(ρ):

E0*(ρ)=supQ:x2dQ(x)PE0(ρ,Q). (22)

The multi-letter extension of E0 is

E0(n)ρ,Q(n)=1nlnyRnxRnw(y|x)11+ρdQ(n)(x)1+ρdν(y), (23)

where Q(n) is a probability measure on Rn; the integrals are over Rn; the channel w(y|x) is defined in (11). Similarly,

E0(n),*[n](ρ)=supQ(n):x2dQ(n)(x)nPE0(n)(ρ,Q(n)). (24)

Given probability measures Q(m) on Rm and Q(n) on Rn that satisfy the power constraints E[X2]mP and E[X2]nP respectively, the product measure Q(m)×Q(n) on Rm+n satisfies the power constraint E[X2](m+n)P and

(m+n)E0(m+n)ρ,Q(m)×Q(n)=mE0(m)ρ,Q(m)+nE0(n)ρ,Q(n) (25)

because

(m+n)E0(m+n)ρ,Q(m)×Q(n)(26)=lnyRm+nxRm+nw(y|x)11+ρdQ(m)×Q(n)(x)1+ρdν(y)(27)=mE0(m)(ρ,Q(m))+nE0(n)(ρ,Q(n)).

The sequence nE0(n),*(ρ) is thus superadditive, and Feket’s Subadditive lemma implies that E0(n),*(ρ) converges to its supremum:

limnE0(n),*(ρ)=supnE0(n),*(ρ). (28)

We shall later see (cf. (55) ahead) that

1ρsupnE0(n),*(ρ)=R0(ρ), (29)

where R0(ρ) is defined in (3).

We shall also need Gallager’s modified E0 function. To highlight its relation to the unmodified function, which is quite general, we shall use g(x) for x2 and g(x) for x2. We shall also replace P with Γ.

Given some ρ0, some probability distribution Q on R under which E[g(X)]Γ, and some r0, the modified Gallager’s E0 function E0,m(ρ,Q,r) is defined as

E0,m(ρ,Q,r)=lnyRxRer(g(x)Γ)w(y|x)11+ρdQ(x)1+ρdν(y). (30)

We shall also be interested in the maximum of E0,m(ρ,Q,r) over both Q and r. We distinguish between two cases depending on whether E[g(X)]Γ holds strictly or not. In the former case we only allow r to be zero, whereas in the latter case it can be any non-negative number. We thus define

E0,m*(ρ,Q)=supr0E0,m(ρ,Q,r),ifg(x)dQ(x)=Γ,E0(ρ,Q),ifg(x)dQ(x)<Γ, (31)

and

E0,m**(ρ)=supQ:g(x)dQ(x)ΓE0,m*(ρ,Q). (32)

The next proposition provides a lower bound on limE0(n),*(ρ).

Proposition 1.

Any probability distribution Q on R under which g(X) is of finite second moment and of expectation Γ provides the lower bound

limnE0(n),*(ρ)E0,m*(ρ,Q). (33)

Proof. 

Let Q be any input distributions Q under which g(X) is of finite second moment and E[g(X)]=Γ. For each nN, let Q(n) be the conditional distribution of the n-fold product distribution Q×n given the event {XAn}, where

An=xRn:nΓδ<g(x)nΓ (34)

where δ>0 is some positive constant. Thus, for every Borel measurable subset B of Rn,

Q(n)(B)=1μQ×n(BAn),BB(Rn) (35)

with

μ=Q×n(An). (36)

For any r0, we can upper-bound the Radon–Nykodim derivative of Q(n) with respect to product distribution Q×n as follows:

(37)dQ(n)dQ×n=1μI{xAn}(38)1μer(g(x)nΓ+δ)(39)=1μerδer(g(x)nΓ)

where I{statement} equals 1 if the statement is true and else 0. Using this bound on the Radon–Nykodim derivative we obtain:

(40)E0(n)(ρ,Q(n))=1nlnyRnxRnw(y|x)11+ρdQ(n)(x)1+ρdν(y)(41)1nlnyRnxRnw(y|x)11+ρ·1μerδer(g(x)nΓ)dQ×n(x)1+ρdν(y)(42)=1+ρnlnerδμlnyRxRer(g(x)Γ)w(y|x)11+ρdQ(x)1+ρdν(y)(43)=1+ρnlnerδμ+E0,m(ρ,Q,r).

By the Central Limit Theorem, μ tends to 1/2 as n tends to infinity, so (43) implies that

lim infn1nE0(n)(ρ,Q(n))E0,m(ρ,Q,r). (44)

Taking the supremum of the RHS over all r0, establishes that

lim infn1nE0(n)ρ,Q(n)E0,m*(ρ,Q) (45)

and hence, by (24), proves (33). □

We next turn to upper-bounding limE0(n),*(ρ).

Proposition 2.

If the probability distribution Q(n) on Rn is such that E[g(X)]nΓ, and if fR is any density on R, then

E0(n)(ρ,Q(n))supP:g(x)dP(x)Γ(1+ρ)xRlnyRw(y|x)11+ρfR(y)ρ1+ρdydP(x) (46)

and, consequently,

limnE0(n),*(ρ)supP:g(x)dP(x)Γ(1+ρ)xRlnyRw(y|x)11+ρfR(y)ρ1+ρdydP(x). (47)

Proof. 

The proof is based on Proposition 2 in [18], which implies that for every density fR(n) on Rn and any probability measure Q(n) on Rn,

nE0(n)ρ,Q(n)(1+ρ)xRnlnyRnw(y|x)11+ρfR(n)(y)ρ1+ρdydQ(n)(x). (48)

Applying this inequality to the product density

fR(n)(y)=i=1nfR(yi), (49)

where fR is a density on R, and using the product form of the channel (11), we obtain that for any density fR on R

(50)E0(n)(ρ,Q(n))1n(1+ρ)i=1nxiRlnyRw(y|xi)11+ρfR(y)ρ1+ρdydQi(n)(xi)(51)=(1+ρ)xRlnyRw(y|x)11+ρfR(y)ρ1+ρdydQ¯(x),

where Qi(n) is the i-th marginal of Q(n), and Q¯ is the probability measure on R defined by

Q¯=1ni=1nQi(n). (52)

Observe that if E[g(X)]nΓ under Q(n), then E[g(X)]Γ under Q¯. This observation and (51) establish (46). Since (46) holds for all n, (47) must also hold. □

4. The Cutoff Rate of the Gaussian Channel

In this section, we prove Theorem 1. Since scaling the output does not change the cutoff rate, we will assume WLOG that the noise variance is 1 and the transmit power is A; see (12). Thus,

w(y|x)=12πe(yx)22,x,yR, (53)

and each codeword xm satisfies

xm2nA. (54)

4.1. Computing limE0(n),*(ρ)

Here we shall establish that on the Gaussian channel (53)

limnE0(n),*(ρ)=ρR0(ρ)=E0,m*(ρ,QG), (55)

where R0(ρ) is defined in (3), and QG is the zero-mean variance-A Gaussian distribution. To this end, we shall derive matching upper and lower bounds on the limit. We begin with the former.

4.1.1. Upper-Bounding limE0(n),*(ρ)

We show that on the channel (10)

limnE0(n),*(ρ)ρR0(ρ). (56)

The proof is based on Proposition 2 with the density fR corresponding to a centered Gaussian of variance σ2, where

σ2=A(1+ρ)β+1 (57)

and

β=121A1+ρ+1A1+ρ2+4A(1+ρ)2. (58)

Evaluating the RHS of (47) for this density, we obtain

(59)supP:E[X2]A(1+ρ)xRlnyRw(y|x)11+ρfR(y)ρ1+ρdydP(x)(60)=supP:E[X2]A(1+ρ)xRlnyR1(2π)11+ρe(yx)22(1+ρ)1(2πσ2)ρ1+ρey2ρ2σ2(1+ρ)dydP(x)(61)=supP:E[X2]A(1+ρ)xRln2π(1+ρ)2ρσ211+ρ12πσ12ex22σ12dP(x)(62)=supP:E[X2]A(1+ρ)ln(1+ρ)2ρσ211+ρ1σ12+(1+ρ)xRx22σ12dP(x)(63)=(1+ρ)ln(1+ρ)2ρσ211+ρ1σ12+(1+ρ)A2σ12(64)=1+ρ2Aσ12+1+ρ2lnσ1212lnσ21+ρ2ln(1+ρ)2ρ

where in (61) we defined

(65)σ121+ρ+σ2(1+ρ)ρ(66)=Aρβ+(1+ρ)2ρ.

To conclude the proof, it remains to show that the RHS of (64) coincides with ρR0(ρ). To this end, observe that some basic algebra reveals that

ββ1+A1+ρ=A(1+ρ)2 (67)

and

(β+A1+ρ)(1β)=Aρ(1+ρ)2. (68)

Therefore, the first term in (64) can be rewritten as

(69)1+ρ2Aσ12=1+ρ2AAρβ+(1+ρ)2ρ=1+ρ2Aρ(1+ρ)2βA(1+ρ)2+β(70)=Aρ2(1+ρ)1β+A1+ρ=(1+ρ)(1β)2,

and the remaining terms rewritten as

1+ρ2lnσ1212lnσ21+ρ2ln(1+ρ)2ρ(71)=1+ρ2lnAρβ+(1+ρ)2ρ12lnA(1+ρ)β+11+ρ2ln(1+ρ)2ρ(72)=1+ρ2lnA(1+ρ)2β+112lnA(1+ρ)β+1(73)=ρ2lnβ+A1+ρ+12lnβ.

The sum equals to ρR0(ρ).

4.1.2. Lower-Bounding limE0(n),*(ρ)

To lower-bound limE0(n),*(ρ), we shall use Proposition 1 with Q chosen as a centered variance-A Gaussian distribution QG. For this probability distribution Gallager calculated E0,m*(ρ,QG) (Section 7.4 in [1]). He showed that for any ρ>0,

E0,m*(ρ,QG)=ρR0(ρ), (74)

where R0(ρ) is defined in (3). Using this result and Proposition 1 we obtain

(75)limnE0(n),*(ρ)E0,m*(ρ,QG)(76)=ρR0(ρ).

4.2. The Mapping ρR0(ρ) Is Monotonically Decreasing

For the purpose of proving the achievability of R0(ρ), we will need the fact that it is monotonically decreasing in ρ. In view of (55), it suffices to show that, for every nN, the mapping ρρ1E0(n),*(ρ) is monotonically decreasing. In view of (24), the latter will follow once we establish the monotonicity of ρρ1E0(n)(ρ,Q(n)) for any fixed Q(n). Since E0(n)(ρ,Q(n)) evaluates to zero at ρ=0, this monotonicity can be established by showing that the mapping ρE0(n)(ρ,Q(n)) is concave. This is established in (Appendix 5.B in [1]). (That appendix deals with finite alphabets, but the proof goes through also to our case.)

4.3. Achievability of R0(ρ)

The achievability of R0(ρ) will be proved using a random-coding argument. Let Q be the zero-mean variance-A Gaussian distribution, let δ>0 be a positive constant, and let Q(n) be the distribution on Rn defined in (35) and (36). Draw the codewords {Xm}m=1,,enR of a blocklength-n random codebook independently, each according to Q(n), so Xm2nA with probability 1 for every mM. By symmetry, E|L(m,Y)|ρ (where the expectation is over the random choice of codebook and on the channel behavior) does not depend on m. Consequently,

EenRmM|L(m,Y)|ρ=E|L(1,Y)|ρ, (77)

and if we establish that E|L(1,Y)|ρ tends to 1, it will follow by the random-coding argument that there exists a codebook for which the LHS of (77)—with the expectation now over the channel behavior only—tends to 1.

Defining

Bm(x1,y)=𝟙w(y|Xm)w(y|x1),x1,yRn, (78)

we can express the RHS of (77) as

E|L(1,Y)|ρ=E1+m1Bm(X1,Y)ρ, (79)

and we seek to show that

limnE1+m1Bm(X1,Y)ρ=1. (80)

To this end, we shall need the following lemma.

Lemma 1.

Let {Zn} be a sequence of random variables taking values in N, and let ρ>0 be fixed. The following two conditions are then equivalent:

  • (i) 

    E[(1+Zn)ρ]=1+o(1)

  • (ii) 

    E[Znρ]=o(1)

where o(1) tends to zero as n tends to infinity. Thus

limnE[(1+Zn)ρ]=1limnE[Znρ]=0. (81)

Proof. 

The implication (ii) ⟹ (i) follows by noting for any zN and ρ>0

(1+z)ρ1+2ρzρ, (82)

so

E[(1+Zn)ρ]1+2ρE[Znρ]. (83)

As for the implication (i) ⟹ (ii), note that any yN and ρ>0

(1+y)ρyρ+𝟙{y=0}, (84)

so

E[(1+Zn)ρ]E[Znρ]+Pr[Zn=0]. (85)

The implication is now established by noting that (i) implies that Pr[Zn=0]1 because, by Markov’s inequality (and the strict positivity of ρ),

(86)Pr[Zn0]=Pr[(1+Zn)ρ12ρ1](87)E[(1+Zn)ρ]12ρ1.

In light of the above lemma, to establish (80) it suffices to show that

limnEm1Bm(X1,Y)ρ=0, (88)

i.e., that

limnEEm1Bm(x1,y)ρ|X1=x1,Y=y=0, (89)

where the outer expectation is over X1 and Y.

A related expectation—but one where it is the conditional expectation that is raised to the ρ-th power—is studied in the following lemma:

Lemma 2.

If ρ>0 and R<R0(ρ), then

limnEEm1Bm(x1,y)|X1=x1,Y=yρ=0. (90)

Proof. 

See Appendix A. □

To establish (88) using this lemma, we distinguish between two cases depending on whether 0<ρ1 or ρ>1. In the former case xxρ is concave, so Jensen’s inequality implies that

EEm1Bm(x1,y)ρ|X1=x1,Y=yEEm1Bm(x1,y)|X1=x1,Y=yρ, (91)

which, together with Lemma 2, implies (88) whenever R<R0(ρ).

Suppose now that ρ>1. Conditional on the transmitted codeword x1 and the output y, the random variables {Bm}m1 are IID Bernoulli, with Bm determined by Xm. We can thus use Rosenthal’s technique (Lemma 5.10 in [19]), [20] to obtain

Em1Bm(x1,y)ρ|X1=x1,Y=y(92)2ρ2maxEm1Bm(x1,y)|X1=x1,Y=yρ,Em1Bm(x1,y)|X1=x1,Y=y(93)2ρ2Em1Bm(x1,y)|X1=x1,Y=yρ+Em1Bm(x1,y)|X1=x1,Y=y.

Taking the expectation over X1 and Y yields

(94)EEm1Bm(x1,y)ρ|X1=x1,Y=y(95)2ρ2EEm1Bm(y)|Y=yρ+2ρ2EEm1Bm(y)|Y=y(96)2ρ2EEm1Bm(x1,y)|X1=x1,Y=yρ+2ρ2EEm1Bm(x1,y)|X1=x1,Y=y.

The first term on the RHS can be treated using the lemma. The second—but for the 2ρ2 constant—is the one encountered when ρ is 1. Since by Section 4.2, R0(ρ)R0(1) (because ρ>1 for the case at hand), it too tends to zero when R<R0(ρ).

4.4. No Rate Exceeding R0(ρ) Is Achievable

To show the converse, we need Arıkan’s lower bound on guessing [21].

Fix any sequence of rate-R blocklength-n codebooks {Cn} satisfying the cost constraint. For any nN, let

Q(n)(x)=1|Cn|ifxCn,0otherwise (97)

be the induced probability distribution on Rn. Since the codebook satisfies the cost constraint, E[X2]nA under Q(n).

Given y, list the messages mM in decreasing order of likelihood w(y|xm) (resolving ties arbitrarily, e.g., ranking low numerical values of m higher), and let G(m|y) denote the ranking of the message m in this list. Note that

|L(m,y)|G(m|y), (98)

where the inequality can be strict because there may be messages that are in L(m,y) because they have the same likelihood as m, and that are yet ranked lower than m by G(·|y) because of the way ties are resolved. It follows from this inequality that the ρ-th moment of |L(M,Y)| cannot tend to one unless the ρ-th moment of G(M|Y) does. By Arıkan’s guessing inequality [21],

EG(M|Y)ρ(1+nR)ρ·expnρRnE0(n)(ρ,Q(n)), (99)

so the ρ-th moment of G(M|Y) can tend to one only if

ρRlim infnE0(n)(ρ,Q(n)). (100)

From this, the converse now follows using (24) and (55) because

(101)lim infnE0(n)(ρ,Q(n))limnE0(n),*(ρ)(102)=ρR0(ρ).

5. The Direct Part of Theorem 2

In this section we prove the direct part of Theorem 2: when the decoder can be provided with a rate-Rh description of the noise, the convergence (19) can be achieved at all transmission rates below R0(ρ)+Rh. As noted earlier, the converse follows directly from (Remark 4 in [3]).

Our proof treats the cases Rh=0 and Rh>0 separately. As in Section 4, we assume that the channel is normalized to having noise variance 1 and transmit power A.

5.1. Case 1: Rh=0

The analogous result for the modulo-additive channel was proved in [3] by having the helper provide the decoder with a lossless description of the type of the noise sequence. Since this type fully specifies the a posteriori probability of the transmitted message, the decoder’s remotely-plausible-with-this-help list L(Y,T) contains only messages whose a posteriori probability is equal to that of the correct message. It is therefore a subset of the at-least-as-likely list L(M,Y) (without help) and hence of smaller-or-equal ρ-th moment. Consequently, any rate that allows the latter to tend to one, also allows the former to tend to one.

On the Gaussian channel the likelihood w(y|xm) is specified by the normalized squared Euclidean norm of the noise sequence z2/n. The latter, however, cannot be described at zero rate with infinite precision. This motivates us to quantize it and have the quantized version be the zero-rate help. The result will then follow by considering the high-resolution limit of the achievable rates. For this purpose, a uniform quantizer will do.

Given some large M>0 (which determines the overload region) and some large K (corresponding to the number of quantization cells), we partition the interval [0,M] into K subintervals, each of length Δ=M/K. The helper, upon observing the noise sequence Z, produces

T=ϕ(n)(Z)=Z2/(nΔ)ifZ2/n<M,Kotherwise. (103)

The constant M, which does not depend on the blocklength n, is chosen large enough to guarantee that the large-deviation probability of overload Pr[Z2/nM] decay sufficiently fast in n so that the contribution of the overload to the ρ-th moment of the list be negligible, even if an overload results in the list containing all enR codewords:

limnenρR·Prn1Z2M=0. (104)

(Upper bounds on the tail of the χ2 distribution show, for example, that for R<R0(ρ), the choice M=max{2,20ρR0(ρ)} will do.) Since the help takes values in the finite set Tn={0,1,,K}, where K does not depend on the blocklength, it is of zero rate.

As in Section 4.3, we consider a random codebook {Xm}m=1,,enR whose codewords are drawn independently from the conditional Gaussian distribution, i.e., from Q(n) defined in (35) and (36) with Q being QG, the centered variance-A Gaussian distribution. Using the same symmetry arguments, we also assume that the transmitted message is m=1 and study the ρ-th moment of the list under this assumption. Defining

Vm(x1,y)=𝟙ϕ(n)(yXm)=ϕ(n)(yx1),x1,yRn, (105)

we can express the ρ-th moment of the remotely-plausible list when m=1 as

E|L(Y,T)|ρ=E1+m1Vm(X1,Y)ρ. (106)

In view of Lemma 1, we need to prove that

limnEm1Vm(X1,Y)ρ=0, (107)

where the expectation is over both the random choice of the codebook and the channel behavior.

To analyze the LHS of (107), we define for every x1,yRn and every message m1 the binary random variable

Bm(x1,y;Δ)=𝟙w(y|Xm)w(y|x1)·enΔ2. (108)

Our analysis of Vm(x1,y) depends on whether ϕ(n)(yx1) differs from K (no overload) or equals K (corresponding to quantizer overload). In the former case, the random variable Vm(x1,y) can be upper bounded by Bm(x1,y;Δ) because

(109)Vm(x1,y)=𝟙ϕ(n)(yXm)=ϕ(n)(yx1)(110)𝟙|yXm2yx12|<nΔ(111)𝟙yXm2yx12+nΔ(112)=𝟙eyXm22eyx12+nΔ2(113)=Bm(x1,y;Δ),

where (110) holds because, for the case at hand, the equality of helper’s description implies that yXm2 and yx12 lie in a same interval of length nΔ. In the latter case—which is exponentially rare when M exceeds the noise variance—we simply upper bound Vm(x1,y) by 1.

The ρ-th moment of the list can now be expressed using the law of total expectation as

Em1Vm(X1,Y)ρ(114)=Em1Vm(X1,Y)ρ|TKPr[TK]+Em1Vm(X1,Y)ρ|T=KPr[T=K](115)Em1Bm(X1,Y;Δ)ρ|TKPr[TK]+enρRPr[T=K](116)Em1Bm(X1,Y;Δ)ρ+enρRPr[T=K].

The second term on the RHS of (116) tends to zero by (104). The first term is studied in the following lemma:

Lemma 3.

If ρ>0, Δ>0, and R<R0(ρ)Δ, then

limnEm1Bm(X1,Y;Δ)ρ=0. (117)

Proof. 

See Appendix B. □

For a given R<R0(ρ), achievability is thus established using this lemma and (116) by picking M sufficiently large for (104) to hold, and then picking K large enough to guarantee that R<R0(ρ)M/K so that, by Lemma 3, the first term on the RHS of (116) will also tend to zero.

5.2. Case 2: Rh>0

The key to proving the achievability of Rcutoff(ρ)+Rh is in showing that rate-Rh help can be utilized to increase the data rate by Rh, and that this can be done losslessly, with arbitrarily small (positive) power, and in one channel use. To show how this can be done, we show that—by using the channel once to send a single input that is bounded by A (with A any prespecified positive number) and using help taking values in the set T={0,,κ1}—we can send error-free a message taking values in said set. To transmit m{0,,κ1}, the encoder sends

x=m·Aκ, (118)

which is upper-bounded by A. Upon observing the noise Z, the helper produces the description T by quantizing the normalized noise and taking modulo, i.e.,

T=Z·κAmodκ, (119)

which is an element of {0,,κ1}. Based on Y and T, the decoder can calculate

m^=Y·κATmodκ, (120)

which equals m, because

(121)m^=x+Z·κATmodκ(122)=m+Z·κATmodκ(123)=m+Z·κATmodκ(124)=m,

where (123) holds because m and T are both integers.

Using this building-block, we can now prove the achievability of Rcutoff(ρ)+Rh by employing two-phase time sharing. Specifically, we propose the following blocklength-(n+1) scheme. In the first n channel uses, the helper operates at rate zero as in Section 5.1. By the achievability result proved in Section 5.1, for any R<R0(ρ), there exists a sequence of blocklength-n rate-R codebooks {xm}m=1,,enR, with xm2(n1)A for every m, and zero-rate helpers ϕ(Zn), such that the remotely-plausible-list L(Yn,ϕ(Zn)) satisfies

limnE|L(Yn,ϕ(Zn))|ρ=1. (125)

In the (n+1)-th channel-use we use the aforementioned coding scheme with κ being enRh. Since that scheme is error-free, the overall remotely-plausible-list for the two phases has the same cardinality as that of the first phase, namely |L(Yn,ϕ(Zn))|, and hence, its ρ-th moment tends to 1 by (125).

The achievability now follows by verifying that, the power of the transmitted input sequence x satisfies

x2=xn2+xn+12nA+A=(n+1)A; (126)

the rate of the helper is

1n+10+nRh (127)

and the rate achieved by the scheme is

1n+1nR0(ρ)+nRh (128)

which tend to Rh and R0(ρ)+Rh, respectively, as n tends to infinity.

Appendix A. Proof of Lemma 2

We shall establish that the expectation

(A1)EEm1Bm(x1,y)|X1=x1,Y=yρ(A2)=yRnx1RnEm1Bm(x1,y)|X1=x1,Y=yρw(y|x1)dQ(n)(x1)dν(y)

tends to zero as n tends to infinity whenever R<R0(ρ).

First notice that conditional on the transmitted codeword x1 and the channel output y, the random variables {Bm}m1 are IID Bernoulli, with Bm determined by Xm and being of probability of success

(A3)p(x1,y)=Prw(y|Xm)w(y|x1)(A4)=Prw(y|Xm)11+ρw(y|x1)11+ρ(A5)w(y|x1)11+ρEw(y|Xm)11+ρ,

where the last inequality follows from Markov’s inequality. Thus

EEm1Bm(x1,y)|X1=x1,Y=yρ(A6)=yRnx1RnEm1Bm(x1,y)|X=x1,Y=yρw(y|x1)dQ(n)(x1)dν(y)(A7)enρRyRnx1Rnp(x1,y)ρw(y|x1)dQ(n)(x1)dν(y)(A8)enρRyRnx1RnEw(y|Xm)11+ρρw(y|x1)11+ρdQ(n)(x1)dν(y).(A9)=enρRyRnEw(y|Xm)11+ρρx1Rnw(y|x1)11+ρdQ(n)(x1)dν(y)(A10)=enρRyRnxRnw(y|x)11+ρdQ(n)(x)1+ρdν(y)(A11)erδμ1+ρenρRyRxRer(x2A)w(y|x)11+ρdQG(x)1+ρdν(y)n(A12)=erδμ1+ρenρRenE0,m(ρ,QG,r),

where (A11) follows from the upper bound (39) on the Radon–Nykodim derivative and holds for every r0. Choosing r as r that achieves E0,m*(ρ,QG) (cf. (31)), we obtain

(A13)EEm1Bm(x1,y)|X1=x1,Y=yρerδμ1+ρenρRenE0,m*(ρ,QG)(A14)=erδμ1+ρenρ(RR0(ρ)),

where the last equality follows from (74).

The Central Limit Theorem guarantees that, as n tends to infinity, μ approaches 1/2. Consequently, the RHS of (A14) tends to zero whenever R<R0(ρ).

Appendix B. Proof of Lemma 3

To prove the lemma, we shall establish that, whenever R<R0(ρ)Δ,

limnEEm1Bm(x1,y;Δ)|X1=x1,Y=yρ=0, (A15)

where the outer expectation is over X1 and Y. From this (117) will follow in much the same way that (88) followed from (90) in Section 4.3.

To establish (A15), first note that, conditional on the transmitted codeword x1 and the channel output y, the random variables {Bm(x1,y;Δ)}m1 are IID Bernoulli, with Bm determined by Xm and being of probability of success

(A16)p(x1,y;Δ)=Prw(y|Xm)w(y|x1)enΔ2(A17)=Prw(y|Xm)11+ρw(y|x1)11+ρenΔ2(1+ρ)(A18)w(y|x1)11+ρenΔ2(1+ρ)Ew(y|Xm)11+ρ,

where the last inequality follows from Markov’s inequality. Consequently,

EEm1Bm(x1,y;Δ)|X1=x1,Y=yρ(A19)=yRnx1RnEm1Bm(x1,y;Δ)|X=x1,Y=yρw(y|x1)dQ(n)(x1)dν(y)(A20)enρRyRnx1Rnp(x1,y;Δ)ρw(y|x1)dQ(n)(x1)dν(y)(A21)enρRenρΔ2(1+ρ)yRnx1RnEw(y|Xm)11+ρρw(y|x1)11+ρdQ(n)(x1)dν(y)(A22)<enρΔenρRyRnx1RnEw(y|Xm)11+ρρw(y|x1)11+ρdQ(n)(x1)dν(y)

where (A22) holds because ρ,Δ>0 so nρΔ/2(1+ρ)<nρΔ.

Except for the enρΔ factor, the RHS of (A22) is identical to the RHS of (A8), which was shown to decay at least as fast as enρ(RR0(ρ)); see (A14). It follows that the RHS of (A22) tends to zero whenever R+Δ<R0(ρ).

Author Contributions

Writing—original draft preparation, A.L. and Y.Y.; writing—review and editing, A.L. and Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Gallager R.G. Information Theory and Reliable Communication. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 1968. [Google Scholar]
  • 2.Verdú S. Error exponents and α-mutual information. Entropy. 2021;23:199. doi: 10.3390/e23020199. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Lapidoth A., Marti G., Yan Y. Other helper capacities; Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT); Victoria, Australia. 12–20 July 2021; pp. 1272–1277. [DOI] [Google Scholar]
  • 4.Cover T.M. Elements of Information Theory. 2nd ed. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 2006. [Google Scholar]
  • 5.Kim Y. Capacity of a class of deterministic relay channels. IEEE Trans. Inf. Theory. 2008;54:1328–1329. doi: 10.1109/TIT.2007.915921. [DOI] [Google Scholar]
  • 6.Bross S.I., Lapidoth A., Marti G. Decoder-assisted communications over additive noise channels. IEEE Trans. Commun. 2020;68:4150–4161. doi: 10.1109/TCOMM.2020.2984215. [DOI] [Google Scholar]
  • 7.Lapidoth A., Marti G. Encoder-assisted communications over additive noise channels. IEEE Trans. Inf. Theory. 2020;66:6607–6616. doi: 10.1109/TIT.2020.3012629. [DOI] [Google Scholar]
  • 8.Merhav N. On error exponents of encoder-assisted communication systems. IEEE Trans. Inf. Theory. 2021;67:7019–7029. doi: 10.1109/TIT.2021.3111541. [DOI] [Google Scholar]
  • 9.Bunte C., Lapidoth A. Encoding tasks and Rényi entropy. IEEE Trans. Inf. Theory. 2014;60:5065–5076. doi: 10.1109/TIT.2014.2329490. [DOI] [Google Scholar]
  • 10.Pinsker M.S., Sheverdjaev A.Y. Transmission capacity with zero error and erasure. Probl. Peredachi Informatsii. 1970;6:20–24. [Google Scholar]
  • 11.Csiszar I., Narayan P. Channel capacity for a given decoding metric. IEEE Trans. Inf. Theory. 1995;41:35–43. doi: 10.1109/18.370120. [DOI] [Google Scholar]
  • 12.Telatar I.E. Zero-error list capacities of discrete memoryless channels. IEEE Trans. Inf. Theory. 1997;43:1977–1982. doi: 10.1109/18.641560. [DOI] [Google Scholar]
  • 13.Ahlswede R., Cai N., Zhang Z. Erasure, list, and detection zero-error capacities for low noise and a relation to identification. IEEE Trans. Inf. Theory. 1996;42:55–62. doi: 10.1109/18.481778. [DOI] [Google Scholar]
  • 14.Bunte C., Lapidoth A., Samorodnitsky A. The zero-undetected-error capacity approaches the Sperner capacity. IEEE Trans. Inf. Theory. 2014;60:3825–3833. doi: 10.1109/TIT.2014.2322624. [DOI] [Google Scholar]
  • 15.Nakiboğlu B., Zheng L. Errors-and-erasures decoding for block codes with feedback. IEEE Trans. Inf. Theory. 2012;58:24–49. doi: 10.1109/TIT.2011.2169529. [DOI] [Google Scholar]
  • 16.Bunte C., Lapidoth A. The zero-undetected-error capacity of discrete memoryless channels with feedback; Proceedings of the 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton); Monticello, IL, USA. 1–5 October 2012; pp. 1838–1842. [DOI] [Google Scholar]
  • 17.Bunte C., Lapidoth A. On the listsize capacity with feedback. IEEE Trans. Inf. Theory. 2014;60:6733–6748. doi: 10.1109/TIT.2014.2355815. [DOI] [Google Scholar]
  • 18.Lapidoth A., Miliou N. Duality bounds on the cutoff rate with applications to Ricean fading. IEEE Trans. Inf. Theory. 2006;52:3003–3018. doi: 10.1109/TIT.2006.876349. [DOI] [Google Scholar]
  • 19.Pfister C. Ph.D. Thesis. ETH Zurich; Zurich, Switzerland: 2019. On Rényi Information Measures and Their Applications. [Google Scholar]
  • 20.Rosenthal H.P. On the subspaces of Lp (p > 2) spanned by sequences of independent random variables. Isr. J. Math. 1970;8:273–303. doi: 10.1007/BF02771562. [DOI] [Google Scholar]
  • 21.Arıkan E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory. 1996;42:99–105. doi: 10.1109/18.481781. [DOI] [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES