The Listsize Capacity of the Gaussian Channel with Decoder Assistance

Amos Lapidoth; Yiming Yan

doi:10.3390/e24010029

. 2021 Dec 24;24(1):29. doi: 10.3390/e24010029

The Listsize Capacity of the Gaussian Channel with Decoder Assistance

Amos Lapidoth ^1,^*, Yiming Yan ¹

Editors: Luca Barletta¹, Alex Dytso¹

PMCID: PMC8774540 PMID: 35052055

Abstract

The listsize capacity is computed for the Gaussian channel with a helper that—cognizant of the channel-noise sequence but not of the transmitted message—provides the decoder with a rate-limited description of said sequence. This capacity is shown to equal the sum of the cutoff rate of the Gaussian channel without help and the rate of help. In particular, zero-rate help raises the listsize capacity from zero to the cutoff rate. This is achieved by having the helper provide the decoder with a sufficiently fine quantization of the normalized squared Euclidean norm of the noise sequence.

Keywords: bit pipe, cutoff rate, decoder assistance, Gaussian channel, helper, listsize capacity

1. Introduction

The order- $ρ$ listsize capacity $C_{l i s t}^{(ρ)}$ of a channel is the supremum of the coding rates for which there exist codes guaranteeing the large-blocklength convergence to one of the $ρ$ -th moment of the cardinality of the list of messages that, given the received output sequence, have positive a posteriori probability. It is zero for the Gaussian channel because, on this channel, no codeword is ruled out by any received sequence so said list contains all the messages. Here we derive this capacity for the Gaussian channel with a helper that observes the noise sequence and describes it to the decoder using a rate-limited noise-free bit pipe; see Figure 1.

Gaussian channel with decoder assistance.

We show that the listsize capacity $C_{l i s t}^{(ρ)} (R_{h})$ is then the sum of bit-pipe’s rate $R_{h}$ and the order- $ρ$ cutoff rate $R_{c u t o f f} (ρ)$ of the Gaussian channel without a helper

\begin{matrix} C_{l i s t}^{(ρ)} (R_{h}) & = & R_{c u t o f f} (ρ) + R_{h} . \end{matrix}

(1)

The latter’s definition is similar to that of the listsize capacity, but with the list now comprising only those messages that are a posteriori at least as likely as the transmitted one. As we shall see, for the Gaussian channel with average power $P$ , noise-variance $N$ , and corresponding signal-to-noise ratio (SNR) $A ≜ P / N$ ,

\begin{matrix} R_{c u t o f f} (ρ) & = & R_{0} (ρ), \end{matrix}

(2)

where

\begin{matrix} R_{0} (ρ) & = & \frac{1}{2} \ln \frac{1}{2} (1 + \frac{A}{1 + ρ} + \sqrt{{(1 - \frac{A}{1 + ρ})}^{2} + \frac{4 A}{{(1 + ρ)}^{2}}}) \\ + \frac{1 + ρ}{2 ρ} \frac{1}{2} (1 + \frac{A}{1 + ρ} - \sqrt{{(1 - \frac{A}{1 + ρ})}^{2} + \frac{4 A}{{(1 + ρ)}^{2}}}) \\ + \frac{1}{2 ρ} \ln \frac{1}{2} (1 - \frac{A}{1 + ρ} + \sqrt{{(1 - \frac{A}{1 + ρ})}^{2} + \frac{4 A}{{(1 + ρ)}^{2}}}) \end{matrix}

(3)

(in nats) is a function that plays a prominent role in the analysis of the Reliability Function of said channel (Section 7.4 in [1]), [2]. That analysis does not, however, carry over directly to our setting because it deals with error exponents and not lists.

It is interesting to note that (1) also holds when the help rate $R_{h}$ is zero: the number of help bits required to increase the listsize capacity from zero to $R_{c u t o f f} (ρ)$ is sublinear in the blocklength. In fact, as we shall see, all it takes is a sufficiently fine quantization of the normalized squared Euclidean norm of the noise sequence.

The relation (1) is reminiscent of the analogous result on the erasures-only capacity $C_{e-o} (R_{h})$ of the Gaussian channel with a rate- $R_{h}$ helper (Remark 10 in [3]), namely, that

\begin{matrix} C_{e-o} (R_{h}) & = & C + R_{h}, \end{matrix}

(4)

where C denotes the Shannon capacity of the Gaussian channel (without help) (Theorem 9.1.1 in [4]), and $C_{e-o} (R_{h})$ is the erasures-only capacity, which is defined like $C_{l i s t}^{(ρ)} (R_{h})$ but with the requirement on the $ρ$ -th moment of the list replaced by the requirement that the list be of size 1 with probability tending to one. (The Gaussian erasures-only capacity with a helper is given by the RHS of (4) irrespective of whether the assistance is provided to the encoder or decoder.) The latter result in turn is reminiscent of the analogous result on the Shannon capacity with a helper $C (R_{h})$ [5,6,7,8]

\begin{matrix} C (R_{h}) & = & C + R_{h} . \end{matrix}

(5)

In proving (1), we shall focus on the “direct part,” i.e., that the right-hand side (RHS) of (1) is achievable. The “converse,” that no rate exceeding the RHS of (1) is achievable, is omitted because it follows directly from (Remark 4 in [3]): There it is shown that this is true even if, given the received sequence and the provided help, the list contains only a subset of the messages that are of positive a posteriori probability, namely, those that are a posteriori at least as likely as the transmitted message.

The listsize capacity is relevant, for example, when the message set corresponds to tasks [9] and the transmitted message corresponds to one that must be performed by the decoder with absolute certainty. To ensure this, the decoder must perform all the tasks in the list of tasks that are not ruled out by the received sequence. (In addition to the transmitted task, other tasks need not but may be performed.) The $ρ$ -th moment of the list’s size then measures the receiver’s average effort.

Results on the listsize capacity and the erasures-only capacity of general discrete memoryless channels (DMCs) in the absence of help are scarce. Noteworthy exceptions are the results of Pinsker and Sheverdjaev [10], Csiszár and Narayan [11], and Telatar [12], that provide sufficient conditions for the erasures-only capacity to equal the Shannon capacity and for the listsize capacity to equal the cutoff rate. Asymptotic results on the erasures-only capacity in the low-noise regime can be found in [13,14]. Once noiseless feedback is introduced, the problems become more tractable [15,16,17].

The rest of the paper is organized as follows. Section 2 describes our set-up and presents the main result. Section 3 contains some classical and some new observations regarding Gallager’s $E_{0}$ function and its modification. Section 4 derives the cutoff rate of the Gaussian channel without help and proves (2). Section 5 describes and analyzes a coding scheme that proves the direct part of (1).

2. The Main Result

A power- $P$ blocklength-n encoder $f^{(n)}$ for a message set $M$ is a mapping

f^{(n)} : M \to R^{n}

(6)

that maps each message $m \in M$ to an n-tuple $f^{(n)} (m)$ whose Euclidean norm $∥ f^{(n)} (m) ∥$ satisfies

∥ f^{(n)} {(m) ∥}^{2} \leq n P, m \in M .

(7)

We sometimes use $x_{m}$ to denote $f^{(n)} (m)$ , and $x_{m, k}$ to denote the k-th component of $x_{m}$ , so

f^{(n)} (m) = x_{m} = (x_{m, 1}, \dots, x_{m, n}) .

(8)

The encoder is said to be of rate R if the cardinality of $M$ is $e^{n R}$ , in which case we often assume that $M = {1, \dots, e^{n R}}$ . (We ignore the fact that $e^{n R}$ need not be an integer; this issue washes out in the large-n asymptotics we study.)

When a message $m \in M$ is sent over the discrete-time additive Gaussian noise channel using the encoder $f^{(n)}$ , the channel produces the random vector $Y \in R^{n}$ whose k-th component $Y_{k}$ is

Y_{k} = x_{m, k} + Z_{k}, k = 1, \dots, n,

(9)

where ${Z_{k}}$ are independent and identically distributed (IID) zero-mean Gaussians of variance $N$ . We assume that $N$ is positive and use $w (y | x)$ to denote the density of the channel’s output when its input is x, i.e., the mean-x variance- $N$ Gaussian density

w (y | x) = \frac{1}{\sqrt{2 π N}} e^{- \frac{{(y - x)}^{2}}{2 N}}, x, y \in R,

(10)

which we extend to n-tuples in a memoryless fashion:

w (y | x) = \prod_{k = 1}^{n} w (y_{k} | x_{k}), x, y \in R^{n} .

(11)

For convenience, we define

A = \frac{P}{N} .

(12)

Given an output sequence $y$ and a message m, we define the “at-least-as-likely list”

L (m, y) = \{m^{'} \in M : w (y | x_{m^{'}}) \geq w (y | x_{m})\} .

(13)

Assuming, as we do, that the messages are a priori equally likely, this list comprises the messages that, given the output sequence $y$ , are a posteriori at least as likely as m.

If a message M, drawn equiprobably from $M$ , is transmitted over the channel with a resulting received sequence $Y$ , then the cardinality of the at-least-as-likely list is a random positive integer, and we denote its $ρ$ -th moment $E [{| L (M, Y) |}^{ρ}]$ :

E [{| L (M, Y) |}^{ρ}] = \frac{1}{| M |} \sum_{m \in M} \int w (y | x_{m} {) | L (m, y) |}^{ρ} d ν (y),

(14)

where $ν (\cdot)$ denotes the Lebesgue measure on $R^{n}$ .

For a given $ρ > 0$ , we define the order- $ρ$ cutoff rate $R_{c u t o f f} (ρ)$ as the supremum of the rates R for which there exists a sequence of rate-R power- $P$ blocklength-n encoders ${f^{(n)}}$ satisfying

lim_{n \to \infty} E [{| L (M, Y) |}^{ρ}] = 1 .

(15)

Theorem 1.

The order-ρ cutoff rate $R_{c u t o f f} (ρ)$ of the additive Gaussian noise channel equals $R_{0} (ρ)$ of (3).

Proof.

See Section 4. □

A $T_{n}$ -valued description of the noise sequence $Z = (Z_{1}, \dots, Z_{n})$ is a mapping

ϕ^{(n)} : R^{n} \to T_{n}

(16)

with the understanding that $ϕ^{(n)} (Z)$ , which we denote T, is the description of $Z$ . We say that a sequence ${ϕ^{(n)}}$ of descriptions is of rate $R_{h}$ (nats) if

lim_{n \to \infty} \frac{1}{n} \ln |T_{n}| = R_{h} .

(17)

Suppose now that, in addition to the received sequence $Y$ , the receiver is also presented with the description $T = ϕ^{(n)} (Z)$ of the noise, and that, based on the two, it forms the “remotely-plausible list” $L (Y, T)$ comprising the messages that have positive a posteriori probability given the two:

L (y, t) = \{m \in M : ϕ^{(n)} (y - x_{m}) = t\} .

(18)

Given $ρ > 0$ , the listsize capacity $C_{l i s t}^{(ρ)} (R_{h})$ with rate- $R_{h}$ decoder assistance is the supremum of the rates R for which there exists a sequence of rate-R power- $P$ blocklength-n encoders ${f^{(n)}}$ and a sequence ${ϕ^{(n)}}$ of descriptions of rate $R_{h}$ such that

lim_{n \to \infty} E [{|L (Y, ϕ^{(n)} (Z))|}^{ρ}] = 1 .

(19)

Theorem 2.

On the Gaussian channel, the listsize capacity with rate- $R_{h}$ decoder assistance $C_{l i s t}^{(ρ)} (R_{h})$ is given by

$\begin{matrix} C_{l i s t}^{(ρ)} (R_{h}) & = & R_{c u t o f f} (ρ) + R_{h} \end{matrix}$ (20)

where $R_{c u t o f f} (ρ)$ is the order-ρ cutoff rate of the channel (without assistance) as given in (2) and (3).

Proof.

The “converse,” that (19) cannot be achieved when the rate exceeds the RHS of (20), follows from (Remark 4 in [3]). The “direct part,” describing a coding scheme that achieves (19) with rates approaching the RHS of (20), is proved in Section 5. □

3. Preliminaries

Given $ρ \geq 0$ and any probability measure Q on $R$ , Gallager’s $E_{0}$ function for our channel is defined as [1]

E_{0} (ρ, Q) = - \ln \int_{y \in R} {(\int_{x \in R} w {(y | x)}^{\frac{1}{1 + ρ}} d Q (x))}^{1 + ρ} d ν (y),

(21)

where $ν (\cdot)$ is now the Lebesgue measure on $R$ . The result of maximizing $E_{0} (ρ, Q)$ over all Q under which $E [X^{2}] \leq P$ , is denoted $E_{0}^{*} (ρ)$ :

\begin{matrix} E_{0}^{*} (ρ) & = & \sup_{Q : \int x^{2} d Q (x) \leq P} E_{0} (ρ, Q) . \end{matrix}

(22)

The multi-letter extension of $E_{0}$ is

E_{0}^{(n)} (ρ, Q^{(n)}) = - \frac{1}{n} \ln \int_{y \in R^{n}} {(\int_{x \in R^{n}} w {(y | x)}^{\frac{1}{1 + ρ}} d Q^{(n)} (x))}^{1 + ρ} d ν (y),

(23)

where $Q^{(n)}$ is a probability measure on $R^{n}$ ; the integrals are over $R^{n}$ ; the channel $w (y | x)$ is defined in (11). Similarly,

\begin{matrix} E_{0}^{(n), *} [n] (ρ) & = & \sup_{Q^{(n)} : \int {∥ x ∥}^{2} d Q^{(n)} (x) \leq n P} E_{0}^{(n)} (ρ, Q^{(n)}) . \end{matrix}

(24)

Given probability measures $Q^{(m)}$ on $R^{m}$ and $Q^{(n)}$ on $R^{n}$ that satisfy the power constraints $E [∥ X ∥^{2}] \leq m P$ and $E [∥ X ∥^{2}] \leq n P$ respectively, the product measure $Q^{(m)} \times Q^{(n)}$ on $R^{m + n}$ satisfies the power constraint $E [∥ X ∥^{2}] \leq (m + n) P$ and

(m + n) E_{0}^{(m + n)} (ρ, Q^{(m)} \times Q^{(n)}) = m E_{0}^{(m)} (ρ, Q^{(m)}) + n E_{0}^{(n)} (ρ, Q^{(n)})

(25)

because

\begin{array}{l} (m + n) E_{0}^{(m + n)} (ρ, Q^{(m)} \times Q^{(n)}) \\ (26) & = - \ln \int_{y \in R^{m + n}} {(\int_{x \in R^{m + n}} w {(y | x)}^{\frac{1}{1 + ρ}} d (Q^{(m)} \times Q^{(n)}) (x))}^{1 + ρ} d ν (y) \\ (27) & = m E_{0}^{(m)} (ρ, Q^{(m)}) + n E_{0}^{(n)} (ρ, Q^{(n)}) . \end{array}

The sequence $\{n E_{0}^{(n), *} (ρ)\}$ is thus superadditive, and Feket’s Subadditive lemma implies that $\{E_{0}^{(n), *} (ρ)\}$ converges to its supremum:

lim_{n \to \infty} E_{0}^{(n), *} (ρ) = \sup_{n} E_{0}^{(n), *} (ρ) .

(28)

We shall later see (cf. (55) ahead) that

\frac{1}{ρ} \sup_{n} E_{0}^{(n), *} (ρ) = R_{0} (ρ),

(29)

where $R_{0} (ρ)$ is defined in (3).

We shall also need Gallager’s modified $E_{0}$ function. To highlight its relation to the unmodified function, which is quite general, we shall use $g (x)$ for $x^{2}$ and $g (x)$ for ${∥ x ∥}^{2}$ . We shall also replace $P$ with $Γ$ .

Given some $ρ \geq 0$ , some probability distribution Q on $R$ under which $E [g (X)] \leq Γ$ , and some $r \geq 0$ , the modified Gallager’s $E_{0}$ function $E_{0, m} (ρ, Q, r)$ is defined as

E_{0, m} (ρ, Q, r) = - \ln \int_{y \in R} {(\int_{x \in R} e^{r (g (x) - Γ)} w {(y | x)}^{\frac{1}{1 + ρ}} d Q (x))}^{1 + ρ} d ν (y) .

(30)

We shall also be interested in the maximum of $E_{0, m} (ρ, Q, r)$ over both Q and r. We distinguish between two cases depending on whether $E [g (X)] \leq Γ$ holds strictly or not. In the former case we only allow r to be zero, whereas in the latter case it can be any non-negative number. We thus define

E_{0, m}^{*} (ρ, Q) = \{\begin{matrix} \sup_{r \geq 0} E_{0, m} (ρ, Q, r), & if \int g (x) d Q (x) = Γ, \\ E_{0} (ρ, Q), & if \int g (x) d Q (x) < Γ, \end{matrix}

(31)

and

E_{0, m}^{* *} (ρ) = \sup_{Q : \int g (x) d Q (x) \leq Γ} E_{0, m}^{*} (ρ, Q) .

(32)

The next proposition provides a lower bound on $lim E_{0}^{(n), *} (ρ)$ .

Proposition 1.

Any probability distribution Q on $R$ under which $g (X)$ is of finite second moment and of expectation Γ provides the lower bound

$lim_{n \to \infty} E_{0}^{(n), *} (ρ) \geq E_{0, m}^{*} (ρ, Q) .$ (33)

Proof.

Let Q be any input distributions Q under which $g (X)$ is of finite second moment and $E [g (X)] = Γ$ . For each $n \in N$ , let $Q^{(n)}$ be the conditional distribution of the n-fold product distribution $Q^{\times n}$ given the event ${X \in A_{n}}$ , where

$A_{n} = \{x \in R^{n} : n Γ - δ < g (x) \leq n Γ\}$ (34)

where $δ > 0$ is some positive constant. Thus, for every Borel measurable subset $B$ of $R^{n}$ ,

$Q^{(n)} (B) = \frac{1}{μ} Q^{\times n} (B \cap A_{n}), B \in B (R^{n})$ (35)

with

$\begin{matrix} μ = Q^{\times n} (A_{n}) . \end{matrix}$ (36)

For any $r \geq 0$ , we can upper-bound the Radon–Nykodim derivative of $Q^{(n)}$ with respect to product distribution $Q^{\times n}$ as follows:

$\begin{array}{l} (37) & \frac{d Q^{(n)}}{d Q^{\times n}} & = \frac{1}{μ} I {x \in A_{n}} \\ (38) & \leq \frac{1}{μ} e^{r (g (x) - n Γ + δ)} \\ (39) & = \frac{1}{μ} e^{r δ} e^{r (g (x) - n Γ)} \end{array}$

where $I {statement}$ equals 1 if the statement is true and else 0. Using this bound on the Radon–Nykodim derivative we obtain:

$\begin{array}{l} (40) & E_{0}^{(n)} (ρ, Q^{(n)}) & = & - \frac{1}{n} \ln \int_{y \in R^{n}} {(\int_{x \in R^{n}} w {(y | x)}^{\frac{1}{1 + ρ}} d Q^{(n)} (x))}^{1 + ρ} d ν (y) \\ (41) & \geq & - \frac{1}{n} \ln \int_{y \in R^{n}} {(\int_{x \in R^{n}} w {(y | x)}^{\frac{1}{1 + ρ}} \cdot \frac{1}{μ} e^{r δ} e^{r (g (x) - n Γ)} d Q^{\times n} (x))}^{1 + ρ} d ν (y) \\ (42) & = & - \frac{1 + ρ}{n} \ln \frac{e^{r δ}}{μ} - \ln \int_{y \in R} {(\int_{x \in R} e^{r (g (x) - Γ)} w {(y | x)}^{\frac{1}{1 + ρ}} d Q (x))}^{1 + ρ} d ν (y) \\ (43) & = & - \frac{1 + ρ}{n} \ln \frac{e^{r δ}}{μ} + E_{0, m} (ρ, Q, r) . \end{array}$

By the Central Limit Theorem, $μ$ tends to 1/2 as n tends to infinity, so (43) implies that

$\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{n} E_{0}^{(n)} (ρ, Q^{(n)}) \geq E_{0, m} (ρ, Q, r) . \end{matrix}$ (44)

Taking the supremum of the RHS over all $r \geq 0$ , establishes that

$\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{n} E_{0}^{(n)} (ρ, Q^{(n)}) \geq E_{0, m}^{*} (ρ, Q) \end{matrix}$ (45)

and hence, by (24), proves (33). □

We next turn to upper-bounding $lim E_{0}^{(n), *} (ρ)$ .

Proposition 2.

If the probability distribution $Q^{(n)}$ on $R^{n}$ is such that $E [g (X)] \leq n Γ$ , and if $f_{R}$ is any density on $R$ , then

$\begin{matrix} E_{0}^{(n)} (ρ, Q^{(n)}) & \leq & \sup_{P : \int g (x) d P (x) \leq Γ} - (1 + ρ) \int_{x \in R} \ln (\int_{y \in R} w {(y | x)}^{\frac{1}{1 + ρ}} f_{R} {(y)}^{\frac{ρ}{1 + ρ}} d y) d P (x) \end{matrix}$ (46)

and, consequently,

$\begin{matrix} lim_{n \to \infty} E_{0}^{(n), *} (ρ) & \leq & \sup_{P : \int g (x) d P (x) \leq Γ} - (1 + ρ) \int_{x \in R} \ln (\int_{y \in R} w {(y | x)}^{\frac{1}{1 + ρ}} f_{R} {(y)}^{\frac{ρ}{1 + ρ}} d y) d P (x) . \end{matrix}$ (47)

Proof.

The proof is based on Proposition 2 in [18], which implies that for every density $f_{R}^{(n)}$ on $R^{n}$ and any probability measure $Q^{(n)}$ on $R^{n}$ ,

$n E_{0}^{(n)} (ρ, Q^{(n)}) \leq - (1 + ρ) \int_{x \in R^{n}} \ln (\int_{y \in R^{n}} w {(y | x)}^{\frac{1}{1 + ρ}} f_{R}^{(n)} {(y)}^{\frac{ρ}{1 + ρ}} d y) d Q^{(n)} (x) .$ (48)

Applying this inequality to the product density

$f_{R}^{(n)} (y) = \prod_{i = 1}^{n} f_{R} (y_{i}),$ (49)

where $f_{R}$ is a density on $R$ , and using the product form of the channel (11), we obtain that for any density $f_{R}$ on $R$

$\begin{array}{l} (50) & E_{0}^{(n)} (ρ, Q^{(n)}) & \leq & - \frac{1}{n} (1 + ρ) \sum_{i = 1}^{n} \int_{x_{i} \in R} \ln (\int_{y \in R} w {(y | x_{i})}^{\frac{1}{1 + ρ}} f_{R} {(y)}^{\frac{ρ}{1 + ρ}} d y) d Q_{i}^{(n)} (x_{i}) \\ (51) & = & - (1 + ρ) \int_{x \in R} \ln (\int_{y \in R} w {(y | x)}^{\frac{1}{1 + ρ}} f_{R} {(y)}^{\frac{ρ}{1 + ρ}} d y) d \bar{Q} (x), \end{array}$

where $Q_{i}^{(n)}$ is the i-th marginal of $Q^{(n)}$ , and $\bar{Q}$ is the probability measure on $R$ defined by

$\bar{Q} = \frac{1}{n} \sum_{i = 1}^{n} Q_{i}^{(n)} .$ (52)

Observe that if $E [g (X)] \leq n Γ$ under $Q^{(n)}$ , then $E [g (X)] \leq Γ$ under $\bar{Q}$ . This observation and (51) establish (46). Since (46) holds for all n, (47) must also hold. □

4. The Cutoff Rate of the Gaussian Channel

In this section, we prove Theorem 1. Since scaling the output does not change the cutoff rate, we will assume WLOG that the noise variance is 1 and the transmit power is $A$ ; see (12). Thus,

w (y | x) = \frac{1}{\sqrt{2 π}} e^{- \frac{{(y - x)}^{2}}{2}}, x, y \in R,

(53)

and each codeword $x_{m}$ satisfies

\begin{matrix} ∥ x_{m} ∥^{2} \leq n A . \end{matrix}

(54)

4.1. Computing $lim E_{0}^{(n), *} (ρ)$

Here we shall establish that on the Gaussian channel (53)

lim_{n \to \infty} E_{0}^{(n), *} (ρ) = ρ R_{0} (ρ) = E_{0, m}^{*} (ρ, Q_{G}),

(55)

where $R_{0} (ρ)$ is defined in (3), and $Q_{G}$ is the zero-mean variance- $A$ Gaussian distribution. To this end, we shall derive matching upper and lower bounds on the limit. We begin with the former.

4.1.1. Upper-Bounding $lim E_{0}^{(n), *} (ρ)$

We show that on the channel (10)

lim_{n \to \infty} E_{0}^{(n), *} (ρ) \leq ρ R_{0} (ρ) .

(56)

The proof is based on Proposition 2 with the density $f_{R}$ corresponding to a centered Gaussian of variance $σ^{2}$ , where

\begin{matrix} σ^{2} = \frac{A}{(1 + ρ) β} + 1 \end{matrix}

(57)

and

\begin{matrix} β = \frac{1}{2} (1 - \frac{A}{1 + ρ} + \sqrt{{(1 - \frac{A}{1 + ρ})}^{2} + \frac{4 A}{{(1 + ρ)}^{2}}}) . \end{matrix}

(58)

Evaluating the RHS of (47) for this density, we obtain

\begin{array}{l} (59) & \sup_{P : E [X^{2}] \leq A} - (1 + ρ) \int_{x \in R} \ln \int_{y \in R} w {(y | x)}^{\frac{1}{1 + ρ}} f_{R} {(y)}^{\frac{ρ}{1 + ρ}} d y d P (x) \\ (60) & = \sup_{P : E [X^{2}] \leq A} - (1 + ρ) \int_{x \in R} \ln \int_{y \in R} \frac{1}{{(\sqrt{2 π})}^{\frac{1}{1 + ρ}}} e^{- \frac{{(y - x)}^{2}}{2 (1 + ρ)}} \frac{1}{{(\sqrt{2 π σ^{2}})}^{\frac{ρ}{1 + ρ}}} e^{- \frac{y^{2} ρ}{2 σ^{2} (1 + ρ)}} d y d P (x) \\ (61) & = \sup_{P : E [X^{2}] \leq A} - (1 + ρ) \int_{x \in R} \ln (\sqrt{\frac{2 π {(1 + ρ)}^{2}}{ρ}} {\sqrt{σ^{2}}}^{\frac{1}{1 + ρ}}) \frac{1}{\sqrt{2 π σ_{1}^{2}}} e^{- \frac{x^{2}}{2 σ_{1}^{2}}} d P (x) \\ (62) & = \sup_{P : E [X^{2}] \leq A} - (1 + ρ) \ln (\sqrt{\frac{{(1 + ρ)}^{2}}{ρ}} {\sqrt{σ^{2}}}^{\frac{1}{1 + ρ}} \frac{1}{\sqrt{σ_{1}^{2}}}) + (1 + ρ) \int_{x \in R} \frac{x^{2}}{2 σ_{1}^{2}} d P (x) \\ (63) & = - (1 + ρ) \ln (\sqrt{\frac{{(1 + ρ)}^{2}}{ρ}} {\sqrt{σ^{2}}}^{\frac{1}{1 + ρ}} \frac{1}{\sqrt{σ_{1}^{2}}}) + (1 + ρ) \frac{A}{2 σ_{1}^{2}} \\ (64) & = \frac{1 + ρ}{2} \frac{A}{σ_{1}^{2}} + \frac{1 + ρ}{2} \ln σ_{1}^{2} - \frac{1}{2} \ln σ^{2} - \frac{1 + ρ}{2} \ln \frac{{(1 + ρ)}^{2}}{ρ} \end{array}

where in (61) we defined

\begin{array}{l} (65) & σ_{1}^{2} & ≜ & 1 + ρ + \frac{σ^{2} (1 + ρ)}{ρ} \\ (66) & = & \frac{A}{ρ β} + \frac{{(1 + ρ)}^{2}}{ρ} . \end{array}

To conclude the proof, it remains to show that the RHS of (64) coincides with $ρ R_{0} (ρ)$ . To this end, observe that some basic algebra reveals that

\begin{matrix} β (β - 1 + \frac{A}{1 + ρ}) & = & \frac{A}{{(1 + ρ)}^{2}} \end{matrix}

(67)

and

\begin{matrix} (β + \frac{A}{1 + ρ}) (1 - β) & = & \frac{A ρ}{{(1 + ρ)}^{2}} . \end{matrix}

(68)

Therefore, the first term in (64) can be rewritten as

\begin{array}{l} (69) & \frac{1 + ρ}{2} \frac{A}{σ_{1}^{2}} & = & \frac{1 + ρ}{2} \frac{A}{\frac{A}{ρ β} + \frac{{(1 + ρ)}^{2}}{ρ}} = \frac{1 + ρ}{2} \frac{A ρ}{{(1 + ρ)}^{2}} \frac{β}{\frac{A}{{(1 + ρ)}^{2}} + β} \\ (70) & = & \frac{A ρ}{2 (1 + ρ)} \frac{1}{β + \frac{A}{1 + ρ}} = \frac{(1 + ρ) (1 - β)}{2}, \end{array}

and the remaining terms rewritten as

\begin{array}{l} \frac{1 + ρ}{2} \ln σ_{1}^{2} - \frac{1}{2} \ln σ^{2} - \frac{1 + ρ}{2} \ln \frac{{(1 + ρ)}^{2}}{ρ} \\ (71) & = \frac{1 + ρ}{2} \ln (\frac{A}{ρ β} + \frac{{(1 + ρ)}^{2}}{ρ}) - \frac{1}{2} \ln (\frac{A}{(1 + ρ) β} + 1) - \frac{1 + ρ}{2} \ln \frac{{(1 + ρ)}^{2}}{ρ} \\ (72) & = \frac{1 + ρ}{2} \ln (\frac{A}{{(1 + ρ)}^{2} β} + 1) - \frac{1}{2} \ln (\frac{A}{(1 + ρ) β} + 1) \\ (73) & = \frac{ρ}{2} \ln (β + \frac{A}{1 + ρ}) + \frac{1}{2} \ln β . \end{array}

The sum equals to $ρ R_{0} (ρ)$ .

4.1.2. Lower-Bounding $lim E_{0}^{(n), *} (ρ)$

To lower-bound $lim E_{0}^{(n), *} (ρ)$ , we shall use Proposition 1 with Q chosen as a centered variance- $A$ Gaussian distribution $Q_{G}$ . For this probability distribution Gallager calculated $E_{0, m}^{*} (ρ, Q_{G})$ (Section 7.4 in [1]). He showed that for any $ρ > 0$ ,

E_{0, m}^{*} (ρ, Q_{G}) = ρ R_{0} (ρ),

(74)

where $R_{0} (ρ)$ is defined in (3). Using this result and Proposition 1 we obtain

\begin{array}{l} (75) & lim_{n \to \infty} E_{0}^{(n), *} (ρ) & \geq & E_{0, m}^{*} (ρ, Q_{G}) \\ (76) & = & ρ R_{0} (ρ) . \end{array}

4.2. The Mapping $ρ \mapsto R_{0} (ρ)$ Is Monotonically Decreasing

For the purpose of proving the achievability of $R_{0} (ρ)$ , we will need the fact that it is monotonically decreasing in $ρ$ . In view of (55), it suffices to show that, for every $n \in N$ , the mapping $ρ \mapsto ρ^{- 1} E_{0}^{(n), *} (ρ)$ is monotonically decreasing. In view of (24), the latter will follow once we establish the monotonicity of $ρ \mapsto ρ^{- 1} E_{0}^{(n)} (ρ, Q^{(n)})$ for any fixed $Q^{(n)}$ . Since $E_{0}^{(n)} (ρ, Q^{(n)})$ evaluates to zero at $ρ = 0$ , this monotonicity can be established by showing that the mapping $ρ \mapsto E_{0}^{(n)} (ρ, Q^{(n)})$ is concave. This is established in (Appendix 5.B in [1]). (That appendix deals with finite alphabets, but the proof goes through also to our case.)

4.3. Achievability of $R_{0} (ρ)$

The achievability of $R_{0} (ρ)$ will be proved using a random-coding argument. Let Q be the zero-mean variance- $A$ Gaussian distribution, let $δ > 0$ be a positive constant, and let $Q^{(n)}$ be the distribution on $R^{n}$ defined in (35) and (36). Draw the codewords ${X_{m}}_{m = 1, \dots, e^{n R}}$ of a blocklength-n random codebook independently, each according to $Q^{(n)}$ , so $∥ X_{m} ∥^{2} \leq n A$ with probability 1 for every $m \in M$ . By symmetry, $E [{| L (m, Y) |}^{ρ}]$ (where the expectation is over the random choice of codebook and on the channel behavior) does not depend on m. Consequently,

E [e^{- n R} \sum_{m \in M} {| L (m, Y) |}^{ρ}] = E [{| L (1, Y) |}^{ρ}],

(77)

and if we establish that $E [{| L (1, Y) |}^{ρ}]$ tends to 1, it will follow by the random-coding argument that there exists a codebook for which the LHS of (77)—with the expectation now over the channel behavior only—tends to 1.

Defining

\begin{matrix} B_{m} (x_{1}, y) = 𝟙 \{w (y | X_{m}) \geq w (y | x_{1})\}, x_{1}, y \in R^{n}, \end{matrix}

(78)

we can express the RHS of (77) as

\begin{matrix} E [{| L (1, Y) |}^{ρ}] & = & E [{(1 + \sum_{m \neq 1} B_{m} (X_{1}, Y))}^{ρ}], \end{matrix}

(79)

and we seek to show that

\begin{matrix} lim_{n \to \infty} E [{(1 + \sum_{m \neq 1} B_{m} (X_{1}, Y))}^{ρ}] = 1 . \end{matrix}

(80)

To this end, we shall need the following lemma.

Lemma 1.

Let ${Z_{n}}$ be a sequence of random variables taking values in $N$ , and let $ρ > 0$ be fixed. The following two conditions are then equivalent:

(i)
$E [{(1 + Z_{n})}^{ρ}] = 1 + o (1)$

(ii)
$E [Z_{n}^{ρ}] = o (1)$

where $o (1)$ tends to zero as n tends to infinity. Thus

$\begin{matrix} (lim_{n \to \infty} E [{(1 + Z_{n})}^{ρ}] = 1) \Leftrightarrow (lim_{n \to \infty} E [Z_{n}^{ρ}] = 0) . \end{matrix}$ (81)

Proof.

The implication (ii) ⟹ (i) follows by noting for any $z \in N$ and $ρ > 0$

$\begin{matrix} {(1 + z)}^{ρ} \leq 1 + 2^{ρ} z^{ρ}, \end{matrix}$ (82)

so

$\begin{matrix} E [{(1 + Z_{n})}^{ρ}] & \leq & 1 + 2^{ρ} E [Z_{n}^{ρ}] . \end{matrix}$ (83)

As for the implication (i) ⟹ (ii), note that any $y \in N$ and $ρ > 0$

$\begin{matrix} {(1 + y)}^{ρ} \geq y^{ρ} + 𝟙 {y = 0}, \end{matrix}$ (84)

so

$\begin{matrix} E [{(1 + Z_{n})}^{ρ}] & \geq & E [Z_{n}^{ρ}] + \Pr [Z_{n} = 0] . \end{matrix}$ (85)

The implication is now established by noting that (i) implies that $\Pr [Z_{n} = 0] \to 1$ because, by Markov’s inequality (and the strict positivity of $ρ$ ),

$\begin{array}{l} (86) & \Pr [Z_{n} \neq 0] & = & \Pr [{(1 + Z_{n})}^{ρ} - 1 \geq 2^{ρ} - 1] \\ (87) & \leq & \frac{E [{(1 + Z_{n})}^{ρ}] - 1}{2^{ρ} - 1} . \end{array}$

□

In light of the above lemma, to establish (80) it suffices to show that

\begin{matrix} lim_{n \to \infty} E [{(\sum_{m \neq 1} B_{m} (X_{1}, Y))}^{ρ}] = 0, \end{matrix}

(88)

i.e., that

\begin{matrix} lim_{n \to \infty} E [E [{(\sum_{m \neq 1} B_{m} (x_{1}, y))}^{ρ} | X_{1} = x_{1}, Y = y]] = 0, \end{matrix}

(89)

where the outer expectation is over $X_{1}$ and $Y$ .

A related expectation—but one where it is the conditional expectation that is raised to the $ρ$ -th power—is studied in the following lemma:

Lemma 2.

If $ρ > 0$ and $R < R_{0} (ρ)$ , then

$\begin{matrix} lim_{n \to \infty} E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}] = 0 . \end{matrix}$ (90)

Proof.

See Appendix A. □

To establish (88) using this lemma, we distinguish between two cases depending on whether $0 < ρ \leq 1$ or $ρ > 1$ . In the former case $x \mapsto x^{ρ}$ is concave, so Jensen’s inequality implies that

\begin{matrix} E [E [{(\sum_{m \neq 1} B_{m} (x_{1}, y))}^{ρ} | X_{1} = x_{1}, Y = y]] \leq E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}], \end{matrix}

(91)

which, together with Lemma 2, implies (88) whenever $R < R_{0} (ρ)$ .

Suppose now that $ρ > 1$ . Conditional on the transmitted codeword $x_{1}$ and the output $y$ , the random variables ${B_{m}}_{m \neq 1}$ are IID Bernoulli, with $B_{m}$ determined by $X_{m}$ . We can thus use Rosenthal’s technique (Lemma 5.10 in [19]), [20] to obtain

\begin{array}{l} E [{(\sum_{m \neq 1} B_{m} (x_{1}, y))}^{ρ} | X_{1} = x_{1}, Y = y] \\ (92) & \leq 2^{ρ^{2}} max \{E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}, E [\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]\} \\ (93) & \leq 2^{ρ^{2}} (E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ} + E [\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]) . \end{array}

Taking the expectation over $X_{1}$ and $Y$ yields

\begin{array}{l} (94) & E [E [{(\sum_{m \neq 1} B_{m} (x_{1}, y))}^{ρ} | X_{1} = x_{1}, Y = y]] \\ (95) & \leq 2^{ρ^{2}} E [E {[\sum_{m \neq 1} B_{m} (y) | Y = y]}^{ρ}] + 2^{ρ^{2}} E [E [\sum_{m \neq 1} B_{m} (y) | Y = y]] \\ (96) & \leq 2^{ρ^{2}} E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}] + 2^{ρ^{2}} E [E [\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]] . \end{array}

The first term on the RHS can be treated using the lemma. The second—but for the $2^{ρ^{2}}$ constant—is the one encountered when $ρ$ is 1. Since by Section 4.2, $R_{0} (ρ) \leq R_{0} (1)$ (because $ρ > 1$ for the case at hand), it too tends to zero when $R < R_{0} (ρ)$ .

4.4. No Rate Exceeding $R_{0} (ρ)$ Is Achievable

To show the converse, we need Arıkan’s lower bound on guessing [21].

Fix any sequence of rate-R blocklength-n codebooks ${C_{n}}$ satisfying the cost constraint. For any $n \in N$ , let

\begin{matrix} Q^{(n)} (x) = \{\begin{matrix} \frac{1}{| C_{n} |} & if x \in C_{n}, \\ 0 & otherwise \end{matrix} \end{matrix}

(97)

be the induced probability distribution on $R^{n}$ . Since the codebook satisfies the cost constraint, $E [∥ X ∥^{2}] \leq n A$ under $Q^{(n)}$ .

Given $y$ , list the messages $m \in M$ in decreasing order of likelihood $w (y | x_{m})$ (resolving ties arbitrarily, e.g., ranking low numerical values of m higher), and let $G (m | y)$ denote the ranking of the message m in this list. Note that

\begin{matrix} | L (m, y) | \geq G (m | y), \end{matrix}

(98)

where the inequality can be strict because there may be messages that are in $L (m, y)$ because they have the same likelihood as m, and that are yet ranked lower than m by $G (\cdot | y)$ because of the way ties are resolved. It follows from this inequality that the $ρ$ -th moment of $| L (M, Y) |$ cannot tend to one unless the $ρ$ -th moment of $G (M | Y)$ does. By Arıkan’s guessing inequality [21],

\begin{matrix} E [G {(M | Y)}^{ρ}] & \geq & {(1 + n R)}^{- ρ} \cdot exp (n ρ R - n E_{0}^{(n)} (ρ, Q^{(n)})), \end{matrix}

(99)

so the $ρ$ -th moment of $G (M | Y)$ can tend to one only if

\begin{matrix} ρ R & \leq & \underset{n \to \infty}{lim inf} E_{0}^{(n)} (ρ, Q^{(n)}) . \end{matrix}

(100)

From this, the converse now follows using (24) and (55) because

\begin{array}{l} (101) & \underset{n \to \infty}{lim inf} E_{0}^{(n)} (ρ, Q^{(n)}) & \leq & lim_{n \to \infty} E_{0}^{(n), *} (ρ) \\ (102) & = & ρ R_{0} (ρ) . \end{array}

5. The Direct Part of Theorem 2

In this section we prove the direct part of Theorem 2: when the decoder can be provided with a rate- $R_{h}$ description of the noise, the convergence (19) can be achieved at all transmission rates below $R_{0} (ρ) + R_{h}$ . As noted earlier, the converse follows directly from (Remark 4 in [3]).

Our proof treats the cases $R_{h} = 0$ and $R_{h} > 0$ separately. As in Section 4, we assume that the channel is normalized to having noise variance 1 and transmit power $A$ .

5.1. Case 1: $R_{h} = 0$

The analogous result for the modulo-additive channel was proved in [3] by having the helper provide the decoder with a lossless description of the type of the noise sequence. Since this type fully specifies the a posteriori probability of the transmitted message, the decoder’s remotely-plausible-with-this-help list $L (Y, T)$ contains only messages whose a posteriori probability is equal to that of the correct message. It is therefore a subset of the at-least-as-likely list $L (M, Y)$ (without help) and hence of smaller-or-equal $ρ$ -th moment. Consequently, any rate that allows the latter to tend to one, also allows the former to tend to one.

On the Gaussian channel the likelihood $w (y | x_{m})$ is specified by the normalized squared Euclidean norm of the noise sequence ${∥ z ∥}^{2} / n$ . The latter, however, cannot be described at zero rate with infinite precision. This motivates us to quantize it and have the quantized version be the zero-rate help. The result will then follow by considering the high-resolution limit of the achievable rates. For this purpose, a uniform quantizer will do.

Given some large $M > 0$ (which determines the overload region) and some large K (corresponding to the number of quantization cells), we partition the interval $[0, M]$ into K subintervals, each of length $Δ = M / K$ . The helper, upon observing the noise sequence $Z$ , produces

\begin{matrix} T = ϕ^{(n)} (Z) = \{\begin{matrix} ⌊{∥ Z ∥}^{2} / (n Δ)⌋ & {if ∥ Z ∥}^{2} / n < M, \\ K & otherwise \end{matrix} . \end{matrix}

(103)

The constant M, which does not depend on the blocklength n, is chosen large enough to guarantee that the large-deviation probability of overload $\Pr [∥ Z ∥^{2} / n \geq M]$ decay sufficiently fast in n so that the contribution of the overload to the $ρ$ -th moment of the list be negligible, even if an overload results in the list containing all $e^{n R}$ codewords:

\begin{matrix} lim_{n \to \infty} e^{n ρ R} \cdot \Pr [n^{- 1} {∥ Z ∥}^{2} \geq M] & = & 0 . \end{matrix}

(104)

(Upper bounds on the tail of the $χ^{2}$ distribution show, for example, that for $R < R_{0} (ρ)$ , the choice $M = max {2, 20 ρ R_{0} (ρ)}$ will do.) Since the help takes values in the finite set $T_{n} = {0, 1, \dots, K}$ , where K does not depend on the blocklength, it is of zero rate.

As in Section 4.3, we consider a random codebook ${X_{m}}_{m = 1, \dots, e^{n R}}$ whose codewords are drawn independently from the conditional Gaussian distribution, i.e., from $Q^{(n)}$ defined in (35) and (36) with Q being $Q_{G}$ , the centered variance- $A$ Gaussian distribution. Using the same symmetry arguments, we also assume that the transmitted message is $m = 1$ and study the $ρ$ -th moment of the list under this assumption. Defining

\begin{matrix} V_{m} (x_{1}, y) & = & 𝟙 \{ϕ^{(n)} (y - X_{m}) = ϕ^{(n)} (y - x_{1})\}, x_{1}, y \in R^{n}, \end{matrix}

(105)

we can express the $ρ$ -th moment of the remotely-plausible list when $m = 1$ as

\begin{matrix} E [{| L (Y, T) |}^{ρ}] & = & E [{(1 + \sum_{m \neq 1} V_{m} (X_{1}, Y))}^{ρ}] . \end{matrix}

(106)

In view of Lemma 1, we need to prove that

\begin{matrix} lim_{n \to \infty} E [{(\sum_{m \neq 1} V_{m} (X_{1}, Y))}^{ρ}] = 0, \end{matrix}

(107)

where the expectation is over both the random choice of the codebook and the channel behavior.

To analyze the LHS of (107), we define for every $x_{1}, y \in R^{n}$ and every message $m \neq 1$ the binary random variable

\begin{matrix} B_{m} (x_{1}, y; Δ) & = & 𝟙 \{w (y | X_{m}) \geq w (y | x_{1}) \cdot e^{- \frac{n Δ}{2}}\} . \end{matrix}

(108)

Our analysis of $V_{m} (x_{1}, y)$ depends on whether $ϕ^{(n)} (y - x_{1})$ differs from K (no overload) or equals K (corresponding to quantizer overload). In the former case, the random variable $V_{m} (x_{1}, y)$ can be upper bounded by $B_{m} (x_{1}, y; Δ)$ because

\begin{array}{l} (109) & V_{m} (x_{1}, y) & = & 𝟙 \{ϕ^{(n)} (y - X_{m}) = ϕ^{(n)} (y - x_{1})\} \\ (110) & \leq & 𝟙 \{| ∥ y - X_{m} ∥^{2} - {∥ y - x_{1} ∥}^{2} | < n Δ\} \\ (111) & \leq & 𝟙 \{∥ y - X_{m} ∥^{2} \leq {∥ y - x_{1} ∥}^{2} + n Δ\} \\ (112) & = & 𝟙 \{e^{- \frac{∥ y - X_{m} ∥^{2}}{2}} \geq e^{- \frac{∥ y - x_{1} ∥^{2} + n Δ}{2}}\} \\ (113) & = & B_{m} (x_{1}, y; Δ), \end{array}

where (110) holds because, for the case at hand, the equality of helper’s description implies that $∥ y - X_{m} ∥^{2}$ and $∥ y - x_{1} ∥^{2}$ lie in a same interval of length $n Δ$ . In the latter case—which is exponentially rare when M exceeds the noise variance—we simply upper bound $V_{m} (x_{1}, y)$ by 1.

The $ρ$ -th moment of the list can now be expressed using the law of total expectation as

\begin{array}{l} E [{(\sum_{m \neq 1} V_{m} (X_{1}, Y))}^{ρ}] \\ (114) & = E [{(\sum_{m \neq 1} V_{m} (X_{1}, Y))}^{ρ} | T \neq K] \Pr [T \neq K] + E [{(\sum_{m \neq 1} V_{m} (X_{1}, Y))}^{ρ} | T = K] \Pr [T = K] \\ (115) & \leq E [{(\sum_{m \neq 1} B_{m} (X_{1}, Y; Δ))}^{ρ} | T \neq K] \Pr [T \neq K] + e^{n ρ R} \Pr [T = K] \\ (116) & \leq E [{(\sum_{m \neq 1} B_{m} (X_{1}, Y; Δ))}^{ρ}] + e^{n ρ R} \Pr [T = K] . \end{array}

The second term on the RHS of (116) tends to zero by (104). The first term is studied in the following lemma:

Lemma 3.

If $ρ > 0$ , $Δ > 0$ , and $R < R_{0} (ρ) - Δ$ , then

$\begin{matrix} lim_{n \to \infty} E [{(\sum_{m \neq 1} B_{m} (X_{1}, Y; Δ))}^{ρ}] = 0 . \end{matrix}$ (117)

Proof.

See Appendix B. □

For a given $R < R_{0} (ρ)$ , achievability is thus established using this lemma and (116) by picking M sufficiently large for (104) to hold, and then picking K large enough to guarantee that $R < R_{0} (ρ) - M / K$ so that, by Lemma 3, the first term on the RHS of (116) will also tend to zero.

5.2. Case 2: $R_{h} > 0$

The key to proving the achievability of $R_{c u t o f f} (ρ) + R_{h}$ is in showing that rate- $R_{h}$ help can be utilized to increase the data rate by $R_{h}$ , and that this can be done losslessly, with arbitrarily small (positive) power, and in one channel use. To show how this can be done, we show that—by using the channel once to send a single input that is bounded by $\sqrt{A}$ (with $A$ any prespecified positive number) and using help taking values in the set $T = {0, \dots, κ - 1}$ —we can send error-free a message taking values in said set. To transmit $m \in {0, \dots, κ - 1}$ , the encoder sends

\begin{matrix} x = m \cdot \frac{\sqrt{A}}{κ}, \end{matrix}

(118)

which is upper-bounded by $\sqrt{A}$ . Upon observing the noise Z, the helper produces the description T by quantizing the normalized noise and taking modulo, i.e.,

\begin{matrix} T = ⌊Z \cdot \frac{κ}{\sqrt{A}}⌋ \mod κ, \end{matrix}

(119)

which is an element of ${0, \dots, κ - 1}$ . Based on Y and T, the decoder can calculate

\begin{matrix} \hat{m} = ⌊Y \cdot \frac{κ}{\sqrt{A}} - T⌋ \mod κ, \end{matrix}

(120)

which equals m, because

\begin{array}{l} (121) & \hat{m} & = & ⌊(x + Z) \cdot \frac{κ}{\sqrt{A}} - T⌋ \mod κ \\ (122) & = & ⌊m + Z \cdot \frac{κ}{\sqrt{A}} - T⌋ \mod κ \\ (123) & = & (m + ⌊Z \cdot \frac{κ}{\sqrt{A}}⌋ - T) \mod κ \\ (124) & = & m, \end{array}

where (123) holds because m and T are both integers.

Using this building-block, we can now prove the achievability of $R_{c u t o f f} (ρ) + R_{h}$ by employing two-phase time sharing. Specifically, we propose the following blocklength- $(n + 1)$ scheme. In the first n channel uses, the helper operates at rate zero as in Section 5.1. By the achievability result proved in Section 5.1, for any $R < R_{0} (ρ)$ , there exists a sequence of blocklength-n rate-R codebooks ${x_{m}}_{m = 1, \dots, e^{n R}}$ , with $∥ x_{m} ∥^{2} \leq (n - 1) A$ for every m, and zero-rate helpers $ϕ (Z^{n})$ , such that the remotely-plausible-list $L (Y^{n}, ϕ (Z^{n}))$ satisfies

\begin{matrix} lim_{n \to \infty} E [| L (Y^{n}, ϕ (Z^{n})) |^{ρ}] = 1 . \end{matrix}

(125)

In the $(n + 1)$ -th channel-use we use the aforementioned coding scheme with $κ$ being $⌈ e^{n R_{h}} ⌉$ . Since that scheme is error-free, the overall remotely-plausible-list for the two phases has the same cardinality as that of the first phase, namely $| L (Y^{n}, ϕ (Z^{n})) |$ , and hence, its $ρ$ -th moment tends to 1 by (125).

The achievability now follows by verifying that, the power of the transmitted input sequence $x$ satisfies

\begin{matrix} {∥ x ∥}^{2} = ∥ x^{n} ∥^{2} + {∥ x_{n + 1} ∥}^{2} \leq n A + A = (n + 1) A; \end{matrix}

(126)

the rate of the helper is

\begin{matrix} \frac{1}{n + 1} (0 + n R_{h}) \end{matrix}

(127)

and the rate achieved by the scheme is

\begin{matrix} \frac{1}{n + 1} (n R_{0} (ρ) + n R_{h}) \end{matrix}

(128)

which tend to $R_{h}$ and $R_{0} (ρ) + R_{h}$ , respectively, as n tends to infinity.

Appendix A. Proof of Lemma 2

We shall establish that the expectation

\begin{array}{l} (A 1) & E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}] \\ (A 2) & = \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ} w (y | x_{1}) d Q^{(n)} (x_{1}) d ν (y) \end{array}

tends to zero as n tends to infinity whenever $R < R_{0} (ρ)$ .

First notice that conditional on the transmitted codeword $x_{1}$ and the channel output $y$ , the random variables ${B_{m}}_{m \neq 1}$ are IID Bernoulli, with $B_{m}$ determined by $X_{m}$ and being of probability of success

\begin{array}{l} (A 3) & p (x_{1}, y) & = & \Pr [w (y | X_{m}) \geq w (y | x_{1})] \\ (A 4) & = & \Pr [w {(y | X_{m})}^{\frac{1}{1 + ρ}} \geq w {(y | x_{1})}^{\frac{1}{1 + ρ}}] \\ (A 5) & \leq & w {(y | x_{1})}^{- \frac{1}{1 + ρ}} E [w {(y | X_{m})}^{\frac{1}{1 + ρ}}], \end{array}

where the last inequality follows from Markov’s inequality. Thus

\begin{array}{l} E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}] \\ (A 6) & = \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X = x_{1}, Y = y]}^{ρ} w (y | x_{1}) d Q^{(n)} (x_{1}) d ν (y) \\ (A 7) & \leq e^{n ρ R} \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} p {(x_{1}, y)}^{ρ} w (y | x_{1}) d Q^{(n)} (x_{1}) d ν (y) \\ (A 8) & \leq e^{n ρ R} \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} E {[w {(y | X_{m})}^{\frac{1}{1 + ρ}}]}^{ρ} w {(y | x_{1})}^{\frac{1}{1 + ρ}} d Q^{(n)} (x_{1}) d ν (y) . \\ (A 9) & = e^{n ρ R} \int_{y \in R^{n}} E {[w {(y | X_{m})}^{\frac{1}{1 + ρ}}]}^{ρ} (\int_{x_{1} \in R^{n}} w {(y | x_{1})}^{\frac{1}{1 + ρ}} d Q^{(n)} (x_{1})) d ν (y) \\ (A 10) & = e^{n ρ R} \int_{y \in R^{n}} {(\int_{x \in R^{n}} w {(y | x)}^{\frac{1}{1 + ρ}} d Q^{(n)} (x))}^{1 + ρ} d ν (y) \\ (A 11) & \leq {(\frac{e^{r δ}}{μ})}^{1 + ρ} e^{n ρ R} {(\int_{y \in R} {(\int_{x \in R} e^{r (x^{2} - A)} w {(y | x)}^{\frac{1}{1 + ρ}} d Q_{G} (x))}^{1 + ρ} d ν (y))}^{n} \\ (A 12) & = {(\frac{e^{r δ}}{μ})}^{1 + ρ} e^{n ρ R} e^{- n E_{0, m} (ρ, Q_{G}, r)}, \end{array}

where (A11) follows from the upper bound (39) on the Radon–Nykodim derivative and holds for every $r \geq 0$ . Choosing r as $r^{⋆}$ that achieves $E_{0, m}^{*} (ρ, Q_{G})$ (cf. (31)), we obtain

\begin{array}{l} (A 13) & E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y) | X_{1} = x_{1}, Y = y]}^{ρ}] & \leq & {(\frac{e^{r^{⋆} δ}}{μ})}^{1 + ρ} e^{n ρ R} e^{- n E_{0, m}^{*} (ρ, Q_{G})} \\ (A 14) & = & {(\frac{e^{r^{⋆} δ}}{μ})}^{1 + ρ} e^{n ρ (R - R_{0} (ρ))}, \end{array}

where the last equality follows from (74).

The Central Limit Theorem guarantees that, as n tends to infinity, $μ$ approaches $1 / 2$ . Consequently, the RHS of (A14) tends to zero whenever $R < R_{0} (ρ)$ .

Appendix B. Proof of Lemma 3

To prove the lemma, we shall establish that, whenever $R < R_{0} (ρ) - Δ$ ,

\begin{matrix} lim_{n \to \infty} E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y; Δ) | X_{1} = x_{1}, Y = y]}^{ρ}] = 0, \end{matrix}

(A15)

where the outer expectation is over $X_{1}$ and $Y$ . From this (117) will follow in much the same way that (88) followed from (90) in Section 4.3.

To establish (A15), first note that, conditional on the transmitted codeword $x_{1}$ and the channel output $y$ , the random variables ${B_{m} (x_{1}, y; Δ)}_{m \neq 1}$ are IID Bernoulli, with $B_{m}$ determined by $X_{m}$ and being of probability of success

\begin{array}{l} (A 16) & p (x_{1}, y; Δ) & = & \Pr [w (y | X_{m}) \geq w (y | x_{1}) e^{- \frac{n Δ}{2}}] \\ (A 17) & = & \Pr [w {(y | X_{m})}^{\frac{1}{1 + ρ}} \geq w {(y | x_{1})}^{\frac{1}{1 + ρ}} e^{- \frac{n Δ}{2 (1 + ρ)}}] \\ (A 18) & \leq & w {(y | x_{1})}^{- \frac{1}{1 + ρ}} e^{\frac{n Δ}{2 (1 + ρ)}} E [w {(y | X_{m})}^{\frac{1}{1 + ρ}}], \end{array}

where the last inequality follows from Markov’s inequality. Consequently,

\begin{array}{l} E [E {[\sum_{m \neq 1} B_{m} (x_{1}, y; Δ) | X_{1} = x_{1}, Y = y]}^{ρ}] \\ (A 19) & = \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} E {[\sum_{m \neq 1} B_{m} (x_{1}, y; Δ) | X = x_{1}, Y = y]}^{ρ} w (y | x_{1}) d Q^{(n)} (x_{1}) d ν (y) \\ (A 20) & \leq e^{n ρ R} \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} p {(x_{1}, y; Δ)}^{ρ} w (y | x_{1}) d Q^{(n)} (x_{1}) d ν (y) \\ (A 21) & \leq e^{n ρ R} e^{\frac{n ρ Δ}{2 (1 + ρ)}} \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} E {[w {(y | X_{m})}^{\frac{1}{1 + ρ}}]}^{ρ} w {(y | x_{1})}^{\frac{1}{1 + ρ}} d Q^{(n)} (x_{1}) d ν (y) \\ (A 22) & < e^{n ρ Δ} e^{n ρ R} \int_{y \in R^{n}} \int_{x_{1} \in R^{n}} E {[w {(y | X_{m})}^{\frac{1}{1 + ρ}}]}^{ρ} w {(y | x_{1})}^{\frac{1}{1 + ρ}} d Q^{(n)} (x_{1}) d ν (y) \end{array}

where (A22) holds because $ρ, Δ > 0$ so $n ρ Δ / (2 (1 + ρ)) < n ρ Δ$ .

Except for the $e^{n ρ Δ}$ factor, the RHS of (A22) is identical to the RHS of (A8), which was shown to decay at least as fast as $e^{n ρ (R - R_{0} (ρ))}$ ; see (A14). It follows that the RHS of (A22) tends to zero whenever $R + Δ < R_{0} (ρ)$ .

Author Contributions

Writing—original draft preparation, A.L. and Y.Y.; writing—review and editing, A.L. and Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Gallager R.G. Information Theory and Reliable Communication. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 1968. [Google Scholar]
2.Verdú S. Error exponents and α-mutual information. Entropy. 2021;23:199. doi: 10.3390/e23020199. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Lapidoth A., Marti G., Yan Y. Other helper capacities; Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT); Victoria, Australia. 12–20 July 2021; pp. 1272–1277. [DOI] [Google Scholar]
4.Cover T.M. Elements of Information Theory. 2nd ed. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 2006. [Google Scholar]
5.Kim Y. Capacity of a class of deterministic relay channels. IEEE Trans. Inf. Theory. 2008;54:1328–1329. doi: 10.1109/TIT.2007.915921. [DOI] [Google Scholar]
6.Bross S.I., Lapidoth A., Marti G. Decoder-assisted communications over additive noise channels. IEEE Trans. Commun. 2020;68:4150–4161. doi: 10.1109/TCOMM.2020.2984215. [DOI] [Google Scholar]
7.Lapidoth A., Marti G. Encoder-assisted communications over additive noise channels. IEEE Trans. Inf. Theory. 2020;66:6607–6616. doi: 10.1109/TIT.2020.3012629. [DOI] [Google Scholar]
8.Merhav N. On error exponents of encoder-assisted communication systems. IEEE Trans. Inf. Theory. 2021;67:7019–7029. doi: 10.1109/TIT.2021.3111541. [DOI] [Google Scholar]
9.Bunte C., Lapidoth A. Encoding tasks and Rényi entropy. IEEE Trans. Inf. Theory. 2014;60:5065–5076. doi: 10.1109/TIT.2014.2329490. [DOI] [Google Scholar]
10.Pinsker M.S., Sheverdjaev A.Y. Transmission capacity with zero error and erasure. Probl. Peredachi Informatsii. 1970;6:20–24. [Google Scholar]
11.Csiszar I., Narayan P. Channel capacity for a given decoding metric. IEEE Trans. Inf. Theory. 1995;41:35–43. doi: 10.1109/18.370120. [DOI] [Google Scholar]
12.Telatar I.E. Zero-error list capacities of discrete memoryless channels. IEEE Trans. Inf. Theory. 1997;43:1977–1982. doi: 10.1109/18.641560. [DOI] [Google Scholar]
13.Ahlswede R., Cai N., Zhang Z. Erasure, list, and detection zero-error capacities for low noise and a relation to identification. IEEE Trans. Inf. Theory. 1996;42:55–62. doi: 10.1109/18.481778. [DOI] [Google Scholar]
14.Bunte C., Lapidoth A., Samorodnitsky A. The zero-undetected-error capacity approaches the Sperner capacity. IEEE Trans. Inf. Theory. 2014;60:3825–3833. doi: 10.1109/TIT.2014.2322624. [DOI] [Google Scholar]
15.Nakiboğlu B., Zheng L. Errors-and-erasures decoding for block codes with feedback. IEEE Trans. Inf. Theory. 2012;58:24–49. doi: 10.1109/TIT.2011.2169529. [DOI] [Google Scholar]
16.Bunte C., Lapidoth A. The zero-undetected-error capacity of discrete memoryless channels with feedback; Proceedings of the 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton); Monticello, IL, USA. 1–5 October 2012; pp. 1838–1842. [DOI] [Google Scholar]
17.Bunte C., Lapidoth A. On the listsize capacity with feedback. IEEE Trans. Inf. Theory. 2014;60:6733–6748. doi: 10.1109/TIT.2014.2355815. [DOI] [Google Scholar]
18.Lapidoth A., Miliou N. Duality bounds on the cutoff rate with applications to Ricean fading. IEEE Trans. Inf. Theory. 2006;52:3003–3018. doi: 10.1109/TIT.2006.876349. [DOI] [Google Scholar]
19.Pfister C. Ph.D. Thesis. ETH Zurich; Zurich, Switzerland: 2019. On Rényi Information Measures and Their Applications. [Google Scholar]
20.Rosenthal H.P. On the subspaces of Lp (p > 2) spanned by sequences of independent random variables. Isr. J. Math. 1970;8:273–303. doi: 10.1007/BF02771562. [DOI] [Google Scholar]
21.Arıkan E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory. 1996;42:99–105. doi: 10.1109/18.481781. [DOI] [Google Scholar]

[B1-entropy-24-00029] 1.Gallager R.G. Information Theory and Reliable Communication. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 1968. [Google Scholar]

[B2-entropy-24-00029] 2.Verdú S. Error exponents and α-mutual information. Entropy. 2021;23:199. doi: 10.3390/e23020199. [DOI] [PMC free article] [PubMed] [Google Scholar]

[B3-entropy-24-00029] 3.Lapidoth A., Marti G., Yan Y. Other helper capacities; Proceedings of the 2021 IEEE International Symposium on Information Theory (ISIT); Victoria, Australia. 12–20 July 2021; pp. 1272–1277. [DOI] [Google Scholar]

[B4-entropy-24-00029] 4.Cover T.M. Elements of Information Theory. 2nd ed. John Wiley & Sons, Inc.; Hoboken, NJ, USA: 2006. [Google Scholar]

[B5-entropy-24-00029] 5.Kim Y. Capacity of a class of deterministic relay channels. IEEE Trans. Inf. Theory. 2008;54:1328–1329. doi: 10.1109/TIT.2007.915921. [DOI] [Google Scholar]

[B6-entropy-24-00029] 6.Bross S.I., Lapidoth A., Marti G. Decoder-assisted communications over additive noise channels. IEEE Trans. Commun. 2020;68:4150–4161. doi: 10.1109/TCOMM.2020.2984215. [DOI] [Google Scholar]

[B7-entropy-24-00029] 7.Lapidoth A., Marti G. Encoder-assisted communications over additive noise channels. IEEE Trans. Inf. Theory. 2020;66:6607–6616. doi: 10.1109/TIT.2020.3012629. [DOI] [Google Scholar]

[B8-entropy-24-00029] 8.Merhav N. On error exponents of encoder-assisted communication systems. IEEE Trans. Inf. Theory. 2021;67:7019–7029. doi: 10.1109/TIT.2021.3111541. [DOI] [Google Scholar]

[B9-entropy-24-00029] 9.Bunte C., Lapidoth A. Encoding tasks and Rényi entropy. IEEE Trans. Inf. Theory. 2014;60:5065–5076. doi: 10.1109/TIT.2014.2329490. [DOI] [Google Scholar]

[B10-entropy-24-00029] 10.Pinsker M.S., Sheverdjaev A.Y. Transmission capacity with zero error and erasure. Probl. Peredachi Informatsii. 1970;6:20–24. [Google Scholar]

[B11-entropy-24-00029] 11.Csiszar I., Narayan P. Channel capacity for a given decoding metric. IEEE Trans. Inf. Theory. 1995;41:35–43. doi: 10.1109/18.370120. [DOI] [Google Scholar]

[B12-entropy-24-00029] 12.Telatar I.E. Zero-error list capacities of discrete memoryless channels. IEEE Trans. Inf. Theory. 1997;43:1977–1982. doi: 10.1109/18.641560. [DOI] [Google Scholar]

[B13-entropy-24-00029] 13.Ahlswede R., Cai N., Zhang Z. Erasure, list, and detection zero-error capacities for low noise and a relation to identification. IEEE Trans. Inf. Theory. 1996;42:55–62. doi: 10.1109/18.481778. [DOI] [Google Scholar]

[B14-entropy-24-00029] 14.Bunte C., Lapidoth A., Samorodnitsky A. The zero-undetected-error capacity approaches the Sperner capacity. IEEE Trans. Inf. Theory. 2014;60:3825–3833. doi: 10.1109/TIT.2014.2322624. [DOI] [Google Scholar]

[B15-entropy-24-00029] 15.Nakiboğlu B., Zheng L. Errors-and-erasures decoding for block codes with feedback. IEEE Trans. Inf. Theory. 2012;58:24–49. doi: 10.1109/TIT.2011.2169529. [DOI] [Google Scholar]

[B16-entropy-24-00029] 16.Bunte C., Lapidoth A. The zero-undetected-error capacity of discrete memoryless channels with feedback; Proceedings of the 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton); Monticello, IL, USA. 1–5 October 2012; pp. 1838–1842. [DOI] [Google Scholar]

[B17-entropy-24-00029] 17.Bunte C., Lapidoth A. On the listsize capacity with feedback. IEEE Trans. Inf. Theory. 2014;60:6733–6748. doi: 10.1109/TIT.2014.2355815. [DOI] [Google Scholar]

[B18-entropy-24-00029] 18.Lapidoth A., Miliou N. Duality bounds on the cutoff rate with applications to Ricean fading. IEEE Trans. Inf. Theory. 2006;52:3003–3018. doi: 10.1109/TIT.2006.876349. [DOI] [Google Scholar]

[B19-entropy-24-00029] 19.Pfister C. Ph.D. Thesis. ETH Zurich; Zurich, Switzerland: 2019. On Rényi Information Measures and Their Applications. [Google Scholar]

[B20-entropy-24-00029] 20.Rosenthal H.P. On the subspaces of Lp (p > 2) spanned by sequences of independent random variables. Isr. J. Math. 1970;8:273–303. doi: 10.1007/BF02771562. [DOI] [Google Scholar]

[B21-entropy-24-00029] 21.Arıkan E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory. 1996;42:99–105. doi: 10.1109/18.481781. [DOI] [Google Scholar]

PERMALINK

The Listsize Capacity of the Gaussian Channel with Decoder Assistance

Amos Lapidoth

Yiming Yan

Roles

Abstract

1. Introduction

Figure 1.

2. The Main Result

Theorem 1.

Proof.

Theorem 2.

Proof.

3. Preliminaries

Proposition 1.

Proof.

Proposition 2.

Proof.

4. The Cutoff Rate of the Gaussian Channel

4.1. Computing limE0(n),*(ρ)

4.1.1. Upper-Bounding limE0(n),*(ρ)

4.1.2. Lower-Bounding limE0(n),*(ρ)

4.2. The Mapping ρ↦R0(ρ) Is Monotonically Decreasing

4.3. Achievability of R0(ρ)

Lemma 1.

Proof.

Lemma 2.

Proof.

4.4. No Rate Exceeding R0(ρ) Is Achievable

5. The Direct Part of Theorem 2

5.1. Case 1: Rh=0

Lemma 3.

Proof.

5.2. Case 2: Rh>0

Appendix A. Proof of Lemma 2

Appendix B. Proof of Lemma 3

Author Contributions

Funding

Conflicts of Interest

Footnotes

References

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases

4.1. Computing $lim E_{0}^{(n), *} (ρ)$

4.1.1. Upper-Bounding $lim E_{0}^{(n), *} (ρ)$

4.1.2. Lower-Bounding $lim E_{0}^{(n), *} (ρ)$

4.2. The Mapping $ρ \mapsto R_{0} (ρ)$ Is Monotonically Decreasing

4.3. Achievability of $R_{0} (ρ)$

4.4. No Rate Exceeding $R_{0} (ρ)$ Is Achievable

5.1. Case 1: $R_{h} = 0$

5.2. Case 2: $R_{h} > 0$