Skip to main content
Entropy logoLink to Entropy
. 2019 Mar 19;21(3):298. doi: 10.3390/e21030298

Guessing with Distributed Encoders

Annina Bracher 1, Amos Lapidoth 2, Christoph Pfister 2,*
PMCID: PMC7514780  PMID: 33267013

Abstract

Two correlated sources emit a pair of sequences, each of which is observed by a different encoder. Each encoder produces a rate-limited description of the sequence it observes, and the two descriptions are presented to a guessing device that repeatedly produces sequence pairs until correct. The number of guesses until correct is random, and it is required that it have a moment (of some prespecified order) that tends to one as the length of the sequences tends to infinity. The description rate pairs that allow this are characterized in terms of the Rényi entropy and the Arimoto–Rényi conditional entropy of the joint law of the sources. This solves the guessing analog of the Slepian–Wolf distributed source-coding problem. The achievability is based on random binning, which is analyzed using a technique by Rosenthal.

Keywords: Arimoto–Rényi conditional entropy, distributed source coding, guessing, Rényi entropy

1. Introduction

In the Massey–Arıkan guessing problem [1,2], a random variable X is drawn from a finite set X according to some probability mass function (PMF) PX, and it has to be determined by making guesses of the form “Is X equal to x?” until the guess is correct. The guessing order is determined by a guessing function G, which is a bijective function from X to {1,,|X|}. Guessing according to G proceeds as follows: the first guess is the element x^1X satisfying G(x^1)=1; the second guess is the element x^2X satisfying G(x^2)=2, and so on. Consequently, G(X) is the number of guesses needed to guess X. Arıkan [2] showed that for any ρ>0, the ρth moment of the number of guesses required by an optimal guesser G to guess X is bounded by:

1(1+ln|X|)ρ2ρH1/(1+ρ)(X)E[G(X)ρ]2ρH1/(1+ρ)(X), (1)

where ln(·) denotes the natural logarithm, and H1/(1+ρ)(X) denotes the Rényi entropy of order 11+ρ, which is defined in Section 3 ahead (refinements of (1) were recently derived in [3]).

Guessing with an encoder is depicted in Figure 1. Here, prior to guessing X, the guesser is provided some side information about X in the form of f(X), where f:X{1,,M} is a function taking on at most M different values (“labels”). Accordingly, a guessing function G(·|·) is a function from X×{1,,M} to {1,,|X|} such that for every label m{1,,M}, G(·|m):X{1,,|X|} is bijective. If, among all encoders, f minimizes the ρth moment of the number of guesses required by an optimal guesser to guess X after observing f(X), then [4] (Corollary 7):

1(1+ln|X|)ρ2ρ[H1/(1+ρ)(X)logM]E[G(X|f(X))ρ]1+2ρ[H1/(1+ρ)(X)logM+1]. (2)

Figure 1.

Figure 1

Guessing with an encoder f.

Thus, in guessing a sequence of independent and identically distributed (IID) random variables, a description rate of approximately H1/(1+ρ)(X) bits per symbol is needed to drive the ρth moment of the number of guesses to one as the sequence length tends to infinity [4,5] (see Section 2 for more related work).

In this paper, we generalize the single-encoder setting from Figure 1 to the setting with distributed encoders depicted in Figure 2, which is the analog of Slepian–Wolf coding [6] for guessing: A source generates a sequence of pairs {(Xi,Yi)}i=1n over a finite alphabet X×Y. The sequence Xn is described by one of 2nRX labels and the sequence Yn by one of 2nRY labels using functions:

fn:Xn{1,,2nRX}, (3)
gn:Yn{1,,2nRY}, (4)

where RX0 and RY0. Based on fn(Xn) and gn(Yn), a guesser repeatedly produces guesses of the form (x^n,y^n) until (x^n,y^n)=(Xn,Yn).

Figure 2.

Figure 2

Guessing with distributed encoders fn and gn.

For a fixed ρ>0, a rate pair (RX,RY)R02 is called achievable if there exists a sequence of encoders and guessing functions {(fn,gn,Gn)}n=1 such that the ρth moment of the number of guesses tends to one as n tends to infinity, i.e.,

limnEGnXn,Yn|fn(Xn),gn(Yn)ρ=1. (5)

Our main contribution is Theorem 1, which characterizes the achievable rate pairs. For a fixed ρ>0, let the region R(ρ) comprise all rate pairs (RX,RY)R02 satisfying the following inequalities simultaneously:

RXlim supnHρ˜(Xn|Yn)n, (6)
RYlim supnHρ˜(Yn|Xn)n, (7)
RX+RYlim supnHρ˜(Xn,Yn)n, (8)

where the Rényi entropy Hα(·) and the Arimoto–Rényi conditional entropy Hα(·|·) of order α are both defined in Section 3 ahead, and throughout the paper,

ρ˜11+ρ. (9)

Theorem 1.

For any ρ>0, all rate pairs in the interior of R(ρ) are achievable, while those outside R(ρ) are not. If {(Xi,Yi)}i=1 are IID according to PXY, then (6)(8) reduce to:

RXHρ˜(X|Y), (10)
RYHρ˜(Y|X), (11)
RX+RYHρ˜(X,Y). (12)

Proof. 

The converse follows from Corollary 1 in Section 4; the achievability follows from Corollary 2 in Section 5; and the reduction of (6)–(8) to (10)–(12) in the IID case follows from (19) and (20) ahead. ☐

The rate region defined by (10)–(12) resembles the rate region of Slepian–Wolf coding [6] (Theorem 15.4.1); the difference is that the Shannon entropy and conditional entropy are replaced by their Rényi counterparts. The rate regions are related as follows:

Remark 1.

For memoryless sources and ρ>0, the region R(ρ) is contained in the Slepian–Wolf region. Typically, the containment is strict.

Proof. 

The containment follows from the monotonicity of the Arimoto–Rényi conditional entropy: (9) implies that ρ˜(0,1), so, by [7] (Proposition 5), Hρ˜(X|Y)H(X|Y), Hρ˜(Y|X)H(Y|X), and Hρ˜(X,Y)H(X,Y). As for the strict containment, first note that the Slepian–Wolf region contains at least one rate pair (RX,RY) satisfying RX+RY=H(X,Y). Consequently, if Hρ˜(X,Y)>H(X,Y), then the containment is strict. Because Hρ˜(X,Y)>H(X,Y) unless (X,Y) is distributed uniformly over its support [8], the containment is typically strict.

The claim can also be shown operationally: The probability of error is equal to the probability that more than one guess is needed, and for every ρ>0,

PrGnXn,Yn|fn(Xn),gn(Yn)2=PrGnXn,Yn|fn(Xn),gn(Yn)ρ12ρ1 (13)
EGnXn,Yn|fn(Xn),gn(Yn)ρ12ρ1, (14)

where (14) follows from Markov’s inequality. Thus, the probability of error tends to zero if the ρth moment of the number of guesses tends to one.  ☐

Despite the resemblance between (10)–(12) and the Slepian–Wolf region, there is an important difference: while Slepian–Wolf coding allows separate encoding with the same sum rate as with joint encoding, this is not necessarily true in our setting:

Remark 2.

Although the sum rate constraint (12) is the same as in single-source guessing [5], separate encoding of Xn and Yn may require a larger sum rate than joint encoding of Xn and Yn.

Proof. 

If Hρ˜(X|Y)+Hρ˜(Y|X)>Hρ˜(X,Y), then (10) and (11) together impose a stronger constraint on the sum rate than (12). For example, if:

PXY(x,y)y=0y=1x=00.650.17x=10.170.01

and ρ=1, then H1/2(X|Y)+H1/2(Y|X)1.61 bits, so separate (distributed) encoding requires a sum rate exceeding 1.61 bits as opposed to joint encoding, which is possible with H1/2(X,Y)1.58 bits (in Slepian–Wolf coding, this cannot happen because H(X,Y)H(X|Y)H(Y|X)=I(X;Y)0).  ☐

The guessing problem is related to the task-encoding problem, where based on fn(Xn) and gn(Yn), the decoder outputs a list that is guaranteed to contain (Xn,Yn), and the ρth moment of the list size is required to tend to one as n tends to infinity. While, in the single-source setting, the guessing problem and the task-encoding problem have the same asymptotics [4], this is not the case in the distributed setting:

Remark 3.

For memoryless sources, the task-encoding region from [9] is strictly smaller than the guessing region R(ρ) unless X and Y are independent.

Proof. 

In the IID case, the task-encoding region is the set of all rate pairs (RX,RY)R02 satisfying the following inequalities [9] (Theorem 1):

RXHρ˜(X), (15)
RYHρ˜(Y), (16)
RX+RYHρ˜(X,Y)+Kρ˜(X;Y), (17)

where Kα(X;Y) is a Rényi measure of dependence studied in [10] (when α is one, Kα(X;Y) is the mutual information). The claim now follows from the following observations: By [7] (Theorem 2), Hρ˜(X)Hρ˜(X|Y) with equality if and only if X and Y are independent; similarly, Hρ˜(Y)Hρ˜(Y|X) with equality if and only if X and Y are independent; and by [10] (Theorem 2), Kρ˜(X;Y)0 with equality if and only if X and Y are independent.  ☐

The rest of this paper is structured as follows: in Section 2, we review other guessing settings; in Section 3, we recall the Rényi information measures and prove some auxiliary lemmas; in Section 4, we prove the converse theorem; and in Section 5, we prove the achievability theorem, which is based on random binning and, in the case ρ>1, is analyzed using a technique by Rosenthal [11].

2. Related Work

Tighter versions of (1) can be found in [3,12]. The large deviation behavior of guessing was studied in [13,14]. The relation between guessing and variable-length lossless source coding was explored in [3,15,16].

Mismatched guessing, where the assumed distribution of X does not match its actual distribution, was studied in [17], along with guessing under source uncertainty, where the PMF of X belongs to some known set, and a guesser was sought with good worst-case performance over that set. Guessing subject to distortion, where instead of guessing X, it suffices to guess an X^ that is close to X according to some distortion measure, was treated in [18].

If the guesser observes some side information Y, then the ρth moment of the number of guesses required by an optimal guesser is bounded by [2]:

1(1+ln|X|)ρ2ρHρ˜(X|Y)E[G(X|Y)ρ]2ρHρ˜(X|Y), (18)

where Hρ˜(X|Y) denotes the Arimoto–Rényi conditional entropy of order ρ˜=11+ρ, which is defined in Section 3 ahead (refinements of (18) were recently derived in [3]). Guessing is related to the cutoff rate of a discrete memoryless channel, which is the supremum over all rates for which the ρth moment of the number of guesses needed by the decoder to guess the message can be driven to one as the block length tends to infinity. In [2,19], the cutoff rate was expressed in terms of Gallager’s E0 function [20]. Joint source-channel guessing was considered in [21].

Guessing with an encoder, i.e., the situation where the side information can be chosen, was studied in [4], where it was also shown that guessing and task encoding [22] have the same asymptotics. With distributed encoders, however, task encoding [9] and guessing no longer have the same asymptotics; see Remark 3. Lower and upper bounds for guessing with a helper, i.e., an encoder that does not observe X, but has access to a random variable that is correlated with X, can be found in [5].

3. Preliminaries

Throughout the paper, log(·) denotes the base-two logarithm. When clear from the context, we often omit sets and subscripts; for example, we write x for xX and P(x) for PX(x). The Rényi entropy [23] of order α is defined for positive α other than one as:

Hα(X)11αlogxP(x)α. (19)

In the limit as α tends to one, the Shannon entropy is recovered, i.e., limα1Hα(X)=H(X). The Arimoto–Rényi conditional entropy [24] of order α is defined for positive α other than one as:

Hα(X|Y)α1αlogyxP(x,y)α1α. (20)

In the limit as α tends to one, the Shannon conditional entropy is recovered, i.e., limα1Hα(X|Y)=H(X|Y). The properties of the Arimoto–Rényi conditional entropy were studied in [7,24,25].

In the rest of this section, we recall some properties of the Arimoto–Rényi conditional entropy that will be used in Section 4 (Lemmas 1–3), and we prove auxiliary results for Section 5 (Lemmas 4–7).

Lemma 1 

([7], Theorem 2). Let α>0, and let PXYZ be a PMF over the finite set X×Y×Z. Then,

Hα(X|Y,Z)Hα(X|Z) (21)

with equality if and only if XZY form a Markov chain.

Lemma 2 

([7], Proposition 4). Let α>0, and let PXYZ be a PMF over the finite set X×Y×Z. Then,

Hα(X,Y|Z)Hα(X|Z) (22)

with equality if and only if Y is uniquely determined by X and Z.

Lemma 3 

([7], Theorem 3). Let α>0, and let PXYZ be a PMF over the finite set X×Y×Z. Then,

Hα(X|Y,Z)Hα(X|Z)log|Y|. (23)

Lemma 4 

([20], Problem 4.15(f)). Let Y be a finite set, and let f:YR0. Then, for all p(0,1],

yf(y)pyf(y)p. (24)

Proof. 

If yf(y)=0, then (24) holds because the left-hand side (LHS) and the right-hand side (RHS) are both zero. If yf(y)>0, then:

yf(y)p=yf(y)pyf(y)yf(y)p (25)
yf(y)pyf(y)yf(y) (26)
=yf(y)p, (27)

where (26) holds because p(0,1] and f(y)/yf(y)[0,1] for every yY.  ☐

Lemma 5.

Let a, b, and c be nonnegative integers. Then, for all p>0,

(1+a+b+c)p1+4p(ap+bp+cp) (28)

(the restriction to integers cannot be omitted; for example, (28) does not hold if a=b=c=0.1 and p=2).

Proof. 

If p(0,1], then (28) follows from Lemma 4 because 4p1. If p>1, then the cases with a+b+c{0,1,2} can be checked individually. For a+b+c3,

(1+a+b+c)p=3a+b+c+3p·a+b+c3p (29)
4p·a+b+c3p (30)
4p·ap+bp+cp3 (31)
1+4p(ap+bp+cp), (32)

where (30) holds because a+b+c3, and (31) follows from Jensen’s inequality because zzp is convex on R0 since p>1.  ☐

Lemma 6.

Let a, b, c, and d be nonnegative real numbers. Then, for all p>0,

(a+b+c+d)p4p(ap+bp+cp+dp). (33)

Proof. 

If p(0,1], then (33) follows from Lemma 4 because 4p1. If p>1, then:

(a+b+c+d)p=4p·a+b+c+d4p (34)
4p·ap+bp+cp+dp4 (35)
4p(ap+bp+cp+dp), (36)

where (35) follows from Jensen’s inequality because zzp is convex on R0 since p>1. ☐

Lemma 7 

(Rosenthal). Let p>1, and let X1,,Xn be independent random variables that are either zero or one. Then, Xi=1nXi satisfies:

E[Xp]2p2max{E[X],E[X]p}. (37)

Proof. 

This is a special case of [11] (Lemma 1). For convenience, we also provide a self-contained proof:

E[Xp]=Ei{1,,n}Xi·j{1,,n}Xjp1 (38)
=Ei{1,,n}Xi·1+j{1,,n}{i}Xjp1 (39)
=i{1,,n}EXi·1+j{1,,n}{i}Xjp1 (40)
=i{1,,n}E[Xi]·E1+j{1,,n}{i}Xjp1 (41)
i{1,,n}E[Xi]·E1+j{1,,n}Xjp1 (42)
=E[X]·E[(1+X)p1] (43)
E[X]·2p1·(1+E[Xp1]) (44)
=2p1(E[X]+E[X]E[Xp1]) (45)
2p1E[X]+E[X]E[Xp]p1p (46)
2pmaxE[X],E[X]E[Xp]p1p, (47)

where (39) holds because each Xi is either zero or one; (41) holds because X1,,Xn are independent; (42) holds because zzp1 is increasing on R0 for p>1; (44) holds because for real numbers a0, b0, and r>0, we have (a+b)r(2max{a,b})r=2rmax{ar,br}2r(ar+br); and (46) follows from Jensen’s inequality because zz(p1)/p is concave on R0 for p>1.

We now consider two cases depending on which term on the RHS of (47) achieves the maximum: If the maximum is achieved by E[X], then E[Xp]2pE[X], which implies (37) because 2p2p2 since p>1. If the maximum is achieved by E[X]E[Xp](p1)/p, then:

E[Xp]2pE[X]E[Xp]p1p. (48)

Rearranging (48), we obtain:

E[Xp]2p2E[X]p, (49)

so (37) holds also in this case. ☐

4. Converse

In this section, we prove a nonasymptotic and an asymptotic converse result (Theorem 2 and Corollary 1, respectively).

Theorem 2.

Let UXYV form a Markov chain over the finite set U×X×Y×V, and let τ1+ln|X×Y|. Then, for every ρ>0 and for every guesser, the ρth moment of the number of guesses it takes to guess the pair (X,Y) based on the side information (U,V) satisfies:

E[G(X,Y|U,V)ρ]max{2ρ(Hρ˜(X|Y)log|U|logτ),2ρ(Hρ˜(Y|X)log|V|logτ),2ρ(Hρ˜(X,Y)log|U×V|logτ)}. (50)

Proof. 

We view (50) as three lower bounds corresponding to the three terms in the maximization on its RHS. The lower bound involving Hρ˜(X,Y) holds because:

E[G(X,Y|U,V)ρ]2ρ(Hρ˜(X,Y|U,V)logτ) (51)
2ρ(Hρ˜(X,Y)log|U×V|logτ), (52)

where (51) follows from (18) and (52) follows from Lemma 3. The lower bound involving Hρ˜(X|Y) holds because:

E[G(X,Y|U,V)ρ]2ρ(Hρ˜(X,Y|U,V)logτ) (53)
2ρ(Hρ˜(X,Y|U,V,Y)logτ) (54)
=2ρ(Hρ˜(X|U,V,Y)logτ) (55)
=2ρ(Hρ˜(X|U,Y)logτ) (56)
2ρ(Hρ˜(X|Y)log|U|logτ), (57)

where (53) follows from (18); (54) follows from Lemma 1; (55) follows from Lemma 2; (56) follows from Lemma 1 because X(U,Y)V form a Markov chain; and (57) follows from Lemma 3. The lower bound involving Hρ˜(Y|X) is analogous to the one with Hρ˜(X|Y). ☐

Corollary 1.

For any ρ>0, rate pairs outside R(ρ) are not achievable.

Proof. 

We first show that (8) is necessary for a rate pair (RX,RY)R02 to be achievable. Indeed, if (8) does not hold, then there exists an ϵ>0 such that for infinitely many n,

Hρ˜(Xn,Yn)nRX+RY+ϵ. (58)

Using Theorem 2 with XXn, YYn, U{1,,2nRX}, V{1,,2nRY}, PXYPXnYn, Ufn(Xn), Vgn(Yn), and τn=1+nln|X×Y| leads to:

E[G(Xn,Yn|U,V)ρ]2ρ(Hρ˜(Xn,Yn)log|U×V|logτn) (59)
2ρn(1nHρ˜(Xn,Yn)RXRY1nlogτn). (60)

It follows from (60), (58), and the fact that 1nlogτn tends to zero as n tends to infinity that the LHS of (59) cannot tend to one as n tends to infinity, so (RX,RY) is not achievable if (8) does not hold. The necessity of (6) and (7) can be shown in the same way. ☐

5. Achievability

In this section, we prove a nonasymptotic and an asymptotic achievability result (Theorem 3 and Corollary 2, respectively).

Theorem 3.

Let X, Y, U, and V be finite nonempty sets; let PXY be a PMF; let ρ>0; and let ϵ>0 be such that:

log|U|Hρ˜(X|Y)+ϵ, (61)
log|V|Hρ˜(Y|X)+ϵ, (62)
log|U×V|Hρ˜(X,Y)+ϵ. (63)

Then, there exist functions f:XU and g:YV and a guesser such that the ρth moment of the number of guesses needed to guess the pair (X,Y) based on the side information (f(X),g(Y)) satisfies:

EGX,Y|f(X),g(Y)ρ1+4ρ+1·2ρϵifρ(0,1],1+4(ρ+1)2·2ϵifρ>1. (64)

Proof. 

Our achievability result relies on random binning: we map each xX uniformly at random to some uU and each yY uniformly at random to some vV. We then show that the ρth moment of the number of guesses averaged over all such mappings f:XU and g:YV is upper bounded by the RHS of (64). From this, we conclude that there exist f and g that satisfy (64).

Let the guessing function G correspond to guessing in decreasing order of probability [2] (ties can be resolved arbitrarily). Let f and g be distributed as described above, and denote by Ef,g[·] the expectation with respect to f and g. Then,

Ef,gE[G(X,Y|f(X),g(Y))ρ]=x,yP(x,y)Ef,g[G(x,y|f(x),g(y))ρ] (65)
x,yP(x,y)Ef,gx,yψ(x,y)ϕf(x)ϕg(y)ρ (66)
=x,yP(x,y)Ef,g[(1+β1+β2+β3)ρ] (67)
1+4ρx,yP(x,y)(Ef,g[β1ρ]+Ef,g[β2ρ]+Ef,g[β3ρ]) (68)

with:

ψ(x,y)=ψ(x,y,x,y)𝟙{P(x,y)P(x,y)}, (69)
ϕf(x)=ϕf(x,x)𝟙{f(x)=f(x)}, (70)
ϕg(y)=ϕg(y,y)𝟙{g(y)=g(y)}, (71)
β1=β1(x,y,f)xxψ(x,y)ϕf(x), (72)
β2=β2(x,y,g)yyψ(x,y)ϕg(y), (73)
β3=β3(x,y,f,g)xx,yyψ(x,y)ϕf(x)ϕg(y), (74)

where 𝟙{·} is the indicator function that is one if the condition comprising its argument is true and zero otherwise; (65) holds because (f,g) and (X,Y) are independent; (66) holds because the number of guesses is upper bounded by the number of (x,y) that are at least as likely as (x,y) and that are mapped to the same labels (u,v) as (x,y); (67) follows from splitting the sum depending on whether x=x or not and whether y=y or not and from the fact that ψ(x,y)=ϕf(x)=ϕg(y)=1; and (68) follows from Lemma 5 because β1, β2, and β3 are nonnegative integers. As indicated in (69)–(74), the dependence of ψ, ϕf, ϕg, β1, β2, and β3 on x, y, f, and g is implicit in our notation.

We first treat the case ρ(0,1]. We bound the terms on the RHS of (68) as follows:

x,yP(x,y)Ef,g[β1ρ]x,yP(x,y)Ef,g[β1]ρ (75)
=x,yP(x,y)xxψ(x,y)1|U|ρ (76)
x,yP(x,y)x[P(x,y)P(x,y)]ρ˜1|U|ρ (77)
=1|U|ρx,yP(x,y)ρ˜xP(x,y)ρ˜ρ (78)
=1|U|ρyxP(x,y)ρ˜xP(x,y)ρ˜ρ (79)
=1|U|ρyxP(x,y)ρ˜1+ρ (80)
=2ρ(Hρ˜(X|Y)log|U|) (81)
2ρϵ, (82)

where (75) follows from Jensen’s inequality because zzρ is concave on R0 since ρ(0,1]; (76) holds because the expectation operator is linear and because Ef,g[ϕf(x)]=1/|U| since xx; in (77), we extended the inner summation and used that ψ(x,y)[P(x,y)/P(x,y)]ρ˜; and (82) follows from (61). In the same way, we obtain:

x,yP(x,y)Ef,g[β2ρ]2ρϵ. (83)

Similarly,

x,yP(x,y)Ef,g[β3ρ]x,yP(x,y)Ef,g[β3]ρ (84)
=x,yP(x,y)xx,yyψ(x,y)1|U×V|ρ (85)
x,yP(x,y)x,y[P(x,y)P(x,y)]ρ˜1|U×V|ρ (86)
=1|U×V|ρx,yP(x,y)ρ˜x,yP(x,y)ρ˜ρ (87)
=1|U×V|ρx,yP(x,y)ρ˜1+ρ (88)
=2ρ(Hρ˜(X,Y)log|U×V|) (89)
2ρϵ. (90)

From (68), (82), (83), and (90), we obtain:

Ef,gE[G(X,Y|f(X),g(Y))ρ]1+3·4ρ·2ρϵ (91)
1+4ρ+1·2ρϵ (92)

and hence infer the existence of f: XU and g: YV satisfying (64).

We now consider (68) when ρ>1. Unlike in the case ρ(0,1], we cannot use Jensen’s inequality as we did in (75). Instead, for fixed xX and yY, we upper-bound the first expectation on the RHS of (68) by:

Ef,g[β1ρ]2ρ2maxEf,g[β1],Ef,g[β1]ρ (93)
2ρ2Ef,g[β1]ρ+Ef,g[β1], (94)

where (93) follows from Lemma 7 because ρ>1 and because β1 is a sum of independent random variables taking values in {0,1}. By the same steps as in (76)–(82),

x,yP(x,y)Ef,g[β1]ρ2ρϵ. (95)

As to the expectation of the other term on the RHS of (94),

x,yP(x,y)Ef,g[β1]x,yP(x,y)Ef,g[β1]ρ1ρ (96)
2ϵ, (97)

where (96) follows from Jensen’s inequality because zz1ρ is concave on R0 since ρ>1, and (97) follows from (95). From (94), (95), and (97), we obtain:

x,yP(x,y)Ef,g[β1ρ]2ρ2(2ρϵ+2ϵ) (98)
2ρ2+1·2ϵ, (99)

where (99) holds because 2ρϵ2ϵ since ρ>1 and ϵ>0. In the same way, we obtain for the second expectation on the RHS of (68):

x,yP(x,y)Ef,g[β2ρ]2ρ2+1·2ϵ. (100)

Bounding Ef,g[β3ρ], i.e., the third expectation on the RHS of (68), is more involved because β3 is not a sum of independent random variables. Our approach builds on the ideas used by Rosenthal [11] (Proof of Lemma 1); compare (47) and (48) with (108) and (123) ahead. For fixed xX and yY,

Ef,g[β3ρ]=Ef,gxx,yyψ(x,y)ϕf(x)ϕg(y)·x˜x,y˜yψ(x˜,y˜)ϕf(x˜)ϕg(y˜)ρ1 (101)
=Ef,gxx,yyψ(x,y)ϕf(x)ϕg(y)·(1+γ1+γ2+γ3)ρ1 (102)
=xx,yyEf,gψ(x,y)ϕf(x)ϕg(y)·(1+γ1+γ2+γ3)ρ1 (103)
=xx,yyEf,gψ(x,y)ϕf(x)ϕg(y)·Ef,g(1+γ1+γ2+γ3)ρ1 (104)
xx,yyEf,gψ(x,y)ϕf(x)ϕg(y)·Ef,g(1+δ1+δ2+β3)ρ1 (105)
xx,yyEf,gψ(x,y)ϕf(x)ϕg(y)·4ρ1·Ef,g1+δ1ρ1+δ2ρ1+β3ρ1 (106)
=4ρ1{Ef,g[β3]+yy1|V|Ef,g[δ1]Ef,g[δ1ρ1]
=4ρ1{+xx1|U|Ef,g[δ2]Ef,g[δ2ρ1]+Ef,g[β3]Ef,g[β3ρ1]} (107)
4ρmax{Ef,g[β3],yy1|V|Ef,g[δ1]Ef,g[δ1ρ1],
4ρmax{xx1|U|Ef,g[δ2]Ef,g[δ2ρ1],Ef,g[β3]Ef,g[β3ρ1]} (108)

with:

γ1=γ1(x,y,x,y,f)x˜{x,x}ψ(x˜,y)ϕf(x˜), (109)
γ2=γ2(x,y,x,y,g)y˜{y,y}ψ(x,y˜)ϕg(y˜), (110)
γ3=γ3(x,y,x,y,f,g)x˜{x,x},y˜{y,y}ψ(x˜,y˜)ϕf(x˜)ϕg(y˜), (111)
δ1=δ1(x,y,y,f)x˜xψ(x˜,y)ϕf(x˜), (112)
δ2=δ2(x,y,x,g)y˜yψ(x,y˜)ϕg(y˜), (113)

where (102) follows from splitting the sum in braces depending on whether x˜=x or not and whether y˜=y or not and from assuming ψ(x,y)=ϕf(x)=ϕg(y)=1 within the braces, which does not change the value of the expression because it is multiplied by ψ(x,y)ϕf(x)ϕg(y); (104) holds because (ϕf(x),ϕg(y)) and (γ1,γ2,γ3) are independent since x˜x and y˜y; (105) holds because ρ1>0, γ1δ1, γ2δ2, and γ3β3; (106) follows from Lemma 6; and (107) follows from identifying Ef,g[β3], Ef,g[δ1], and Ef,g[δ2] because ϕf(x) and ϕg(y) are independent, Ef,g[ϕf(x)]=1/|U|, and Ef,g[ϕg(y)]=1/|V|. As indicated in (109)–(113), the dependence of γ1, γ2, γ3, δ1, and δ2 on x, y, x, y, f, and g is implicit in our notation.

To bound Ef,g[β3ρ] further, we study some of the terms on the RHS of (108) separately, starting with the second, which involves the sum over y. For fixed xX, yY, and yY{y},

Ef,g[δ1]Ef,g[δ1ρ1]Ef,g[δ1ρ]1ρEf,g[δ1ρ]ρ1ρ (114)
=Ef,g[δ1ρ] (115)
2ρ2maxEf,g[δ1],Ef,g[δ1]ρ (116)
2ρ2Ef,g[δ1]+Ef,g[δ1]ρ, (117)

where (114) follows from Jensen’s inequality because zz1ρ and zzρ1ρ are both concave on R0 since ρ>1, and (116) follows from Lemma 7 because ρ>1 and because δ1 is a sum of independent random variables taking values in {0,1}. This implies that for fixed xX and yY,

yy1|V|Ef,g[δ1]Ef,g[δ1ρ1]2ρ2yy1|V|Ef,g[δ1]+Ef,g[δ1]ρ (118)
=2ρ2Ef,g[β3]+2ρ2yy1|V|Ef,g[δ1]ρ, (119)

where (119) follows from the definitions of δ1 and β3. Similarly, for the third term on the RHS of (108),

xx1|U|Ef,g[δ2]Ef,g[δ2ρ1]2ρ2Ef,g[β3]+2ρ2xx1|U|Ef,g[δ2]ρ. (120)

With the help of (119) and (120), we now go back to (108) and argue that it implies that for fixed xX and yY,

Ef,g[β3ρ]2·4ρ2 [Ef,g[β3]+yy1|V|Ef,g[δ1]ρ+xx1|U|Ef,g[δ2]ρ+Ef,g[β3]ρ]. (121)

To prove this, we consider four cases depending on which term on the RHS of (108) achieves the maximum: If Ef,g[β3] achieves the maximum, then (121) holds because 4ρ2·4ρ2. If the LHS of (118) achieves the maximum, then (121) follows from (119) because 4ρ·2ρ22·4ρ2. If the LHS of (120) achieves the maximum, then (121) follows similarly. Finally, if Ef,g[β3]Ef,g[β3ρ1] achieves the maximum, then:

Ef,g[β3ρ]4ρEf,g[β3]Ef,g[β3ρ1] (122)
4ρEf,g[β3]Ef,g[β3ρ]ρ1ρ, (123)

where (123) follows from Jensen’s inequality because zzρ1ρ is concave on R0 for ρ>1. Rearranging (123), we obtain:

Ef,g[β3ρ]4ρ2Ef,g[β3]ρ, (124)

so (121) holds also in this case.

Having established (121), we now take the expectation of its sides to obtain:

x,yP(x,y)Ef,g[β3ρ]2·4ρ2x,yP(x,y)Ef,g[β3]+yy1|V|Ef,g[δ1]ρ+xx1|U|Ef,g[δ2]ρ+Ef,g[β3]ρ. (125)

We now study the terms on the RHS of (125) separately, starting with the fourth (last). By (85)–(90), which hold also if ρ>1,

x,yP(x,y)Ef,g[β3]ρ2ρϵ. (126)

As for the first term on the RHS of (125),

x,yP(x,y)Ef,g[β3]2ϵ, (127)

which follows from (126) in the same way as (97) followed from (95). As for the second term on the RHS of (125),

x,yP(x,y)yy1|V|Ef,g[δ1]ρ
=x,yP(x,y)yy1|V|xxψ(x,y)1|U|ρ (128)
x,yP(x,y)ρ˜y1|V|xP(x,y)ρ˜1|U|ρ (129)
=x,yP(x,y)ρ˜yxP(x,y)ρ˜1|U×V|ρ1ρ·xP(x,y)ρ˜1|U|ρ1+ρ(1+ρ)·ρ1ρ (130)
x,yP(x,y)ρ˜yxP(x,y)ρ˜1|U×V|ρ1ρ·yxP(x,y)ρ˜1|U|ρ1+ρ1+ρρ1ρ (131)
=1|U×V|ρx,yP(x,y)ρ˜1+ρ1ρ·1|U|ρyxP(x,y)ρ˜1+ρρ1ρ (132)
(2ρϵ)1ρ·(2ρϵ)ρ1ρ (133)
=2ρϵ, (134)

where in (129), we extended the inner summations and used that ψ(x,y)[P(x,y)/P(x,y)]ρ˜; (131) follows from Hölder’s inequality; and (133) follows from (89)–(90) and (81)–(82). In the same way, we obtain for the third term on the RHS of (125):

x,yP(x,y)xx1|U|Ef,g[δ2]ρ2ρϵ. (135)

From (125), (127), (134), (135), and (126), we deduce:

x,yP(x,y)Ef,g[β3ρ]2·4ρ2(2ϵ+2ρϵ+2ρϵ+2ρϵ) (136)
8·4ρ2·2ϵ, (137)

where (137) holds because 2ρϵ2ϵ since ρ>1 and ϵ>0. Finally, (68), (99), (100), and (137) imply:

Ef,gE[G(X,Y|f(X),g(Y))ρ]1+4ρ(2·2ρ2+1·2ϵ+8·4ρ2·2ϵ) (138)
1+4(ρ+1)2·2ϵ (139)

and thus prove the existence of f:XU and g:YV satisfying (64). ☐

Corollary 2.

For any ρ>0, rate pairs in the interior of R(ρ) are achievable.

Proof. 

Let (RX,RY) be in the interior of R(ρ). Then, (6)–(8) hold with strict inequalities, and there exists a δ>0 such that for all sufficiently large n,

log2nRXHρ˜(Xn|Yn)+nδ, (140)
log2nRYHρ˜(Yn|Xn)+nδ, (141)
log2nRX+log2nRYHρ˜(Xn,Yn)+nδ. (142)

Using Theorem 3 with XXn, YYn, U{1,,2nRX}, V{1,,2nRY}, PXYPXnYn, and ϵnnδ shows that, for all sufficiently large n, there exist encoders fn:XnU and gn:YnV and a guessing function Gn satisfying:

EGnXn,Yn|fn(Xn),gn(Yn)ρ1+4ρ+1·2ρϵnifρ(0,1],1+4(ρ+1)2·2ϵnifρ>1. (143)

Because ϵn tends to infinity as n tends to infinity, the RHS of (143) tends to one as n tends to infinity, which implies that the rate pair (RX,RY) is achievable.  ☐

Author Contributions

Writing—original draft preparation, A.B., A.L. and C.P.; writing—review and editing, A.B., A.L. and C.P.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Massey J.L. Guessing and entropy; Proceedings of the 1994 IEEE International Symposium on Information Theory (ISIT); Trondheim, Norway. 27 June–1 July 1994; p. 204. [DOI] [Google Scholar]
  • 2.Arıkan E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory. 1996;42:99–105. doi: 10.1109/18.481781. [DOI] [Google Scholar]
  • 3.Sason I., Verdú S. Improved bounds on lossless source coding and guessing moments via Rényi measures. IEEE Trans. Inf. Theory. 2018;64:4323–4346. doi: 10.1109/TIT.2018.2803162. [DOI] [Google Scholar]
  • 4.Bracher A., Hof E., Lapidoth A. Guessing attacks on distributed-storage systems. arXiv. 2017. 1701.01981v1
  • 5.Graczyk R., Lapidoth A. Variations on the guessing problem; Proceedings of the 2018 IEEE International Symposium on Information Theory (ISIT); Vail, CO, USA. 17–22 June 2018; pp. 231–235. [DOI] [Google Scholar]
  • 6.Cover T.M., Thomas J.A. Elements of Information Theory. 2nd ed. John Wiley & Sons; Hoboken, NJ, USA: 2006. [Google Scholar]
  • 7.Fehr S., Berens S. On the conditional Rényi entropy. IEEE Trans. Inf. Theory. 2014;60:6801–6810. doi: 10.1109/TIT.2014.2357799. [DOI] [Google Scholar]
  • 8.Csiszár I. Generalized cutoff rates and Rényi’s information measures. IEEE Trans. Inf. Theory. 1995;41:26–34. doi: 10.1109/18.370121. [DOI] [Google Scholar]
  • 9.Bracher A., Lapidoth A., Pfister C. Distributed task encoding; Proceedings of the 2017 IEEE International Symposium on Information Theory (ISIT); Aachen, Germany. 25–30 June 2017; pp. 1993–1997. [DOI] [Google Scholar]
  • 10.Lapidoth A., Pfister C. Two measures of dependence; Proceedings of the 2016 IEEE International Conference on the Science of Electrical Engineering (ICSEE); Eilat, Israel. 16–18 November 2016; pp. 1–5. [DOI] [Google Scholar]
  • 11.Rosenthal H.P. On the subspaces of Lp (p > 2) spanned by sequences of independent random variables. Isr. J. Math. 1970;8:273–303. doi: 10.1007/BF02771562. [DOI] [Google Scholar]
  • 12.Boztaş S. Comments on “An inequality on guessing and its application to sequential decoding”. IEEE Trans. Inf. Theory. 1997;43:2062–2063. doi: 10.1109/18.641578. [DOI] [Google Scholar]
  • 13.Hanawal M.K., Sundaresan R. Guessing revisited: A large deviations approach. IEEE Trans. Inf. Theory. 2011;57:70–78. doi: 10.1109/TIT.2010.2090221. [DOI] [Google Scholar]
  • 14.Christiansen M.M., Duffy K.R. Guesswork, large deviations, and Shannon entropy. IEEE Trans. Inf. Theory. 2013;59:796–802. doi: 10.1109/TIT.2012.2219036. [DOI] [Google Scholar]
  • 15.Sundaresan R. Guessing based on length functions; Proceedings of the 2007 IEEE International Symposium on Information Theory (ISIT); Nice, France. 24–29 June 2007; pp. 716–719. [DOI] [Google Scholar]
  • 16.Sason I. Tight bounds on the Rényi entropy via majorization with applications to guessing and compression. Entropy. 2018;20:896. doi: 10.3390/e20120896. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17.Sundaresan R. Guessing under source uncertainty. IEEE Trans. Inf. Theory. 2007;53:269–287. doi: 10.1109/TIT.2006.887466. [DOI] [Google Scholar]
  • 18.Arıkan E., Merhav N. Guessing subject to distortion. IEEE Trans. Inf. Theory. 1998;44:1041–1056. doi: 10.1109/18.669158. [DOI] [Google Scholar]
  • 19.Bunte C., Lapidoth A. On the listsize capacity with feedback. IEEE Trans. Inf. Theory. 2014;60:6733–6748. doi: 10.1109/TIT.2014.2355815. [DOI] [Google Scholar]
  • 20.Gallager R.G. Information Theory and Reliable Communication. John Wiley & Sons; Hoboken, NJ, USA: 1968. [Google Scholar]
  • 21.Arıkan E., Merhav N. Joint source-channel coding and guessing with application to sequential decoding. IEEE Trans. Inf. Theory. 1998;44:1756–1769. doi: 10.1109/18.705557. [DOI] [Google Scholar]
  • 22.Bunte C., Lapidoth A. Encoding tasks and Rényi entropy. IEEE Trans. Inf. Theory. 2014;60:5065–5076. doi: 10.1109/TIT.2014.2329490. [DOI] [Google Scholar]
  • 23.Rényi A. On measures of entropy and information; Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability; Berkeley, CA, USA. 20 June–30 July 1960; pp. 547–561. [Google Scholar]
  • 24.Arimoto S. Information measures and capacity of order α for discrete memoryless channels. In: Csiszár I., Elias P., editors. Topics in Information Theory. North-Holland Publishing Company; Amsterdam, The Netherlands: 1977. pp. 41–52. [Google Scholar]
  • 25.Sason I., Verdú S. Arimoto–Rényi conditional entropy and Bayesian M-Ary hypothesis testing. IEEE Trans. Inf. Theory. 2018;64:4–25. doi: 10.1109/TIT.2017.2757496. [DOI] [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES