Are Guessing, Source Coding and Tasks Partitioning Birds of A Feather?

M Ashok Kumar; Albert Sunny; Ashish Thakre; Ashisha Kumar; G Dinesh Manohar

doi:10.3390/e24111695

. 2022 Nov 19;24(11):1695. doi: 10.3390/e24111695

Are Guessing, Source Coding and Tasks Partitioning Birds of A Feather? ^†

M Ashok Kumar ^1,^*, Albert Sunny ², Ashish Thakre ³, Ashisha Kumar ³, G Dinesh Manohar ⁴

Editors: Nicusor Minculete, Shigeru Furuichi

PMCID: PMC9689969 PMID: 36421550

Abstract

This paper establishes a close relationship among the four information theoretic problems, namely Campbell source coding, Arikan guessing, Huleihel et al. memoryless guessing and Bunte and Lapidoth tasks’ partitioning problems in the IID-lossless case. We first show that the aforementioned problems are mathematically related via a general moment minimization problem whose optimum solution is given in terms of Renyi entropy. We then propose a general framework for the mismatched version of these problems and establish all the asymptotic results using this framework. The unified framework further enables us to study a variant of Bunte–Lapidoth’s tasks partitioning problem which is practically more appealing. In addition, this variant turns out to be a generalization of Arıkan’s guessing problem. Finally, with the help of this general framework, we establish an equivalence among all these problems, in the sense that, knowing an asymptotically optimal solution in one problem helps us find the same in all other problems.

Keywords: guessing, source coding, tasks partitioning, Shannon entropy, Rényi entropy, Kullback–Leibler divergence, relative α-entropy, Sundaresan’s divergence

MSC: 94A15, 94A17, 94A50, 62B10

1. Introduction

The concept of entropy is very central to information theory. In source coding, the expected number of bits required (per letter) to encode a source with (finite) alphabet set $X$ and probability distribution P is the Shannon entropy $H (P) : = - \sum_{x \in X} P (x) log P (x)$ . If the compressor does not know the true distribution P, but assumes a distribution Q (mismatch), then the number of bits required for compression is $H (P) + I (P, Q)$ , where

I (P, Q) : = - \sum_{x \in X} P (x) log Q (x) + \sum_{x \in X} P (x) log P (x),

(1)

is the entropy of P relative to Q (or the Kullback–Leibler divergence). In his seminal paper, Shannon [1] argued that $H (P)$ can also be regarded as a measure of uncertainty. Subsequently, Rényi [2] introduced an alternate measure of uncertainty, now known as Rényi entropy of order $α$ , as

\begin{matrix} H_{α} (P) : = \frac{1}{1 - α} log \sum_{x \in X} P {(x)}^{α}, \end{matrix}

(2)

where $α > 0$ and $α \neq 1$ . Rényi entropy can also be regarded as a generalization of the Shannon entropy as ${lim}_{α \to 1} H_{α} (P) = H (P)$ . Refer Aczel and Daroczy [3] and the references therein for an extensive study of various measures of uncertainty and their characterizations.

In 1965, Campbell [4] gave an operational meaning to Rényi entropy. He showed that, instead of expected code lengths, if one minimizes the cumulants of code lengths, then the optimal cumulant is Rényi entropy $H_{α} (P)$ , where $α = 1 / (1 + ρ)$ with $ρ$ being the order of the cumulant. He also showed that the optimal cumulant can be achieved by encoding sufficiently long sequences of symbols. Sundaresan (Theorem 8 of [5]) (c.f. Blumer and McElice [6]) showed that, in the mismatched case, the optimal cumulant is $H_{α} (P) + I_{α} (P, Q)$ , where

I_{α} (P, Q) : = \frac{α}{1 - α} log \sum_{x} P (x) {(\frac{Q {(x)}^{α}}{\sum_{y} Q {(y)}^{α}})}^{\frac{α - 1}{α}} - H_{α} (P)

(3)

is called α-entropy of P relative to Q or Sundaresan’s divergence [7]. Hence, $I_{α} (P, Q)$ can be interpreted as the penalty for not knowing the true distribution. The first term in (3) is sometimes called the Renyi cross-entropy and is analogous to the first term of (1). $I_{α} (P, Q) \geq 0$ with equality if and only if $P = Q$ . $I_{α}$ -divergence can also be regarded as a generalization of the Kullback–Leibler divergence as ${lim}_{α \to 1} I_{α} (P, Q) = I (P, Q)$ . Refer to [5,8,9] for detailed discussions on the properties of $I_{α}$ . Lutwak et al. also independently identified $I_{α}$ in the context of maximum Rényi entropy and called it an α-Rényi relative entropy (Equation (4) of [10]). $I_{α}$ , for $α > 1$ , also arises in robust inference problems (see [11] and the references therein).

In [12], Massey studied a guessing problem where one is interested in the expected number of guesses required to guess a random variable X that assumes values from an infinite set, and found a lower bound in terms of Shannon entropy. Arıkan [13] studied it for a finite alphabet set and showed that Rényi entropy arises as the optimal solution in minimizing moments of the number of guesses. Subsequently, Sundaresan [5] showed that the penalty in guessing according to a distribution Q when the true distribution is P, is given by $I_{α} (P, Q)$ . It is interesting to note that guesswork has also been studied from a large deviations point of view [14,15,16,17,18]. Bunte and Lapidoth [8] studied a problem on partitioning of tasks and showed that Rényi entropy and Sundaresan’s divergence play a similar role in the optimal number of tasks performed. We propose, in this paper, a variant of this problem where the tasks in each subset of the partition are performed according to the decreasing order of probabilities. We show that Rényi antropy and Sundaresan’s divergences arise as optimal solutions in this problem too. Huleihel et al. [19,20] studied the memoryless guessing problem, a variant of Arıkan’s guessing problem, with i.i.d. (independent and identically distributed) guesses and showed that the minimum attainable factorial moments of number of guesses is the Rényi entropy. We show, in this paper, that the minimum factorial moment in the mismatched case is measured by the Sundaresan’s divergence.

We observe that, in all these problems, the objective is to minimize usual moments or factorial moments of random variables, and Rényi entropy and Sundaresan’s divergence arise in optimal solutions. The relationship between source coding and guessing is well-known in the literature. Arıkan and Merhav established a close relationship between lossy source coding and guessing with distortion using large deviation techniques [14,21]. The same for the lossless case was done by Hanawal and Sundaresan [17]. In this paper, we establish a general framework for all the five problems in the IID-lossless case. We then use this to establish upper and lower bounds for the mismatched version of these problems. This helps us find an equivalence among all these problems, in the sense that knowing an asymptotically optimal solution in one problem helps us find the same in all other problems.

Our Contributions in the Paper:

(a)
a general framework for the problems on source coding, guessing and tasks partitioning;
(b)
lower and upper bounds for the general framework of these problems both in matched and mis-matched cases;
(c)
a unified approach to derive bounds for the mismatched version of these problems;
(d)
a generalized tasks partitioning problem; and
(e)
establishing operational commonality among the problems.

Organisation of the Paper:

In Section 2, we present our unified framework, and find conditions under which lower and upper bounds are attained. In Section 3, we present four well-known information-theoretic problems, namely, Campbell’s source coding, Arıkan’s guessing, Huleihel et al.’s memoryless guessing, and Bunte–Lapidoth’s tasks partitioning, and re-establish and refine major results pertaining to these problems. In Section 4, we propose and solve a generalized tasks partitioning problem. In Section 5, we establish a connection among the aforementioned problems. Finally, we summarize and conclude the paper in Section 6.

2. A General Minimization Problem

In this section, we present a general minimization problem whose optimum solution evaluates to Rényi entropy. We will later show that all problems stated in Section 3 are particular instances of this general problem.

Proposition 1.

Let $ψ : X \to (0, \infty)$ be such that $\sum_{x \in X} ψ {(x)}^{- 1} \leq k$ for some $k > 0$ . Then, for $ρ \in (- 1, 0) \cup (0, \infty)$ ,

$\frac{1}{ρ} log E_{P} [ψ {(X)}^{ρ}] \geq H_{α} (P) - log k,$ (4)

where $E_{P} [\cdot]$ denotes the expectation with respect to probability distribution P on $X$ , $H_{α} (P)$ is the Rényi entropy of order α, and $α : = α (ρ) = 1 / (1 + ρ)$ . The lower bound is achieved if and only if

$ψ {(x)}^{- 1} = k \cdot P {(x)}^{α} / Z_{P, α} for x \in X,$ (5)

where $Z_{P, α} : = \sum_{x \in X} P {(x)}^{α}$ .

Proof.

Observe that

$\begin{matrix} sgn (ρ) \sum_{x \in X} P (x) ψ {(x)}^{ρ} & = & sgn (ρ) \sum_{x \in X} P {(x)}^{α} {(\frac{ψ {(x)}^{- 1}}{P {(x)}^{α}})}^{- ρ} \\ \overset{(a)}{\geq} & sgn (ρ) (\sum_{x} P {(x)}^{α}) \cdot {(\frac{\sum_{x} ψ {(x)}^{- 1}}{\sum_{x} P {(x)}^{α}})}^{- ρ} \\ = & sgn (ρ) {(\sum_{x} P {(x)}^{α})}^{1 + ρ} \cdot {(\sum_{x} ψ {(x)}^{- 1})}^{- ρ} \\ \overset{(b)}{\geq} & sgn (ρ) {(\sum_{x} P {(x)}^{α})}^{1 + ρ} \cdot k^{- ρ}, \end{matrix}$

where $(a)$ is due to the generalised log-sum inequality (Equation (4.1) of [22]) applied to the function $f (x) = sgn (ρ) \cdot x^{- ρ}$ ; and $(b)$ follows from the hypothesis that $\sum_{x} ψ {(x)}^{- 1}$ $\leq k$ . By taking log and then dividing by $ρ$ , we obtain (4). Equality holds in $(a)$ if and only if $ψ {(x)}^{- 1} = ν P {(x)}^{α}$ for some constant $ν$ and in (b) if and only if $\sum_{x} ψ {(x)}^{- 1} = k$ . This completes the proof. □

The left-side of (4) is called normalised cumulant of $ψ (X)$ of order ρ. The measure $P^{(α)} (x) : = P {(x)}^{α} / Z_{P, α}$ in (5) that attains the lower bound in (4) is called an α-scaled measure or escort measure of P. This measure also arises in robust inference (Equation (7) of [11]) and statistical physics [23]. The above proposition can also be proved using a variational formula as follows. By a version of Donsker–Varadhan variational formula (Propostion 4.5.1 of [24]), for any real-valued f on $X$ , we have

log E_{P} [2^{f (X)}] = max_{Q} {E_{Q} [f (X)] - D (Q ∥ P)},

(6)

where the max is over all probability distributions Q on $X$ . Taking $ρ > 0$ and $f (x) = ρ log ψ (x)$ in (6), we have

\begin{matrix} log E_{P} [ψ {(X)}^{ρ}] & = & max_{Q} {ρ E_{Q} [log ψ (X)] - D (Q ∥ P)} \\ = & max_{Q} \{- ρ \sum_{x} Q (x) log \frac{ψ {(x)}^{- 1} Q (x)}{Q (x)} - D (Q ∥ P)\} \\ = & max_{Q} \{ρ H (Q) - ρ \sum_{x} Q (x) log \frac{ψ {(x)}^{- 1}}{Q (x)} - D (Q ∥ P)\} \\ \overset{(a)}{\geq} & max_{Q} \{ρ H (Q) - ρ log \sum_{x} ψ {(x)}^{- 1} - D (Q ∥ P)\} \\ \overset{(b)}{\geq} & ρ max_{Q} \{H (Q) - \frac{1}{ρ} D (Q ∥ P)\} - ρ log k, \end{matrix}

where (a) is by the log-sum inequality (Equation (4.1) of [22]) and (b) is by applying the constraint $\sum_{x} ψ {(x)}^{- 1} \leq k$ . For $ρ \in (- 1, 0)$ , the inequalities in (a) and (b) are reversed, and the last max is replaced by min. Hence, (4) follows as the last max is equal to $H_{α} (P)$ by (Theorem 1 of [25]). Equality in $(a)$ and $(b)$ holds if and only if $ψ {(x)}^{- 1} = k \cdot Q (x)$ . In addition, the last max is attained when $Q (x) = P {(x)}^{α} / Z_{P, α} for x \in X$ . This completes the proof. The following is the analogous one for Shannon entropy.

Proposition 2.

Let $ψ : X \to (0, \infty)$ be such that $\sum_{x \in X} ψ {(x)}^{- 1} \leq k$ . Then,

$\begin{matrix} E_{P} [log ψ (X)] \geq H (P) - log k . \end{matrix}$ (7)

Equality in (7) is achieved if and only if $ψ {(x)}^{- 1} = k \cdot P (x) \forall x \in X$ .

Proof.

$\begin{matrix} E_{P} [log ψ (X)] & = & - \sum_{x} P (x) log \frac{P (x) \cdot ψ {(x)}^{- 1}}{P (x)} = H (P) - \sum_{x} P (x) log \frac{ψ {(x)}^{- 1}}{P (x)} \\ \geq & H (P) - log \sum_{x} ψ {(x)}^{- 1} \geq H (P) - log k, \end{matrix}$

where the penultimate inequality is due to the log-sum inequality. Equality holds in both inequalities if and only if $ψ {(x)}^{- 1} = k \cdot P (x) \forall x \in X$ . □

It is interesting to note that $log E_{P} [ψ {(X)}^{ρ}] / ρ \to E_{P} [log (ψ (X))]$ and $H_{α} (P) \to H (P)$ as $ρ \to 0$ in (4). We now extend Propositions 1 and 2 to sequences of random variables. Let $X^{n}$ be the set of all n-length sequences of elements of $X$ , and $P_{n}$ be the n-fold product distribution of P on $X^{n}$ , that is, for $x^{n} : = x_{1}, \dots, x_{n} \in X^{n}$ , $P_{n} (x^{n}) = \prod_{i = 1}^{n} P (x_{i})$ .

Corollary 1.

Given any $n \geq 1$ , if $ψ_{n} : X^{n} \to [0, \infty)$ is such that $\sum_{x^{n} \in X^{n}} ψ_{n} {(x^{n})}^{- 1}$ $\leq k_{n}$ for some $k_{n} > 0$ , then

(a)
For $ρ \in (- 1, 0) \cup (0, \infty)$ ,
$\underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [ψ_{n} {(X^{n})}^{ρ}] \geq H_{α} (P) - \underset{n \to \infty}{lim sup} \frac{log k_{n}}{n} .$

(b)

$\underset{n \to \infty}{lim inf} \frac{1}{n} E_{P_{n}} [log ψ_{n} (X^{n})] \geq H (P) - \underset{n \to \infty}{lim sup} \frac{log k_{n}}{n},$

where $E_{P_{n}} [\cdot]$ denotes the expectation with respect to probability distribution $P_{n}$ on $X^{n}$ .

Proof.

It is easy to see that $H_{α} (P_{n}) = n H_{α} (P)$ and $H (P_{n}) = n H (P)$ . Applying Propositions 1 and 2, dividing throughout by n and taking $lim inf n \to \infty$ , the results follow. □

A General Framework for Mismatched Cases

In this sub-section, we establish a unified approach for cases when there is mismatch between assumed and true distributions.

Proposition 3.

Let $ρ > - 1$ , $α = 1 / (1 + ρ)$ , and Q be a probability distribution on $X$ . For $n \geq 1$ , let $Q_{n}$ be the n-fold product distribution of Q on $X^{n}$ . If $ψ_{n} : X^{n} \to (0, \infty)$ is such that

$ψ_{n} (x^{n}) \leq \frac{c_{n} \cdot Z_{Q_{n}, α}}{Q_{n} {(x^{n})}^{α}}$ (8)

for some $c_{n} > 0$ , then

(a)
for $ρ \neq 0$ , we have
$sgn (ρ) \cdot E_{P_{n}} [ψ_{n} {(X^{n})}^{ρ}] \leq sgn (ρ) \cdot 2^{n ρ [H_{α} (P) + I_{α} (P, Q) + n^{- 1} log c_{n}]},$

(b)
for $ρ \neq 0$ , we have
$\underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [ψ_{n} {(X^{n})}^{ρ}] \leq H_{α} (P) + I_{α} (P, Q) + \underset{n \to \infty}{lim sup} \frac{1}{n} log c_{n},$

(c)
for $ρ = 0$ , we have
$E_{P_{n}} [log ψ_{n} (X^{n})] \leq n [H (P) + I (P, Q) + n^{- 1} log c_{n}],$

(d)
for $ρ = 0$ , we have
$\underset{n \to \infty}{lim sup} \frac{1}{n} E_{P_{n}} [log ψ_{n} (X^{n})] \leq H (P) + I (P, Q) + \underset{n \to \infty}{lim sup} \frac{1}{n} log c_{n} .$

Proof.

Part (a): From (8), we have

$\begin{matrix} sgn (ρ) \cdot E_{P_{n}} [ψ_{n} {(X^{n})}^{ρ}] & = sgn (ρ) \cdot \sum_{x^{n} \in X^{n}} P_{n} (x^{n}) ψ_{n} {(x^{n})}^{ρ} \\ \leq sgn (ρ) \cdot c_{n}^{ρ} Z_{Q_{n}, α}^{ρ} \sum_{x^{n} \in X^{n}} P_{n} (x^{n}) Q_{n} {(x^{n})}^{- α ρ} \\ = sgn (ρ) \cdot 2^{ρ [H_{α} (P_{n}) + I_{α} (P_{n}, Q_{n}) + log c_{n}]} \\ = sgn (ρ) \cdot 2^{n ρ [H_{α} (P) + I_{α} (P, Q) + n^{- 1} log c_{n}]}, \end{matrix}$ (9)

where the penultimate equality holds from the definition of $I_{α}$ , and the last one holds because $H_{α} (P_{n}) = n H_{α} (P)$ , $I_{α} (P_{n}, Q_{n}) = n I_{α} (P, Q)$ .

Part (b): Taking log, dividing throughout by $n ρ$ , and then applying lim sup successively on both sides of (9), the result follows.

Part (c): When $ρ = 0$ , we have $α = 1$ and (8) becomes $ψ_{n} (x^{n}) \leq \frac{c_{n}}{Q_{n} (x^{n})}$ . Hence,

$\begin{matrix} E_{P_{n}} [log ψ_{n} (X^{n})] & = \sum_{x^{n} \in X^{n}} P_{n} (x^{n}) log ψ_{n} (x^{n}) \\ \leq log c_{n} + \sum_{x^{n} \in X^{n}} P_{n} (x^{n}) log (1 / Q_{n} (x^{n})) \\ = log c_{n} + H (P_{n}) + I (P_{n}, Q_{n}) \\ = n (H (P) + I (P, Q) + n^{- 1} log c_{n}), \end{matrix}$ (10)

where the last equality holds because $H (P_{n}) = n H (P)$ , and $I (P_{n}, Q_{n}) = n I (P, Q)$ .

Part (d): Dividing (10) throughout by n, and taking limsup on both sides, the result follows. □

Proposition 4.

Let $ρ > - 1$ , $α = 1 / (1 + ρ)$ , and Q be a probability distribution on $X$ . For $n \geq 1$ , let $Q_{n}$ be the n-fold product distribution of Q on $X^{n}$ . Suppose $ψ_{n} : X^{n} \to (0, \infty)$ is such that

$ψ_{n} (x^{n}) \geq \frac{a_{n} Z_{Q_{n}, α}}{Q_{n} {(x^{n})}^{α}}$

for some $a_{n} > 0$ , then

(a)
for $ρ \neq 0$ , we have
$sgn (ρ) \cdot E_{P_{n}} [ψ_{n} {(X^{n})}^{ρ}] \geq sgn (ρ) \cdot 2^{n ρ (H_{α} (P) + I_{α} (P, Q) + n^{- 1} log a_{n})},$

(b)
for $ρ \neq 0$ , we have
$\underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [ψ_{n} {(X^{n})}^{ρ}] \geq H_{α} (P) + I_{α} (P, Q) + \underset{n \to \infty}{lim inf} \frac{1}{n} log a_{n},$

(c)
for $ρ = 0$ , we have
$E_{P_{n}} [log ψ_{n} (X^{n})] \geq n (H (P) + I (P, Q) + n^{- 1} log a_{n}),$

(d)
for $ρ = 0$ , we have
$\underset{n \to \infty}{lim inf} \frac{1}{n} E_{P_{n}} [log ψ_{n} (X^{n})] \geq H (P) + I (P, Q) + \underset{n \to \infty}{lim inf} \frac{1}{n} log a_{n} .$

Proof.

Similar to proof of Proposition 3. □

3. Problem Statements and Known Results

In this section, we discuss Campbell’s source coding problem, Arıkan’s guessing problem, Huleihel et al.’s memoryless guessing problem, and Bunte–Lapidoth’s tasks partitioning problem. Using the general framework presented in the previous section, we re-establish known results, and present a few new results relating to these problems.

3.1. Source Coding Problem

Let X be a random variable that assumes values from a finite alphabet set $X = {a_{1}, \dots, a_{m}}$ according to a probability distribution P. The tuple $(X, P)$ is usually referred to as a source. A binary codeC is a mapping from $X$ to the set of finite length binary strings. Let $L (C (X))$ be the length of code $C (X)$ . The objective is to find a uniquely decodable code that minimizes the expected code-length, that is,

Minimize E_{P} [L (C (X))]

over all uniquely decodable codes C. Kraft and McMillan independently proved the following relation between uniquely decodable codes and their code-lengths.

Kraft-McMilan Theorem [26]: If C a uniquely decodable code, then

\sum_{x \in X} 2^{- L (C (x))} \leq 1 .

(11)

Conversely, given a length sequence that satisfies the above inequality, there exists a uniquely decodable code C with the given length sequence.

Thus, one can confine the search space for C to codes satisfying the Kraft–McMillan inequality (11).

Theorem 5.3.1 of [26]: If C is a uniquely decodable code, then $E_{P} [L (C (X))] \geq H (P)$ .

Proof.

Choose $ψ (x) = 2^{L (C (x))}$ , where $L (C (x))$ is the length of code $C (x)$ assigned to alphabet x. Since C is uniquely decodable, from (11), we have $\sum_{x \in X} ψ {(x)}^{- 1} \leq 1$ . Now, an application of Proposition 2 with $k = 1$ yields the desired result. □

Theorem 1.

Let $X^{n} : = X_{1}, \dots, X_{n}$ be an i.i.d. sequence from $X^{n}$ following the product distribution $P_{n} (X^{n}) = \prod_{i = 1}^{n} P (X_{i})$ . Let $Q_{n} (X^{n}) = \prod_{i = 1}^{n} Q (X_{i})$ , where Q is another probability distribution. Let $C_{n}$ be a code such that $L (C_{n} (X^{n})) = ⌈ - log Q_{n} (X^{n}) ⌉$ . Then, $C_{n}$ satisfies Kraft–McMillan inequality and

$lim_{n \to \infty} \frac{E_{P_{n}} [L (C_{n} (X^{n}))]}{n} = H (P) + I (P, Q) .$

Proof.

Choose $ψ_{n} (x^{n}) = 2^{L (C_{n} (x^{n}))}$ , where $L (C_{n} (x^{n})$ is the length of code $C_{n} (x^{n})$ assigned to sequence $x^{n}$ . Then, we have

$\begin{matrix} ψ_{n} (x^{n}) = 2^{L (C_{n} (x^{n}))} = 2^{⌈ - log Q_{n} (x^{n}) ⌉} \leq 2 \cdot 2^{- log Q_{n} (x^{n})} = 2 / Q_{n} (x^{n}) . \end{matrix}$

An application of Proposition 3 with $c_{n} = 2$ yields ${lim sup}_{n \to \infty} E_{P_{n}} [L (C_{n} (X^{n}))] / n \leq H (P) + I (P, Q)$ . Furthermore, we also have

$\begin{matrix} ψ_{n} (x^{n}) = 2^{L (C_{n} (x^{n}))} = 2^{⌈ - log Q_{n} (x^{n}) ⌉} \geq 2^{- log Q_{n} (x^{n})} = 1 / Q_{n} (x^{n}) . \end{matrix}$

An application of Proposition 4 with $a_{n} = 1$ gives ${lim inf}_{n \to \infty} E_{P_{n}} [L (C_{n} (X^{n}))] / n \geq H (P) + I (P, Q)$ . □

3.2. Campbell Coding Problem

Campbell’s coding problem is similar to Shannon’s source coding problem except that, instead of minimizing the expected code-length, one is interested in minimizing the normalized cumulant of code lengths, that is,

\begin{matrix} Minimize \frac{1}{ρ} log E_{P} [2^{ρ L (C (X))}], \end{matrix}

over all uniquely decodable codes C, and $ρ > 0$ . This problem was shown to be equivalent to minimizing buffer overflow probability by Humblet in [27]. A lower bound for the normalized cumulants in terms of Rényi entropy was provided by Campbell [4].

Lemma 1 of [4]: Let C be a uniquely decodable code. Then,

\begin{matrix} \frac{1}{ρ} log E_{P_{n}} [2^{ρ L (C (X))}] \geq H_{α} (P), \end{matrix}

(12)

where $α = 1 / (1 + ρ) .$

Proof.

Apply Proposition 1 with $ψ (x) = 2^{L (C (x))}$ and $k = 1$ . □

Notice that, if we ignore the integer constraint of the length function, then

\begin{matrix} L (C (x)) = log \frac{Z_{P, α}}{P {(x)}^{α}}, \end{matrix}

(13)

with $Z_{P, α}$ as in Proposition 1, satisfies (11) and achieves the lower bound in (12). Campbell also showed that the lower bound in (12) can be achieved by encoding long sequences of symbols with code-lengths close to (13).

Theorem 1 of [4]: If $C_{n}$ is a uniquely decodable code such that

\begin{matrix} L (C_{n} (x^{n})) = ⌈log \frac{Z_{P_{n}, α}}{P_{n} {(x^{n})}^{α}}⌉, \end{matrix}

(14)

then

lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [2^{ρ L (C_{n} (X^{n}))}] = H_{α} (P) .

Proof.

Choose $ψ_{n} (x^{n}) = 2^{L (C_{n} (x^{n}))}$ . Then, from (14), we have

$\frac{Z_{P_{n}, α}}{P_{n} {(x^{n})}^{α}} \leq ψ_{n} (x^{n}) < 2 \cdot \frac{Z_{P_{n}, α}}{P_{n} {(x^{n})}^{α}} \cdot$

The result follows by applying Propositions 3 and 4 with $c_{n} = 2$ , $a_{n} = 1$ and $Q = P$ . □

Mismatch Case:

Redundancy in the mismatched case of the Campbell’s problem was studied in [5,6]. Sundaresan showed that the difference in the normalized cumulant from the minimum when encoding according to an arbitrary uniquely decodable code is measured by $I_{α}$ -divergence up-to a factor of 1 [5]. We provide a more general version of this result in the following.

Proposition 5.

Let X be a random variable that assumes values from set $X$ according to a probability distribution P. Let $ρ \in (- 1, 0) \cup (0, \infty)$ and $L : X \to Z_{+}$ be an arbitrary length function that satisfies (11). Define

$\begin{matrix} R_{c} (P, L, ρ) : = \frac{1}{ρ} log E_{P} [2^{ρ L (X)}] - min_{K} \frac{1}{ρ} log E_{P} [2^{ρ K (X)}], \end{matrix}$ (15)

where the minimum is over all length functions K satisfying (11). Then, there exists a probability distribution $Q_{L}$ such that

$\begin{matrix} I_{α} (P, Q_{L}) - log η - 1 \leq R_{c} (P, L, ρ) \leq I_{α} (P, Q_{L}) - log η, \end{matrix}$ (16)

where $η = \sum_{x} 2^{- L (x)}$ .

Proof.

Since K satisfies (11), an application of Proposition 1 with $ψ (x) = 2^{K (x)}$ gives us $\frac{1}{ρ} log E_{P} [2^{ρ K (X)}] \geq H_{α} (P)$ . Since $K (x) = ⌈ log (Z_{P, α} / P {(x)}^{α}) ⌉$ satisfies (11) and $ψ (x) = 2^{K (x)} < 2 \cdot Z_{P, α} / P {(x)}^{α}$ , applying Proposition 3 with $n = 1, c_{1} = 2$ , and $Q = P$ , we have

$\frac{1}{ρ} log E_{P} [2^{ρ K (X)}] \leq H_{α} (P) + 1,$

that is, the minimum in (15) is between $H_{α} (P)$ and $H_{α} (P) + 1$ . Hence,

$\begin{matrix} \frac{1}{ρ} log E_{P} [2^{ρ L (X)}] - H_{α} (P) - 1 \leq R_{c} (P, L, ρ) \leq \frac{1}{ρ} log E_{P} [2^{ρ L (X)}] - H_{α} (P) . \end{matrix}$ (17)

Let us now define a probability distribution $Q_{L}$ as

$Q_{L} (x) = \frac{2^{- L (x) / α}}{\sum_{x^{'}} 2^{- L (x^{'}) / α}} \cdot$

Then,

$2^{L (x)} = Q_{L} {(x)}^{- α} Z_{Q_{L}, α} \cdot \frac{1}{\sum_{x^{'}} 2^{- L (x^{'})}} = Q_{L} {(x)}^{- α} Z_{Q_{L}, α} \cdot \frac{1}{η},$

where $η = \sum_{x^{'}} 2^{- L (x^{'})}$ . Applying Propositions 3 and 4 with $n = 1$ , $ψ_{1} (x) = 2^{L (x)}$ , $a_{1} = c_{1} = 1 / η$ , we obtain

$\frac{1}{ρ} log E_{P} [2^{ρ L (X)}] = H_{α} (P) + I_{α} (P, Q_{L}) - log η .$ (18)

Combining (17) and (18), we obtain the desired result. □

We remark that the bound in (16) can be loose when $η$ is small. For example, for a source with two symbols, say x and y, with code lengths $L (x) = L (y) = 100$ , we have $R_{c} (P, L, ρ) \geq I_{α} (P, Q_{L}) + 98$ . However, if one imposes the constraint $1 / 2 \leq η \leq 1$ , then (16) simplifies to

| R_{c} (P, L, ρ) - I_{α} (P, Q_{L}) | \leq 1,

which is (Theorem 8 of [5]). $I_{α} (P, Q_{L})$ is, in a sense, the penalty when $Q_{L}$ does not match the true distribution P. In view of this, a result analogous to Proposition 5 also holds for the Shannon source coding problem.

3.3. Arıkan’s Guessing Problem

Let $X$ be a set of objects with $| X | = m$ . Bob thinks of an object X (a random variable) from $X$ according to a probability distribution P. Alice guesses it by asking questions of the form “Is $X = x$ ?”. The objective is to minimize average number of guesses required for Alice to guess X correctly. By a guessing strategy (or guessing function), we mean a one-one map $G : X \to {1, \dots, m}$ , where $G (x)$ is to be interpreted as the number of questions required to guess x correctly. Arıkan studied the $ρ^{th}$ moment of number of guesses and found upper and lower bounds in terms of Rényi entropy.

Theorem 1 of [13]: Let G be any guessing function. Then, for $ρ \in (- 1, 0) \cup (0, \infty)$ ,

\frac{1}{ρ} log E_{P} [G {(X)}^{ρ}] \geq H_{α} (P) - log (1 + ln m) .

Proof.

Let G be any guessing function. Let $ψ (x) = G (x)$ . Then, we have $\sum_{x \in X} ψ {(x)}^{- 1} = \sum_{x \in X} 1 / G (x) = \sum_{i = 1}^{m} 1 / i \leq 1 + ln m$ . An application of Proposition 1 with $k = 1 + ln m$ yields the desired result. □

Arıkan showed that an optimal guessing function guesses according to the decreasing order of P-probabilities with ties broken using an arbitrary but fixed rule [13]. He also showed that normalized cumulant of an optimal guessing function is bounded above by the Rényi entropy. Next, we present a proof of this using our general framework.

Proposition 4 of [13]: If $G^{*}$ is an optimal guessing function, then, for $ρ \in (- 1, 0) \cup (0, \infty)$ ,

\frac{1}{ρ} log E_{P} [G^{*} {(X)}^{ρ}] \leq H_{α} (P) .

Proof.

Let us rearrange the probabilities ${P (x), x \in X}$ in non-increasing order, say

$p_{1} \geq p_{2} \geq \dots \geq p_{m} .$

Then, the optimal guessing function $G^{*}$ is given by $G^{*} (x) = i$ if $P (x) = p_{i}$ . Let us index the elements in set $X$ as ${x_{1}, x_{2}, \dots, x_{m}}$ , according to the decreasing order of their probabilities. Then, for $i \in {1, \dots, m}$ , we have

$\frac{Z_{P, α}}{P {(x_{i})}^{α}} = \frac{\sum_{j = 1}^{m} p_{j}^{α}}{p_{i}^{α}} \geq i = G^{*} (x_{i}) .$ (19)

That is, $G^{*} (x) \leq \frac{Z_{P, α}}{P {(x)}^{α}}$ for $x \in X$ . Now, an application of Proposition 3 with $n = 1$ , $Q = P$ , $ψ_{1} (x) = G^{*} (x)$ and $c_{1} = 1$ , gives us

$\frac{1}{ρ} log E_{P} [G^{*} {(X)}^{ρ}] = \frac{1}{ρ} log E_{P} [ψ {(X)}^{ρ}] \leq H_{α} (P) + I_{α} (P, P) + log 1 = H_{α} (P) .$

□

Arıkan also proved that the upper bound of Rényi entropy can be achieved by guessing long sequences of letters in an i.i.d. fashion.

Proposition 5 of [13] Let $X_{1}, X_{2}, \dots, X_{n}$ be a sequence of i.i.d. guesses. Let $G_{n}^{*} (X_{1}, \dots, X_{n})$ be an optimal guessing function. Then, for $ρ \in (- 1, 0) \cup (0, \infty)$ ,

lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [G_{n}^{*} {(X_{1}, X_{2}, \dots, X_{n})}^{ρ}] = H_{α} (P) .

Proof.

Let $G_{n}^{*}$ be the optimal guessing function from $X^{n}$ to ${1, 2, \dots, m^{n}}$ . An application of Corollary 1 with $ψ_{n} (x^{n}) = G_{n}^{*} (x^{n})$ and $k_{n} = 1 + n ln m$ yields

$\begin{matrix} \underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [G_{n}^{*} {(X^{n})}^{ρ}] \geq H_{α} (P) - \underset{n \to \infty}{lim sup} \frac{log (1 + n ln m)}{n} = H_{α} (P) . \end{matrix}$ (20)

As in the proof of the previous result, we know that $G^{*} (x^{n}) \leq \frac{Z_{P_{n}, α}}{P_{n} {(x^{n})}^{α}}$ for $x^{n} \in X^{n}$ . Hence, an application of Proposition 3 with $Q_{n} = P_{n}$ , $ψ_{n} (x^{n}) = G_{n}^{*} (x^{n})$ , and $c_{n} = 1$ yields

$\begin{matrix} \underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [G_{n}^{*} {(X^{n})}^{ρ}] \leq H_{α} (P) . \end{matrix}$ (21)

Combining (20) and (21), we obtain the desired result. □

Henceforth, we shall denote the optimal guessing function corresponding to a probability distribution P by $G_{P}$ .

Mismatch Case:

Suppose Alice does not know the true underlying probability distribution P, and guesses according to some guessing function G. The following proposition tells us that the penalty for deviating from the optimal guessing function can be measured by $I_{α}$ -divergence.

Proposition 6.

Let G be an arbitrary guessing function. Then, for $ρ \in (- 1, 0) \cup (0, \infty)$ , there exists a probability distribution $Q_{G}$ on $X$ such that

$\frac{1}{ρ} log E_{P} [G {(X)}^{ρ}] \geq H_{α} (P) + I_{α} (P, Q_{G}) - log (1 + ln m) .$

Proof.

Let G be a guessing function. Define a probability distribution $Q_{G}$ on $X$ as

$Q_{G} (x) = \frac{G {(x)}^{- 1 / α}}{\sum_{x^{'} \in X} G {(x^{'})}^{- 1 / α}} .$ (22)

Then, we have

$\frac{Z_{Q_{G}, α}}{Q_{G} {(x)}^{α}} = G (x) \sum_{x^{'} \in X} \frac{1}{G (x^{'})} \leq G (x) \cdot (1 + ln m) \cdot$

Now, an application of Proposition 4 with $n = 1$ , $ψ_{1} (x) = G (x)$ , and $a_{1} = 1 / (1 + ln m)$ yields the desired result. □

A converse result is the following.

Proposition 1 of [5]: Let $G_{Q}$ be an optimal guessing function associated with Q. Then, for $ρ \in (- 1, 0) \cup (0, \infty)$ ,

\begin{matrix} \frac{1}{ρ} log E_{P} [G_{Q} {(X)}^{ρ}] \leq H_{α} (P) + I_{α} (P, Q), \end{matrix}

where the expectation is with respect to P.

Proof.

Let us rearrange the probabilities $(Q (x), x \in X)$ in non-increasing order, say

$q_{1} \geq q_{2} \geq \dots \geq q_{m} .$

By definition, $G_{Q} (x) = i$ if $Q (x) = q_{i}$ . Then, as in (19), we have $G_{Q} (x) \leq Z_{Q, α} / Q {(x)}^{α}$ $for x \in X$ . Hence, an application of Proposition 3 with $n = 1$ , $ψ_{1} (x) = G_{Q} (x)$ , and $c_{1} = 1$ proves the result. □

Observe that, given a guessing function G, if we apply the above proposition for $Q = Q_{G}$ , where $Q_{G}$ is as in (22), then we obtain

\begin{matrix} \frac{1}{ρ} log E_{P_{n}} [G_{Q_{G}} {(X)}^{ρ}] \leq H_{α} (P) + I_{α} (P, Q_{G}) . \end{matrix}

Thus, the above two propositions can be combined to state the following, which is analogous to Proposition 5 (refer Section 3.2).

Theorem 6 of [5]: Let G be an arbitrary guessing function and $G_{P}$ be the optimal guessing function for P. For $ρ \in (- 1, 0) \cup (0, \infty)$ , let

R_{g} (P, G, ρ) : = \frac{1}{ρ} log E_{P} [G {(X)}^{ρ}] - \frac{1}{ρ} log E_{P} [G_{P} {(X)}^{ρ}] .

Then, there exists a probability distribution $Q_{G}$ such that

| R_{g} (P, G, ρ) - I_{α} (P, Q_{G}) | \leq log (1 + ln m) .

3.4. Memoryless Guessing

In memoryless guessing, the setup is similar to that of Arıkan’s guessing problem except that this time the guesser Alice comes up with guesses independent of her previous guesses. Let ${\hat{X}}_{1}, {\hat{X}}_{2}, \dots$ be Alice’s sequence of independent guesses according to a distribution $\hat{P}$ . The guessing function in this problem is defined as

\begin{matrix} G_{\hat{P}} (X) : = inf {i \geq 1 : {\hat{X}}_{i} = X}, \end{matrix}

that is, the number of guesses until a successful guess. Sundaresan [28], inspired by Arıkan’s result, showed that the minimum expected number of guesses required is $exp {H_{\frac{1}{2}} (P)}$ , and the distribution that achieves this is surprisingly not the underlying distribution P, but the “tilted distribution” ${\hat{P}}^{*} (x) : = \sqrt{P (x)} / \sum_{y} \sqrt{P (y)}$ .

Unlike in Arıkan’s guessing problem, Huleihel et al. [19] minimized what are called factorial moments, defined for $ρ \in Z_{+}$ as

\begin{matrix} V_{\hat{P}, ρ} (X) = \frac{1}{ρ!} \prod_{l = 0}^{ρ - 1} (G_{\hat{P}} (X) + l) . \end{matrix}

Huleihel et al. [19] (c.f. [20]) studied the following problem.

\begin{matrix} Minimize E_{P} [V_{\hat{P}, ρ} (X)], \end{matrix}

over all $\hat{P} \in P$ , where $P$ is the probability simplex that is, $P = {{(P (x))}_{x \in X} : P (x) \geq 0,$ $\sum_{x} P (x) = 1}$ . Let ${\hat{P}}^{*}$ be the optimal solution of the above problem.

Theorem 1 of [19]: For any integer $ρ > 0$ , we have

\frac{1}{ρ} log E_{P} [V_{{\hat{P}}^{*}, ρ} (X)] = H_{α} (P)

and ${\hat{P}}^{*} (x) = P {(x)}^{α} / Z_{P, α} .$

Proof.

From [19], we know that

$\begin{matrix} E_{P} [V_{\hat{P}, ρ} (X)] = E_{P} [\hat{P} {(X)}^{- ρ}] . \end{matrix}$ (23)

Now, the result follows from Proposition 1 with $ψ (x) = \hat{P} {(x)}^{- 1}$ and $k = 1$ . Indeed, since $\hat{P}$ is a probability distribution, we have $\sum_{x \in X} ψ {(x)}^{- 1} = \sum_{x \in X} \hat{P} (x) = 1$ . Hence, $\frac{1}{ρ} log E_{P} [V_{\hat{P}, ρ} (X)] \geq H_{α} (P)$ , and the lower bound is attained by ${\hat{P}}^{*} (x) = P {(x)}^{α} / Z_{P, α}$ . □

For a sequence of guesses, the above theorem can be stated in the following way. Let ${\hat{X}}^{n} = ({\hat{X}}_{1}, \dots, {\hat{X}}_{n}$ ), where ${\hat{X}}_{i}$ ’s are i.i.d. guesses, drawn from $X^{n}$ with distribution ${\hat{P}}_{n}$ —the n-fold product distribution of $\hat{P}$ on $X^{n}$ . If the true underlying distribution is $P_{n}$ , then

\begin{matrix} lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [V_{{\hat{P}}_{n}^{*} ρ} (X^{n})] = H_{α} (P), \end{matrix}

where ${\hat{P}}_{n}^{*} (x) = P_{n} {(x)}^{α} / Z_{P_{n}, α}$ . For the mismatched case, we have the following result.

Proposition 7.

If the true underlying probability distribution is P, but Alice assumes it as Q and guesses according to its optimal one, namely ${\hat{Q}}^{*} (x) = Q {(x)}^{α} / Z_{Q, α}$ , then

$\frac{1}{ρ} log E_{P} [V_{{\hat{Q}}^{*}, ρ} (X)] = H_{α} (P) + I_{α} (P, Q) .$

Proof.

Due to (23), the result follows easily by taking $n = 1$ , $ψ_{1} (x) = {\hat{Q}}^{*} {(x)}^{- 1}$ , $c_{1} = 1,$ $a_{1} = 1$ in Propositions 3 and 4. □

3.5. Tasks Partitioning Problem

Encoding of Tasks problem studied by Bunte and Lapidoth [8] can be phrased in the following way. Let $X$ be a finite set of tasks. A task X is randomly drawn from $X$ according to a probability distribution P, which may correspond to the frequency of occurrences of tasks. Suppose these tasks are associated with M keys. Typically, $M < | X |$ . Due to a limited availability of keys, more than one task may be associated with a single key. When a task needs to be performed, the key associated with it is pressed. Consequently, all tasks associated with this key will be performed. The objective in this problem is to minimize the number of redundant tasks performed. Usual coding techniques suggest assigning tasks with high probability to individual keys and leaving the low probability tasks unassigned. It may just be the case that some tasks may have a higher frequency of occurrence than others. However, for an individual, all tasks can be equally important. If $M \geq | X |$ , then one can perform tasks without any redundancy. However, Bunte and Lapidoth [8] showed that, even when $M < | X |$ , one can accomplish the tasks with much less redundancy on average, provided the underlying probability distribution is different from the uniform distribution.

Let $A = {A_{1}, A_{2}, \dots, A_{M}}$ be a partition of $X$ that corresponds to the assignment of tasks to M keys. Let $A (x)$ be the cardinality of the subset containing x in the partition. We shall call A the partition function associated with partition $A$ . We shall assume that $ρ > 0$ throughout this section, though some of the results hold even when $ρ \in (- 1, 0)$ .

Theorem I.1 of [8]: The following results hold.

(a)
For any partition of $X$ of size M with partition function A, we have
$\frac{1}{ρ} log E_{P} [A {(X)}^{ρ}] \geq H_{α} (P) - log M .$
(b)
If $M > log | X | + 2$ , then there exists a partition of $X$ of size at most M with partition function A such that
$1 \leq E_{P} [\hat{A} {(X)}^{ρ}] \leq 1 + 2^{ρ (H_{α} (P) - log \tilde{M})},$
where
$\begin{matrix} \tilde{M} : = (M - log | X | - 2) / 4 . \end{matrix}$ (24)

Proof.

Part (a): Let $ψ (x) = A (x)$ . Then, we have $\sum_{x \in X} ψ {(x)}^{- 1} = \sum_{x \in X} A {(x)}^{- 1} = M$ (Prop. III-1 of [8]). Now, an application of Proposition 1 with $k = M$ gives us the desired result.

Part (b): For the proof of this part, we refer to [8]. □

Bunte and Lapidoth also proved the following limit results.

Theorem I.2 of [8]: Then, for every $n \geq 1$ , there exists a partition $A_{n}$ of $X^{n}$ of size at most $M^{n}$ with an associated partition function $A_{n}$ such that

lim_{n \to \infty} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] = \{\begin{matrix} 1 & if log M > H_{α} (P) \\ \infty & if log M < H_{α} (P), \end{matrix}

where $X^{n} : = (X_{1}, \dots, X_{n})$ .

It should be noted that, in a general set-up of the tasks partitioning problem, it is not necessary that the partition size is of the form $M^{n}$ ; it can be some $M_{n}$ (a function of n). Consequently, we have the following result.

Proposition 8.

Let ${M_{n}}$ be a sequence of positive integers such that $M_{n} \geq n log | X | + 3$ , and

$γ : = lim_{n \to \infty} \frac{log M_{n}}{n}$

exists. Then, there exists a sequence of partitions of $X^{n}$ of size at most $M_{n}$ with partition functions $A_{n}$ such that

(a)

$lim_{n \to \infty} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] = 1 if γ > H_{α} (P),$

(b)

$lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] = H_{α} (P) - γ if γ < H_{α} (P) .$

Proof.

Let

$\begin{matrix} {\tilde{M}}_{n} : = (M_{n} - n log | X | - 2) / 4 . \end{matrix}$ (25)

We first claim that ${lim}_{n \to \infty} \frac{log {\tilde{M}}_{n}}{n} = γ$ . Indeed, since $\frac{log (1 / 4)}{n} \leq \frac{log {\tilde{M}}_{n}}{n} < \frac{log M_{n}}{n}$ , when $γ = 0$ , we have ${lim}_{n \to \infty} \frac{log {\tilde{M}}_{n}}{n} = 0$ . On the other hand, when $γ > 0$ , we can find an $n_{γ}$ such that $M_{n} \geq 2^{γ n / 2} \forall n \geq n_{γ}$ . Thus, we have ${lim}_{n \to \infty} \frac{n}{M_{n}} = 0$ . Consequently,

$lim_{n \to \infty} \frac{log {\tilde{M}}_{n}}{n} = lim_{n \to \infty} \frac{log M_{n}}{n} + lim_{n \to \infty} \frac{1}{n} log (1 - \frac{(n log | X | + 2)}{M_{n}}) - lim_{n \to \infty} \frac{2}{n} = γ .$

This proves the claim. From Theorem I.1 of [11], for any $n \geq 1$ and $M_{n} > n log | X | + 2$ , there exists a partition $A_{n}$ of $X^{n}$ of size at most $M_{n}$ such that the associated partition function $A_{n}$ satisfies

$\begin{matrix} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] & \leq 1 + 2^{ρ (H_{α} (P_{n}) - log {\tilde{M}}_{n})} = 1 + 2^{n ρ (H_{α} (P) - \frac{log {\tilde{M}}_{n}}{n})} . \end{matrix}$

Part (a): When $γ > H_{α} (p)$ , let us choose $ϵ = (γ - H_{α} (P)) / 2 > 0$ . Then, there exists an $n_{ϵ}$ such that $\frac{log {\tilde{M}}_{n}}{n} \geq γ - ϵ \forall n \geq n_{ϵ}$ . Thus, we have

$\begin{matrix} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] \leq 1 + 2^{n ρ (H_{α} (P) - γ + ϵ)} = 1 + 2^{- n ρ (γ - H_{α} (P)) / 2} \forall n \geq n_{ϵ} \end{matrix}$

Consequently, ${lim sup}_{n \to \infty} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] \leq 1$ . We also note that $A_{n} (x^{n}) \geq 1$ for all $x^{n} \in X^{n}$ .

Thus, ${lim inf}_{n \to \infty} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] \geq 1$ .

Part (b): For any $ϵ > 0$ , there exists an $n_{ϵ}$ such that $\frac{log {\tilde{M}}_{n}}{n} \geq γ - ϵ \forall n \geq n_{ϵ}$ . Thus, we have

$\begin{matrix} E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] & \leq 1 + 2^{n ρ (H_{α} (P) - γ + ϵ)} \leq 2^{1 + n ρ (H_{α} (P) - γ + ϵ)} \forall n \geq n_{ϵ} \end{matrix}$

Hence, we have

$\underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] \leq H_{α} (P) - γ + ϵ \forall ϵ > 0$

Furthermore, an invocation of Corollary 1 with $ψ_{n} (x^{n}) = A_{n} (x^{n})$ and $k_{n} = \sum_{x^{n} \in X^{n}} 1 / A_{n} (x^{n}) = M_{n}$ gives us

$\underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] \geq H_{α} (P) - \underset{n \to \infty}{lim sup} \frac{log M_{n}}{n} = H_{α} (P) - γ .$

□

Remark 1.

It is interesting to note that, when $γ < H_{α} (P)$ , in addition to the fact that ${lim}_{n \to \infty} E_{P_{n}}$ $[A_{n} {(X^{n})}^{ρ}] = \infty$ , we also have $E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] \approx 2^{n ρ (H_{α} (P) - γ)}$ for large values of n.

Mismatch Case:

Let us now suppose that one does not know the true underlying probability distribution P, but arbitrarily partitions $X$ . Then, the penalty due to such a partition can be measured by the $I_{α}$ -divergence as stated in the following theorem.

Proposition 9.

Let $A$ be a partition of $X$ of size M with partition function A. Then, there exists a probability distribution $Q_{A}$ on $X$ such that

$\frac{1}{ρ} log E_{P} [A {(X)}^{ρ}] = H_{α} (P) + I_{α} (P, Q_{A}) - log M .$

Proof.

Define a probability distribution $Q_{A} = {Q_{A} (x), x \in X}$ as

$Q_{A} (x) : = \frac{A {(x)}^{- 1 / α}}{\sum_{x^{'} \in X} A {(x^{'})}^{- 1 / α}} \cdot$

Then,

$\frac{Z_{Q_{A}, α}}{Q_{A} {(x)}^{α}} = A (x) \sum_{x^{'} \in X} \frac{1}{A (x^{'})} = A (x) \cdot M,$

where the last equality follows due to Proposition III.1 of [8]. Rearranging terms, we have $A (x) = \frac{Z_{Q_{A}, α}}{M \cdot Q_{A} {(x)}^{α}}$ . Hence, an application of Propositions 3 and 4 with $n = 1$ , $ψ_{1} (x) = A (x)$ , $c_{1} = 1 / M$ , $a_{1} = 1 / M$ , and $Q = Q_{A}$ yields the desired result. □

A converse result is the following.

Proposition 10.

Let X be a random task from $X$ following distribution P and $ρ \in (0, \infty)$ . Let Q be another distribution on $X$ . If $M > log | X | + 2$ , then there exists a partition $A_{Q}$ (with an associated partition function $A_{Q}$ ) of $X$ of size at most M such that

$E_{P} [A_{Q} {(X)}^{ρ}] \leq 1 + 2^{ρ (H_{α} (P) + I_{α} (P, Q) - log \tilde{M})},$

where $\tilde{M}$ is as in (24).

Proof.

Similar to proof of Theorem I.1 of [8]. □

4. Ordered Tasks Partitioning Problem

In Bunte–Lapidoth’s tasks partitioning problem [8], one is interested in the average number of tasks associated with a key. However, in some scenarios, it might be more important to minimize the average number of redundant tasks performed, before the intended task. To achieve this, tasks associated with a key should be performed in a decreasing order of their probabilities. With such a strategy in place, this problem draws parallel with Arıkan’s guessing problem [13].

Let $A = {A_{1}, A_{2}, \dots, A_{M}}$ be a partition of $X$ that corresponds to the assignment of tasks to M keys. Let $N (x)$ be the number of redundant tasks performed until and including the intended task x. We refer to $N (\cdot)$ as the count function associated with partition $A$ . We suppress the dependence of N on $A$ for the sake of notational convenience. If X denotes the intended task, then we are interested in the $ρ^{th}$ moment of number of tasks performed, that is, $E_{P} [N {(X)}^{ρ}]$ , where $ρ > 0$ .

Lemma 1.

For any count function associated with a partition of size M, we have

$\sum_{x \in X} \frac{1}{N (x)} \leq M [1 + ln (\frac{| X |}{M})] .$ (26)

Proof.

For a partition $A = {A_{1}, A_{2}, \dots, A_{M}}$ of $X$ , observe that

$\sum_{x \in X} \frac{1}{N (x)} = (1 + \frac{1}{2} + \dots + \frac{1}{| A_{1} |}) + \dots + (1 + \frac{1}{2} + \dots + \frac{1}{| A_{M} |}) .$ (27)

Since $1 + \frac{1}{2} + \dots + \frac{1}{| A_{k} |} \leq 1 + ln | A_{k} |$ , for any $k \in {1, \dots, M}$ , we have

$\begin{matrix} \sum_{x \in X} \frac{1}{N (x)} & \leq M + ln (| A_{1} | \dots | A_{M} |) = M [1 + ln (| A_{1} | \dots | A_{M} {|)}^{1 / M}] \\ \overset{(a)}{\leq} M [1 + ln (\frac{| A_{1} | + \dots + | A_{M} |}{M})] = M [1 + ln (\frac{| X |}{M})], \end{matrix}$ (28)

where (a) follows due to the AM–GM inequality. □

Proposition 11.

Let X be a random task from $X$ following distribution P. Then, the following hold:

(a)
For any partition of $X$ of size M, we have
$\begin{matrix} \frac{1}{ρ} log E_{P} [N {(X)}^{ρ}] \geq H_{α} (P) - log {M [1 + ln (| X | / M)]} \end{matrix}$ (29)

(b)
Let $M > log | X | + 2$ . Then, there exists a partition of $X$ of size at most M with count function N such that
$1 \leq E_{P} [N {(X)}^{ρ}] \leq 1 + 2^{ρ (H_{α} (P) - log \tilde{M})},$
where $\tilde{M}$ is as in (24).

Proof.

Part (a): Applying Proposition 1 with $k = M [1 + ln (| X | / M)]$ and $ψ (x) = N (x)$ , we obtain the desired result.

Part (b): If A and N are, respectively, the partition and count functions of a partition $A$ , then we have $1 \leq N (x) \leq A (x)$ for $x \in X$ . Once we observe this, the proof is same as Theorem I.1 (b) of [8]. □

Proposition 12.

Let ${M_{n}}$ be a sequence of positive integers such that $M_{n} \geq n log | X | + 3$ , and $γ : = {lim}_{n \to \infty} log M_{n} / n$ exists. Then, there exists a sequence of partitions of $X^{n}$ of size at most $M_{n}$ with count functions $N_{n}$ such that

(a)

$lim_{n \to \infty} E_{P_{n}} [N_{n} {(X^{n})}^{ρ}] = 1 if γ > H_{α} (P),$

(b)

$lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [N_{n} {(X^{n})}^{ρ}] = H_{α} (P) - γ if γ < H_{α} (P) .$

Proof.

Similar to proof of Proposition 8. □

Remark 2.

(a)
If we choose the trivial partition, namely $A_{n} = {X^{n}}$ , then the ordered tasks partitioning problem simplifies to Arıkan’s guessing problem, that is, we have $M_{n} = 1$ , $N_{n} (x^{n}) = G_{n} (x^{n})$ and (26) simplifies to
$\sum_{x^{n} \in X^{n}} \frac{1}{G_{n} (x^{n})} \leq 1 + n ln | X | .$

Hence, all results pertaining to the Arıken’s guessing problem can be derived from the ordered tasks partitioning problem.

(b)
Structurally, ordered tasks partitioning problem differs from the Bunte–Lapidoth’s problem only due the factor $1 + ln (| X | / M)$ in (28). While this factor matters for one-shot results, for a sequence of i.i.d. tasks, this factor vanishes asymptotically.

Mismatch Case:

Let us now suppose that one does not know the true underlying probability distribution P, but arbitrarily partitions $X$ and executes tasks within each subset of this partition in an arbitrary order. Then, the penalty due to such a partition and ordering can be measured by the $I_{α}$ -divergence as stated in the following propositions.

Proposition 13.

Let $A$ be a partition of $X$ of size M with count function N. Then, there exists a probability distribution $Q_{N}$ on $X$ such that

$\frac{1}{ρ} log E_{P} [N {(X)}^{ρ}] \geq H_{α} (P) + I_{α} (P, Q_{N}) - log {M [1 + ln (| X | / M)]} .$

Proof.

Define a probability distribution $Q_{N} = {Q_{N} (x), x \in X}$ as

$Q_{N} (x) : = \frac{N {(x)}^{- 1 / α}}{\sum_{x^{'} \in X} N {(x^{'})}^{- 1 / α}} \cdot$

Then, by Lemma 1, we have

$\begin{matrix} \frac{Z_{Q_{N}, α}}{Q_{N} {(x)}^{α}} & = & N (x) \sum_{x^{'} \in X} \frac{1}{N (x^{'})} \leq N (x) \cdot M [1 + ln (| X | / M)] . \end{matrix}$

Now, an application of Proposition 4 with $n = 1$ , $ψ_{1} (x) = N (x)$ , $Q = Q_{N}$ , and $a_{1} = 1 / M [1 + ln (| X | / M)]$ yields the desired result. □

A converse result is the following.

Proposition 14.

Let X be a random task from $X$ following distribution P. Let Q be another distribution on $X$ . If $M > log | X | + 2$ , then there exists partition $A_{Q}$ (with an associated count function $N_{Q}$ ) of $X$ of size at most M such that

$E_{P} [N_{Q} {(X)}^{ρ}] \leq 1 + 2^{ρ (H_{α} (P) + I_{α} (P, Q) - log \tilde{M})} if ρ \in (0, \infty),$

where $\tilde{M}$ is as in (24).

Proof.

Identical to the proof of Proposition 11(b). □

5. Operational Connection among the Problems

In this section, we establish an operational relationship among the five problems (refer Figure 1) that we studied in the previous section. The relationship we are interested in is “Does knowing an optimal or asymptotically optimal solution in one problem helps us find the same in another?” In fact, we end up showing that, under suitable conditions, all the five problems form an equivalence class with respect to the above-mentioned relation.

Relationships established among the five problems. A directed arrow from problem A to problem B means knowing optimal or asymptotically optimal solution of A helps us find the same in B.

In this section, we assume $ρ > 0$ . First, we make the following observations:

Among the five problems discussed in the previous section, only Arıkan’s guessing and Huleihel et al.’s memoryless guessing have a unique optimal solution; others only have asymptotically optimal solutions.
Optimal solution of Huleihel et al.’s memoryless guessing problem is the $α$ -scaled measure of the underlying probability distribution P. Hence, knowledge about the optimal solution of this problem implies knowledge about an optimal (or asymptotically optimal) solution of all other problems.
Among the Bunte–Lapidoth’s and ordered tasks problems, an asymptotically optimal solution of one yields that of the other. The partitioning lemma (Prop. III-2 of [8]) is the key result in these two problems, as it guarantees the existence of the asymptotically optimal partitions in both these problems.

5.1. Campbell’s Coding and Arıkan’s Guessing

An attempt to find a close relationship between these two problems was made, for example, by Hanawal and Sundaresan (Section II of [17]). Here, we show the equivalence between asymptotically optimal solutions of these two problems.

Proposition 15.

An asymptotically optimal solution exists for Campbell’s source coding problem if and only if an asymptotically optimal solution exists for Arıkan’s guessing problem.

Proof.

Let ${G_{n}^{*}}$ be an asymptotically optimal sequence of guessing functions, that is,

$lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [G_{n}^{*} {(X^{n})}^{ρ}] = H_{α} (P) .$

Define

$Q_{G_{n}^{*}} (x^{n}) : = c_{n}^{- 1} \cdot {G_{n}^{*} (x^{n})}^{- 1},$ (30)

where $c_{n}$ is the normalization constant. Notice that

$\begin{matrix} c_{n} = \sum_{x^{n}} {G_{n}^{*} (x^{n})}^{- 1} \leq 1 + n ln | X | . \end{matrix}$ (31)

Let us now define

$L_{G_{n}^{*}} (x^{n}) : = ⌈ - log Q_{G_{n}^{*}} (x^{n}) ⌉ .$

Then, by (Proposition 1 of [17]),

$L_{G_{n}^{*}} (x^{n}) \leq log G_{n}^{*} (x^{n}) + 1 + log c_{n} .$

Hence,

$\begin{matrix} 2^{ρ L_{G_{n}^{*}} (x^{n})} \leq 2^{ρ} \cdot c_{n}^{ρ} \cdot G_{n}^{*} {(x^{n})}^{ρ} \leq 2^{ρ} \cdot {(1 + n ln | X |)}^{ρ} \cdot G_{n}^{*} {(x^{n})}^{ρ} . \end{matrix}$

Thus, we have

$\underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [2^{ρ L_{G_{n}^{*}} (X^{n})}] \leq \underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [G_{n}^{*} {(X^{n})}^{ρ}] = H_{α} (P) .$

We observe that

$\begin{matrix} \sum_{x^{n} \in X^{n}} 2^{- L_{G_{n}^{*}} (x^{n})} = \sum_{x^{n} \in X^{n}} 2^{- ⌈ - log Q_{G_{n}^{*}} (x^{n}) ⌉} \leq \sum_{x^{n} \in X^{n}} 2^{log Q_{G_{n}^{*}} (x^{n})} = 1 . \end{matrix}$

Consequently, from (12), we have

$\underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [2^{ρ L_{G_{n}^{*}} (X^{n})}] \geq H_{α} (P) .$

Thus, ${L_{G_{n}^{*}}}$ is an asymptotically optimal sequence of length functions for Campbell’s coding problem.

Conversely, given an asymptotically optimal sequence of length functions ${L_{n}^{*}}$ for Campbell’s coding problem, define

$Q_{L_{n}^{*}} (x^{n}) : = \frac{2^{- L_{n}^{*} (x^{n})}}{\sum_{y^{n}} 2^{- L_{n}^{*} (x^{n})}} .$

Let $G_{L_{n}^{*}}$ be the guessing function on $X^{n}$ that guesses according to the decreasing order of $Q_{L_{n}^{*}}$ -probabilities. Then, by (Proposition 2 of [17]),

$log G_{L_{n}^{*}} (x^{n}) \leq L_{n}^{*} (x^{n}) .$

Thus,

$\underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [G_{L_{n}^{*}} {(X^{n})}^{ρ}] \leq lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [2^{ρ L_{n}^{*} (X^{n})}] = H_{α} (P) .$

Furthermore, from Theorem 1 of [13], we have

$\underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [G_{L_{n}^{*}} {(X^{n})}^{ρ}] \geq H_{α} (P) - \underset{n \to \infty}{lim sup} \frac{1}{n} log (1 + n ln | X |) = H_{α} (P) .$

This completes the proof. □

5.2. Arıkan’s Guessing and Bunte–Lapidoth’s Tasks Partitioning Problem

Bracher et al. found a close connection between Arıkan’s guessing problem and Bunte–Lapidoth’s tasks partitioning problem in the context of distributed storage [29]. In this section, we establish a different relation between these problems.

Proposition 16.

An asymptotically optimal solution of Arıkan’s guessing problem gives rise to an asymptotically optimal solution of tasks partitioning problem.

Proof.

Let ${G_{n}^{*}}$ be an asymptotically optimal sequence of guessing functions. Define

$Q_{G_{n}^{*}} (x^{n}) = d_{n}^{- 1} G_{n}^{*} {(x^{n})}^{- 1 / α},$

where $d_{n}$ is the normalization constant. Let $A_{G_{n}^{*}}$ be the partition function satisfying $A_{G_{n}^{*}} (x^{n}) \leq ⌈ β_{n} \cdot Z_{Q_{G_{n}^{*}}, α} / Q_{G_{n}^{*}} {(x^{n})}^{α} ⌉$ guaranteed by (Proposition III-2 of [8]), where

$β_{n} = \frac{2}{M_{n} - n log | X | - 2} \cdot$

Thus, we have

$\begin{matrix} A_{G_{n}^{*}} {(x^{n})}^{ρ} \leq {⌈ β_{n} \cdot Z_{Q_{G_{n}^{*}}, α} / Q_{G_{n}^{*}} {(x^{n})}^{α} ⌉}^{ρ} & \overset{(a)}{\leq} & ⌈ β_{n} \cdot (1 + n ln | X |) \cdot G_{n}^{*} (x^{n}) ⌉^{ρ} \\ \overset{(b)}{\leq} & 1 + 2^{ρ} β_{n}^{ρ} \cdot {(1 + n ln | X |)}^{ρ} \cdot {G_{n}^{*} (x^{n})}^{ρ}, \end{matrix}$

where (a) holds because $Z_{Q_{G_{n}^{*}}, α} = d_{n}^{- α} \sum_{x^{n} \in X^{n}} G_{n}^{*} {(x^{n})}^{- 1} \leq d_{n}^{- α} (1 + n ln | X |)$ ; and (b) hold because ${⌈ x ⌉}^{ρ} \leq 1 + 2^{ρ} x^{ρ}$ for $x > 0$ . Hence,

$\begin{matrix} E_{P_{n}} [A_{G_{n}^{*}} {(X^{n})}^{ρ}] & \leq & 1 + 2^{ρ} β_{n}^{ρ} \cdot {(1 + n ln | X |)}^{ρ} \cdot E_{P_{n}} [G_{n}^{*} {(X^{n})}^{ρ}] \\ \overset{(c)}{\leq} & 1 + 2^{2 ρ} {(\frac{1 + n ln | X |}{M_{n} - n log | X | - 2})}^{ρ} \cdot 2^{n ρ H_{α} (P)} \\ = & 1 + 2^{n ρ (H_{α} (P) + \frac{log (1 + n ln | X |)}{n} - \frac{log {\tilde{M}}_{n}}{n})}, \end{matrix}$

where ${\tilde{M}}_{n}$ is as in (25); and inequality (c) follows from Proposition 4 of [13] proved in Section 3. Thus, if $M_{n}$ is such that $M_{n} \geq n log | X | + 3$ and if $γ : = {lim}_{n \to \infty} (log M_{n}) / n$ exists and $γ > H_{α} (P)$ , then we have

$\underset{n \to \infty}{lim sup} E_{P_{n}} [A_{G_{n}^{*}} {(X^{n})}^{ρ}] \leq 1 .$

Since $E_{P_{n}} [A_{G_{n}^{*}} {(X^{n})}^{ρ}] \geq 1$ , we have ${lim inf}_{n \to \infty} E_{P_{n}} [A_{G_{n}^{*}} {(X^{n})}^{ρ}] \geq 1$ . When $γ < H_{α} (P)$ , arguing along the lines of proof of Proposition 8(b), it can be shown than

$lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [A_{G_{n}^{*}} {(X^{n})}^{ρ}] = H_{α} (P) - γ .$

□

Reverse implication of the above result does not hold always due to the additional parameter $M_{n}$ in the tasks partitioning problem. For example, if $M_{n} = {| X |}^{n}$ and $A_{n} (x^{n}) = 1$ for every $x^{n}$ , the partition does not provide any information about the underlying distribution. As a consequence, we will not be able to conclude anything about the optimal (or asymptotically optimal) solutions of other problems. However, if $M_{n}$ is such that $log M_{n}$ increases sub-linearly, then it does help us find asymptotically optimal solutions of other problems.

Proposition 17.

An asymptotically optimal sequence of partition functions ${A_{n}}$ with partition sizes ${M_{n}}$ for the tasks partitioning problem gives rise to an asymptotically optimal solution for the guessing problem if $M_{n} \geq n log | X | + 3$ and ${lim}_{n \to \infty} (log M_{n}) / n = 0 .$

Proof.

By hypothesis,

$lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] = H_{α} (P) .$

For every $A_{n}$ , define the probability distribution

$Q_{A_{n}} (x^{n}) : = c_{n}^{- 1} A_{n} {(x^{n})}^{- 1},$

where $c_{n} : = \sum_{x^{n}} A_{n} {(x^{n})}^{- 1} = M_{n}$ . Let $G_{A_{n}}^{*}$ be the guessing function that guesses according to the decreasing order of $Q_{A_{n}}$ -probabilities. Then, by (Proposition 2 of [17]), we have

$\begin{matrix} G_{A_{n}}^{*} (x^{n}) \leq & Q_{A_{n}} {(x^{n})}^{- 1} = c_{n} A_{n} (x^{n}) = M_{n} A_{n} (x^{n}) . \end{matrix}$

Hence,

$\underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [G_{A_{n}}^{*} {(X^{n})}^{ρ}] \leq \underset{n \to \infty}{lim sup} \frac{1}{n} log M_{n} + \underset{n \to \infty}{lim sup} \frac{1}{n ρ} log E_{P_{n}} [A_{n} {(X^{n})}^{ρ}] = H_{α} (P) .$

Furthermore, an application of Theorem 1 of [13] gives us

$\underset{n \to \infty}{lim inf} \frac{1}{n ρ} log E_{P_{n}} [G_{A_{n}}^{*} {(X^{n})}^{ρ}] \geq H_{α} (P) - \underset{n \to \infty}{lim sup} \frac{1}{n} log (1 + n ln | X |) = H_{α} (P) .$

This completes the proof. □

5.3. Huleihel et al.’s Memoryless Guessing and Campbell’s Coding

We already know that, if one knows the optimal solution of Huleihel et al.’s memoryless guessing problem, that is, the $α$ -scaled measure of the underlying probability distribution P, one has knowledge about the optimal (or asymptotically optimal) solution of Campbell’s coding problem. In this section, we prove a converse statement. We first prove the following lemma.

Lemma 2.

Let $L_{n}^{*}$ denote the length function corresponding to an optimal solution for Campbell’s coding problem on the alphabet set $X^{n}$ endowed with the product distribution $P_{n}$ . Then, $\sum_{x^{n} \in X^{n}} 2^{- L_{n}^{*} (x^{n})} \geq 1 / 2$ .

Proof.

Suppose $\sum_{x^{n} \in X^{n}} 2^{- L_{n}^{*} (x^{n})} < 1 / 2$ . Then, we must have $L_{n}^{*} (x^{n}) \geq 2$ for every $x^{n} \in X^{n}$ . Define ${\hat{L}}_{n} (x^{n}) : = L_{n}^{*} (x^{n}) - 1$ . We observe that $\sum_{x^{n} \in X^{n}} 2^{- {\hat{L}}_{n} (x^{n})} < 1$ , that is, the length function ${\hat{L}}_{n} (\cdot)$ satisfies (11). Hence, there exists a code $C_{n}$ for $X^{n}$ such that $L (C_{n} (x^{n})) = {\hat{L}}_{n} (x^{n})$ . Then, for $ρ > 0$ , we have $log E_{P_{n}} [2^{ρ L_{n}^{*} (X^{n})}] > log E_{P_{n}} [2^{ρ {\hat{L}}_{n} (X^{n})}]$ —a contradiction. □

Proposition 18.

An asymptotically optimal solution for Huleihel et al.’s memoryless guessing problem exists if an asymptotically optimal solution exists for Campbell’s coding problem.

Proof.

Let ${L_{n}^{*}, n \geq 1}$ denote a sequence of asymptotically optimal length functions of Campbell’s coding problem, that is,

$\begin{matrix} lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [2^{ρ L_{n}^{*} (X^{n})}] = H_{α} (P) . \end{matrix}$ (32)

Let us define

$Q_{L_{n}^{*}} (x^{n}) : = \frac{2^{- L_{n}^{*} (x^{n}) / α}}{\sum_{{\bar{x}}^{n} \in X^{n}} 2^{- L_{n}^{*} ({\bar{x}}^{n}) / α}} .$

Then, we have

$\begin{matrix} I_{α} (P_{n}, Q_{L_{n}^{*}}) & = \frac{α}{1 - α} \log (\sum_{x^{n} \in X^{n}} P_{n} (x^{n}) {[\sum_{{\hat{x}}^{n} \in X^{n}} {(\frac{Q_{L_{n}^{*}} ({\hat{x}}^{n})}{Q_{L_{n}^{*}} (x^{n})})}^{α}]}^{\frac{1 - α}{α}}) - H_{α} (P_{n}) \\ = \frac{α}{1 - α} \log (\sum_{x^{n} \in X^{n}} P_{n} (x^{n}) {[\sum_{{\hat{x}}^{n} \in X^{n}} \frac{2^{- L_{n}^{*} ({\hat{x}}^{n})}}{2^{- L_{n}^{*} (x^{n})}}]}^{ρ}) - n H_{α} (P) \\ = \frac{1}{ρ} log E_{P_{n}} [2^{ρ L_{n}^{*} (X^{n})}] + log ζ_{n} - n H_{α} (P), \end{matrix}$

where $ζ_{n} = \sum_{{\hat{x}}^{n} \in X^{n}} 2^{- L_{n}^{*} ({\hat{x}}^{n})}$ . Hence,

$\begin{matrix} lim_{n \to \infty} \frac{1}{n} I_{α} (P_{n}, Q_{L_{n}^{*}}) = lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [2^{ρ L_{n}^{*} (X^{n})}] + lim_{n \to \infty} \frac{1}{n} log ζ_{n} - H_{α} (P) \overset{(a)}{=} 0, \end{matrix}$

where (a) holds because $ζ_{n} \in [1 / 2, 1]$ (refer Lemma 2). If we assume the underlying probability distribution to be $Q_{L_{n}^{*}}$ instead of $P_{n}$ , and perform memoryless guessing according to the escort distribution of $Q_{L_{n}^{*}}$ , namely ${\hat{Q}}_{n}^{*} (x^{n}) = Q_{L_{n}^{*}} {(x^{n})}^{α} / Z_{Q_{L_{n}^{*}}, α}$ , due to Proposition 7, we have

$lim_{n \to \infty} \frac{1}{n ρ} log E_{P_{n}} [V_{{\hat{Q}}_{n}^{*}, ρ} (X^{n})] = H_{α} (P) + lim_{n \to \infty} \frac{1}{n} I_{α} (P_{n}, Q_{L_{n}^{*}}) = H_{α} (P) .$

□

6. Summary and Conclusions

This paper was motivated by the need to find a unified framework for the problems on source coding, guessing and the tasks partitioning. To that end, we formulated a general moment minimization problem in the IID-lossless case and observed that the optimal value of its objective function is bounded below by Rényi entropy. We then re-established all achievable lower bounds in each of the above-mentioned problems using the generalized framework. It was interesting to note that the optimal solution did not depend on the moment function $ψ$ , but only on the underlying probability distribution P and order of the moment $ρ$ (refer Proposition 1). We also presented a unified framework for the mismatched version of the above-mentioned problems. This framework not only led to refinement of the known theorems, but also helped us identify a few new results. We went on to extend the tasks partitioning problem by asking a more practical question and solved it using the unified theory. Finally, we established an equivalence among these problems, in the sense that an asymptotically optimal solution of one problem yields the asymptotically optimal solution of all other problems. Although the relationship between source coding and guessing is well-known in the literature [30,31,32,33], their connection to the tasks partitioning problem, and the connection in the mismatched version of the problems are new.

Our unified framework also has the potential to act as a general tool-set and provide insights for similar problems in information theory. For example, in Section 4, this framework enabled us to solve a more general tasks partitioning problem, namely, the ordered tasks partitioning problem using this framework. The presented that a unified approach can also be extended and explored further in several ways. This includes (a) Extension to general alphabet set: The guessing problem was originally studied for countably infinite alphabet set by Massey [12]. Courtade and Verdú have studied the source coding problem for a countably infinite alphabet set with a cumulant generating function of code word lengths as a design criterion [31]. It would be interesting to see if memory-less guessing and tasks partitioning problems can also be formulated for countably infinite alphabet sets and relationships among the problems can be extended. (b) More general sources: Relationship between source coding and guessing is very well-known in the literature. Relationship between guessing and source coding in the ‘with distortion’ case for finite alphabet was established by Merhav and Arıkan [14] and for a countably infinite alphabet by Hanawal and Sundaresan [34]. Relationship between Guessing and Campbell’s coding in the universal case was established by Sundaresan [35]. It would be interesting to see if these can be extended to memoryless guessing and tasks partitioning also, possibly in an unified manner. (c) Applications: Arıkan showed an application of the guessing problem in a sequential decoding problem [13]. Humblet showed that the cumulant of code-lengths arises in minimizing the probability of buffer overflow in source coding problems [27]. Rezaee et al. [36], Salamatian et al. [20], and Sundaresan [28] show the application of guessing in the security aspect of password protected systems. Our unified framework has the potential to help solve problems that arise in real life situations and fall in this framework.

Author Contributions

Conceptualization: M.A.K., A.S., A.T., A.K. and G.D.M.; Formal analysis: M.A.K. and A.S.; Investigation: M.A.K., A.S., A.T. and G.D.M.;Writing – original draft: M.A.K., A.S., A.T., A.K. and G.D.M. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Funding Statement

This research received no external funding.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Shannon C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948;27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]
2.Rényi A. On measures of entropy and information; Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics; Berkeley, CA, USA. 20 June–30 July 1960; pp. 547–561. [Google Scholar]
3.Aczél J., Daróczy Z. Mathematicas in Science and Engineering. Volume 115 Academic Press; Cambridge, MA, USA: 1975. On Measures of Information and Their Characterizations. [Google Scholar]
4.Campbell L.L. A coding theorem and Rényi’s entropy. Inf. Control. 1965;8:423–429. doi: 10.1016/S0019-9958(65)90332-3. [DOI] [Google Scholar]
5.Sundaresan R. Guessing Under Source Uncertainty. IEEE Trans. Inf. Theory. 2007;53:269–287. doi: 10.1109/TIT.2006.887466. [DOI] [Google Scholar]
6.Blumer A.C., McEliece R.J. The Rényi redundancy of generalized Huffman codes. IEEE Trans. Inf. Theory. 1988;34:1242–1249. doi: 10.1109/18.21251. [DOI] [Google Scholar]
7.Sundaresan R. A measure of discrimination and its geometric properties; Proceedings of the IEEE International Symposium on Information Theory; Lausanne, Switzerland. 30 June–5 July 2002; p. 264. [Google Scholar]
8.Bunte C., Lapidoth A. Encoding Tasks and Rényi Entropy. IEEE Trans. Inf. Theory. 2014;60:5065–5076. doi: 10.1109/TIT.2014.2329490. [DOI] [Google Scholar]
9.Kumar M.A., Sundaresan R. Minimization problems based on relative α-entropy I: Forward Projection. IEEE Trans. Inf. Theory. 2015;61:5063–5080. doi: 10.1109/TIT.2015.2449311. [DOI] [Google Scholar]
10.Lutwak E., Yang D., Zhang G. Cramér-Rao and moment-entropy inequalities for Rényi entropy and generalized Fisher information. IEEE Trans. Inf. Theory. 2005;51:473–478. doi: 10.1109/TIT.2004.840871. [DOI] [Google Scholar]
11.Ashok Kumar M., Sundaresan R. Minimization Problems Based on Relative α-Entropy II: Reverse Projection. IEEE Trans. Inf. Theory. 2015;61:5081–5095. doi: 10.1109/TIT.2015.2449312. [DOI] [Google Scholar]
12.Massey J.L. Guessing and entropy; Proceedings of the 1994 IEEE International Symposium on Information Theory; Trondheim, Norway. 27 June–1 July 1994; p. 204. [Google Scholar]
13.Arikan E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory. 1996;42:99–105. doi: 10.1109/18.481781. [DOI] [Google Scholar]
14.Arikan E., Merhav N. Guessing subject to distortion. IEEE Trans. Inf. Theory. 1998;44:1041–1056. doi: 10.1109/18.669158. [DOI] [Google Scholar]
15.Pfister C., Sullivan W. Renyi entropy, guesswork moments, and large deviations. IEEE Trans. Inf. Theory. 2004;50:2794–2800. doi: 10.1109/TIT.2004.836665. [DOI] [Google Scholar]
16.Malone D., Sullivan W. Guesswork and entropy. IEEE Trans. Inf. Theory. 2004;50:525–526. doi: 10.1109/TIT.2004.824921. [DOI] [Google Scholar]
17.Hanawal M.K., Sundaresan R. Guessing Revisited: A Large Deviations Approach. IEEE Trans. Inf. Theory. 2011;57:70–78. doi: 10.1109/TIT.2010.2090221. [DOI] [Google Scholar]
18.Christiansen M.M., Duffy K.R. Guesswork, Large Deviations, and Shannon Entropy. IEEE Trans. Inf. Theory. 2013;59:796–802. doi: 10.1109/TIT.2012.2219036. [DOI] [Google Scholar]
19.Huleihel W., Salamatian S., Médard M. Guessing with limited memory; In Proceeding of the 2017 IEEE International Symposium on Information Theory (ISIT); Aachen, Germany. 25–30 June 2017; pp. 2253–2257. [Google Scholar]
20.Salamatian S., Huleihel W., Beirami A., Cohen A., Médard M. Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization. IEEE Trans. Inf. Forensics Secur. 2019;14:2288–2299. doi: 10.1109/TIFS.2019.2895955. [DOI] [Google Scholar]
21.Arikan E., Merhav N. Joint source-channel coding and guessing with application to sequential decoding. IEEE Trans. Inf. Theory. 1998;44:1756–1769. doi: 10.1109/18.705557. [DOI] [Google Scholar]
22.Csiszár I., Shields P. Information Theory and Statistics: A Tutorial, Foundations and Trends in Communications and Information Theory. Now Publishers; Delft, The Netherlands: 2004. [Google Scholar]
23.Tsallis C., Mendes R.S., Plastino A.R. The role of constraints within generalized non-extensive statistics. Phys. A. 1998;261:534–554. doi: 10.1016/S0378-4371(98)00437-3. [DOI] [Google Scholar]
24.Dupuis P., Ellis R.S. A Weak Convergence Approach to the Theory of Large Deviations. John Wiley & Sons; Hoboken, NJ, USA: 1997. [Google Scholar]
25.Shayevitz O. On Rényi measures and hypothesis testing; Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings; St. Petersburg, Russia. 31 July–5 August 2011; pp. 894–898. [Google Scholar]
26.Cover T.M., Thomas J.A. Elements of Information Theory. 2nd ed. John Wiley & Sons; Hoboken, NJ, USA: 2012. [Google Scholar]
27.Humblet P. Generalization of Huffman coding to minimize the probability of buffer overflow. IEEE Trans. Inf. Theory. 1981;27:230–232. doi: 10.1109/TIT.1981.1056322. [DOI] [Google Scholar]
28.Hanawal M.K., Sundaresan R. Randomised Attacks on Passwords. DRDO-IISc Programme on Advanced Research in Mathematical Engineering. 2010. [(accessed on 14 August 2022)]. Available online: https://ece.iisc.ac.in/rajeshs/reprints/TR-PME-2010-11.pdf.
29.Bracher A., Hof E., Lapidoth A. Guessing Attacks on Distributed-Storage Systems. IEEE Trans. Inf. Theory. 2019;65:6975–6998. doi: 10.1109/TIT.2019.2933000. [DOI] [Google Scholar]
30.Salamatian S., Liu L., Beirami A., Médard M. Mismatched guesswork and one-to-one codes; Proceedings of the 2019 IEEE Information Theory Workshop (ITW); Gotland, Sweden. 25–28 August 2019; pp. 1–5. [Google Scholar]
31.Courtade T.A., Verdú S. Cumulant generating function of codeword lengths in optimal lossless compression; Proceedings of the 2014 IEEE International Symposium on Information Theory; Honolulu, HI, USA. 29 June –4 July 2014; pp. 2494–2498. [DOI] [Google Scholar]
32.Kosut O., Sankar L. Asymptotics and non-asymptotics for universal fixed-to-variable source coding. IEEE Trans. Inf. Theory. 2017;63:3757–3772. doi: 10.1109/TIT.2017.2686881. [DOI] [Google Scholar]
33.Beirami A., Calderbank R., Christiansen M.M., Duffy K.R., Médard M. A characterization of guesswork on swiftly tilting curves. IEEE Trans. Inf. Theory. 2018;60:2850–2871. [Google Scholar]
34.Hanawal M.K., Sundaresan R. Guessing and Compression Subject to Distortion. IndraStra Global; Sheridan, WY, USA: 2010. [Google Scholar]
35.Sundaresan R. Guessing Based On Length Functions; Proceedings of the 2007 IEEE International Symposium on Information Theory; Cambridge, MA, USA. 1–6 July 2007; pp. 716–719. [DOI] [Google Scholar]
36.Rezaee A., Beirami A., Makhdoumi A., Médard M., Duffy K. Guesswork subject to a total entropy budget; Proceedings of the 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton); Monticello, IL, USA. 3–6 October 2017; pp. 1008–1015. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Not applicable.

[B1-entropy-24-01695] 1.Shannon C.E. A Mathematical Theory of Communication. Bell Syst. Tech. J. 1948;27:379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x. [DOI] [Google Scholar]

[B2-entropy-24-01695] 2.Rényi A. On measures of entropy and information; Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics; Berkeley, CA, USA. 20 June–30 July 1960; pp. 547–561. [Google Scholar]

[B3-entropy-24-01695] 3.Aczél J., Daróczy Z. Mathematicas in Science and Engineering. Volume 115 Academic Press; Cambridge, MA, USA: 1975. On Measures of Information and Their Characterizations. [Google Scholar]

[B4-entropy-24-01695] 4.Campbell L.L. A coding theorem and Rényi’s entropy. Inf. Control. 1965;8:423–429. doi: 10.1016/S0019-9958(65)90332-3. [DOI] [Google Scholar]

[B5-entropy-24-01695] 5.Sundaresan R. Guessing Under Source Uncertainty. IEEE Trans. Inf. Theory. 2007;53:269–287. doi: 10.1109/TIT.2006.887466. [DOI] [Google Scholar]

[B6-entropy-24-01695] 6.Blumer A.C., McEliece R.J. The Rényi redundancy of generalized Huffman codes. IEEE Trans. Inf. Theory. 1988;34:1242–1249. doi: 10.1109/18.21251. [DOI] [Google Scholar]

[B7-entropy-24-01695] 7.Sundaresan R. A measure of discrimination and its geometric properties; Proceedings of the IEEE International Symposium on Information Theory; Lausanne, Switzerland. 30 June–5 July 2002; p. 264. [Google Scholar]

[B8-entropy-24-01695] 8.Bunte C., Lapidoth A. Encoding Tasks and Rényi Entropy. IEEE Trans. Inf. Theory. 2014;60:5065–5076. doi: 10.1109/TIT.2014.2329490. [DOI] [Google Scholar]

[B9-entropy-24-01695] 9.Kumar M.A., Sundaresan R. Minimization problems based on relative α-entropy I: Forward Projection. IEEE Trans. Inf. Theory. 2015;61:5063–5080. doi: 10.1109/TIT.2015.2449311. [DOI] [Google Scholar]

[B10-entropy-24-01695] 10.Lutwak E., Yang D., Zhang G. Cramér-Rao and moment-entropy inequalities for Rényi entropy and generalized Fisher information. IEEE Trans. Inf. Theory. 2005;51:473–478. doi: 10.1109/TIT.2004.840871. [DOI] [Google Scholar]

[B11-entropy-24-01695] 11.Ashok Kumar M., Sundaresan R. Minimization Problems Based on Relative α-Entropy II: Reverse Projection. IEEE Trans. Inf. Theory. 2015;61:5081–5095. doi: 10.1109/TIT.2015.2449312. [DOI] [Google Scholar]

[B12-entropy-24-01695] 12.Massey J.L. Guessing and entropy; Proceedings of the 1994 IEEE International Symposium on Information Theory; Trondheim, Norway. 27 June–1 July 1994; p. 204. [Google Scholar]

[B13-entropy-24-01695] 13.Arikan E. An inequality on guessing and its application to sequential decoding. IEEE Trans. Inf. Theory. 1996;42:99–105. doi: 10.1109/18.481781. [DOI] [Google Scholar]

[B14-entropy-24-01695] 14.Arikan E., Merhav N. Guessing subject to distortion. IEEE Trans. Inf. Theory. 1998;44:1041–1056. doi: 10.1109/18.669158. [DOI] [Google Scholar]

[B15-entropy-24-01695] 15.Pfister C., Sullivan W. Renyi entropy, guesswork moments, and large deviations. IEEE Trans. Inf. Theory. 2004;50:2794–2800. doi: 10.1109/TIT.2004.836665. [DOI] [Google Scholar]

[B16-entropy-24-01695] 16.Malone D., Sullivan W. Guesswork and entropy. IEEE Trans. Inf. Theory. 2004;50:525–526. doi: 10.1109/TIT.2004.824921. [DOI] [Google Scholar]

[B17-entropy-24-01695] 17.Hanawal M.K., Sundaresan R. Guessing Revisited: A Large Deviations Approach. IEEE Trans. Inf. Theory. 2011;57:70–78. doi: 10.1109/TIT.2010.2090221. [DOI] [Google Scholar]

[B18-entropy-24-01695] 18.Christiansen M.M., Duffy K.R. Guesswork, Large Deviations, and Shannon Entropy. IEEE Trans. Inf. Theory. 2013;59:796–802. doi: 10.1109/TIT.2012.2219036. [DOI] [Google Scholar]

[B19-entropy-24-01695] 19.Huleihel W., Salamatian S., Médard M. Guessing with limited memory; In Proceeding of the 2017 IEEE International Symposium on Information Theory (ISIT); Aachen, Germany. 25–30 June 2017; pp. 2253–2257. [Google Scholar]

[B20-entropy-24-01695] 20.Salamatian S., Huleihel W., Beirami A., Cohen A., Médard M. Why Botnets Work: Distributed Brute-Force Attacks Need No Synchronization. IEEE Trans. Inf. Forensics Secur. 2019;14:2288–2299. doi: 10.1109/TIFS.2019.2895955. [DOI] [Google Scholar]

[B21-entropy-24-01695] 21.Arikan E., Merhav N. Joint source-channel coding and guessing with application to sequential decoding. IEEE Trans. Inf. Theory. 1998;44:1756–1769. doi: 10.1109/18.705557. [DOI] [Google Scholar]

[B22-entropy-24-01695] 22.Csiszár I., Shields P. Information Theory and Statistics: A Tutorial, Foundations and Trends in Communications and Information Theory. Now Publishers; Delft, The Netherlands: 2004. [Google Scholar]

[B23-entropy-24-01695] 23.Tsallis C., Mendes R.S., Plastino A.R. The role of constraints within generalized non-extensive statistics. Phys. A. 1998;261:534–554. doi: 10.1016/S0378-4371(98)00437-3. [DOI] [Google Scholar]

[B24-entropy-24-01695] 24.Dupuis P., Ellis R.S. A Weak Convergence Approach to the Theory of Large Deviations. John Wiley & Sons; Hoboken, NJ, USA: 1997. [Google Scholar]

[B25-entropy-24-01695] 25.Shayevitz O. On Rényi measures and hypothesis testing; Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings; St. Petersburg, Russia. 31 July–5 August 2011; pp. 894–898. [Google Scholar]

[B26-entropy-24-01695] 26.Cover T.M., Thomas J.A. Elements of Information Theory. 2nd ed. John Wiley & Sons; Hoboken, NJ, USA: 2012. [Google Scholar]

[B27-entropy-24-01695] 27.Humblet P. Generalization of Huffman coding to minimize the probability of buffer overflow. IEEE Trans. Inf. Theory. 1981;27:230–232. doi: 10.1109/TIT.1981.1056322. [DOI] [Google Scholar]

[B28-entropy-24-01695] 28.Hanawal M.K., Sundaresan R. Randomised Attacks on Passwords. DRDO-IISc Programme on Advanced Research in Mathematical Engineering. 2010. [(accessed on 14 August 2022)]. Available online: https://ece.iisc.ac.in/rajeshs/reprints/TR-PME-2010-11.pdf.

[B29-entropy-24-01695] 29.Bracher A., Hof E., Lapidoth A. Guessing Attacks on Distributed-Storage Systems. IEEE Trans. Inf. Theory. 2019;65:6975–6998. doi: 10.1109/TIT.2019.2933000. [DOI] [Google Scholar]

[B30-entropy-24-01695] 30.Salamatian S., Liu L., Beirami A., Médard M. Mismatched guesswork and one-to-one codes; Proceedings of the 2019 IEEE Information Theory Workshop (ITW); Gotland, Sweden. 25–28 August 2019; pp. 1–5. [Google Scholar]

[B31-entropy-24-01695] 31.Courtade T.A., Verdú S. Cumulant generating function of codeword lengths in optimal lossless compression; Proceedings of the 2014 IEEE International Symposium on Information Theory; Honolulu, HI, USA. 29 June –4 July 2014; pp. 2494–2498. [DOI] [Google Scholar]

[B32-entropy-24-01695] 32.Kosut O., Sankar L. Asymptotics and non-asymptotics for universal fixed-to-variable source coding. IEEE Trans. Inf. Theory. 2017;63:3757–3772. doi: 10.1109/TIT.2017.2686881. [DOI] [Google Scholar]

[B33-entropy-24-01695] 33.Beirami A., Calderbank R., Christiansen M.M., Duffy K.R., Médard M. A characterization of guesswork on swiftly tilting curves. IEEE Trans. Inf. Theory. 2018;60:2850–2871. [Google Scholar]

[B34-entropy-24-01695] 34.Hanawal M.K., Sundaresan R. Guessing and Compression Subject to Distortion. IndraStra Global; Sheridan, WY, USA: 2010. [Google Scholar]

[B35-entropy-24-01695] 35.Sundaresan R. Guessing Based On Length Functions; Proceedings of the 2007 IEEE International Symposium on Information Theory; Cambridge, MA, USA. 1–6 July 2007; pp. 716–719. [DOI] [Google Scholar]

[B36-entropy-24-01695] 36.Rezaee A., Beirami A., Makhdoumi A., Médard M., Duffy K. Guesswork subject to a total entropy budget; Proceedings of the 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton); Monticello, IL, USA. 3–6 October 2017; pp. 1008–1015. [DOI] [Google Scholar]

PERMALINK

Are Guessing, Source Coding and Tasks Partitioning Birds of A Feather? †

M Ashok Kumar

Albert Sunny

Ashish Thakre

Ashisha Kumar

G Dinesh Manohar

Roles

Abstract

1. Introduction

2. A General Minimization Problem

Proposition 1.

Proof.

Proposition 2.

Proof.

Corollary 1.

Proof.

A General Framework for Mismatched Cases

Proposition 3.

Proof.

Proposition 4.

Proof.

3. Problem Statements and Known Results

3.1. Source Coding Problem

Proof.

Theorem 1.

Proof.

3.2. Campbell Coding Problem

Proof.

Proof.

Proposition 5.

Proof.

3.3. Arıkan’s Guessing Problem

Proof.

Proof.

Proof.

Proposition 6.

Proof.

Proof.

3.4. Memoryless Guessing

Proof.

Proposition 7.

Proof.

3.5. Tasks Partitioning Problem

Proof.

Proposition 8.

Proof.

Remark 1.

Proposition 9.

Proof.

Proposition 10.

Proof.

4. Ordered Tasks Partitioning Problem

Lemma 1.

Proof.

Proposition 11.

Proof.

Proposition 12.

Proof.

Remark 2.

Proposition 13.

Proof.

Proposition 14.

Proof.

5. Operational Connection among the Problems

Figure 1.

5.1. Campbell’s Coding and Arıkan’s Guessing

Proposition 15.

Proof.

5.2. Arıkan’s Guessing and Bunte–Lapidoth’s Tasks Partitioning Problem

Proposition 16.

Proof.

Proposition 17.

Proof.

5.3. Huleihel et al.’s Memoryless Guessing and Campbell’s Coding

Lemma 2.

Proof.

Proposition 18.

Proof.

6. Summary and Conclusions

Are Guessing, Source Coding and Tasks Partitioning Birds of A Feather? ^†