Abstract
The error probability of block codes sent under a non-uniform input distribution over the memoryless binary symmetric channel (BSC) and decoded via the maximum a posteriori (MAP) decoding rule is investigated. It is proved that the ratio of the probability of MAP decoder ties to the probability of error grows most linearly in blocklength when no MAP decoding ties occur, thus showing that decoder ties do not affect the code’s error exponent. This result generalizes a similar recent result shown for the case of block codes transmitted over the BSC under a uniform input distribution.
Keywords: binary symmetric channel, block codes, non-uniformly distributed channel inputs, joint source-channel coding, maximum a posteriori (MAP) decoding, decoder ties, error probability, error exponent
1. Introduction
Consider the classical channel coding context, where we send a block code through the memoryless binary symmetric channel (BSC) with crossover probability . Given a sequence of binary codes with n being the blocklength, we denote the sequence of corresponding minimal probabilities of decoding error under maximum a posteriori (MAP) decoding by . The following result was recently shown in [1] when the channel input selects codewords from according to a uniform distribution.
Theorem 1
([1]). For any sequence of codes of blocklength n and size with , sent over the BSC with crossover probability under a uniform channel input distribution over , its minimum probability of decoding error satisfies
(1) where
(2) where is the joint input–output distribution that is sent over the BSC (via n uses) and is received.
Noting that in (2) is the probability that a decoding error occurs without inducing decoder ties (which occur when two or more codewords in are identified by the decoder as the estimated transmitted codeword; i.e., when more than one codeword in maximize for a given received word ), the above result in (1) directly implies that decoder ties do not affect the error exponent of . The error exponent or reliability function of a block coding communication system represents the largest rate of exponential decay of the system’s probability of decoding error as the coding blocklength grows to infinity (e.g., see [2,3,4,5,6,7,8,9,10,11,12,13,14]).
It is known that uniformly distributed data achieves the largest entropy rate and leaves no room for data compression. Thus, ideally compressed data should exhibit uniform distribution for all blocklengths n. However, this setting is often impractical due to the sub-optimality of the implemented data compression schemes. Instead, we generally have non-uniformly distributed data after compression in the form of residual redundancy such as in speech or image coding (e.g., [15,16]). Furthermore, one may have a compressed source that can be divided into several groups, within each of which the symbols are equally probable. Decoder ties can thus occur with respect to two (or more) codewords corresponding to symbols within the same group.
In this paper, we consider a non-uniform prior distribution over and prove that decoder ties, under optimal MAP decoding, still have linear and hence sub-exponential impact on the error probability , thus extending Theorem 1 established for the case of a uniform prior distribution over . Since our problem falls within the general framework of joint source-channel coding for point-to-point communication systems, we refer the reader to [14,15,16,17,18,19,20,21] (Section 4.6) and the references therein for theoretical studies on this subject as well as practical designs that outperform separate source and channel coding under complexity or delay constraints.
The proof technique used in [1] to show (1) above is based on the observation that there are two types of decoding errors. One is that the received tuple at the channel output induces no decoder ties but the corresponding decoder decision is wrong. The other is that the received tuple at the channel output causes a decoder tie, but the decoder picks the wrong codeword. As a result, the MAP error probability can be upper bounded by the sum of two terms, and , where is the probability of the first type of decoding errors as given in (2), and is the probability of decoder ties regardless of whether the tie breaker misses the correct codeword or not. Under the assumption that the channel input is uniformly distributed over block code for each blocklength n and an arbitrary sequence of codes , it was shown in [1] that flipping a properly selected bit component of the channel output that causes a decoder tie can produce a unique channel output that leads to the first type of decoding errors. An analysis of this bit-flipping manipulation shows that the ratio grows at most linearly in n and hence yields the upper bound in (1). However, this flipping technique no longer works when non-uniform channel inputs are considered. To tackle this problem, we judiciously separate the channel output tuples that induce decoder ties into two groups, one group consisting of output tuples that do not fulfill the above flipping manipulation property and the other group composed of the remaining output tuples (i.e., the complement group). We then show that the probability of the former group is upper bounded by that of the latter group, and therefore remains growing at most linearly in blocklength n under arbitrary channel input statistics. Note that the group that fails the flipping property is an empty set when channel input is uniformly distributed over , thereby making the result of Theorem 1 a special case of the extended result in this paper. The rest of the paper is organized as follows. Section 2 presents the main result and highlights the key steps of the proof to facilitate its understanding. The proof is then provided in full detail, along with illustrative examples, in Section 3 and Appendix A and Appendix B. Finally, conclusions and future directions are given in Section 4.
Throughout the paper, we denote for positive integer M and set to be the Hamming distance between n-tuples and with the indices of the tuples restricted to . By convention, we set when and use to represent .
2. Main Result
Consider a binary code with fixed blocklength n and size M to be used over the memoryless BSC with crossover probability . Denote the prior probability on by and hence . Without loss of generality, we assume that all codewords in occur with positive probability, i.e., for all ; hence, is the support of .
It is known the minimal probability of decoding error is achieved by the MAP decoder, which upon the reception of the channel output estimates the codeword according to
| (3) |
where is the posterior conditional distribution of given . We can see from (3) that if more than one achieves the maximum value of for a given , a decoder tie occurs, in which case the set of these , denoted conveniently as , contains more than one element. As a result, an erroneous MAP decision is made if one of the two situations occurs: the transmitted codeword does not belong to ; the transmitted codeword belong to and , but the tie breaker picks the wrong one from . By conveniently denoting
| (4) |
the probability of the first situation acts as a lower bound for (i.e., ), where is given in (2) and can be written as
| (5) |
It is shown in [22] that exactly equals the generalized Poor–Verdú (lower) bound [23,24] as its tilting parameter approaches infinity. The probability of the second situation is bounded above by the probability that the transmitted codeword belongs to and , disregarding whether the tie breaker picks the wrong codeword or not, and this upper bound can be expressed as
| (6) |
We thus have
| (7) |
By proving the inequality
| (8) |
where
| (9) |
we have our main result as follows.
Theorem 2.
For any sequence of binary codes and prior probabilities used over the BSC, we have
(10)
Remark 1.
Theorem 2 implies that the relative deviation of from is at most linear in the blocklength n and the impact of decoder ties in (6) to is only sub-exponential. Consequently, and must have the same error exponent. Note also that the upper bound in (10) differs from the result in Theorem 1 by an additional multiplicative factor of 2 in the term. As explained in the introduction section, this is a consequence of the fact that the probability of the group of channel output tuples that cause decoder ties but fail the flipping manipulation property is upper bounded by that of the remaining tie-inducing channel outputs. The full technical details are provided in Section 3.2. Finally, we emphasize that Theorem 2 holds for arbitrary binary codes, including “bad” codes for which high probability codewords have small Hamming distance between them. Hence, tightening the upper bound in (10) by restricting the analysis for “sufficiently good" codes, in the sense that their most likely codewords sit “sufficiently” far apart in , is an interesting future direction.
List of Main Symbols: Before providing an overview of the main steps of the proof of Theorem 2 (which is presented in full detail in the next section), we describe in Table 1 the main symbols used in the paper and indicate the equation where they are first introduced. We emphasize that sets , and are defined differently from their counterparts in [1] that use the same notation.
Table 1.
Summary of the main symbols used in this paper.
| Symbol | Description | Defined in |
|---|---|---|
| A shorthand for | ||
| The code with being the all-zero codeword | ||
| The Hamming distance between the portions of and with indices in | ||
| All terms below are functions of (this dependence is not explicitly shown to simplify notation) | ||
| The set of channel outputs inducing a decoder tie when is sent | (12) | |
| The set of channel outputs leading to a tie-free decoder decision error when is sent | (15) | |
| The set for | (21) | |
| The set of indices for which the components of and differ | ||
| The size of , i.e., | ||
| The subset of consisting of channel outputs such that j is the minimal | (22a) | |
| number r in satisfying | ||
| The subset of consisting of channel outputs that satisfy | (22b) | |
| and that are not included in for | ||
| The subset of consisting of channel outputs | (23) | |
| such that j is the minimal number in | ||
| The subset of defined according to whether each index in is in each | (43) | |
| of , …, , , …, | ||
| The union of , , …, | (48) | |
| The size of , i.e., | ||
| The mapping from to used for partitioning into | (49) | |
| subsets | ||
| The kth partition of for , 1, …, | (52a) | |
| The kth subset of for , 1, …, | (52b) | |
| The set of representative elements in for partitioning | ||
| The subset of associated with | (55a) | |
| The subset of associated with | (55b) | |
We also visually illustrate in Figure 1 some of the main sets defined in Table 1 under the setting of Example 1, which is presented in Section 3 below for a non-uniformly distributed binary code with codewords and blocklength given by . More specifically, we only show the non-empty component subsets in corresponding to codewords and ; refer to Table A2 in Appendix A for a detailed listing of all component subsets in (including empty ones).
Figure 1.
An illustration, based on the setting in Example 1 for a non-uniformly distributed binary code (with given by of the non-empty component subsets of defined in Table 1 and corresponding to codewords (left figure) and (right figure).
Overview of the Proof: Given that codeword is sent over the channel, , let denote the set of output tuples that result in MAP decoding ties:
| (11) |
| (12) |
where (12) holds because . Then, in (6) can be rewritten as
| (13) |
Similarly, let denote the set of output tuples which guarantee a tie-free MAP decoding error when is transmitted over the channel:
| (14) |
| (15) |
Hence, in (5) can be rewritten as:
| (16) |
Note if , then (7) is tight and (10) holds trivially; so, without loss of generality, we assume in the proof that , which implies that there exists at least one non-empty for . Then, according to (13) and (16), we have that
| (17) |
We can upper-bound (17) by
| (18) |
where for convenience we will refer to an inequality of the form given in (18) as the ratio-sum inequality. As a result, Theorem 2 holds if we can substantiate that is an upper bound for (18). To this end, we will find a proper partition of and an equal number of disjoint subsets of , of which the individual probabilities can be evaluated. For ease of notation, we denote the individual probabilities corresponding to the K-partition of and K disjoint subsets of by and , respectively. Then, we obtain that
| (19) |
By showing that each individual ratio , is bounded above by , the ratio-sum inequality can again be applied to complete the proof.
3. Proof of Theorem 2
In [1], where a uniformly distributed prior probability over is assumed, one can flip a properly selected bit in the output to convert it to a corresponding element in . In light of this connection, one can evaluate the ratio . This approach, however, no longer works when a non-uniformly distributed prior probability is considered. Therefore, we have to devise a more judicious approach to extend the result in [1] for a general prior probability.
3.1. A Partition of Non-Empty and Corresponding Disjoint Subsets of
In this section, instead of finding a disjoint covering of the set of decoder ties as in [1], we establish a proper partition of from Definitions 1 and 2. This is one of the key differences from the techniques used in [1]. Example 1 is given after Proposition 1 to illustrate Definitions 1 and 2.
Given defined in (12), there exists at least one such that
| (20) |
We collect the indices m that satisfy (20) in as follows:
| (21) |
Remark 2.
First, we note that is not empty as long as . Also, for any , we can infer from (21) that if and only if .
In Definitions 1 and 2 that follow, we will assign each to a subset indexed by . These subsets will form a partition of as stated in Proposition 1.
Definition 1.
For , denoting by the set of indices where the bit components of and differ, we define
Since there may exist satisfying for all , the collection of all elements in may not exhaust the elements in (see Example 1). We thus go on to collect the remaining elements in as follows.
Definition 2.
Define for ,
(23)
With the sets defined in Definitions 1 and 2, a partition of and disjoint subsets of are constructed as proven in the following proposition.
Proposition 1.
For non-empty , the following two properties hold.
- (i)
The collection forms a (disjoint) partition of .
- (ii)
is a collection of disjoint subsets of .
Before proving Proposition 1, we provide the following example to illustrate the above sets.
Example 1.
This example illustrates the necessity of introducing as a companion to . Suppose . Let , and . Then, satisfies
(24) where the probabilities are written in the form
(25) Note that the first equality in (24) indicates and the last two equalities and the right-most inequality jointly imply . In light of Proposition 1, this 0111 must lie in one and only one of as shown in Table A1 and Table A2 of Appendix A. Since there exist no integers h in fulfilling , this 0111 belongs to with . Recall that in [1], an element in can be obtained if flipping a zero of can make it further away from but closer to . However, for in this example if we flip the only zero to one, it gets further away from both and for any . Therefore, the bit-flipping manipulation fails.
With , we also have
(26) where the first equality indicates and the remaining parts in (26) jointly imply that . Proposition 1 then states that this 0111 lies in one and only one of . Since and , we have according to (22a). Thus, we can flip a bit in 0111 to get further away from and closer to simultaneously. More specifically, the bit-flipping manipulation produces either 0110 or 0011, which lies in as is in . Therefore, we can associate the element in with an element in via a single flipping operation. For completeness, a full list of the sets , , , and for and , is given in Appendix A.
Proof of Proposition 1.
First, we note that by the definitions in (22a) and (23), are disjoint and so is . Additionally, (23) implies for arbitrary . Furthermore, according to Definitions 1 and 2, for any , we have either or for some . Consequently, forms a partition of .
On the other hand, the inequality in (22b) prevents multiple inclusions of an element from the previous collections. Therefore, are a collection of disjoint subsets of . □
Remark 3.
When channel inputs are uniformly distributed as considered in [1], it follows that
(27) and for every . Therefore, (22a) is reduced to
(28) and
(29) We then have the following two remarks. First, we note that the newly defined via (22a) and reduced to (28) in the regime considered in [1] is more restrictive than the introduced in [1] (Equation (16a)). As a consequence, forms a partition of in this paper while those introduced in [1] (Equation (16a)) are a disjoint covering of under uniform channel inputs. Second, (29) shows that [1] does not need to consider a companion to , but this paper does.
Based on Proposition 1, we continue the derivation from (17) and obtain:
| (30) |
| (31) |
where (31) holds because and are disjoint, and the same applies to . An additional upper bound for (31) requires the verification of the inequality:
| (32) |
which is an immediate consequence of the proposition to be proven in the next section (Proposition 2), stating that
| (33) |
3.2. Verification of (32)
Recall that the main technique used in [1] is to associate every element in with a corresponding element in via the bit-flipping manipulation. By this bit-flipping association, the probability ratio of the elements and corresponding elements respectively in and can be evaluated. However, as Example 1 indicates, for an element in , the bit-flipping association no longer works. This reveals the challenge of generalizing the results in [1] from uniform channel inputs to arbitrarily distributed channel inputs. A solution is to subdivide the elements in into two groups and , where the bit-flipping association to works for the former group but not for the latter. The inequality in (32) can then be used to exclude the latter group with an upper bound:
| (34) |
| (35) |
Since uniform channel inputs as considered in [1] guarantee (29), it can be seen from (35) that the multiplicative factor of 2 can be reduced to 1 as observed in Remark 1. For general arbitrary channel inputs, we have the factor of 2 since the set may not be empty. The validity of (32) can be confirmed by the next proposition.
Proposition 2.
Suppose . Then, for every , we have
(36)
Proof.
Suppose . Then, for every . We therefore have:
(37) We can rewrite (37) as
(38) implying and . Noting that because and , we conclude that the smallest integer satisfying exists, and therefore . □
Remark 4.
Two observations can be made based on Proposition 2. First, Proposition 2 indicates that every must appear at least once in the sum , contributing the same probability mass as . Second, Proposition 2 also implies that every cannot be contained in . This observation can be substantiated as follows. For every , Proposition 2 implies for some and hence Definition 2 immediately gives for all . For , we have and therefore for all as pointed out in Remark 2. As a result, every appears exactly once in the sum . Combining the two observations leads to:
(39)
To flesh out the above inequality, we give the next example.
Example 2.
Proceeding from Example 1, we observe from Table A1 and Table A2 in Appendix A that 0111 is contained in , and . Hence, it appears once in the sum while it contributes twice in the sum . We then confirm from (A35) that:
(40)
We continue the derivation from (35) and obtain
| (41) |
| (42) |
where we add the restriction in (41) to exclude the cases of zero dividing by zero in (42), and (42) follows the ratio-sum inequality in (18).
In the next section, we introduce a number of delicate decompositions of non-empty and an equal number of disjoint subsets of to facilitate the bit-flipping association of the pairs.
3.3. Atomic Decomposition of Non-Empty and the Corresponding Disjoint Subsets of
To simplify the exposition, we assume without loss of generality that is the all-zero codeword (It is known that we can simultaneously flip the same position of all codewords to yield a new code of equal performance over the BSC. Thus, via a number of flipping manipulations, we can transform any code to a code of equal performance with the first codeword being all-zero.) Below we present the proof for . The proof for general follows annalagously.
Since is the all-zero codeword, is the set containing the indices of the non-zero components of . To facilitate the investigation of the structure of relative to the remaining codewords , we first partition into subsets according to whether each index in is in , …, , , …, or not, as follows:
| (43) |
where and , and each . An example of the partition is given below.
Example 3.
Suppose . For and , we obtain subsets as
(44)
As is the all-zero codeword, the components of with indices in can now be unambiguously identified and must all equal . As a result,
| (45) |
Example 4.
Proceeding from Example 3, we have
(46) and
(47)
It should be emphasized that in this paper is defined differently from that in [1]. While the one defined in [1] partitions only according to codewords with indices less than j, the one defined in this paper considers all other codewords in the partition manipulation, and hence the order of codewords becomes irrelevant.
Next, to decompose , we further define a sequence of incremental sets:
| (48) |
and set . Let and respectively denote the sizes of and and note that .
The idea behind the partition of into subsets, indexed by , is as follows. Pick one . We start by examining whether is strickly less than . If the answer is negative, we continue examining whether is strictly less than . Proceed until we reach the smallest m such that holds. Setting k to be equal to , we assign this to the subset . Notably, there exists no such number that satisfies if and only if ; in this case, we find the smallest m satisfying and assign this element to as . For ease of describing the above algorithmic partition process, we introduce a mapping from to as follows:
| (49) |
We can see that for , we have . Therefore, if is assigned to for some , we must have
| (50) |
On the other hand, if is collected in , then and
| (51) |
A formal definition of is given next, where the corresponding subsets of are also introduced.
Definition 3.
Define for , 1, …, ,
where is defined in (49).
With Definition 3, we have the following proposition.
Proposition 3.
For non-empty , the following two properties hold.
- (i)
forms a partition of ;
- (ii)
is a collection of disjoint subsets of .
Proof.
It can be seen from the definitions of and that they are collections of mutually disjoint subsets of and , respectively. It remains to argue that every element in belongs to for some . Noting that the element in satisfies , we differentiate two cases: and . For the former case, must hold for ; hence, this will be contained in . For the latter case, will be included in . The proof is thus completed. □
In light of Proposition 3, we can apply the ratio-sum inequality to obtain
| (53) |
| (54) |
We continue to construct a fine partition of and the corresponding disjoint subsets of in Proposition 4 after giving the next definition.
Definition 4.
Define for ,
where is given in (49).
Note from Definition 4 that for one element in non-empty , we can find a group of elements that have identical bit components to with indices in . We denote this group as . We continue this grouping manipulation until all elements in are exhausted as summarized below.
Proposition 4.
For non-empty , there exists a representative subset such that the following two properties hold.
- (i)
forms a (non-empty) partition of ;
- (ii)
is a collection of (non-empty) disjoint subsets of .
Since the above proposition can be self-validated via its sequential selection manipulation of each from , we omit the proof. Interested readers can find the details in [1] (Section III-C).
From Proposition 4, using again the ratio-sum inequality, we obtain that for non-empty ,
| (56) |
| (57) |
Noting that the above result can be similarly conducted for general , we combine (42), (54) and (57) to conclude that
| (58) |
The final task is to evaluate in order to characterize a linear upper bound for .
3.4. Characterization of a Linear Upper Bound for
We again focus on with being the all-zero codeword for simplicity. The definitions of in (55a) and in (55b) indicate that when dealing with the ratio , we only need to consider those bits with indices in because the remaining bits of all tuples in and have identical values as . Note that all elements in have exactly k ones with indices in , and all elements in have exactly ones with indices in , we can immediately infer that
| (59) |
The cardinalities of and then decide the ratio in (59) as verified in the next proposition, based on which the proof of Theorem 2 can be completed from (58).
Proposition 5.
For , we have
(60)
Proof.
Recall from (22a), (52a) and (55a) that if and only if
Thus, the number of elements in is exactly the number of channel outputs fulfilling the above three conditions. We then examine the number of satisfying (61b) and (61c). Noting that these have either ones or ones with indices in , we know that there are at most
(62) of tuples satisfying (61b) and (61c). Disregarding (61a), we get that the number of elements in is upper-bounded by (62).
On the other hand, from (22b), (52b) and (55b), we obtain that if and only if
We then claim that any satisfying (63c) and (63d) directly validate (63a) and (63b). Note that the validity of the claim, which we prove in Appendix B, immediately implies that the number of elements in can be determined by (63c) and (63d), and hence
(64) Under this claim, (62) and (64) result in
(65)
(66)
(67)
(68) where (67) holds because by (49), and (68) follows from . The proof of the proposition is thus completed by (59) and (68). □
4. Conclusions
In this paper, we analyzed the error probability of block codes sent over the memoryless BSC under an arbitrary (not necessarily uniform) input distribution and used in conjunction with (optimal) MAP decoding. We showed that decoder ties do not affect the error exponent of the probability of error, thus extending a similar result recently established in [1] for uniformly distributed channel inputs. This result was obtained by proving that the relative deviation of the error probability from the probability of error grows no more than linearly in blocklength when no MAP decoding ties occur, directly implying that decoder ties have only a sub-exponential effect on the error probability as blocklength grows without bound. Future work includes further extending this result for more general channels used under arbitrary input statistics, such as non-binary symmetric channels (Note that the result of Theorem 1 can be extended for non-binary (q-ary, ) codes sent over q-ary symmetric memoryless channels under a uniform input distribution; see [25] (Theorem 2).) and binary non-symmetric channels. Studying how to sharpen the upper bound derived in (10) for “sufficiently good” codes as highlighted in Remark 1 and for codes with small blocklengths are other worthwhile future directions.
Appendix A. Supplement to Example 1
Under distribution
| (A1) |
| (A2) |
| (A3) |
over the code , we obtain:
| (A4) |
| (A5) |
| (A6) |
| (A7) |
where (A5) follows from (25), and
| (A8) |
and
| (A9) |
| (A10) |
The above derivations are verified via Table A1. Continuing with the same setting, we obtain
| (A11) |
| (A12) |
| (A13) |
| (A14) |
| (A15) |
and
| (A16) |
where the above derivations are also confirmed via Table A1. Based on Table A1, we further have
| (A17) |
| (A18) |
| (A19) |
| (A20) |
| (A21) |
and
| (A22) |
Furthermore, we establish from Table A1 that
| (A23) |
| (A24) |
| (A25) |
| (A26) |
| (A27) |
and
| (A28) |
After summarizing all sets derived above in Table A2, we remark that
| (A29) |
Note that are disjoint as confirmed in Remark 4 such that every element appears only once in the following summation:
| (A30) |
| (A31) |
| (A32) |
Additionally,
| (A33) |
| (A34) |
| (A35) |
Finally, we have
| (A36) |
| (A37) |
| (A38) |
| (A39) |
| (A40) |
| (A41) |
| (A42) |
| (A43) |
| (A44) |
| (A45) |
| (A46) |
| (A47) |
| (A48) |
| (A49) |
| (A50) |
| (A51) |
| (A52) |
| (A53) |
| (A54) |
Table A1.
Measures used in Example 1.
| 2 | 2 | 5 | ∅ | ∅ | ∅ | ∅ | ||
| 1 | 3 | 4 | ∅ | ∅ | ∅ | ∅ | ||
| 3 | 1 | 4 | ∅ | ∅ | ∅ | ∅ | ||
| 1 | 1 | 4 | ∅ | ∅ | ∅ | ∅ | ||
| 3 | 3 | 6 | ∅ | ∅ | ∅ | ∅ | ||
| 0 | 2 | 2 | 3 | ∅ | ∅ | ∅ | ∅ | |
| 0 | 0 | 2 | 3 | ∅ | ∅ | |||
| 0 | 2 | 0 | 3 | ∅ | ∅ | |||
| 0 | 2 | 4 | 5 | ∅ | ∅ | ∅ | ∅ | |
| 0 | 4 | 2 | 5 | ∅ | ∅ | ∅ | ∅ | |
| 0 | 2 | 2 | 5 | ∅ | ∅ | ∅ | ∅ | |
| 1 | 1 | 1 | 2 | ∅ | ||||
| 1 | 3 | 3 | 4 | ∅ | ∅ | ∅ | ∅ | |
| 1 | 1 | 3 | 4 | ∅ | ∅ | |||
| 1 | 3 | 1 | 4 | ∅ | ∅ | |||
| 2 | 2 | 2 | 3 | ∅ |
Table A2.
List of , , , and for and in Example 1.
| ∅ | |||||
|---|---|---|---|---|---|
| ∅ | |||||
| ∅ | ∅ | ||||
| ∅ | ∅ | ||||
| ∅ | ∅ | ∅ | |||
| ∅ | |||||
| ∅ | ∅ | ||||
| ∅ | ∅ | ∅ | |||
| ∅ | |||||
| ∅ | ∅ | ||||
| ∅ | ∅ | ∅ | |||
| ∅ | ∅ | ∅ | |||
| ∅ | ∅ | ∅ | |||
| ∅ | ∅ | ∅ | |||
Appendix B. The Proof of the Claim Supporting Proposition 5
We validate the claim that (63c) and (63d) imply (63a) and (63b) via the construction of an auxiliary from . This auxiliary will be defined differently according to whether equals or as follows.
-
(i) : In this case, has no zero components with indices in . Moreover, indicates that
(A55) Therefore, we flip arbitrarily a zero component of with its index in to construct a such that
which implies(A56) (A57) Then, must fulfill (63a), (63c) and (63d) (with replaced by ) as satisfies (61a), (61b) and (61c). We next declare that also fulfills (63b) and will prove this declaration by contradiction.
Proof of the declaration: Suppose there exists a satisfying(A58) We then recall from (45) that is either 0 or . Thus, (A58) can be disproved by differentiating two subcases: , and (Since as can be seen from (50) and (51), we have , i.e., non-empty).
In Subcase , that is obtained by flipping a zero component of with index in must satisfy and , which is equivalent to(A59) Then, (A58) implies
Hence,(A60) (A61) A contradiction to the fact that satisfies (61a) (with replaced by ) is obtained.
In Subcase , we note that implies . Therefore, (A55) leads to(A62) The flipping manipulation on results in and , which is equivalent to(A63) Therefore, (A58) implies
which together with and (A62) result in because . This contradicts . Accordingly, must also fulfill (63b); hence, . This completes the proof of the declaration.(A64) With this auxiliary , we are ready to prove that every satisfying (63c) and (63d) also validates (63a) and (63b). Toward this end, we need to prove
Note that(A65)
where (A66a) holds because both and satisfy (63c), implying that all components of and with indices in are equal to one; (A66b) holds because when considering only those portions with indices in (non-empty) , gives either all ones or all zeros according to (45), and both and have exactly ones according to (63c); and (A66c) is valid since both and satisfy (63d). Based on (A66a)–(A66c), we remark that for all , which implies (equivalently, ) for all ). -
(ii)
: In this case, there is only one zero component of with its index in . Suppose the index of such zero component lie in , where . The flipping manipulation to leads to , which has all one components with respect to . Then, must fulfill (63a), (63c), and (63d) as satisfies (61a), (61b), and (61c). With the components of with respect to (non-empty) being either all zeros or all ones, the same contradiction argument between (A58) and (A64), with replaced by h, can disprove the validity of (A58) for this and for any . Therefore, also fulfills (63b), implying . With this auxiliary , we can again verify (A66a)–(A66c) via the same argument. The claim that satisfying (63c) and (63d) validates (63a) and (63b) is thus confirmed.
Author Contributions
Conceptualization, L.-H.C., P.-N.C. and F.A.; Formal analysis, L.-H.C.; Writing—original draft, L.-H.C.; Writing—review & editing, P.-N.C. and F.A. All authors have read and agreed to the published version of the manuscript.
Conflicts of Interest
The authors declare no conflict of interest.
Funding Statement
The work of Ling-Hua Chang is supported by the Ministry of Science and Technology, Taiwan under Grant MOST 109-2221-E-155-035-MY3. The work of Po-Ning Chen is supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 110-2221-E-A49-024-MY3. The work of Fady Alajaji is supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
- 1.Chang L.H., Chen P.N., Alajaji F., Han Y.S. Decoder Ties Do Not Affect the Error Exponent of the Memoryless Binary Symmetric Channel. IEEE Trans. Inf. Theory. 2022;68:3501–3510. doi: 10.1109/TIT.2022.3150597. [DOI] [Google Scholar]
- 2.Shannon C.E., Gallager R.G., Berlekamp E.R. Lower bounds to error probability for coding on discrete memoryless channels—I. Inf. Control. 1967;10:65–103. doi: 10.1016/S0019-9958(67)90052-6. [DOI] [Google Scholar]
- 3.Shannon C.E., Gallager R.G., Berlekamp E.R. Lower bounds to error probability for coding on discrete memoryless channels—II. Inf. Control. 1967;10:522–552. doi: 10.1016/S0019-9958(67)91200-4. [DOI] [Google Scholar]
- 4.McEliece R.J., Omura J.K. An improved upper bound on the block coding error exponent for binary-input discrete memoryless channels. IEEE Trans. Inf. Theory. 1977;23:611–613. doi: 10.1109/TIT.1977.1055772. [DOI] [Google Scholar]
- 5.Gallager R.G. Information Theory and Reliable Communication. Wiley; New York, NY, USA: 1968. [Google Scholar]
- 6.Viterbi A.J., Omura J.K. Principles of Digital Communication and Coding. McGraw-Hill; New York, NY, USA: 1979. [Google Scholar]
- 7.Csiszár I., Körner J. Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press; New York, NY, USA: 1981. [Google Scholar]
- 8.Blahut R. Principles and Practice of Information Theory. Addison-Wesley Longman Publishing Co., Inc.; Albany, NY, USA: 1988. [Google Scholar]
- 9.Barg A., McGregor A. Distance distribution of binary codes and the error probability of decoding. IEEE Trans. Inf. Theory. 2005;51:4237–4246. doi: 10.1109/TIT.2005.858977. [DOI] [Google Scholar]
- 10.Haroutunian E.A., Haroutunian M.E., Harutyunyan A.N. Foundations and Trends in Communications and Information Theory. Volume 4. Now Publishers Inc.; Delft, The Netherlands: 2007. Reliability Criteria in Information Theory and in Statistical Hypothesis Testing; pp. 97–263. [Google Scholar]
- 11.Dalai M. Lower bounds on the probability of error for classical and classical-quantum channels. IEEE Trans. Inf. Theory. 2013;59:8027–8056. doi: 10.1109/TIT.2013.2283794. [DOI] [Google Scholar]
- 12.Burnashev M.V. On the BSC reliability function: Expanding the region where it is known exactly. Probl. Inf. Transm. 2015;51:307–325. doi: 10.1134/S0032946015040018. [DOI] [Google Scholar]
- 13.Csiszár I. Joint source-channel error exponent. Probl. Control. Inf. Theory. 1980;9:315–328. [Google Scholar]
- 14.Zhong Y., Alajaji F., Campbell L. On the joint source-channel coding error exponent for discrete memoryless systems. IEEE Trans. Inf. Theory. 2006;52:1450–1468. doi: 10.1109/TIT.2006.871608. [DOI] [Google Scholar]
- 15.Alajaji F., Phamdo N., Fuja T. Channel codes that exploit the residual redundancy in CELP-encoded speech. IEEE Trans. Speech Audio Process. 1996;4:325–336. doi: 10.1109/89.536927. [DOI] [Google Scholar]
- 16.Xu W., Hagenauer J., Hollmann J. Joint source-channel decoding using the residual redundancy in compressed images; Proceedings of the Proceedings of the International Conference on Communications; Washington, DC, USA. 25–28 February 1996; pp. 142–148. [Google Scholar]
- 17.Hagenauer J. Source-controlled channel decoding. IEEE Trans. Commun. 1995;43:2449–2457. doi: 10.1109/26.412719. [DOI] [Google Scholar]
- 18.Goertz N. Joint Source-Channel Coding of Discrete-Time Signals with Continuous Amplitudes. World Scientific; Singapore: 2007. [Google Scholar]
- 19.Duhamel P., Kieffer M. Joint Source-Channel Decoding: A Cross-Layer Perspective with Applications in Video Broadcasting. Academic Press; Cambridge, MA, USA: 2009. [Google Scholar]
- 20.Fresia M., Pérez-Cruz F., Poor H.V., Verdú S. Joint source and channel coding. IEEE Signal Process. Mag. 2010;27:104–113. doi: 10.1109/MSP.2010.938080. [DOI] [Google Scholar]
- 21.Alajaji F., Chen P.N. An Introduction to Single-User Information Theory. Springer; Berlin/Heidelberg, Germany: 2018. [Google Scholar]
- 22.Chang L.H., Chen P.N., Alajaji F., Han Y.S. The asymptotic generalized Poor-Verdú bound achieves the BSC error exponent at zero rate; Proceedings of the IEEE International Symposium on Information Theory; Los Angeles, CA, USA. 21–26 June 2020. [Google Scholar]
- 23.Chen P.N., Alajaji F. A generalized Poor-Verdú error bound for multihypothesis testings. IEEE Trans. Inf. Theory. 2012;58:311–316. doi: 10.1109/TIT.2011.2171533. [DOI] [Google Scholar]
- 24.Poor H.V., Verdú S. A lower bound on the probability of error in multihypothesis testing. IEEE Trans. Inf. Theory. 1995;41:1992–1994. doi: 10.1109/18.476322. [DOI] [Google Scholar]
- 25.Chang L.H., Chen P.N., Alajaji F., Han Y.S. Tightness of the asymptotic generalized Poor-Verdú error bound for the memoryless symmetric channel. arXiv. 20202007.04080v1 [Google Scholar]

