Skip to main content
Entropy logoLink to Entropy
. 2023 Apr 16;25(4):668. doi: 10.3390/e25040668

On Decoder Ties for the Binary Symmetric Channel with Arbitrarily Distributed Input

Ling-Hua Chang 1,*, Po-Ning Chen 2, Fady Alajaji 3
Editors: Jun Chen, Sadaf Salehkalaibar
PMCID: PMC10138166  PMID: 37190456

Abstract

The error probability of block codes sent under a non-uniform input distribution over the memoryless binary symmetric channel (BSC) and decoded via the maximum a posteriori (MAP) decoding rule is investigated. It is proved that the ratio of the probability of MAP decoder ties to the probability of error grows most linearly in blocklength when no MAP decoding ties occur, thus showing that decoder ties do not affect the code’s error exponent. This result generalizes a similar recent result shown for the case of block codes transmitted over the BSC under a uniform input distribution.

Keywords: binary symmetric channel, block codes, non-uniformly distributed channel inputs, joint source-channel coding, maximum a posteriori (MAP) decoding, decoder ties, error probability, error exponent

1. Introduction

Consider the classical channel coding context, where we send a block code through the memoryless binary symmetric channel (BSC) with crossover probability 0<p<1/2. Given a sequence of binary codes {Cn}n1 with n being the blocklength, we denote the sequence of corresponding minimal probabilities of decoding error under maximum a posteriori (MAP) decoding by {an}n1. The following result was recently shown in [1] when the channel input selects codewords from Cn according to a uniform distribution.

Theorem 1

([1]). For any sequence of codes {Cn}n1 of blocklength n and size |Cn|=M with Cn{0,1}n, sent over the BSC with crossover probability 0<p<1/2 under a uniform channel input distribution over Cn, its minimum probability of decoding error an satisfies

bnan1+(1p)pnbn, (1)

where

bn=PXn,Yn(xn,yn)Xn×Yn:PXn|Yn(xn|yn)<maxunCn\{xn}PXn|Yn(un|yn), (2)

where PXn,Yn is the joint input–output distribution that Xn=(X1,X2,,Xn)Xn{0,1}n is sent over the BSC (via n uses) and Yn=(Y1,Y2,,Yn)Yn{0,1}n is received.

Noting that bn in (2) is the probability that a decoding error occurs without inducing decoder ties (which occur when two or more codewords in Cn are identified by the decoder as the estimated transmitted codeword; i.e., when more than one codeword in Cn maximize PXn|Yn(·|yn) for a given received word yn), the above result in (1) directly implies that decoder ties do not affect the error exponent of an. The error exponent or reliability function of a block coding communication system represents the largest rate of exponential decay of the system’s probability of decoding error as the coding blocklength grows to infinity (e.g., see [2,3,4,5,6,7,8,9,10,11,12,13,14]).

It is known that uniformly distributed data achieves the largest entropy rate and leaves no room for data compression. Thus, ideally compressed data should exhibit uniform distribution for all blocklengths n. However, this setting is often impractical due to the sub-optimality of the implemented data compression schemes. Instead, we generally have non-uniformly distributed data after compression in the form of residual redundancy such as in speech or image coding (e.g., [15,16]). Furthermore, one may have a compressed source that can be divided into several groups, within each of which the symbols are equally probable. Decoder ties can thus occur with respect to two (or more) codewords corresponding to symbols within the same group.

In this paper, we consider a non-uniform prior distribution over Cn and prove that decoder ties, under optimal MAP decoding, still have linear and hence sub-exponential impact on the error probability an, thus extending Theorem 1 established for the case of a uniform prior distribution over Cn. Since our problem falls within the general framework of joint source-channel coding for point-to-point communication systems, we refer the reader to [14,15,16,17,18,19,20,21] (Section 4.6) and the references therein for theoretical studies on this subject as well as practical designs that outperform separate source and channel coding under complexity or delay constraints.

The proof technique used in [1] to show (1) above is based on the observation that there are two types of decoding errors. One is that the received tuple at the channel output induces no decoder ties but the corresponding decoder decision is wrong. The other is that the received tuple at the channel output causes a decoder tie, but the decoder picks the wrong codeword. As a result, the MAP error probability an can be upper bounded by the sum of two terms, bn and δn, where bn is the probability of the first type of decoding errors as given in (2), and δn is the probability of decoder ties regardless of whether the tie breaker misses the correct codeword or not. Under the assumption that the channel input is uniformly distributed over block code Cn for each blocklength n and an arbitrary sequence of codes {Cn}n1, it was shown in [1] that flipping a properly selected bit component of the channel output that causes a decoder tie can produce a unique channel output that leads to the first type of decoding errors. An analysis of this bit-flipping manipulation shows that the ratio δn/bn grows at most linearly in n and hence yields the upper bound in (1). However, this flipping technique no longer works when non-uniform channel inputs are considered. To tackle this problem, we judiciously separate the channel output tuples that induce decoder ties into two groups, one group consisting of output tuples that do not fulfill the above flipping manipulation property and the other group composed of the remaining output tuples (i.e., the complement group). We then show that the probability of the former group is upper bounded by that of the latter group, and therefore δn/bn remains growing at most linearly in blocklength n under arbitrary channel input statistics. Note that the group that fails the flipping property is an empty set when channel input is uniformly distributed over Cn, thereby making the result of Theorem 1 a special case of the extended result in this paper. The rest of the paper is organized as follows. Section 2 presents the main result and highlights the key steps of the proof to facilitate its understanding. The proof is then provided in full detail, along with illustrative examples, in Section 3 and Appendix A and Appendix B. Finally, conclusions and future directions are given in Section 4.

Throughout the paper, we denote [M]{1,2,,M} for positive integer M and set d(xn,yn|S) to be the Hamming distance between n-tuples xn=(x1,x2,,xn) and yn=(y1,y2,,yn) with the indices of the tuples restricted to S[n]. By convention, we set d(xn,yn|S)=0 when S= and use d(xn,yn) to represent d(xn,yn|[n]).

2. Main Result

Consider a binary code Cn{0,1}n with fixed blocklength n and size M to be used over the memoryless BSC with crossover probability 0<p<12. Denote the prior probability on Cn by PXn and hence PXn(Cn)=1. Without loss of generality, we assume that all codewords in Cn occur with positive probability, i.e., PXn(xn)>0 for all xnCn; hence, Cn is the support of PXn.

It is known the minimal probability of decoding error is achieved by the MAP decoder, which upon the reception of the channel output yn{0,1}n estimates the codeword xnCn according to

e(yn)=argmaxunCnPXn|Yn(un|yn), (3)

where PXn|Yn is the posterior conditional distribution of Xn given Yn. We can see from (3) that if more than one unCn achieves the maximum value of PXn|Yn(un|yn) for a given yn, a decoder tie occurs, in which case the set of these un, denoted conveniently as {e(yn)}, contains more than one element. As a result, an erroneous MAP decision is made if one of the two situations occurs: i) the transmitted codeword does not belong to {e(yn)}; ii) the transmitted codeword belong to {e(yn)} and |{e(yn)}|>1, but the tie breaker picks the wrong one from {e(yn)}. By conveniently denoting

Cn={c1,c2,,cM}, (4)

the probability of the first situation acts as a lower bound bn for an (i.e., bnan), where bn is given in (2) and can be written as

bn=i=1MPXn(ci)PYn|Xnyn{0,1}n:PXn|Yn(ci|yn)<maxr[M]\{i}PXn|Yn(cr|yn). (5)

It is shown in [22] that bn exactly equals the generalized Poor–Verdú (lower) bound [23,24] as its tilting parameter approaches infinity. The probability of the second situation is bounded above by the probability that the transmitted codeword belongs to {e(yn)} and |{e(yn)}|>1, disregarding whether the tie breaker picks the wrong codeword or not, and this upper bound can be expressed as

δni=1MPXn(ci)PYn|Xnyn{0,1}n:PXn|Yn(ci|yn)=maxr[M]\{i}PXn|Yn(cr|yn). (6)

We thus have

bnanbn+δn. (7)

By proving the inequality

δn2qnbn, (8)

where

q1pp>1, (9)

we have our main result as follows.

Theorem 2.

For any sequence of binary codes {Cn}n1 and prior probabilities {PXn}n1 used over the BSC, we have

bnan1+2qnbn. (10)

Remark 1.

Theorem 2 implies that the relative deviation of an from bn is at most linear in the blocklength n and the impact of decoder ties in (6) to an is only sub-exponential. Consequently, an and bn must have the same error exponent. Note also that the upper bound in (10) differs from the result in Theorem 1 by an additional multiplicative factor of 2 in the qn term. As explained in the introduction section, this is a consequence of the fact that the probability of the group of channel output tuples that cause decoder ties but fail the flipping manipulation property is upper bounded by that of the remaining tie-inducing channel outputs. The full technical details are provided in Section 3.2. Finally, we emphasize that Theorem 2 holds for arbitrary binary codes, including “bad” codes for which high probability codewords have small Hamming distance between them. Hence, tightening the upper bound in (10) by restricting the analysis for “sufficiently good" codes, in the sense that their most likely codewords sit “sufficiently” far apart in {0,1}n, is an interesting future direction.

List of Main Symbols: Before providing an overview of the main steps of the proof of Theorem 2 (which is presented in full detail in the next section), we describe in Table 1 the main symbols used in the paper and indicate the equation where they are first introduced. We emphasize that sets Tj|i, Nj|i and S1,j(m) are defined differently from their counterparts in [1] that use the same notation.

Table 1.

Summary of the main symbols used in this paper.

Symbol Description Defined in
[M] A shorthand for {1,2,,M}
Cn The code c1,c2,,cn with c1 being the all-zero codeword
d(un,vn|S) The Hamming distance between the portions of un and vn with indices in S
All terms below are functions of Cn (this dependence is not explicitly shown to simplify notation)
Ti The set of channel outputs yn inducing a decoder tie when ci is sent (12)
Ni The set of channel outputs yn leading to a tie-free decoder decision error when ci is sent (15)
Ii(yn) The set {m[M]\{i}:ynTm} for ynTi (21)
Si,j The set of indices for which the components of ci and cj differ
i,j The size of Si,j, i.e., |Si,j|
Tj|i The subset of Ti consisting of channel outputs yn such that j is the minimal (22a)
number r in Ii(yn) satisfying d(ci,yn|Si,r)<i,r
Nj|i The subset of Ni consisting of channel outputs yn that satisfy PXn,Yn(ci,yn)·q= (22b)
PXn,Yn(cj,yn)·1q and that are not included in Nr|i for r[j1]{i}
Tj|i The subset of Ti\h[M]\{i}Th|i consisting of channel outputs yn (23)
such that j is the minimal number in Ii(yn)
S1,j(m) The subset of S1,j defined according to whether each index in S1,j is in each (43)
of S1,2, , S1,j1, S1,j+1, , S1,M
S1,j(m) The union of S1,j(1), S1,j(2), , S1,j(m) (48)
1,j(m) The size of S1,j(m), i.e., |S1,j(m)|
ηk The mapping from k{0,1,,j1} to [2M2] used for partitioning Tj|1 into 1,j (49)
subsets {Tj|1(k)}0k<1,j
Tj|1(k) The kth partition of Tj|1 for k=0, 1, , 1,j1 (52a)
Nj|1(k) The kth subset of Nj|1 for k=0, 1, , 1,j1 (52b)
Uj|1(k) The set of representative elements in Tj|1(k) for partitioning Tj|1(k)
Tj|1(un;k) The subset of Tj|1(k) associated with unUj|1(k) (55a)
Nj|1(un;k) The subset of Nj|1(k) associated with unUj|1(k) (55b)

We also visually illustrate in Figure 1 some of the main sets defined in Table 1 under the setting of Example 1, which is presented in Section 3 below for a non-uniformly distributed binary code with M=4 codewords and blocklength n=4 given by C4={c1,c2,c3,c4}={0000,0101,0110,0111}. More specifically, we only show the non-empty component subsets in Yn={0,1}4 corresponding to codewords c1 and c2; refer to Table A2 in Appendix A for a detailed listing of all component subsets in {0,1}4 (including empty ones).

Figure 1.

Figure 1

An illustration, based on the setting in Example 1 for a non-uniformly distributed binary code (with M=n=4) given by C4={c1,c2,c3,c4}={0000,0101,0110,0111} of the non-empty component subsets of Yn defined in Table 1 and corresponding to codewords c1=0000 (left figure) and c2=0101 (right figure).

Overview of the Proof: Given that codeword ci is sent over the channel, i[M], let Ti denote the set of output tuples yn that result in MAP decoding ties:

Tiyn{0,1}n:PXn|Yn(ci|yn)=maxr[M]\{i}PXn|Yn(cr|yn) (11)
 =yn{0,1}n:PXn,Yn(ci,yn)=maxr[M]\{i}PXn,Yn(cr,yn), (12)

where (12) holds because PXn|Yn(xn|yn)=PXn,Yn(xn,yn)PYn(yn). Then, δn in (6) can be rewritten as

δn=i[M]PXn(ci)PYn|XnTi|ci=i[M]PXn,Yn(ci,Ti). (13)

Similarly, let Ni denote the set of output tuples yn which guarantee a tie-free MAP decoding error when ci is transmitted over the channel:

Niyn{0,1}n:PXn|Yn(ci|yn)<maxr[M]\{i}PXn|Yn(cr|yn) (14)
 =yn{0,1}n:PXn,Yn(ci,yn)<maxr[M]\{i}PXn,Yn(cr,yn). (15)

Hence, bn in (5) can be rewritten as:

bn=i[M]PXn(ci)PYn|Xn(Ni|ci)=i[M]PXn,Yn(ci,Ni). (16)

Note if δn=0, then (7) is tight and (10) holds trivially; so, without loss of generality, we assume in the proof that δn>0, which implies that there exists at least one non-empty Ti for i[M]. Then, according to (13) and (16), we have that

δnbn=i[M]PXn,Yn(ci,Ti)i[M]PXn,Yn(ci,Ni)i[M]:TiPXn,Yn(ci,Ti)i[M]:TiPXn,Yn(ci,Ni). (17)

We can upper-bound (17) by

i[M]:TiPXn,Yn(ci,Ti)i[M]:TiPXn,Yn(ci,Ni)max1[M]:TiPXn,Yn(ci,Ti)PXn,Yn(ci,Ni), (18)

where for convenience we will refer to an inequality of the form given in (18) as the ratio-sum inequality. As a result, Theorem 2 holds if we can substantiate that 2qn is an upper bound for (18). To this end, we will find a proper partition of Ti and an equal number of disjoint subsets of Ni, of which the individual probabilities can be evaluated. For ease of notation, we denote the individual probabilities corresponding to the K-partition of Ti and K disjoint subsets of Ni by {αk}k=1K and {βk}k=1K, respectively. Then, we obtain that

PXn,Yn(ci,Ti)PXn,Yn(ci,Ni)k=1Kαkk=1Kβk. (19)

By showing that each individual ratio αk/βk, k[K] is bounded above by 2qn, the ratio-sum inequality can again be applied to complete the proof.

3. Proof of Theorem 2

In [1], where a uniformly distributed prior probability PXn over Cn is assumed, one can flip a properly selected bit in the output ynTi to convert it to a corresponding element in Ni. In light of this connection, one can evaluate the ratio PYn|Xn(Ti|ci)PYn|Xn(Ni|ci). This approach, however, no longer works when a non-uniformly distributed prior probability is considered. Therefore, we have to devise a more judicious approach to extend the result in [1] for a general prior probability.

3.1. A Partition of Non-Empty Ti and Corresponding Disjoint Subsets of Ni

In this section, instead of finding a disjoint covering of the set of decoder ties Ti as in [1], we establish a proper partition of Ti from Definitions 1 and 2. This is one of the key differences from the techniques used in [1]. Example 1 is given after Proposition 1 to illustrate Definitions 1 and 2.

Given ynTi defined in (12), there exists at least one m[M]\{i} such that

PXn,Yn(ci,yn)=PXn,Yn(cm,yn)=maxr[M]\{i}PXn,Yn(cr,yn). (20)

We collect the indices m that satisfy (20) in Ii(yn) as follows:

Ii(yn)h[M]\{i}:PXn,Yn(ci,yn)=PXn,Yn(ch,yn)=maxr[M]\{i}PXn,Yn(cr,yn). (21)

Remark 2.

First, we note that Ii(yn) is not empty as long as ynTi. Also, for any ynTi, we can infer from (21) that hIi(yn) if and only if ynTh.

In Definitions 1 and 2 that follow, we will assign each ynTi to a subset indexed by jIi(yn). These subsets will form a partition of Ti as stated in Proposition 1.

Definition 1.

For j[M]\{i}, denoting by Si,j the set of indices where the bit components of ci and cj differ, we define

{(22a)Tj|iynTi:j=minrIi(yn):d(ci,yn|Si,r)<|Si,r|r;Nj|i{ynNi:PXn,Yn(ci,yn)·q=PXn,Yn(cj,yn)·1q(22b)andPXn,Yn(cj,yn)PXn,Yn(cr,yn)forr[j1]\{i}}.

Since there may exist ynTi satisfying d(ci,yn|Si,r)=|Si,r| for all rIi(yn), the collection of all elements in j[M]\{i}Tj|i may not exhaust the elements in Ti (see Example 1). We thus go on to collect the remaining elements in Ti\j[M]\{i}Tj|i as follows.

Definition 2.

Define for j[M]\{i},

Tj|iynTi\h[M]\{i}Th|i:j=minrIi(yn)r. (23)

With the sets defined in Definitions 1 and 2, a partition of Ti and disjoint subsets of Ni are constructed as proven in the following proposition.

Proposition 1.

For non-empty Ti, the following two properties hold.

  • (i) 

    The collection {Tj|iTj|i}j[M]\{i} forms a (disjoint) partition of Ti.

  • (ii) 

    {Nj|i}j[M]\{i} is a collection of disjoint subsets of Ni.

Before proving Proposition 1, we provide the following example to illustrate the above sets.

Example 1.

This example illustrates the necessity of introducing Tj|i as a companion to Tj|i. Suppose C4={c1,c2,c3,c4}={0000,0101,0110,0111}. Let PX4(c1)=q22+q2+q2, PX4(c2)=PX4(c3)=12+q2+q2 and PX4(c4)=q22+q2+q2. Then, y4=0111 satisfies

PX4,Y4(c1,y4)q2(2+q2+q2)p4q1=maxr[4]\{1}PX4,Y4(cr,y4)=PX4,Y4(c2,yn)1(2+q2+q2)p4q3=PX4,Y4(c3,y4)1(2+q2+q2)p4q3>PX4,Y4(c4,y4)q2(2+q2+q2)p4q4, (24)

where the probabilities PXn,Yn(xn,yn) are written in the form

PXn,Yn(xn,yn)=PXn(xn)PYn|Xn(yn|xn)=PXn(xn)pnqnd(xn,yn). (25)

Note that the first equality in (24) indicates 0111T1 and the last two equalities and the right-most inequality jointly imply I1(0111)={2,3}. In light of Proposition 1, this 0111 must lie in one and only one of {Tj|1Tj|1}j[4]\{1} as shown in Table A1 and Table A2 of Appendix A. Since there exist no integers h in I1(0111) fulfilling d(c1,0111|S1,h)<|S1,h|, this 0111 belongs to Tj|1 with j=minrI1(0111)=2. Recall that in [1], an element in Nj|1 can be obtained if flipping a zero of ynTj|1 can make it further away from c1 but closer to cj. However, for y4=0111 in this example if we flip the only zero to one, it gets further away from both c1 and ch for any h=2,3,4. Therefore, the bit-flipping manipulation fails.

With y4=0111, we also have

PX4,Y4(c2,y4)1(2+q2+q2)p4q3=maxr[4]\{2}PX4,Y4(cr,y4)=PX4,Y4(c1,y4)q2(2+q2+q2)p4q=PX4,Y4(c3,yn)1(2+q2+q2)p4q3>PX4,Y4(c4,y4)1(2+q2+q2)p4q4, (26)

where the first equality indicates 0111T2 and the remaining parts in (26) jointly imply that I2(0111)={1,3}. Proposition 1 then states that this 0111 lies in one and only one of {Tj|2Tj|2}j[4]\{2}. Since 1I2(0111)={1,3} and d(c2,0111|S2,1)=0<|S2,1|=2, we have 0111T1|2 according to (22a). Thus, we can flip a bit in 0111 to get further away from c2 and closer to cj simultaneously. More specifically, the bit-flipping manipulation produces either 0110 or 0011, which lies in N1|2 as y4=0111 is in T1|2. Therefore, we can associate the element in T1|2 with an element in N1|2 via a single flipping operation. For completeness, a full list of the sets Ti, Ni, Tj|i, Tj|i and Nj|i for i[4] and j[4]\{i}, is given in Appendix A.

Proof of Proposition 1.

First, we note that by the definitions in (22a) and (23), {Tj|i}j[M]\{i} are disjoint and so is {Tj|i}j[M]\{i}. Additionally, (23) implies Tj|iTh|i= for arbitrary j,h[M]\{i}. Furthermore, according to Definitions 1 and 2, for any ynTi, we have either ynTh|i or ynTh|i for some h[M]\{i}. Consequently, {Tj|iTj|i}j[M]\{i} forms a partition of Ti.

On the other hand, the inequality in (22b) prevents multiple inclusions of an element from the previous collections. Therefore, {Nj|i}j[M]\{i} are a collection of disjoint subsets of Ni. □

Remark 3.

When channel inputs are uniformly distributed as considered in [1], it follows that

Ii(yn)=h[M]\{i}:d(ci,yn)=d(ch,yn)=maxr[M]\{i}d(cr,yn), (27)

and d(ci,yn|Si,j)=|Si,j|2<|Si,j| for every jIi(yn). Therefore, (22a) is reduced to

Tj|i=ynTi:j=minrIi(yn)r, (28)

and

Tj|i=. (29)

We then have the following two remarks. First, we note that the Tj|i newly defined via (22a) and reduced to (28) in the regime considered in [1] is more restrictive than the Tj|i introduced in [1] (Equation (16a)). As a consequence, {Tj|i}j[M]\{i} forms a partition of Ti in this paper while those introduced in [1] (Equation (16a)) are a disjoint covering of Ti under uniform channel inputs. Second, (29) shows that [1] does not need to consider a companion Tj|i to Tj|i, but this paper does.

Based on Proposition 1, we continue the derivation from (17) and obtain:

δnbni[M]PXn,Ynci,j[M]\{i}(Tj|iTj|i)i[M]PXn,Ynci,j[M]\{i}Nj|i (30)
=i[M]j[M]\{i}PXn,Ynci,Tj|i+i[M]j[M]\{i}PXn,Ynci,Tj|ii[M]j[M]\{i}PXn,Ynci,Nj|i, (31)

where (31) holds because {Tj|i}j[M]\{i} and {Tj|i}j[M]\{i} are disjoint, and the same applies to {Nj|i}j[M]\{i}. An additional upper bound for (31) requires the verification of the inequality:

i[M]j[M]\{i}PXn,Ynci,Tj|ij[M]i[M]\{j}PXn,Yncj,Ti|j, (32)

which is an immediate consequence of the proposition to be proven in the next section (Proposition 2), stating that

ynTj|iandhIi(yn)ynT|hforsomeIh(yn)andPXn,Yn(ci,yn)=PXn,Yn(ch,yn). (33)

3.2. Verification of (32)

Recall that the main technique used in [1] is to associate every element in Ti with a corresponding element in Ni via the bit-flipping manipulation. By this bit-flipping association, the probability ratio of the elements and corresponding elements respectively in Ti and Ni can be evaluated. However, as Example 1 indicates, for an element in Tj|i, the bit-flipping association no longer works. This reveals the challenge of generalizing the results in [1] from uniform channel inputs to arbitrarily distributed channel inputs. A solution is to subdivide the elements in Ti into two groups {Tj|i}j[M]\{i} and {Tj|i}j[M]\{i}, where the bit-flipping association to {Nj|i}j[M]\{i} works for the former group but not for the latter. The inequality in (32) can then be used to exclude the latter group with an upper bound:

δnbni[M]j[M]\{i}PXn,Ynci,Tj|i+i[M]j[M]\{i}PXn,Ynci,Tj|ii[M]j[M]\{i}PXn,Ynci,Nj|i (34)
2i[M]j[M]\{i}PXn,Ynci,Tj|ii[M]j[M]\{i}PXn,Ynci,Nj|i. (35)

Since uniform channel inputs as considered in [1] guarantee (29), it can be seen from (35) that the multiplicative factor of 2 can be reduced to 1 as observed in Remark 1. For general arbitrary channel inputs, we have the factor of 2 since the set Tj|i may not be empty. The validity of (32) can be confirmed by the next proposition.

Proposition 2.

Suppose ynTj|i. Then, for every hIi(yn), we have

ynT|hforsomeIh(yn)andPXn,Yn(ci,yn)=PXn,Yn(ch,yn). (36)

Proof. 

Suppose ynTj|i. Then, d(ci,yn|Si,h)=|Si,h| for every hIi(yn). We therefore have:

PXn,Yn(ci,yn)=PXn,Yn(ch,yn)=maxr[M]\{i}PXn,Yn(cr,yn). (37)

We can rewrite (37) as

PXn,Yn(ch,yn)=PXn,Yn(ci,yn)=maxr[M]\{h}PXn,Yn(cr,yn), (38)

implying ynTh and iIh(yn). Noting that d(ch,yn|Sh,i)=0<|Sh,i| because d(ci,yn|Si,h)=|Si,h| and Sh,i=Si,h, we conclude that the smallest integer Ih(yn) satisfying d(ch,yn|Sh,)<|Sh,| exists, and therefore ynT|h. □

Remark 4.

Two observations can be made based on Proposition 2. First, Proposition 2 indicates that every ynTj|i must appear at least once in the sum h[M][M]\{h} PXn,Ynch,T|h, contributing the same probability mass PXn,Yn(ch,yn) as PXn,Yn(ci,yn). Second, Proposition 2 also implies that every ynTj|i cannot be contained in h[M]r[M]\{h}Tr|h\Tj|i. This observation can be substantiated as follows. For every hIi(yn), Proposition 2 implies ynT|h for some Ih(yn) and hence Definition 2 immediately gives ynTr|h for all r[M]\{h}. For hIi(yn), we have ynTh and therefore ynTr|hTh for all r[M]\{h} as pointed out in Remark 2. As a result, every ynTj|i appears exactly once in the sum h[M]r[M]\{h}PXn,Ynch,Tr|h. Combining the two observations leads to:

i[M]j[M]\{i}ynTj|iPXn,Ynci,ynj[M][M]\{j}ynT|jPXn,Yncj,yn. (39)

To flesh out the above inequality, we give the next example.

Example 2.

Proceeding from Example 1, we observe from Table A1 and Table A2 in Appendix A that 0111 is contained in T2|1, T1|2 and T1|3. Hence, it appears once in the sum j[4]i[4]\{j} PXn,Yncj,Ti|j while it contributes twice in the sum j[4]i[4]\{j}PXn,Yncj,Ti|j. We then confirm from (A35) that:

i[4]j[4]\{i}PX4,Y4ci,Tj|ii[4]j[4]\{i}PX4,Y4ci,Tj|i. (40)

We continue the derivation from (35) and obtain

δnbn2i[M]j[M]\{i}:Tj|iPXn,Ynci,Tj|ii[M]j[M]\{i}:Tj|iPXn,Ynci,Nj|i (41)
2maxi[M]andj[M]\{i}:Tj|iPXn,Ynci,Tj|iPXn,Ynci,Nj|i, (42)

where we add the restriction Tj|i in (41) to exclude the cases of zero dividing by zero in (42), and (42) follows the ratio-sum inequality in (18).

In the next section, we introduce a number of delicate decompositions of non-empty Tj|i and an equal number of disjoint subsets of Nj|i to facilitate the bit-flipping association of the pairs.

3.3. Atomic Decomposition of Non-Empty Tj|i and the Corresponding Disjoint Subsets of Nj|i

To simplify the exposition, we assume without loss of generality that c1 is the all-zero codeword (It is known that we can simultaneously flip the same position of all codewords to yield a new code of equal performance over the BSC. Thus, via a number of flipping manipulations, we can transform any code to a code of equal performance with the first codeword being all-zero.) Below we present the proof for i=1. The proof for general i>1 follows annalagously.

Since c1 is the all-zero codeword, S1,j is the set containing the indices of the non-zero components of cj. To facilitate the investigation of the structure of cj relative to the remaining codewords {cr}r[M]\{1,j}, we first partition S1,j into 2M2 subsets according to whether each index in S1,j is in S1,2, , S1,j1, S1,j+1, , S1,M or not, as follows:

S1,j(m)r=2j1Sr;λrr=j+1MSr;λrS1,jform1+r=2j1λr·2r2+r=j+1Mλr·2r3, (43)

where Sr;1S1,r and Sr;0[n]\S1,r=S1,rc, and each λr{0,1}. An example of the partition is given below.

Example 3.

Suppose C4={00000,11001,01111,01101}. For j=3 and S1,j={2,3,4,5}, we obtain 242=4 subsets as

S1,3(m)=S1,3(1)=S1,2cS1,4cS1,3={4},if(λ4,λ2)=(0,0);S1,3(2)=S1,2S1,4cS1,3=,if(λ4,λ2)=(0,1);S1,3(3)=S1,2cS1,4S1,3={3},if(λ4,λ2)=(1,0);S1,3(4)=S1,2S1,4S1,3={2,5},if(λ4,λ2)=(1,1). (44)

As c1 is the all-zero codeword, the components of cr with indices in S1,j(m) can now be unambiguously identified and must all equal λr. As a result,

dc1,cr|S1,j(m)=|S1,j(m)|,λr=1;0,λr=0. (45)

Example 4.

Proceeding from Example 3, we have

dc1,c2|S1,3(1)=0becauseλ2=0;dc1,c2|S1,3(2)=|S1,3(2)|=0becauseλ2=1;dc1,c2|S1,3(3)=0becauseλ2=0;dc1,c2|S1,3(4)=|S1,3(4)|=2becauseλ2=1, (46)

and

dc1,c4|S1,3(1)=0becauseλ4=0;dc1,c4|S1,3(2)=0becauseλ4=0;dc1,c4|S1,3(3)=|S1,3(3)|=1becauseλ4=1;dc1,c4|S1,3(4)=|S1,3(4)|=2becauseλ4=1. (47)

It should be emphasized that S1,j(m) in this paper is defined differently from that in [1]. While the one defined in [1] partitions S1,j only according to codewords with indices less than j, the one defined in this paper considers all other M2 codewords in the partition manipulation, and hence the order of codewords becomes irrelevant.

Next, to decompose Tj|1, we further define a sequence of incremental sets:

S1,j(m)h=1mS1,j(h),m[2M2], (48)

and set S1,j(0). Let 1,j|S1,j| and 1,j(m)|S1,j(m)| respectively denote the sizes of S1,j and S1,j(m) and note that 0=1,j(0)1,j(1)1,j(2)1,j(2M2)=1,j.

The idea behind the partition of Tj|1 into 1,j subsets, indexed by k[1,j1]{0}, is as follows. Pick one ynTj|1. We start by examining whether d(c1,yn|S1,j(1)) is strickly less than 1,j(1)1. If the answer is negative, we continue examining whether d(c1,yn|S1,j(2)) is strictly less than 1,j(2)1. Proceed until we reach the smallest m such that d(c1,yn|S1,j(m))<1,j(m)1 holds. Setting k to be equal to k=d(c1,yn|S1,j(m)), we assign this yn to the subset Tj|1(k). Notably, there exists no such number m[2M2] that satisfies d(c1,yn|S1,j(m))<1,j(m)1 if and only if d(c1,yn|S1,j)=1,j1; in this case, we find the smallest m satisfying S1,j(m)=S1,j and assign this element to Tj|1(1,j1) as d(c1,yn|S1,j(m))=1,j1. For ease of describing the above algorithmic partition process, we introduce a mapping from k[1,j1]{0} to m[2M2] as follows:

ηkminm[2M2]:k<1,j(m)1,0k<1,j1;minm[2M2]:k=1,j(m)1,k=1,j1. (49)

We can see that for 0k<1,j1, we have 1,j(ηk1)1k<1,j(ηk)1. Therefore, if ynTj|1 is assigned to Tj|1(k) for some k<1,j1, we must have

1,j(ηk1)1dc1,yn|S1,j(ηk1)dc1,yn|S1,j(ηk)=k<1,j(ηk)1. (50)

On the other hand, if ynTj|1 is collected in Tj|1(1,j1), then S1,j(ηk)=S1,j and

1,j(ηk1)1dc1,yn|S1,j(ηk1)dc1,yn|S1,j(ηk)=1,j1. (51)

A formal definition of Tj|1(k) is given next, where the corresponding subsets Nj|1(k) of Nj|1 are also introduced.

Definition 3.

Define for k=0, 1, …, 1,j1,

{(52a)Tj|1(k)ynTj|1:1,j(ηk1)1dc1,yn|S1,j(ηk1)anddc1,yn|S1,j(ηk)=k;(52b)Nj|1(k)ynNj|1:1,j(ηk1)=dc1,yn|S1,j(ηk1)anddc1,yn|S1,j(ηk)=k+1,

where ηk is defined in (49).

With Definition 3, we have the following proposition.

Proposition 3.

For non-empty Tj|1, the following two properties hold.

  • (i) 

    {Tj|1(k)}k[1,j1]{0} forms a partition of Tj|1;

  • (ii) 

    {Nj|1(k)}k[1,j1]{0} is a collection of disjoint subsets of Nj|1.

Proof. 

It can be seen from the definitions of {Tj|1(k)}k[1,j1]{0} and {Nj|1(k)}k[1,j1]{0} that they are collections of mutually disjoint subsets of Tj|1 and Nj|1, respectively. It remains to argue that every element in Tj|1 belongs to Tj|1(k) for some k[1,j1]{0}. Noting that the element yn in Tj|1 satisfies d(c1,yn|S1,j)1,j1, we differentiate two cases: d(c1,yn|S1,j)1,j2 and d(c1,yn|S1,j)=1,j1. For the former case, d(c1,yn|S1,j(m))<1,j(m)1 must hold for m=ηk; hence, this yn will be contained in Tj|1(k). For the latter case, yn will be included in Tj|1(1,j1). The proof is thus completed. □

In light of Proposition 3, we can apply the ratio-sum inequality to obtain

PXn,Yn(c1,Tj|1)PXn,Yn(c1,Nj|1)k=0:Tj|1(k)1,j1PXn,Ync1,Tj|1(k)k=0:Tj|1(k)1,j1PXn,Ync1,Nj|1(k) (53)
maxk[1,j1]{0}:Tj|1(k)PXn,Ync1,Tj|1(k)PXn,Ync1,Nj|1(k). (54)

We continue to construct a fine partition of Tj|1(k) and the corresponding disjoint subsets of Nj|1(k) in Proposition 4 after giving the next definition.

Definition 4.

Define for unTj|1(k),

{(55a)Tj|1(un;k)ynTj|1(k):dun,yn|S1,j(ηk)c=0;(55b)Nj|1(un;k)ynNj|1(k):dun,yn|(S1,j(ηk))c=0,

where ηk is given in (49).

Note from Definition 4 that for one element un in non-empty Tj|1(k), we can find a group of elements that have identical bit components to un with indices in (S1,j(ηk))c. We denote this group as Tj|1(un;k). We continue this grouping manipulation until all elements in Tj|1(k) are exhausted as summarized below.

Proposition 4.

For non-empty Tj|1(k), there exists a representative subset Uj|1(k)Tj|1(k) such that the following two properties hold.

  • (i) 

    Tj|1(un;k)unUj|1(k) forms a (non-empty) partition of Tj|1(k);

  • (ii) 

    Nj|1(un;k)unUj|1(k) is a collection of (non-empty) disjoint subsets of Nj|1(k).

Since the above proposition can be self-validated via its sequential selection manipulation of each un from Tj|1(k), we omit the proof. Interested readers can find the details in [1] (Section III-C).

From Proposition 4, using again the ratio-sum inequality, we obtain that for non-empty Tj|1(k),

PXn,Ync1,Tj|1(k)PXn,Ync1,Nj|1(k)unUj|1(k)PXn,Ync1,Tj|1(un;k)unUj|1(k)PXn,Ync1,Nj|1(un;k) (56)
maxunUj|1(k)PXn,Ync1,Tj|1(un;k)PXn,Ync1,Nj|1(un;k). (57)

Noting that the above result can be similarly conducted for general i>1, we combine (42), (54) and (57) to conclude that

δnbn2maxi[M]andj[M]\{i}:Tj|imaxk[i,j1]{0}:Tj|i(k)maxunUj|i(k)PXn,Ynci,Tj|i(un;k)PXn,Ynci,Nj|i(un;k). (58)

The final task is to evaluate PXn,Ynci,Tj|i(un;k)/PXn,Ynci,Nj|i(un;k) in order to characterize a linear upper bound for δn/bn.

3.4. Characterization of a Linear Upper Bound for δn/bn

We again focus on i=1 with c1 being the all-zero codeword for simplicity. The definitions of Tj|1(un;k) in (55a) and Nj|1(un;k) in (55b) indicate that when dealing with the ratio PXn,Yn(c1,Tj|1(un;k))/PXn,Yn(c1,Nj|1(un;k)), we only need to consider those bits with indices in S1,j(ηk) because the remaining bits of all tuples in Tj|1(un;k) and Nj|1(un;k) have identical values as un. Note that all |Tj|1(un;k)| elements in Tj|1(un;k) have exactly k ones with indices in S1,j(ηk), and all |Nj|1(un;k)| elements in Nj|1(un;k) have exactly k+1 ones with indices in S1,j(ηk), we can immediately infer that

PXn,Yn(c1,Tj|1(un;k))PXn,Yn(c1,Nj|1(un;k))=PXn(c1)·PYn|c1(Tj|1(un;k)|c1)PXn(c1)·PYn|c1(Nj|1(un;k)|c1)=(1p)p·|Tj|1(un;k)||Nj|1(un;k)|. (59)

The cardinalities of Tj|1(un;k) and Nj|1(un;k) then decide the ratio in (59) as verified in the next proposition, based on which the proof of Theorem 2 can be completed from (58).

Proposition 5.

For unTj|1(k), we have

PXn,Yn(c1,Tj|1(un;k))PXn,Yn(c1,Nj|1(un;k))(1p)pn. (60)

Proof. 

Recall from (22a), (52a) and (55a) that ynTj|1(un;k)Tj|1(k)Tj|1 if and only if

{(61a)PXn,Yn(c1,yn)=PXn,Yn(cj,yn)=maxh[M]\{1}PXn,Yn(x(h)n,yn)andd(c1,yn|S1,j)<|Si,j|;(61b)1,j(ηk1)1dc1,yn|S1,j(ηk1)anddc1,yn|S1,j(ηk)=k;(61c)dun,yn|(S1,j(ηk))c=0.

Thus, the number of elements in Tj|1(un;k) is exactly the number of channel outputs yn fulfilling the above three conditions. We then examine the number of yn satisfying (61b) and (61c). Noting that these yn have either 1,j(ηk1)1 ones or 1,j(ηk1) ones with indices in S1,j(ηk1), we know that there are at most

1,j(ηk1)1,j(ηk1)1j(ηk)1,j(ηk1)k(1,j(ηk1)1)+1,j(ηk1)1,j(ηk1)j(ηk)1,j(ηk1)k1,j(ηk1) (62)

of yn tuples satisfying (61b) and (61c). Disregarding (61a), we get that the number of elements in Tj|1(un;k) is upper-bounded by (62).

On the other hand, from (22b), (52b) and (55b), we obtain that wnNj|1(un;k)Nj|1(k)Nj|1 if and only if

{(63a)PXn,Yn(c1,wn)·q2=PXn,Yn(cj,wn);(63b)PXn,Yn(c1,wn)·q2PXn,Yn(cr,wn)forr[j1]\{1};(63c)1,j(ηk1)=dc1,wn|S1,j(ηk1)anddc1,wn|S1,j(ηk)=k+1;(63d)dun,wn|(S1,j(ηk))c=0.

We then claim that any wn satisfying (63c) and (63d) directly validate (63a) and (63b). Note that the validity of the claim, which we prove in Appendix B, immediately implies that the number of elements in Nj|1(un;k) can be determined by (63c) and (63d), and hence

|Nj|1(un;k)|=j(ηk)1,j(ηk1)k+11,j(ηk1). (64)

Under this claim, (62) and (64) result in

|Tj|1(un;k)||Nj|1(un;k)|1,j(ηk1)1,j(ηk1)1j(ηk)1,j(ηk1)k(1,j(ηk1)1)+1,j(ηk1)1,j(ηk1)j(ηk)1,j(ηk1)k1,j(ηk1)j(ηk)1,j(ηk1)k+11,j(ηk1) (65)
=1,j(ηk1)+k+11,j(ηk1)j(ηk)k (66)
1,j(ηk1)+j(ηk)1,j(ηk1)1 (67)
n, (68)

where (67) holds because kj(ηk)1 by (49), and (68) follows from 1,j(ηk)1,jn. The proof of the proposition is thus completed by (59) and (68). □

4. Conclusions

In this paper, we analyzed the error probability of block codes sent over the memoryless BSC under an arbitrary (not necessarily uniform) input distribution and used in conjunction with (optimal) MAP decoding. We showed that decoder ties do not affect the error exponent of the probability of error, thus extending a similar result recently established in [1] for uniformly distributed channel inputs. This result was obtained by proving that the relative deviation of the error probability from the probability of error grows no more than linearly in blocklength when no MAP decoding ties occur, directly implying that decoder ties have only a sub-exponential effect on the error probability as blocklength grows without bound. Future work includes further extending this result for more general channels used under arbitrary input statistics, such as non-binary symmetric channels (Note that the result of Theorem 1 can be extended for non-binary (q-ary, q>2) codes sent over q-ary symmetric memoryless channels under a uniform input distribution; see [25] (Theorem 2).) and binary non-symmetric channels. Studying how to sharpen the upper bound derived in (10) for “sufficiently good” codes as highlighted in Remark 1 and for codes with small blocklengths are other worthwhile future directions.

Appendix A. Supplement to Example 1

Under distribution

PX4(c1)=q22+q2+q2 (A1)
PX4(c2)=PX4(c3)=12+q2+q2 (A2)
PX4(c4)=q22+q2+q2 (A3)

over the code C4={c1,c2,c3,c4}={0000,0101,0110,0111}, we obtain:

T1=y4{0,1}4:PX4,Y4(c1,y4)=maxr[4]\{1}PX4,Y4(cr,y4) (A4)
=y4{0,1}4:PX4(c1)qd(c1,y4)=maxPX4(c2)qd(c2,y4),PX4(c3)qd(c3,y4),PX4(c4)qd(c4,y4) (A5)
=y4{0,1}4:d(c1,y4)2=mind(c2,y4),d(c3,y4),d(c4,y4)+2 (A6)
=0101,0110,0111,1101,1110,1111, (A7)

where (A5) follows from (25), and

Tj|1y4T1:j=minrI1(y4):d(c1,y4|S1,r)<|S1,r|r=forj=2,3,4, (A8)

and

Tj|1y4T1\h[4]\{1}Th|1:j=minrI1(yn)r (A9)
={0101,0111,1101,1111},j=2;{0110,1110},j=3;,j=4. (A10)

The above derivations are verified via Table A1. Continuing with the same setting, we obtain

T2=y4{0,1}4:PX4,Y4(c2,y4)=maxr[4]\{2}PX4,Y4(cr,y4) (A11)
=y4{0,1}4:PX4(c2)qd(c2,y4)=maxPX4(c1)qd(c1,y4),PX4(c3)qd(c3,y4),PX4(c4)qd(c4,y4) (A12)
=y4{0,1}4:d(c2,y4)=mind(c1,y4)2,d(c3,y4),d(c4,y4)+2 (A13)
=0101,0111,1101,1111, (A14)
Tj|2y4T2:j=minrI2(y4):d(c2,y4|S2,r)<|S2,r|r=T2,j=1;,j=3,4 (A15)

and

Tj|2y4T2\h[4]\{1}Th|2:j=minrI2(yn)r=forj=1,3,4, (A16)

where the above derivations are also confirmed via Table A1. Based on Table A1, we further have

T3=y4{0,1}4:PX4,Y4(c3,y4)=maxr[4]\{3}PX4,Y4(cr,y4) (A17)
=y4{0,1}4:PX4(c3)qd(c3,y4)=maxPX4(c1)qd(c1,y4),PX4(c2)qd(c2,y4),PX4(c4)qd(c4,y4) (A18)
=y4{0,1}4:d(c3,y4)=mind(c1,y4)2,d(c2,y4),d(c4,y4)+2 (A19)
=0110,0111,1110,1111, (A20)
Tj|3y4T3:j=minrI3(y4):d(c3,y4|S3,r)<|S3,r|r=T3,j=1;,j=2,4, (A21)

and

Tj|3y4T3\h[4]\{3}Th|3:j=minrI3(yn)r=forj=1,2,4. (A22)

Furthermore, we establish from Table A1 that

T4=y4{0,1}4:PX4,Y4(c4,y4)=maxr[4]\{4}PX4,Y4(cr,y4) (A23)
=y4{0,1}4:PX4(c4)qd(c4,y4)=maxPX4(c1)qd(c1,y4),PX4(c2)qd(c2,y4),PX4(c3)qd(c3,y4) (A24)
=y4{0,1}4:d(c4,y4)+2=mind(c1,y4)2,d(c2,y4),d(c3,y4) (A25)
=, (A26)
Tj|4y4T4:j=minrI4(y4):d(c4,y4|S4,r)<|S4,r|r=forj=1,2,3, (A27)

and

Tj|4y4T4\h[4]\{4}Th|4:j=minrI4(yn)r=forj=1,2,3. (A28)

After summarizing all sets derived above in Table A2, we remark that

T2|1T1|2andPX4,Y4(c1,y4)=PX4,Y4(c2,y4)foreveryy4T2|1;T3|1T1|3andPX4,Y4(c1,y4)=PX4,Y4(c2,y4)foreveryy4T3|1. (A29)

Note that {Tj|i}i[M],j[M]\{i} are disjoint as confirmed in Remark 4 such that every element ynTj|i appears only once in the following summation:

i[4]j[4]\{i}PX4,Y4ci,Tj|i=PX4,Y4c1,0101+PX4,Y4c1,0110+PX4,Y4c1,0111+PX4,Y4c1,1101+PX4,Y4c1,1110+PX4,Y4c1,1111 (A30)
=p4q22+q2+q2q2+q2+q+q+q+1 (A31)
=p4q22+q2+q22q2+3q+1. (A32)

Additionally,

i[4]j[4]\{i}PX4,Y4ci,Tj|i=PX4,Y4c2,0101+PX4,Y4c2,0111+PX4,Y4c2,1101+PX4,Y4c2,1111+PX4,Y4c3,0110+PX4,Y4c3,0111+PX4,Y4c3,1110+PX4,Y4c3,1111 (A33)
=p42+q2+q2q4+q3+q3+q2+q4+q3+q3+q2 (A34)
=i[4]j[4]\{i}PX4,Y4ci,Tj|i+p4q22+q2+q2q+1. (A35)

Finally, we have

N1=y4{0,1}4:PX4(c1)qd(c1,y4)<maxPX4(c2)qd(c2,y4),PX4(c3)qd(c3,y4),PX4(c4)qd(c4,y4) (A36)
=y4{0,1}4:d(c1,y4)2>mind(c2,y4),d(c3,y4),d(c4,y4)+2 (A37)
=, (A38)
N2=y4{0,1}4:d(c2,y4)>mind(c1,y4)2,d(c3,y4),d(c4,y4)+2={0,1}4\T2, (A39)
N3=y4{0,1}4:d(c3,y4)>mind(c2,y4),d(c1,y4)2,d(c4,y4)+2={0,1}4\T3, (A40)
N4=y4{0,1}4:d(c4,y4)+2>mind(c2,y4),d(c3,y4),d(c1,y4)2={0,1}4, (A41)
Nj|1=forj=2,3,4, (A42)
Nj|2={y4N2:PX4,Y4(c2,y4)·q=PX4,Y4(cj,y4)·1q
andPX4,Y4(cj,y4)PX4,Y4(cr,y4)forr[j1]\{2}}={y4N2:PX4(c2)qd(c2,y4)1=PX4(cj)qd(cj,y4)+1andPX4(cj)qd(cj,y4)PX4(cr)qd(cr,y4) (A43)
forr[j1]\{2}} (A44)
=y4N2:PX4(c2)qd(c2,y4)1=PX4(c1)qd(c1,y4)+1,j=1;y4N2:PX4(c2)qd(c2,y4)1=PX4(c3)qd(c3,y4)+1andPX4(c3)qd(c3,y4)PX4(c1)qd(c1,y4),j=3;{y4N2:PX4(c2)qd(c2,y4)1=PX4(c4)qd(c4,y4)+1andPX4(c4)qd(c4,y4)PX4(cr)qd(cr,y4)forr[3]\{2}},j=4 (A45)
=y4N2:d(c2,y4)=d(c1,y4),j=1;y4N2:d(c2,y4)=d(c3,y4)+2andd(c2,y4)d(c1,y4),j=3;{y4N2:d(c2,y4)=d(c4,y4)+4,d(c2,y4)d(c1,y4)andd(c2,y4)d(c3,y4)+2},j=4
=0001,0100,0011,0110,1001,1100,1011,1110,j=1;0010,1010,j=3;,j=4, (A46)
Nj|3={y4N3:PX4,Y4(c3,y4)·q=PX4,Y4(cj,y4)·1q
andPX4,Y4(cj,y4)PX4,Y4(cr,y4)forr[j1]\{3}}={y4N3:PX4(c3)qd(c3,y4)1=PX4(cj)qd(cj,y4)+1andPX4(cj)qd(cj,y4)PX4(cr)qd(cr,y4) (A47)
forr[j1]\{3}} (A48)
=y4N3:PX4(c3)qd(c3,y4)1=PX4(c1)qd(c1,y4)+1,j=1;y4N3:PX4(c3)qd(c3,y4)1=PX4(c2)qd(c2,y4)+1andPX4(c2)qd(c2,y4)PX4(c1)qd(c1,y4),j=2;{y4N3:PX4(c3)qd(c3,y4)1=PX4(c4)qd(c4,y4)+1andPX4(c4)qd(c4,y4)PX4(cr)qd(cr,y4)forr[3]\{3}},j=4=y4N3:d(c3,y4)=d(c1,y4),j=1;y4N3:d(c3,y4)=d(c2,y4)+2andd(c3,y4)d(c1,y4),j=3;{y4N3:d(c3,y4)=d(c4,y4)+4,d(c3,y4)d(c1,y4)andd(c3,y4)d(c2,y4)+2},j=4 (A49)
=0010,0100,0011,0101,1010,1100,1011,1101,j=1;0001,1001,j=3;,j=4, (A50)
Nj|4={y4N4:PX4,Y4(c4,y4)·q=PX4,Y4(cj,y4)·1q
andPX4,Y4(cj,y4)PX4,Y4(cr,y4)forr[j1]\{4}}=y4N4:PX4(c4)qd(c4,y4)1=PX4(cj)qd(cj,y4)+1andPX4(cj)qd(cj,y4)PX4(cr)qd(cr,y4)forr[j1] (A51)
=y4N4:PX4(c4)qd(c4,y4)1=PX4(c1)qd(c1,y4)+1,j=1;y4N4:PX4(c4)qd(c4,y4)1=PX4(c2)qd(c2,y4)+1andPX4(c2)qd(c2,y4)PX4(c1)qd(c1,y4),j=2;{y4N4:PX4(c4)qd(c4,y4)1=PX4(c3)qd(c3,y4)+1andPX4(c3)qd(c3,y4)PX4(cr)qd(cr,y4)forr[2]},j=3 (A52)
=y4N4:d(c4,y4)=d(c1,y4)2,j=1;y4N4:d(c4,y4)=d(c2,y4)andd(c4,y4)d(c1,y4)2,j=2;{y4N4:d(c4,y4)=d(c3,y4),d(c4,y4)d(c1,y4)2andd(c4,y4)d(c2,y4)},j=3 (A53)
=forj=1,2,3. (A54)

Table A1.

Measures used in Example 1.

d(0000,y4)2 d(0101,y4) d(0110,y4) d(0111,y4)+2 I1(y4) I2(y4) I3(y4) I4(y4)
y4=0000 2 2 2 5
y4=0001 1 1 3 4
y4=0010 1 3 1 4
y4=0100 1 1 1 4
y4=1000 1 3 3 6
y4=0011 0 2 2 3
y4=0101 0 0 2 3 {2} {1}
y4=0110 0 2 0 3 {3} {1}
y4=1001 0 2 4 5
y4=1010 0 4 2 5
y4=1100 0 2 2 5
y4=0111 1 1 1 2 {2,3} {1,3} {1,2}
y4=1011 1 3 3 4
y4=1101 1 1 3 4 {2} {1}
y4=1110 1 3 1 4 {3} {1}
y4=1111 2 2 2 3 {2,3} {1,3} {1,2}

Table A2.

List of Ti, Ni, Tj|i, Tj|i and Nj|i for i[4] and j[4]\{i} in Example 1.

T1 0101,0110,0111,1101,1110,1111 N1
T2 0101,0111,1101,1111 N2 {0,1}4\T2
T3 0110,0111,1110,1111 N3 {0,1}4\T3
T4 N4 {0,1}4
T2|1 T2|1 {0101,0111,1101,1111} N2|1
T3|1 T3|1 {0110,1110} N3|1
T4|1 T4|1 N4|1
T1|2 {0101,0111,1101,1111} T1|2 N1|2 N2\{0000,0010,1000,1010}
T3|2 T3|2 N3|2 {0010,1010}
T4|2 T4|2 N1|2
T1|3 {0110,0111,1110,1111} T1|3 N1|3 N3\{0000,0001,1000,1001}
T2|3 T2|3 N2|3 {0001,1001}
T4|3 T4|3 N4|3
T1|4 T1|4 N1|4
T2|4 T2|4 N2|4
T3|4 T3|4 N3|4

Appendix B. The Proof of the Claim Supporting Proposition 5

We validate the claim that (63c) and (63d) imply (63a) and (63b) via the construction of an auxiliary vnNj|1(un;k) from unTj|1(un;k). This auxiliary vn will be defined differently according to whether dc1,un|S1,j(ηk1) equals 1,j(ηk1) or 1,j(ηk1)1 as follows.

  • (i) 
    d(c1,un|S1,j(ηk1))=1,j(ηk1): In this case, un has no zero components with indices in S1,j(ηk1). Moreover, d(c1,un|S1,j(ηk))=kj(ηk)1 indicates that
    unhasatleastonezerocomponentwithitsindexinS1,j(ηk)\S1,j(ηk1)=S1,j(ηk). (A55)
    Therefore, we flip arbitrarily a zero component of un with its index in S1,j(ηk) to construct a vn such that
    d(c1,vn)=d(c1,un)+1andd(cj,vn)=d(cj,un)1, (A56)
    which implies
    PXn,Yn(c1,vn)=PXn,Yn(c1,un)·1qandPXn,Yn(cj,vn)=PXn,Yn(cj,un)·q. (A57)

    Then, vn must fulfill (63a), (63c) and (63d) (with wn replaced by vn) as un satisfies (61a), (61b) and (61c). We next declare that vn also fulfills (63b) and will prove this declaration by contradiction.

    Proof of the declaration: Suppose there exists a r[j1]\{1} satisfying
    PXn,Yn(c1,vn)·q2=PXn,Yn(cr,vn). (A58)

    We then recall from (45) that d(c1,cr|S1,j(ηk)) is either 0 or |S1,j(ηk)|. Thus, (A58) can be disproved by differentiating two subcases: (1) d(c1,cr|S1,j(ηk))=0, and (2) d(c1,cr|S1,j(ηk))=|S1,j(ηk)| (Since 1,j(ηk1)<1,j(ηk) as can be seen from (50) and (51), we have |S1,j(ηk)|=1,j(ηk)1,j(ηk1)>0, i.e., S1,j(ηk) non-empty).

    In Subcase (1), vn that is obtained by flipping a zero component of un with index in S1,j(ηk) must satisfy d(c1,vn)=d(c1,un)+1 and d(cr,vn)=d(cr,un)+1, which is equivalent to
    PXn,Yn(c1,vn)·q=PXn,Yn(c1,un)andPXn|Yn(cr|vn)·q=PXn|Yn(cr|un). (A59)
    Then, (A58) implies
    PXn,Yn(c1,un)·q2=PXn,Yn(cr,un). (A60)
    Hence,
    PXn,Yn(c1,un)<PXn,Yn(cr,un)maxh[M]\{1}PXn,Yn(x(h)n,un). (A61)

    A contradiction to the fact that unTj|1(un;k) satisfies (61a) (with yn replaced by un) is obtained.

    In Subcase (2), we note that d(c1,cr|S1,j(ηk))=|S1,j(ηk)| implies S1,j(ηk)S1,r. Therefore, (A55) leads to
    d(c1,un|S1,r)<|S1,r|. (A62)
    The flipping manipulation on un results in d(c1,vn)=d(c1,un)+1 and d(cr,vn)=d(cr,un)1, which is equivalent to
    PXn,Yn(c1,vn)·q=PXn,Yn(c1,un)andPXn,Yn(cr,vn)=PXn,Yn(cr,un)·q. (A63)
    Therefore, (A58) implies
    PXn,Yn(c1,un)=PXn,Yn(cr,un), (A64)
    which together with maxh[M]\{1}PXn,Yn(ch,un)=PXn,Yn(c1,un) and (A62) result in unTr|1 because r<j. This contradicts unTj|1. Accordingly, vn must also fulfill (63b); hence, vnNj|1(un;k). This completes the proof of the declaration.
    With this auxiliary vn, we are ready to prove that every wn satisfying (63c) and (63d) also validates (63a) and (63b). Toward this end, we need to prove
    PXn,Yn(cr,wn)=PXn,Yn(cr,vn)forallr[M]. (A65)
    Note that
    {(A66a)dwn,vn|S1,j(ηk1)=0;(A66b)dcr,wn|S1,j(ηk)=dcr,vn|S1,j(ηk)forallr[M];(A66c)d(wn,vn|(S1,j(ηk1))c=0,
    where (A66a) holds because both vn and wn satisfy (63c), implying that all components of vn and wn with indices in S1,j(ηk1) are equal to one; (A66b) holds because when considering only those portions with indices in (non-empty) S1,j(ηk), cr gives either all ones or all zeros according to (45), and both wn and vn have exactly k+11,j(ηk1) ones according to (63c); and (A66c) is valid since both vn and wn satisfy (63d). Based on (A66a)–(A66c), we remark that d(cr,wn)=d(cr,vn) for all r[M], which implies PYn|Xn(wn|cr)=PYn|Xn(vn|cr) (equivalently, PXn,Yn(cr,wn)=PXn,Yn(cr,vn)) for all r[M]).
  • (ii) 

    d(c1,un|S1,j(ηk1))=1,j(ηk1)1: In this case, there is only one zero component of un with its index in S1,j(ηk1). Suppose the index of such zero component lie in S1,j(h)S1,j(ηk1), where hηk1. The flipping manipulation to un leads to vn, which has all one components with respect to S1,j(ηk1). Then, vn must fulfill (63a), (63c), and (63d) as un satisfies (61a), (61b), and (61c). With the components of cr with respect to (non-empty) S1,j(h) being either all zeros or all ones, the same contradiction argument between (A58) and (A64), with ηk replaced by h, can disprove the validity of (A58) for this vn and for any r[j1]\{1}. Therefore, vn also fulfills (63b), implying vnNj|1(un;k). With this auxiliary vn, we can again verify (A66a)–(A66c) via the same argument. The claim that wn satisfying (63c) and (63d) validates (63a) and (63b) is thus confirmed.

Author Contributions

Conceptualization, L.-H.C., P.-N.C. and F.A.; Formal analysis, L.-H.C.; Writing—original draft, L.-H.C.; Writing—review & editing, P.-N.C. and F.A. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Funding Statement

The work of Ling-Hua Chang is supported by the Ministry of Science and Technology, Taiwan under Grant MOST 109-2221-E-155-035-MY3. The work of Po-Ning Chen is supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 110-2221-E-A49-024-MY3. The work of Fady Alajaji is supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada.

Footnotes

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

References

  • 1.Chang L.H., Chen P.N., Alajaji F., Han Y.S. Decoder Ties Do Not Affect the Error Exponent of the Memoryless Binary Symmetric Channel. IEEE Trans. Inf. Theory. 2022;68:3501–3510. doi: 10.1109/TIT.2022.3150597. [DOI] [Google Scholar]
  • 2.Shannon C.E., Gallager R.G., Berlekamp E.R. Lower bounds to error probability for coding on discrete memoryless channels—I. Inf. Control. 1967;10:65–103. doi: 10.1016/S0019-9958(67)90052-6. [DOI] [Google Scholar]
  • 3.Shannon C.E., Gallager R.G., Berlekamp E.R. Lower bounds to error probability for coding on discrete memoryless channels—II. Inf. Control. 1967;10:522–552. doi: 10.1016/S0019-9958(67)91200-4. [DOI] [Google Scholar]
  • 4.McEliece R.J., Omura J.K. An improved upper bound on the block coding error exponent for binary-input discrete memoryless channels. IEEE Trans. Inf. Theory. 1977;23:611–613. doi: 10.1109/TIT.1977.1055772. [DOI] [Google Scholar]
  • 5.Gallager R.G. Information Theory and Reliable Communication. Wiley; New York, NY, USA: 1968. [Google Scholar]
  • 6.Viterbi A.J., Omura J.K. Principles of Digital Communication and Coding. McGraw-Hill; New York, NY, USA: 1979. [Google Scholar]
  • 7.Csiszár I., Körner J. Information Theory: Coding Theorems for Discrete Memoryless Systems. Academic Press; New York, NY, USA: 1981. [Google Scholar]
  • 8.Blahut R. Principles and Practice of Information Theory. Addison-Wesley Longman Publishing Co., Inc.; Albany, NY, USA: 1988. [Google Scholar]
  • 9.Barg A., McGregor A. Distance distribution of binary codes and the error probability of decoding. IEEE Trans. Inf. Theory. 2005;51:4237–4246. doi: 10.1109/TIT.2005.858977. [DOI] [Google Scholar]
  • 10.Haroutunian E.A., Haroutunian M.E., Harutyunyan A.N. Foundations and Trends in Communications and Information Theory. Volume 4. Now Publishers Inc.; Delft, The Netherlands: 2007. Reliability Criteria in Information Theory and in Statistical Hypothesis Testing; pp. 97–263. [Google Scholar]
  • 11.Dalai M. Lower bounds on the probability of error for classical and classical-quantum channels. IEEE Trans. Inf. Theory. 2013;59:8027–8056. doi: 10.1109/TIT.2013.2283794. [DOI] [Google Scholar]
  • 12.Burnashev M.V. On the BSC reliability function: Expanding the region where it is known exactly. Probl. Inf. Transm. 2015;51:307–325. doi: 10.1134/S0032946015040018. [DOI] [Google Scholar]
  • 13.Csiszár I. Joint source-channel error exponent. Probl. Control. Inf. Theory. 1980;9:315–328. [Google Scholar]
  • 14.Zhong Y., Alajaji F., Campbell L. On the joint source-channel coding error exponent for discrete memoryless systems. IEEE Trans. Inf. Theory. 2006;52:1450–1468. doi: 10.1109/TIT.2006.871608. [DOI] [Google Scholar]
  • 15.Alajaji F., Phamdo N., Fuja T. Channel codes that exploit the residual redundancy in CELP-encoded speech. IEEE Trans. Speech Audio Process. 1996;4:325–336. doi: 10.1109/89.536927. [DOI] [Google Scholar]
  • 16.Xu W., Hagenauer J., Hollmann J. Joint source-channel decoding using the residual redundancy in compressed images; Proceedings of the Proceedings of the International Conference on Communications; Washington, DC, USA. 25–28 February 1996; pp. 142–148. [Google Scholar]
  • 17.Hagenauer J. Source-controlled channel decoding. IEEE Trans. Commun. 1995;43:2449–2457. doi: 10.1109/26.412719. [DOI] [Google Scholar]
  • 18.Goertz N. Joint Source-Channel Coding of Discrete-Time Signals with Continuous Amplitudes. World Scientific; Singapore: 2007. [Google Scholar]
  • 19.Duhamel P., Kieffer M. Joint Source-Channel Decoding: A Cross-Layer Perspective with Applications in Video Broadcasting. Academic Press; Cambridge, MA, USA: 2009. [Google Scholar]
  • 20.Fresia M., Pérez-Cruz F., Poor H.V., Verdú S. Joint source and channel coding. IEEE Signal Process. Mag. 2010;27:104–113. doi: 10.1109/MSP.2010.938080. [DOI] [Google Scholar]
  • 21.Alajaji F., Chen P.N. An Introduction to Single-User Information Theory. Springer; Berlin/Heidelberg, Germany: 2018. [Google Scholar]
  • 22.Chang L.H., Chen P.N., Alajaji F., Han Y.S. The asymptotic generalized Poor-Verdú bound achieves the BSC error exponent at zero rate; Proceedings of the IEEE International Symposium on Information Theory; Los Angeles, CA, USA. 21–26 June 2020. [Google Scholar]
  • 23.Chen P.N., Alajaji F. A generalized Poor-Verdú error bound for multihypothesis testings. IEEE Trans. Inf. Theory. 2012;58:311–316. doi: 10.1109/TIT.2011.2171533. [DOI] [Google Scholar]
  • 24.Poor H.V., Verdú S. A lower bound on the probability of error in multihypothesis testing. IEEE Trans. Inf. Theory. 1995;41:1992–1994. doi: 10.1109/18.476322. [DOI] [Google Scholar]
  • 25.Chang L.H., Chen P.N., Alajaji F., Han Y.S. Tightness of the asymptotic generalized Poor-Verdú error bound for the memoryless symmetric channel. arXiv. 20202007.04080v1 [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES