Abstract
Although the entropy of a given signal-type waveform is technically zero, it is nonetheless desirable to use entropic measures to quantify the associated information. Several such prescriptions have been advanced in the literature but none are generally successful. Here, we report that the Fourier-conjugated ‘total entropy’ associated with quantum-mechanical probabilistic amplitude functions (PAFs) is a meaningful measure of information in non-probabilistic real waveforms, with either the waveform itself or its (normalized) analytic representation acting in the role of the PAF. Detailed numerical calculations are presented for both adaptations, showing the expected informatic behaviours in a variety of rudimentary scenarios. Particularly noteworthy are the sensitivity to the degree of randomness in a sequence of pulses and potential for detection of weak signals.
Keywords: information theory, information entropy, signal analysis
1. Introduction
Let α=α(t) be a real, square-integrable function of a real independent variable, t. The following analysis is restricted to cases where t is one-dimensional; the generalization for arbitrary dimensionality follows readily therefrom. For convenience, let the L2-norm, or energy, of α, written here as ∥α∥, be unity. Finally, suppose that α represents some informatic signal such as an audio waveform or digital transmission. It is appropriate to consider t to represent a time-like variable throughout, but the analysis is generic.
In the conventional probabilistic paradigm of information theory, it is meaningful to attribute a quantity of entropy to the process by which signals such as α are generated in a given system but not to any particular signal [1]. Entropy, however, is fundamentally a measure of both indeterminacy and information, and it is therefore appropriate to seek some measure of entropy to quantify the information intrinsic to a given α. The difficulty in such an endeavour is that standard entropy measures are defined in terms of probabilistic distributions or probabilistic density functions (PDF's). In principle, some special treatment is therefore required in order to adapt measures of entropy for use as measures of information in non-probabilistic waveforms like α. Existing approaches generally rely on some secondary distribution or other probabilistic construct to which a measure of indeterminacy may be attributed. See for instance [2–4]. Although a variety of such well-formulated methods have been introduced, all feature some non-trivial dependence on model or parametrization. Consequently, there is no single generally successful prescription for the informatic entropy of a waveform like α.
There exist additional difficulties in formulating a rigorous, general entropic measure of information that are inherent to the Shannon-type measures themselves. For instance, consider the differential Shannon entropy, which is defined here generally such that
1.1 |
for any given λ=λ(t) and for any independent variable. In equation (1.1) and throughout is defined to vanish for any z=0. Integrals with no explicit limits implicitly span the real axis. Naturally, equation (1.1) is meaningful only when λ is a meaningful PDF, which restricts us to real, non-negative, non-singular λ such that
1.2 |
In the probabilistic context H{λ} is meant to measure the indeterminacy in a continuous, real variable whose outcomes are governed by a PDF, λ. Interpreted more broadly, in abstraction from any particular variable, H{λ} is an intrinsic informatic attribute of λ. Strictly speaking, however, H{λ} alone is a somewhat unsatisfactory measure in both cases because the argument of its logarithm is not generally dimensionless [5]. Consequently, H{λ} is sensitive to the unit of t. Specifically for any real, positive b we may express λ equivalently as a function λ′=λ′(t′) of t′=t/b such that λ′(t′) dt′=ρ dt, hence λ′=bλ. The corresponding entropy, H{λ′}, reduces to , which is to say that H{λ} is not invariant under a simple scale-transformation in t.
Moreover, H{λ} may be negative and is unbounded. While it is reasonable to expect the entropy associated with some infinitely expansive PDF to be infinitely large, the negativity and lack of a lower bound pose particularly challenging problems for H{λ} as a meaningful measure per se for an arbitrary λ.
In the quantum mechanical context, where a given PDF is stated as the squared absolute magnitude of a probabilistic amplitude function (PAF), there exists a natural resolution to the problems cited with respect to a single measure of the form H{λ}. Specifically, let ψ=ψ(t) be a generally complex, square-integrable function of t with unit energy. As such the squared absolute magnitude of ψ, defined here as
1.3 |
is generally meaningful as a PDF and ψ is accordingly meaningful as a PAF. The Fourier counterpart to ρ is the spectral density,
1.4 |
where
1.5 |
is the Fourier transform (FT) of ψ and ω∈ℜ is the independent variable conjugate to t. Throughout this work, is the FT of any y=y(t). As the ‘hat’-notation is inconvenient in some cases, let also be the FT of any y. The inverse FT of is accordingly written here as , having the explicit form
1.6 |
Together ρ and σ form a dimensionless two-variable PDF
1.7 |
with
1.8 |
The total differential Shannon entropy associated with f is
1.9 |
As f follows from a given ψ, it is appropriate to interpret the measure in equation (1.9) as an attribute thereof. For convenience, let us therefore define
1.10 |
for a given ψ.
In order to examine the sensitivity of S{ψ} to arbitrary scale-transformations of the form , we must introduce also the implicit conjugate transformation , where ω′=bω. Let σ′=σ′(ω′) be defined such that σ′ dω′=σ dω. It follows directly from the appropriate substitutions in the explicit form of H{σ′} that . Consequently, S{ψ} is scale-invariant and thus insensitive to the unit with which t is measured. In addition to being unit-independent S{ψ} is subject to the lower bound
1.11 |
in one dimension [6]. Furthermore, S{ψ} has emerged as a fundamentally significant information-theoretic measure in quantum systems. See, for instance, [7–12].
Although our nominal signal, α, has unit energy like ψ, it is not a PAF. Apart from being exclusively real, it simply does not represent any probabilistic process, nor does it necessarily represent any amplitude. In the conventional understanding, therefore, some secondary complex amplitude function must be constructed in association with α in order to adapt S{ψ} for waveforms like α. As in the case of secondary PDF's, such an endeavour generally introduces model-dependence and the need for some parametrization.
The purpose of this work, however, is to demonstrate that the total entropy in equation (1.10) is a sensitive, meaningful measure of information in non-probabilistic waveforms such as α without requiring any external models or parameters. Specifically, in addition to considering the most simple adaptation, S{α}, we consider S{ϕ}, where ϕ is the normalized analytic representation (AR) of a given α. Section 2 describes the normalized AR and the motivation for considering it as a characteristic amplitude function for a given α. In §3, are defined the relevant discrete quantities necessary for computation. Section 4 explores the informatic behaviours of S{α} and S{ϕ} with explicit numerical studies of their discrete analogues. Section 5 provides a brief summary and discussion of the results.
2. Analytic representation as an amplitude function
It follows from the realness of α, equation (1.5) and the Euler theorem that is Hermitian. In other words, half of the Fourier spectrum associated with α is informatically redundant. Eliminating the redundant contributions thereby produces a generally complex function that contains the same information as does α. The function so produced is proportional to the well-known AR of α, whose imaginary part is the Hilbert transform of α. According to the central thesis of this present work, the AR of a given signal, with the appropriate normalization, is an appropriate candidate for a PAF-like function that is characteristic to a given real waveform.
Before proceeding, it is instructive to review how the AR and the Hilbert transform (HT) are related generally. Let g=g(t) be a real, square-integrable function of t. By expressing g explicitly as the inverse FT of and exploiting the Hermitian symmetry of , we obtain
2.1 |
where is the complex conjugate of . For a given g, it is therefore natural to define
2.2 |
such that equation (2.1) becomes
2.3 |
where .
By construction γ represents the contributions to g from non-negative Fourier frequencies, whereas γ* represents the contributions from non-positive frequencies. The AR of a given g, written here as , may be defined most simply as
2.4 |
hence
2.5 |
It is also appropriate to define the AR of g in terms of its Fourier transform according to
2.6 |
The above equation ensures that the zero-frequency Fourier component contributes equally to and its negative-frequency analogue.
Let h=h(t) be the imaginary part of a given , hence
2.7 |
With γ expressed explicitly, substituting the dummy variable τ for t in , and with additional exploitation of the Hermitian symmetry of , equation (2.7) produces
2.8 |
Finally, integrating equation (2.8) with respect to ω in accordance with the Riemann–Lebesque Lemma produces
2.9 |
where ‘PV’ signifies the principal value of the integral. The right-hand side of equation (2.9) is identical to the Hilbert transform of g, written here as , where
2.10 |
for any g.
Returning to the primary subject, suppose that a PAF is sought to represent a given α. In order to be a meaningful descriptor of α, the PAF should contain essentially the same information as does α and its absolute magnitude should envelop α. The above criteria are realized naturally through the analytic representation. In particular, due to the orthogonality between g and , the imaginary part of is intrinsically in quadrature with the real part. Consequently, naturally forms an envelope function for g. Furthermore because the HT preserves the norm of square-integrable functions, we have . We therefore may readily scale to create a unity-normalized analogue. Specifically, for a given α let
2.11 |
be defined to represent the characteristic PAF associated with a given α. For convenience, we use ϕ=ϕ(t)=ϕ{α}=ϕ{α}(t). Note that we may express ϕ equivalently as
2.12 |
which is analogous to .
Associated with ϕ is the time-domain PDF-like function, |ϕ|2. The corresponding spectral PDF is likewise , where
2.13 |
From a given α, it is thereby possible to construct the unitless, two-dimensional PDF,
2.14 |
which has the form of f.
As a rudimentary example suppose that we have
2.15 |
where the positive constants d and θ are external parameters, and
2.16 |
In principle, it is possible to obtain the associated PAF from equation (2.11) by calculating the HT of α directly, but the authors are unaware of any published analytical solution to for this case. We may, however, solve for ϕ alternatively by first finding the FT of α and then performing the integration in equation (2.12). With
2.17 |
we find
2.18 |
which implies
2.19 |
Owing to the properties of the error function, we have
2.20 |
for α of the form equation (2.15).
Figure 1 is a plot of the real and imaginary parts of ϕ for the case in which d=1 and θ=10. Note that the imaginary part is approximately proportional to the right-hand side of equation (2.20) in this case, as θd=10. Consequently, the corresponding PDF, |ϕ|2, very nearly recovers the asymptotic Gaussian envelope.
Figure 1.
Real and imaginary parts of ϕ, in solid and dashed lines, respectively, for α of the form equation (2.15), with d=1 and θ=10.
3. Discrete analogues
We may readily define the discrete analogue of S{ψ} in association with the discrete analogue of ψ. Specifically let the discrete amplitude,
3.1 |
be an n-dimensional complex vector such that
3.2 |
where |z|2 signifies the usual ℓ2-norm of any complex vector z and Δt=Δt{v} is some positive real number representing implicitly the uniform interval between nodes on the t-axis at which v is evaluated. Because S{ψ} is insensitive to the unit of t we are free to put Δt=1 and normalize v accordingly. For clarity, however, it is useful to preserve Δt explicitly. Analogously to ρ we then define the discrete distribution
3.3 |
The discrete spectral amplitude associated with v is
3.4 |
where is the discrete Fourier transform (DFT) of a given z and is defined here generally such that
3.5 |
for . Because the DFT does not generally preserve the norm of a vector it is necessary to introduce the normalization factor
3.6 |
such that
3.7 |
where
3.8 |
The discrete analogue of σ is accordingly
3.9 |
Finally, analogously to S{ψ}, the total entropy associated with a given v is
3.10 |
where n=n{v}.
The characteristic amplitude in equation (2.11) and its Fourier transform may be defined analogously for discrete, or sampled, signals [13,14]. First, in analogy to α, let
3.11 |
be an n-dimensional real vector such that
3.12 |
for some positive, real Δt. For convenience, and because it causes no confusion, we assign n=n{a} and Δt=Δt{a} henceforth. Let the discrete Hilbert transform (DHT) be defined generally such that the kth element of the DHT of a, written here as , is
3.13 |
for k=1,…,n. As with the DFT, in general the DHT only approximately preserves the norm of a vector. Let us therefore define
3.14 |
such that
3.15 |
Finally, let us define
3.16 |
as the discrete analogue of ϕ.
It is worth noting that the discrete terms introduced above are essentially nothing more than the most elementary numerical approximations of their respective continuous analogues. They are meant to embody only those principles already embodied in the continuous terms.
It is also important to mention that calculation of the AR requires some care in the case of discrete or sampled signals. Specifically consider some g with some non-trivial compact support [t1,t2]. In general, may be non-vanishing outside of [t1,t2]. In order to compute properly the AR for a discrete waveform, it is therefore necessary to ‘pad’ the original waveform with a sufficient number of zeros. In all calculations presented in the following section, the number of added zeros is at least an order of magnitude greater than the size of the initial waveform, thereby ensuring precision.
4. Waveform information
Because the total entropy is an effective measure of information in PAFs, and because ϕ is a complex, square-integrable function with unit energy that contains essentially the same information as does α, we propose that S{ϕ} is an appropriate, generally meaningful measure of information in real, non-probabilistic waveforms like α. This section validates the informatic utility of S{ϕ} with detailed computations of the discrete analogue, SD{u} in fundamental informatic scenarios.
Additionally, we present corresponding computations of the presumably naively constructed quantity, SD{a}, which is the discrete analogue of S{α}. Given that α and ϕ contain essentially the same information, if S{ϕ} is a meaningful measure of information then so too should be S{α}. Note, however, that we should not expect S{ϕ} and S{α} to be identical. In particular, the relationship between H{|α|2} and H{|ϕ|2} is not analytically reducible. In contrast, it follows directly from the explicit form of that:
4.1 |
We therefore expect S{ϕ} and S{α} to differ generally. Despite the expected differences, we find that the discrete analogues of S{ϕ} and S{α} are strongly correlated and that both are effective measures of information. In fact, the observed consistencies between SD{u} and SD{a} constitute a separate validation of the informatic relevance of the total entropy to non-probabilistic waveforms like α.
In each of the following calculations, the discrete vectors are padded such that the total number of points is 220, and the number of computationally introduced zeros is thus 220−n in each case. The varying number of computational zeros introduces only a negligible influence.
As a first test, let us examine the behaviours of SD{u} and SD{a} for a waveform consisting of randomly generated noise. Specifically, suppose that a has the form
4.2 |
where ν(μ,δ;k) represents the kth sample of normally distributed noise with mean μ and variance δ, and C is a constant for a given randomly generated waveform, chosen such that . For convenience, we put Δt=1 here. Figure 2 shows the normalized entropies, and , as functions of . Each data point represents the average among 10 independently randomized a. Note that both measures behave as if asymptotically approaching unity.
Figure 2.
Average normalized entropies, SD{u} and SD{a}, of n samples of normally distributed noise plotted as a function of in asterisks and circles, respectively.
Let us next consider the complementary case of a correlated signal, such that a consists essentially of some regularly repeating sequence. As more and more sample points are added to such a waveform, the amount of added information decreases and the information contained in a asymptotically approaches some fixed quantity. As an example, let a have the form
4.3 |
where B is chosen for appropriate normalization. Figure 3 shows the entropies, SD{u} and SD{a}, for a of the form equation (4.3) for ω0=0.3 and tk=1,2,…,n. Note that both measures exhibit the expected asymptotic behaviour, but with each converging to a different value.
Figure 3.
Entropies, SD{u} and SD{a}, of n samples of a normalized sine wave of the form equation (4.3) with ω0=0.3 plotted as a function of in asterisks and circles, respectively.
It is instructive in this case to show how the two Fourier-conjugated components constituting the total entropy behave individually. Figure 4 contains plots of the Fourier components of SD{u} for the same scenario shown in figure 3. Note how the t-domain and ω-domain components increase and decrease respectively, while their sum is asymptotically fixed, representing presumably the fixed information in the waveform. Analogous behaviour is observed in the components of SD{a}.
Figure 4.
Individual Fourier components of SD{u} along with SD{u} for scenario shown in figure 3. The t-domain contribution is plotted in diamonds and the ω-domain contribution is plotted in ‘+’. Their sum, the total entropy, is shown in asterisks, as in figure 3.
Next let us examine the behaviours of SD{u} and SD{a} in the case of a simple, ‘bare’ signal with a varying degree of additive background noise. In general, we expect the entropy of the net waveform to increase as the energy of the noise increases in relation to the energy of the bare signal—provided that the bare signal is sufficiently simple. Let us therefore introduce a discrete waveform of the form
4.4 |
The signal-to-noise ratio (SNR), with ‘signal’ referring here to the bare signal, is
4.5 |
The constants c′ and C′ are to be chosen to effect a given SNR while ensuring that .
Figure 5 contains plots of SD{u} and SD{u} as functions of the SNR for a of the form equation (4.4) with d=1 and θ=10. In each calculation, the signal was constructed with n=9001 samples taken at equally spaced points spanning [−8,8]. The value of Δt follows. Each point on the plot represents the average of 10 separate calculations, each of which was performed with a separately generated background noise function. The standard deviation in each plotted point is smaller than the interval implied by the size of the symbol with which the point is plotted. The dashed line in figure 5 marks the average entropy, SD{u}, of 9001 samples of Gaussian noise, which is approximately 9.4, and the dot-dashed line represents the analogous background noise for SD{a}. The entropy of the noise is the limiting entropy as the SNR vanishes (or becomes infinitely negative as measured in dB). Conversely in the limit as the SNR diverges SD{u} is very nearly in this case due to the nearly Gaussian envelope of the signal and is independent of the number of samples insofar as n is sufficiently large. Both measures evolve with respect to the SNR in broad accordance with expectations of measures of information. Also noteworthy is the sensitivity of the entropy measures to the presence of signals even below −20 dB.
Figure 5.
Average entropies, SD{u} and SD{a}, for a of the form equation (4.4), with d=1 and θ=10, plotted as a function of the SNR in asterisks and circles, respectively. The dashed and dot-dashed lines show the average entropies of the noise alone for each case.
As a fourth test of the waveform entropy, let us consider a sequence of bits represented as a sequence of binary-valued square pulses. The informatic complexity of α in such a case should depend on the arrangement of the bits. For example, the sequence ‘101010…’ conveys less information than a sequence of bits arranged in an irregular manner. The informatic entropy of the former should presumably be smaller than that of the latter. Figure 6 shows plots of SD{u} and SD{a} in a sequence of 1024 bits originally ordered as ‘101010…’ but with m consecutive bits somewhere in the sequence being reordered randomly. The corresponding signal, a, was constructed by assigning 8 consecutive sample points to each ‘1’ in the sequence and 8 consecutive sample points ak=0 to each ‘0’. We accordingly have n=8192, hence the value assigned to the ‘1's ensures unit energy. Each plotted point represents the average of 10 separate calculations, with the location of the scrambled sub-sequence chosen randomly each time. Note the expected broad correlation between entropy and the degree of randomization, characterized by m. It is also interesting to note that SD{u}<SD{a} in this case, whereas the opposite was observed in the three preceding studies. This does not have any significance per se, but suggests that the relationship between SD{u} and SD{a} is complex and deserving of further investigation.
Figure 6.
Average entropy of a sequence of 1024 bits, encoded in square waves, originally ordered as ‘101010…’, but with m consecutive bits somewhere in the sequence being randomly reordered.
5. Summary
The foregoing analysis addresses a fundamental issue in information theory, namely the characterization of the informatic content of real waveforms. We have demonstrated that the total entropy associated with quantum-mechanical PAF's is a meaningful measure of information in waveforms like α. Specifically, the computations described in §4 validate that SD{u} and SD{a} exhibit the behaviours expected of a sensitive measure of information over a broad range of rudimentary, non-trivial scenarios.
Our approach differs notably from previous studies in that we have not sought to construct a secondary distribution or other probabilistic artifice to describe a given waveform, but rather we have simply attributed a measure of entropy to the waveform itself—either directly or through the AR. The broad correlation between SD{u} and SD{a} is noteworthy in that it is not guaranteed a priori, and α is neither complex nor probabilistic. More to the point, the observed consistencies between SD{u} and SD{a} affirm that both measures, and their continuous analogues, are meaningful measures of information. Specifically, if S{ϕ} measures information then so should S{α} because ϕ and α contain essentially the same information.
It is worth noting that the computations plotted in figure 5 indicate that the entropy prescriptions described in this work may be useful in the detection of weak signals in noisy backgrounds. Moreover, the sensitivity to the degree of randomness in the arrangement of a sequence of square pulses indicates a connection to Kolmogorov complexity that begs further investigation. Such a sensitivity may be useful in cryptological applications, for instance. Additional applications will be the subject of future work.
Data accessibility
This work did not rely on any experimental data.
Authors' contributions
S.F. conceived the basic model. W.S. and A.W. contributed to analysis and testing. A.W. contributed to mathematical analysis of the Hilbert transform.
Competing interests
We have no competing interests.
Funding
This work was funded through the Naval Innovative Science and Engineering (NISE) program as implemented at SPAWAR Systems Center Atlantic.
References
- 1.Shannon CE. 1948. A mathematical theory of communication. Bell. Syst. Tech. J. 27, 379–423. (doi:10.1002/j.1538-7305.1948.tb01338.x) [Google Scholar]
- 2.Attard P, Jepps OG, Marčelja S. 1997. Information content of signals using correlation function expansions of the entropy. Phys. Rev. E 56, 4052–4067. (doi:10.1103/PhysRevE.56.4052) [Google Scholar]
- 3.Bercher J-F, Vignat C. 2000. Estimating the entropy of a signal, with applications. IEEE Trans. Signal Proc. 48, 1687–1694. (doi:10.1109/78.845926) [Google Scholar]
- 4.Baraniuk RG, Flandrin P, Janssen AJEM, Michel OJJ. 2001. Measuring time-frequency information content using Renyi entropies. IEEE Trans. Inform. Theory 47, 1391–1409. (doi:10.1109/18.923723) [Google Scholar]
- 5.Kullback S, Leibler RA. 1951. On information and sufficiency. Ann. Math. Stat. 22, 79–86. (doi:10.1214/aoms/1177729694) [Google Scholar]
- 6.Bialynicki-Birula I, Mycielski J. 1975. Uncertainty relations for information entropy in wave mechanics. Commun. Math. Phys. 44, 129–132. (doi:10.1007/BF01608825) [Google Scholar]
- 7.Gadre SR, Bendale RD. 1987. Rigorous relationships among quantum-mechanical kinetic energy and atomic information entropies Upper and lower bounds. Phys. Rev. A 36, 1932–1935. (doi:10.1103/PhysRevA.36.1932) [DOI] [PubMed] [Google Scholar]
- 8.Lalazissis GA, Massen SE, Panos CP. 1998. Information entropy as a measure of the quality of a nuclear density distribution. Int. J. Mod. Phys. E 7, 485–494. (doi:10.1142/S0218301398000257) [Google Scholar]
- 9.Guevara NL, Sagar RP, Esquivel RO. 2003. Shannon-information entropy sum as a correlation measure in atomic systems. Phys. Rev. A 67, 012507 (doi:10.1103/PhysRevA.67.012507) [Google Scholar]
- 10.Massen SE. 2003. Application of information entropy to nuclei. Phys. Rev. C 67, 014314 (doi:10.1103/PhysRevC.67.014314) [Google Scholar]
- 11.Shi Q, Kais S. 2004. Discontinuity of Shannon information entropy for two-electron atoms. J. Chem. Phys. 309, 127–131. (doi:10.1016/j.chemphys.2004.08.020) [Google Scholar]
- 12.Bialynicki-Birula I, Rudnicki L. 2007. Entropic uncertainty relations in quantum physics, pp. 1–34. Heidelberg, Germany: Springer. [Google Scholar]
- 13.Gold B, Oppenheim AV, Rader CM. 1970. Theory and implementation of the discrete Hilbert transform. In Proc. of the Symp. on Computer Processing in Communications, 1969 (ed. J Fox), pp. 235–250. New York, NY: Wiley Interscience.
- 14.Marple SL. 1999. Computing the discrete-time ‘analytic’ signal via FFT. IEEE Trans. Signal Proc. 74, 2600–2603. (doi:10.1109/78.782222) [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
This work did not rely on any experimental data.