Skip to main content
Entropy logoLink to Entropy
. 2018 Feb 3;20(2):106. doi: 10.3390/e20020106

Point Divergence Gain and Multidimensional Data Sequences Analysis

Renata Rychtáriková 1,*, Jan Korbel 2,3,4, Petr Macháček 1, Dalibor Štys 1
PMCID: PMC7512599  PMID: 33265197

Abstract

We introduce novel information-entropic variables—a Point Divergence Gain (Ωα(lm)), a Point Divergence Gain Entropy (Iα), and a Point Divergence Gain Entropy Density (Pα)—which are derived from the Rényi entropy and describe spatio-temporal changes between two consecutive discrete multidimensional distributions. The behavior of Ωα(lm) is simulated for typical distributions and, together with Iα and Pα, applied in analysis and characterization of series of multidimensional datasets of computer-based and real images.

Keywords: point divergence gain (PDG), Rényi entropy, data processing

1. Introduction

Extracting the information from raw data obtained from, e.g., a set of experiments, is a challenging task. Quantifying the information gained by a single point of a time series, a pixel in an image, or a single measurement is important in understanding which points bring the most information about the underlying system. This task is especially delicate in case of time-series and image processing because the information is not only stored in the elements, but also in the interactions between successive points in a time series. Similar, when extracting information from an image, not all pixels have the same information content. This type of information is sometimes called local information because the information depends not only on the frequency of the phenomenon but also on the position of the element in the structure. The most important task is to identify the sources of information and to quantify them. Naturally, it is possible to use standard data-processing techniques based on quantities from information theory like, e.g., Kullback–Leibler divergence. On the other hand, the mathematical rigorousness is typically compensated by an increased computational complexity. For this end, a simple quantity called Point Information Gain and its relative macroscopic variables—a Point Information Gain Entropy and a Point Information Gain Entropy Density—were introduced in [1]. In [2], mathematical properties of the Point Information Gain were extensively discussed and applications to real-image data processing were pointed out. From the mathematical point of view, the Point Information Gain represents a change of information after removing an element of a particular phenomena from a distribution. The method is based on the Rényi entropy, which has been already extensively used in multifractal analysis and data processing (see e.g., Refs. [2,3,4,5] and references therein).

In this article, we introduce an analogous variable to the Point Information Gain. This new variable locally determines an information change after an exchange of a given element in a discrete set. We use a simple concept of entropy difference between the original set and the set with the exchanged element. The resulting value is called Point Divergence Gain Ωα(lm) [6,7]. The main idea is to describe the importance of changes in the series of images (typically representing a video record from an experiment) and extract the most important information from it. Similar to the Point Information Gain Entropy and the Point Information Gain Entropy Density, the macroscopic variables called a Point Divergence Gain Entropy Iα and a Point Divergence Gain Entropy Density Pα are defined to characterize subsequent changes in a multidimensional discrete distribution by one number. The goal of this article is to examine and demonstrate some properties of these variables and use them for examination of time-spatial changes of information in sets of discrete multidimensional data, namely series of images in image processing and analysis, after the exchange of a pixel of a particular intensity for a pixel at the same position in the consecutive image. The main reason for choosing the Point Divergence Gain as the relevant quantity for the analysis of spatio-temporal changes is the fact that it represents an information gain of each pixel change. One can also consider model-based approaches based on the theory of random-fields, which can be more predictive in some cases. On the other hand, the model-free approach based on entropy gives us typically more relevant information for real data, where it is typically difficult to find an appropriate model. For the overview of model-based approaches in the random field theory, one can consult, e.g., Refs. [8,9,10].

The paper is organized as follows: in Section 2, we define the main quantity of the paper, i.e., the Point Divergence Gain and the related quantities and discuss its theoretical properties. In Section 3, we show applications of the Point Divergence Gain to image processing for both computer-based and real sequences of images. We show that the Point Divergence Gain can be used as a measure of difference for clustering methods and detects the most prominent behaviour of a system. In Section 4, we explain the presented methods and finer technical details necessary for the analysis including algorithms. Section 5 is dedicated to conclusions. All image data, scripts for histogram processing, and Image Info Extractor Professional software for image processing are available via sftp://160.217.215.193:13332/pdg (user: anonymous; password: anonymous.).

2. Basic Properties of Point Divergence Gain and Derived Quantities

2.1. Point Divergence Gain

Recently, a quantity called Point Information Gain (PIG, Γα(i)) [6,7] and its generalization based on the Rényi entropy [2] have been introduced. We show how to apply the concept of PIG to sequence of multidimensional data frames.

Let us assume a set of variables with k possible outcomes (e.g., possible colours of each pixel). The Γα(i) is a simple variable based on entropy difference and enables us to quantify an information gain of each phenomenon. It is simply defined as a difference between entropy of an original discrete distribution

P={pj}j=1k=n1n,,nkn, (1)

which typically describes a frequency histogram of possible outcomes. Let us also define a distribution, where one occurrence of the i-th phenomenon is omitted, i.e.,

P(i)=pj(i)j=1k=n1n1,,ni1n1,,nkn1. (2)

Thus, the Point Information Gain is defined as

Γα(i)Γα(i)(P)=HαP(i)Hα(P), (3)

where Hα is the Rényi entropy (Despite all computer implementations being calculated as log2, the following derivations are written in natural logarithm, i.e., ln.)

Hα(P)=1α1lnipiα. (4)

The Rényi entropy represents a one-parametric class of information quantities tightly related to multifractal dynamics and enables us to focus on certain parts of the distribution [11]. Unlike the typically used Rényi’s relative entropy [3,4,11,12,13,14,15,16,17], the Point Information Gain Γα(i) is a simple, computationally tractable quantity. Its mathematical properties have been extensively discussed in [2]. On the same basis, we can define a Point Divergence Gain (PDG, Ωα(lm)), where a discrete distribution P(i) is replaced by a distribution

P(lm)=pj(lm)j=1k=n1n,,nl1n,,nm+1n,,nkn, (5)

which can be obtained from the original distribution P, where the occurrence of the examined l-th phenomenon (nlN+) is removed and supplied by a point of the occurrence of the m-th phenomenon (nmN0). The main idea behind the definition is to quantify the information change in the subsequent image, if only one point is changed. Analogous to the Point Information Gain Γα(i), the Point Divergence Gain can be defined as

Ωα(lm)Ωα(lm)(P)=HαP(lm)Hα(P). (6)

Let us first show its connection to the Point Information Gain Γα(i). Since P(l)=P(lm,m), it is possible to express the Point Divergence Gain as

Ωα(lm)(P)=HαP(lm)HαP(lm,m)+HαP(l)Hα(P)=Γα(l)(P)Γα(m)(P(lm)). (7)

Let us investigate mathematical properties of the PDG. The Ωα(lm) can be rewritten as

Ωα(lm)=HαP(lm)Hα(P)=11αlnj=1kpj(lm)α11αlnj=1kpjα=11αlnj=1kpj(lm)αi=1kpjα. (8)

By plugging the relative frequencies from Equations (1) and (5) into Equation (8), we obtain

Ωα(lm)=11αln(nl1)α+(nm+1)α+j=1,jl,mknjαj=1knjα=11αln(nl1)α+(nm+1)α+j=1knjαnlαnmαj=1knjα=11αln(nl1)αnlα+(nm+1)αnmαj=1knjα+1. (9)

As seen in Equation (9), the variable Ωα(lm) does not depend (contrary to the Γα(i)) on n but depends only on the number of elements of each phenomenon j. In Equation (9), let us design the nominator j=1knjα, which is constant and related to the original distribution (histogram) of elements and to the parameter α, as Cα. It gives us the final form

Ωα(lm)=11αln(nl1)αnlα+(nm+1)αnmαCα+1. (10)

Equation (10) demonstrates that, for a particular distribution, Ωα(lm) is a function only of the parameter α and frequencies of occurrences of the phenomena nl and nm in the original distribution, between which the exchange of the element occurs. Equation (10) further shows that if the exchange of the element occurs between phenomena l and m of the same (similar) frequencies of occurrence (i.e., nlnm), the value of Ωα(lm) equals 0. If we remove a rare point and supply it by a high-frequency point (i.e., nlnm), the value of Ωα(lm) is negative, and vice versa. Low values of parameter α separate low-frequency events as Ωα(lm)=0, whereas high α emphasize high-frequency events as Ωα(lm)0 or Ωα(lm)0 and merge rare events into Ωα(lm)=0. With respect to the previous discussion and practical utilization of this notion, we emphasize that, for real systems with large n, the Ωα(lm) are rather small numbers.

In the 3D plots of Figure 1, we demonstrate Ωα(lm)-transformations of four thoroughly studied distributions—the Cauchy, Gauss (symmetrical), Lévy, and Rayleigh distribution (asymmetric; all specified in Section 4.1)—for α={0.5;1.0;2.0;4.0}, where each point presents the exchange of the element between bins l and m (Algorithm 1). In this case, the (a)symmetry of the distribution is always maintained.

Figure 1.

Figure 1

The Ωα-transformations of the discrete (a) Cauchy; (b) Gauss; (c) Lévy; and (d) Rayleigh distribution for α = {0.5;1.0;2.0;4.0} (Section 4.1).

Algorithm 1: Calculation of a point divergence gain matrix (Ωα) for typical histograms.
graphic file with name entropy-20-00106-i001.jpg

Now we will consider the specific case α=2 (collision entropy) for which Equation (10) can be simplified to

Ω2(lm)=ln2C2(nmnl+1)+1=ln2C2(Δn(lm)+1)+1. (11)

For a specific difference Δn(xy)=D, Equation (11) can be approximated by the 1st-order Taylor sequence

Ω2(lm)ln2C2(D+1)+122(D+1)+C2(Δn(lm)D)=22D+2+C2Δn(lm)+2D2D+2+C2ln2DC2+C2+1. (12)

Equations (11) and (12) show that, for each unique Δn(xy), the Ω2(lm) depends only on the difference between the bins l and m, which the exchange of the element occurs between, and this dependence is almost linear. In other words, this explains why, for all distributions in Figure 2, the dependencies Ω2(lm)=f(nm,nmnl) are planes.

Figure 2.

Figure 2

The dependencies Ωα=f(nm,nmnl) for the discrete (a) Cauchy; (b) Gauss; (c) Lévy; and (d) Rayleigh distribution at α = {0.5;1.0;2.0;4.0} (Section 4.1).

For α1, the Rényi entropy becomes the ordinary Shannon entropy [18] and we obtain (cf. Equation (4))

H1(P)=j=1kpjlnpj=j=1knjnlnnjn=j=1,jl,mknjnlnnjnnmnlnnmnnlnlnnln (13)

and

H1(P(lm))=nm+1nlnnm+1nnl1nlnnl1nj=1,jl,mknjnlnnjn. (14)

The difference of these entropies (cf. Equation (9)) is gradually giving

Ω1(lm)=nm+1nlnnm+1nnl1nlnnl1n+nmnlnnmn+nlnlnnln=nm+1nln(nm+1)+nm+1nlnnnl1nln(nl1)+nl1nlnn+nmnlnnmnmnlnn+nlnlnnlnlnlnn=(nm+1n+nl1nnmnnln)=0lnnnmnln(nm+1)1nln(nm+1)nlnln(nl1)+1nln(nl1)+nmnlnnm+nlnlnnl=1n(nmlnnmnm+1+nllnnlnl1+lnnl1nm+1). (15)

One can see that relation (15) is defined for nlN\{0,1} and nmN+ and is approximately equal to 0 for nl,nm0 (the Cauchy and Rayleigh distribution for α=1 in Figure 3).

Figure 3.

Figure 3

The dependencies Ωα=f(nl,nm) for the discrete (a) Cauchy; (b) Gauss; (c) Lévy; and (d) Rayleigh distribution at α = {0.5;1.0;2.0;4.0} (Section 4.1).

For nlN+ and nmN0, from Equation (10), further implies:

  1. If α=0, then Ω0(lm)=0.

  2. If α, then Ω(lm)0.

2.2. Point Divergence Gain Entropy and Point Divergence Gain Entropy Density

In this section, we introduce two new variables that help us to investigate changes between two (typically consecutive) points of time series. A typical example can be provided by video processing, where each element of a time or spatial series is represented by a frame. Let us have two data frames Ib={a1,,an} and Ib={b1,,bn} (For simplicity, we use only one index which corresponds to a one-dimensional frame. In case of images, we have typically two-dimensional frames and the elements are described by two indexes, e.g., x and y positions.). At each position i{1,,n}, it is possible to replace the value ai by the value of the following frame, i.e., bi. The resulting Ωα(aibi) then quantifies how much information is gained/lost, when, at the i-th position, we replace the value ai for the value bi. A Point Divergence Gain Entropy (PDGE, Iα) is defined as a sum of absolute values of all PDGs for all pixels, i.e.,

Iα(Ia;Ib)=i=1n|Ωα(aibi)|=l=1km=1knlm|Ωα(lm)|, (16)

where nlm denotes the number of present substitutions lm, when we transform IaIb. The absolute value ensures that the contribution of the transformation of a rare point to a frequent point (negative Ωα) and a frequent point to a rare point (positive Ωα) do not cancel each other and both contribute to the resulting PDGE. Typically, appearance or disappearance of a rare point (and replacement by a frequent value—typically background colour) carries important information about the experiment. The PDGE can be understood as an absolute information change.

Moreover, it is possible to introduce other macroscopic quantity—a Point Divergence Gain Entropy Density (PDGED, Pα), where we do not sum over all pixels, but only over all realized transitions lm. Thus, the PDGED can be defined as

Pα(Ia;Ib)=l=1km=1kχlm|Ωα(lm)|, (17)

where

χlm=1,nlm1,0,nlm=0. (18)

Let us emphasize that two transitions a1b1 and a2b2, where the frequencies of the occurrences of the phenomena a1 and a2 are equal and of the phenomena b1 and b2 are equal as well, give two unique values of the Ωα(aibi). In the computation of the PDGED, this is arranged by a hash function (Algorithm 2). We can understand the quantity PDGED as an absolute information change of all realized transitions of phenomena ml.

Algorithm 2: Calculation of a point information gain matrix (Ωα) and values Pα and Iα for two consecutive images of a time-spatial series.
graphic file with name entropy-20-00106-i002.jpg

If the aim is to assess the influence of elements of a high occurrence on the time-spatial changes in the image series, it is recommended to use PDGE where each element is weighted by its number of occurrences. If the aim is to suppress the influence of these extreme values, it is better to compute PDGED.

Let us consider a time-series V, where each time step contains one frame, so V={I1,I2,}. The series V can be, e.g., a sequence of images (a video) obtained from some experiment, etc. For each time step, it is possible to calculate Iα(t)=Iα(It;It+s), resp. Pα(t)=Pα(It;It+s), where s is the time lag. Typically, we assume s=1, i.e., consecutive frames with a constant time step.

3. Application of Point Divergence Gain and Its Entropies in Image Processing

The generalized Point Divergence Gain Ωα(lm) in Equation (10) was originally used for characterization of dynamic changes in image series, namely in z-stacks of raw RGB data of unmodified live cells obtained via scanning along the z-axis using video-enhanced digital bright-field transmission microscopy [6,7]. In these two references, this new mathematical approach utilizes 8- and 12-bit intensity histograms of two consecutive images for pixel-by-pixel intensity weighted (parameterized) subtraction of these images to suppress the camera-based noise and to enhance the image contrast (In case of calibrated digital camera-based images, where the value of each point of the image reflects a number of incident photons, or, in case of computer-based images, it can be sufficient to use a simple subtraction for evaluation of time-spatial changes in the image series.).

For this paper, we chose other (grayscale) digital image series (Table 1) in order to demonstrate other applications of the PDG mathematical approach in image processing and analysis. Moreover, we newly introduce applications of the additive macroscopic variables Point Divergence Gain Entropy Iα and Point Divergence Gain Entropy Density Pα.

Table 1.

Specifications of image series.

Series Source Bit-Depth Number of Img. Resolution Origin
Toy Vehicle [19] 8-bit 10 512 × 512 camera
Walter Cronkite [19] 8-bit 16 256 × 256 camera
Simulated BZ [20,21,22] 8-bit 10,521 1001 × 1001 computer-based a
Ring-fluorescence 12-bit 1058 548 × 720 experimental b
Ring-diffraction 8-bit c 1242 252 × 280 experimental b

a A set of a noisy hotch-potch machine simulation of the Belousov–Zhabotinsky reaction [20,21,22] at 200 achievable states with the internal excitation of 10, and phase transition, internal excitation, and external neighbourhood kind of noise of 0, 0.25, and 0.15, respectively. b The microscopic series of a 6-μm standard microring (FocalCheckTM, cat. No. F36909, Life TechnologiesTM (Eugene, OR, USA)) were acquired using the CellObserver microscope (Zeiss, Oberkochen, Germany) at the EMBL (Heidelberg, Germany). For both light processes, the green region of the visible spectrum was selected using an emission and transmission optical filter, respectively. In case of the diffraction, the point spread function was separated and the background intensities was disposed using Algorithm 1 in [7]. c The 12-bit depth was reduced using a Least Information Lost algorithm [23], which, by shifting the intensity bins, filled all empty bins in the histogram obtained from the whole data series up and rescaled these intensities between their minimal and maximal value.

3.1. Image Origin and Specification

Owing to the relation of the Ωα(lm) to the Rényi entropy, the Iα and Pα as macroscopic variables can determine a fractal origin of images by plotting Iα=fI(α) and Pα=fP(α) spectra. If we deal with an image multifractality, the dependency Iα=fI(α) or the dependency Pα=fP(α) shows a peak. In case of a unifractality, these dependences are monotonous. It is demonstrated in Figure 4 and Figure 5. There can be no doubts that the origin of the simulated Belousov–Zhabotinsky reaction (Figure 4) is multifractal. This statement is further strengthened by the courses of the dependencies Iα=fI(α) and Pα=fP(α), where we can see peaks with maxima at α(1,2). On the contrary, a pair of images in Figure 5 (moving toys of cars) is a mixture of the objects of different fractal origin. In this case, whereas the course of fI(α) is monotonous and thus shows a unifractal characteristics, the dependence fP(α) has a maximum at α=0.6 and thus demonstrates some multifractal features in the image. This is due to the fact that, since each information contribution is counted only once, the Pα is more sensitive to the phenomena, which occur less frequently in the image. The monotonic course of the Pα would be achieved only when a sequence of time-evolved Euclidian objects was transformed into the values Ωα(lm).

Figure 4.

Figure 4

The Iα, Pα, and Ωα for a pair of multifractal grayscale images. I. The Iα and Pα spectra, II. 8-bit visualization of Ωα-values for α={0.99;2.0}.

Figure 5.

Figure 5

The Iα, Pα, and Ωα for a pair of real-life grayscale images. I. the Iα and Pα spectra; II. 8-bit visualization of Ωα-values for α={0.99;2.0}.

As mentioned in Section 2.2, the variables Iα and Pα measure absolute information change between a pair of images and characterize a similarity between these images. Therefore, these variables can find a practical utilization in auto-focusing in both light and electron digital microscopy. The in-focus object can be defined as an image with the global extreme of Iα or Pα. In other characteristics, this image fulfils the Nijboer–Zernike definition [24]: it is the smallest and darkest image in light or electron diffraction or the smallest and brightest image in light fluorescence (Section 3.3).

3.2. Image Filtering and Segmentation

Segmentation is a type of filtering of specific features in an image. The parameter α and the related value of Ωα(lm) enable us to filter the parts of two consecutive images, which are either stable or differently variable in time. This can be employed in a 3D image reconstruction by thresholding and joining Ωα(lm) = 0 from two consecutive images or in image tracking via thresholding of the highest and lowest Ωα(lm) in a first image and the following image, respectively.

This is illustrated using simple examples in Figure 4 and Figure 5 where the highest (red-coded) and lowest (blue-coded) values of the Ωα(lm) show the position of the object in the second and the first image of the image sequence, respectively. Compared with the Ω0.99(lm), the variance between the extremes of the Ω2.00(lm) is wider and the number of points Ω2.00(lm)=0 is lower.

In digital light transmission microscopy, this mathematical method enabled us to find time stable intracellular objects inside live mammalian cells from consecutive pixels that fulfilled the equality Ωα(lm)=0 for α=4.00 [6] or α=5.00 [7]. In these cases, the high value of α ensured merging rare points in the image, suppressing the camera noise that was reflected in the images and, thus, modelling the shape of organelles. The rest of image escaped the observation. In the next paper [25], this method was extended to widefield fluorescent data.

As in the case of the Point Information Gain [2], the process of image segmentation of objects of a certain shape can be further improved by usage of the surroundings of this shape from which the intensity histogram is created for each pixel in the image.

3.3. Clustering of Image Sets

Finally, we used the Point Divergence Gain to detect the most relevant information contained in a sequence of images, capturing, e.g., an experiment. For this end, we used Iα or Pα as quantities of information change in the consecutive images and applied the clustering methods on them. The values of Iα or Pα are small numbers (Section 2.1). Due to the computation rounding of small numbers of the Iα and the Pα and for a better characterization of the image multifractality, in clustering, we use α-dependent spectra of these variables than a sole number at one α.

The dependence of the label of the cluster on the order of the image in the series is the smoothest for joint vectors [Iα,Pα]. The similarity of these vectors (and thus images as well) is described in a space of principal components, e.g., [26], and classified by standard clustering algorithms such as k-means++ algorithm [27]. In comparison to the entropies and entropy densities related to the Γα(i), the clustering using the Iα and the Pα is more sensitive to changes in the patterns (intensities) and does not require other specification of images by local entropies computed from a specific type of surroundings around each pixel.

The described clustering method was examined on z-stacks obtained using light microscopy. The z-stacks were classified into 2–6 clusters (groups) when patterns of each image was described by 26 numbers, i.e., by vectors [Iα,Pα] at 13 α (Figure 6a and Figure 7a). These clusters were evaluated on the basis of the sizes of intensity changes between images. These five classification graphs of the gradually splitting clusters (Figure 6a and Figure 7a, middle) further demonstrate the mutual similarity among the micrographs in each data series. The typical (middle) image of each cluster is shown in Figure 6b and Figure 7b.

Figure 6.

Figure 6

The results of clustering of a z-stack of grayscale microscopic images of a microring obtained using a fluorescence mode. (a) the dependencies of (upper) the Pα and (lower) the Iα vs. order of the image in the z-stack for α={0.5;0.99;2.0;4.0} and (middle) clustering (k-means, squared Euclidian distance, 2–6 groups) of the z-stack using connected spectra [Iα, Pα] for α={0.1;0.3;0.5;0.7;0.99;1.3;1.5;1.7;2.0;2.5;3.0;3.5;4.0}; (b) the typical (middle) group’s images for clustering into five groups (in (a), middle). The original 12-bit images are visualized in 8 bits using the Least Information Loss conversion [23].

Figure 7.

Figure 7

The results of clustering of a z-stack of grayscale microscopic images of a microring obtained using a diffraction mode. (a) the dependencies of (upper) the Pα and (lower) the Iα vs. order of the image in the z-stack for α={0.5;0.99;2.0;4.0} and (middle) clustering (k-means, squared Euclidian distance, 2–6 groups) of the z-stack using connected spectra [Iα, Pα] for α={0.1;0.3;0.5;0.7;0.99;1.3;1.5;1.7;2.0;2.5;3.0;3.5;4.0}; (b) the typical (middle) group’s images for clustering into 5 groups (in (a), middle). The original 12-bit images are visualized in 8 bits using the Least Information Loss conversion [23].

Firstly, we shall deal with a z-stack with 1057 images of a microring obtained using a widefield fluorescent microscope. The results of clustering illustrate a canonically repetitive properties of the so-called point spread function as the image of the observed object goes to and from its focus. In this case, the image group containing the real focus of the maximal Iα and Pα at low α (Section 3.1) is successfully determined by clustering into two clusters (Figure 6a). However, we will aim for a description of the results for five clusters. The central Cluster 5 (94 images) can be called an object’s focal region with image levels where parts of the object have their own focus. The in-focus cluster is asymmetrically surrounded by Cluster 4 (131 and 53 images below and above Cluster 5, respectively), which was set on the basis of the occurrence of the lower peaks of Iα and Pα at low α. Cluster 3 (190 and 150 images below and above the focus, respectively) is typical of constant Iα and Pα for all α. Cluster 2 contains img. 176–214 and the last 126 images. These images are characteristic of constant Iα and decreasing/increasing Pα at α2. Cluster 1 (the first 175 images) is prevalently dominated by increasing Iα and decreasing Pα at high α.

Before the calculation of the Iα and Pα, the undesirable background intensities were removed from the images obtained using optical transmission microscopy. The rest of each image was rescaled into 8 bits (Section 4.2). The results of clustering of these images (Figure 7a) are similar to fluorescent data (Figure 6a). The light transmission point spread function is symmetrical around its focus as well but the pixels at the same x,y-positions below and above the focus have opposite, dark vs. bright, intensities. Furthermore, the transitional regions between the clusters are longer than for the fluorescent data. The central, in-focus, part of the z-stack (img. 427–561 in Cluster 4) with the highest peaks of Iα and Pα is unambiguously separated using four clusters. The focus itself lies at the 505th image. This central part of the z-stack is surrounded by eight groups of images which were, due to their similarity, objectively classified into three clusters. Cluster 1 was formed by images 1–78, 376–426, and 562–661. These images show peaks of middle values of the Iα and Pα. Images 79–153, 292–375, and 662–703 were classified into Cluster 2 (dominated by the local minimum of the Iα at α<1). Cluster 3 is related to the images with the lowest values of the Iα together with the lowest values and local peaks of the Pα for α<1 and for α<1, respectively. This cluster contains images 154–291 and the last 537 images of the series.

Let us mention that, in the clustering process, the Iα and Pα can recognize outliers such as incorrectly saved images or images with illumination artifacts.

4. Materials and Methods

4.1. Processing of Typical Histograms

For the Cauchy, Lévy, Gauss, and Rayleigh distributions, dependences of the Ωα(lm) on the number of elements in bins l and m were calculated for α = {0.1, 0.3, 0.5, 0.7, 0.99, 1.3, 1.5, 1.7, 2.0, 2.5, 3.0, 3.5, 4.0} using a pdg_histograms.m Matlab® 2014 script (Mathworks, Natick, MA, USA). The following probability density functions f(x) were studied:

  1. Lévy distribution:
    f(x)=round10cexp12x2πx3,xN,x[1,256],c{5,7},x[1,85],c=3, (19)
  2. Cauchy distribution:
    f(x)=round10c1π1+x2,xZ,x[127,127],c=7,x[44,44],c=3.5, (20)
  3. Gauss distribution:
    f(x)=round10cexpx22σ2σ2π,xZ,x[4,4],c=4,σ=1,x[29,29],c=3,σ=10,x[36,36],c=4,σ=10,x[64,64],c=10,σ=10, (21)
  4. Rayleigh distribution:
    f(x)=round10cxb2expx22b2,xN,x[1,108],c=10,b=16. (22)

In Figure 1, the Cauchy and Lévy distributions at c = 7 and the Gauss distribution at parameters c = 10 and σ = 10 are depicted.

4.2. Image Processing and Analysis

Image analysis based on calculation of the Ωα(lm), Iα, and Pα is demonstrated on five standard grayscale multi-image series (Table 1). All images were processed using Whole Image mode in an Image Info Extractor Professional software (Institute of Complex Systems, FFPW, USB, Nové Hrady, Czech Republic). A pair of images 5000–5001 of a simulated Belousov–Zhabotinsky (BZ) reaction and a pair of images motion01.512–motion02.512 were recalculated for 40 values of α = {0.1, 0.2, ..., 0.9, 0.99, 1.1, 1.2, ..., 4.0}. The rest of series were processed for 13 values of α = {0.1, 0.3, 0.5, 0.7, 0.99, 1.3, 1.5, 1.7, 2.0, 2.5, 3.0, 3.5, 4.0}. The transformation at 13 α was followed by clustering of the matrices [Pα, Iα] vs. Img. by k-means method (squared Euclidian distance metrics). Due to a high data variance in the BZ simulation, the clustering was preceded by the z-score standardization of the matrices over α. The resulted indices of clusters were reclassified to be consecutive (i.e., the first image of the series and the first image of the following group are classified into gr. 1 and 2, respectively, etc.).

5. Conclusions

In this paper, we derived novel variables from the Rényi entropy—a Point Divergence Gain Ωα(lm), a Point Divergence Gain Entropy Iα, and a Point Divergence Gain Entropy Density Pα. We have discussed their theoretical properties and made a brief comparison with the related quantity called Point Information Gain Γαi [2]. Moreover, we have shown that the Ωα(lm) and related quantities can find their applications in multidimensional data analysis, particularly in video processing. However, due to element-by-element computation, we can characterize time-spatial (4-D) changes much more sensitively than using, e.g., the previously derived Γαi. The Ωα(lm) can be considered as a microstate of the information changes in the space-time. However, the Ωα(lm), Iα, and Pα show a property that is similar to the Γαi and its relative macroscopic variables. Due to the derivation from the Rényi entropy, they are good descriptors of multifractility. Therefore, they can be utilized to characterize patterns in datasets and to classify the (sub)data into groups of similar properties. This has been successfully utilized in clustering of multi-image sets, image filtration, and image segmentation, namely in microscopic digital imaging.

Acknowledgments

This work was supported by the Ministry of Education, Youth and Sports of the Czech Republic—projects CENAKVA (No. CZ.1.05/2.1.00/01.0024), CENAKVA II (No. LO1205 under the NPU I program), the CENAKVA Centre Development (No. CZ.1.05/2.1.00/19.0380)—and from the European Regional Development Fund in frame of the project Kompetenzzentrum MechanoBiologie (ATCZ133) in the Interreg V-A Austria—Czech Republic programme. J.K. acknowledges the financial support from the Czech Science Foundation Grant No. 17-33812L and the Austrian Science Fund, Grant No. I 3073-N32.

Author Contributions

Renata Rychtáriková is the main author of the text and tested the algorithms; Jan Korbel is responsible for the theoretical part of the article; Petr Macháček is the developer of the Image Info Extractor Professional software; Dalibor Štys is the group leader who derived the point divergence gain. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  • 1.Rychtáriková R. Clustering of multi-image sets using Rényi information entropy. In: Ortuño F., Rojas I., editors. Bioinformatics and Biomedical Engineering (IWBBIO 2016) Springer; Cham, Switzerland: 2016. pp. 527–536. (Lecture Notes in Computer Science Series). [Google Scholar]
  • 2.Rychtáriková R., Korbel J., Macháček P., Císař P., Urban J., Štys D. Point information gain and multidimensional data analysis. Entropy. 2016;18:372. doi: 10.3390/e18100372. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Jizba P., Kleinert H., Shefaat M. Rényi’s information transfer between financial time series. Phys. A Stat. Mech. Appl. 2012;391:2971–2989. doi: 10.1016/j.physa.2011.12.064. [DOI] [Google Scholar]
  • 4.Jizba P., Korbel J. Multifractal diffusion entropy analysis: Optimal bin width of probability histograms. Phys. A Stat. Mech. Appl. 2014;413:438–458. doi: 10.1016/j.physa.2014.07.008. [DOI] [Google Scholar]
  • 5.Jizba P., Korbel J. Modeling Financial Time Series: Multifractal Cascades and Rényi Entropy. In: Sanayei A., Zelinka I., Rössler O., editors. ISCS 2013: Interdisciplinary Symposium on Complex Systems. Springer; Berlin/Heidelberg, Germany: 2014. (Emergence, Complexity and Computation Series). [Google Scholar]
  • 6.Rychtáriková R., Náhlík T., Smaha R., Urban J., Štys D., Jr., Císař P., Štys D. Multifractality in imaging: Application of information entropy for observation of inner dynamics inside of an unlabeled living cell in bright-field microscopy. In: Sanayei A., Zelinka I., Rössler O.E., editors. ISCS14: Interdisciplinary Symposium on Complex Systems. Springer; Berlin/Heidelberg, Germany: 2015. pp. 261–267. [Google Scholar]
  • 7.Rychtáriková R., Náhlík T., Shi K., Malakhova D., Macháček P., Smaha R., Urban J., Štys D. Super-resolved 3-D imaging of live cells’ organelles from bright-field photon transmission micrographs. Ultramicroscopy. 2017;179:1–14. doi: 10.1016/j.ultramic.2017.03.018. [DOI] [PubMed] [Google Scholar]
  • 8.Chevalier C., Bect J., Ginsbourger D., Vazquez E., Picheny V., Richet Y. Fast parallel kriging-based stepwise uncertainty reduction with application to the identification of an excursion set. Technometrics. 2014;56:455–465. doi: 10.1080/00401706.2013.860918. [DOI] [Google Scholar]
  • 9.Eidsvik J., Mukerji T., Bhattacharjya D. Value of Information in the Earth Sciences: Integrating Spatial Modeling and Decision Analysis. Cambridge University Press; Cambridge, UK: 2015. [Google Scholar]
  • 10.Helle K.B., Pebesma E. Optimising sampling designs for the maximum coverage problem of plume detection. Spat. Stat. 2015;13:21–44. doi: 10.1016/j.spasta.2015.03.004. [DOI] [Google Scholar]
  • 11.Jizba P., Arimitsu T. The world according to Rényi: Thermodynamics of multifractal systems. Ann. Phys. 2004;312:17–59. doi: 10.1016/j.aop.2004.01.002. [DOI] [Google Scholar]
  • 12.Rényi A. On measures of entropy and information; Proceedings of the Fourth Symposium on Mathematical Statistics and Probability; Berkeley, CA, USA. 20 June–30 July 1961; pp. 547–561. [Google Scholar]
  • 13.Kullback S., Leibler R.A. On information and sufficiency. Ann. Math. Stat. 1951;22:79–86. doi: 10.1214/aoms/1177729694. [DOI] [Google Scholar]
  • 14.Csiszár I. I-divergence geometry of probability distributions and minimization problems. Ann. Probab. 1975;3:146–158. doi: 10.1214/aop/1176996454. [DOI] [Google Scholar]
  • 15.Harremoes P. Interpretations of Rényi entropies and divergences. Phys. A Stat. Mech. Appl. 2006;365:5–62. doi: 10.1016/j.physa.2006.01.012. [DOI] [Google Scholar]
  • 16.Van Erven T., Harremoes P. Rényi divergence and Kullback-Leibler divergence. J. Latex Class Files. 2007;6:1–24. doi: 10.1109/TIT.2014.2320500. [DOI] [Google Scholar]
  • 17.Van Erven T., Harremoes P. Rényi divergence and majorization; Proceedings of the IEEE International Symposium on Information Theory; Austin, TX, USA. 13–18 June 2010; pp. 1335–1339. [Google Scholar]
  • 18.Shannon C.E. A mathematical theory of communication. ACM SIGMOBILE Mob. Comput. Commun. 2001;5:3–55. doi: 10.1145/584091.584093. [DOI] [Google Scholar]
  • 19.Volume 1: Textures. [(accessed on 31 January 2018)]; Available online: http://sipi.usc.edu/database/database.php?volume=textures&image=61#top.
  • 20.Štys D., Jizba P., Zhyrova A., Rychtáriková R., Štys K.M., Náhlík T. Multi-state stochastic hotchpotch model gives rise to the observed mesoscopic behaviour in the non-stirred Belousov-Zhabotinky reaction. arXiv. 2016. [(accessed on 31 January 2018)]. Available online: https://arxiv.org/abs/1602.03055.1602.03055
  • 21.Štys D., Náhlík T., Zhyrova A., Rychtáriková R., Papáček Š., Císař P. Model of the Belousov-Zhabotinsky reaction. In: Kozubek T., Blaheta R., Šístek J., editors. Proceedings of the International Conference on High Performance Computing in Science and Engineering; Soláň, Czech Republic. 25–28 May 2015; Cham, Switzerland: Springer; 2016. pp. 171–185. [Google Scholar]
  • 22.Štys D., Štys K.M., Zhyrova A., Rychtáriková R. Optimal noise in the hodgepodge machine simulation of the Belousov-Zhabotinsky reaction. arXiv. 2016. [(accessed on 31 January 2018)]. Available online: https://arxiv.org/abs/1606.04363.1606.04363
  • 23.Štys D., Náhlík T., Macháček P., Rychtáriková R., Saberioon M. Least Information Loss (LIL) conversion of digital images and lessons learned for scientific image inspection. In: Ortuño F., Rojas I., editors. Proceedings of the International Conference on Bioinformatics and Biomedical Engineering; Granada, Spain. 20–22 April 2016; Cham, Switzerland: Springer; 2016. pp. 527–536. [Google Scholar]
  • 24.Braat J.J.M., Dirksen P., van Haver S., Janssen A.J.E.M. Extended Nijboer-Zernike (ENZ) Analysis & Aberration Retrieval. [(accessed on 12 December 2017)]; Available online: http://www.nijboerzernike.nl.
  • 25.Rychtáriková R., Steiner G., Kramer G., Fischer M.B., Štys D. New insights into information provided by light microscopy: Application to fluorescently labelled tissue section. arXiv. 2017. [(accessed on 31 January 2018)]. Available online: https://arxiv.org/abs/1709.03894.1709.03894
  • 26.Jolliffe I.T. Principal Component Analysis. 2nd ed. Springer; New York, NY, USA: 2002. (Springer Series in Statistics). [Google Scholar]
  • 27.Arthur D., Vassilvitskii S. k-means++: The advantages of careful seeding; Proceedings of the 18th ACM-SIAM; Philadelphia, PA, USA. 7–9 January 2007; pp. 1027–1035. [Google Scholar]

Articles from Entropy are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES