Abstract
This study is based on an extension of the concept of joint entropy of two random variables to continuous functions, such as backscattered ultrasound. For two continuous random variables, X and Y, the joint probability density is ordinarily a continuous function of x and y that takes on values in a two dimensional region of the real plane. However, in the case where and are both continuously differentiable functions, X and Y are concentrated exclusively on a curve, , in the plane. This concentration can only be represented using a mathematically “singular” object such as a (Schwartz) distribution. Its use for imaging requires a coarse-graining operation, which is described in this study. Subsequently, removal of the coarse-graining parameter is accomplished using the ergodic theorem. The resulting expression for joint entropy is applied to several data sets, showing the utility of the concept for both materials characterization and detection of targeted liquid nanoparticle ultrasonic contrast agents. In all cases, the sensitivity of these techniques matches or exceeds, sometimes by a factor of two, that demonstrated in previous studies that employed signal energy or alternate entropic quantities.
INTRODUCTION
Previous studies have demonstrated the utility of several different entropies for detection of subresolution backscattering structures in both materials characterization and medical ultrasonics in situations where other approaches fail. These include detection of unresolvable near-surface defects,1, 2, 3, 4, 5 diffuse accumulation of subresolution-sized nanoparticle-based ultrasound contrast agents,6, 7, 8, 9, 10 characterization of smooth muscle pathologies,11, 12, 13 and monitoring of ultrasonically induced temperature rise in tissue.14
In the first of these studies, entropy images were applied to detect defects in advanced aerospace materials.1 In particular, detection and imaging of “near-surface” defects in graphite-epoxy composite plates was accomplished using a 5 MHz center frequency transducer.1, 2, 3, 4, 5 In this application, it is not possible to separate the front wall echo from the defect reflection. This presents a significant challenge to conventional approaches for defect detection and imaging. In fact, the case presented in the following text is nearly impossible to discern using the traditional approach. However, the joint entropy image permits visualization of the defect, its boundary, and certain substructures contained in the defect that were not visible to conventional detection methods or even the previously published entropic images.
The bulk of our investigations have focused on medical imaging. These studies have demonstrated the utility of several different entropies for detection of subresolution structures of backscattering tissue: e.g., targeted accumulation of weakly scattering perfluorocarbon nanoparticles in tissues in vivo, and pathological changes in smooth muscle structure arising from muscular dystrophy. These results have been obtained in a variety of animal disease models and in patients as shown in Table TABLE I.. Several features common to these studies deserve special emphasis at the outset:
-
(a)
Studies have been executed using several different imaging systems (Philips HDI 5000, Philips IE33, Vevo 660, Ardent Spark) with transducer arrays covering a frequency range from 2–34 MHz.
-
(b)
In all cases but one,11 the entropic analysis demonstrated greater sensitivity than did signal energy to subresolution features of scattering architecture. In the exceptional case,11 sensitivities were essentially equivalent.
-
(c)
The most recent version of entropy published8, 15 is amenable to real-time implementation.
-
(d)
All results pertaining to detection of targeted nanoparticles were obtained without use of hand-drawn regions of interest. In fact, the entire analysis chain from raw data to image construction to image-based sensitivity assessment was automated and thus objective.
-
(e)
Image subtraction has not been required to detect changes in scattering architecture.
TABLE I.
“Entropic” receiver | Heuristic meaning | Scattering structure | Experimental system |
---|---|---|---|
“Information” contained | Materials characterization, | Graphite/epoxy laminates (Ref. 5) | |
“Information” contained in backscattered RF . | Targeted nanoparticles, smooth muscle cells | MDA 435 (Ref. 16), B16 | |
VX2,MDx mice (Ref. 11) | |||
Precancer models (Ref. 7) | |||
Humans (Ref. 12) | |||
“Weighted-information” contained in backscattered RF: (note: ). | Targeted nanoparticles, smooth muscle cells | MDA 435, B16 Precancer models (Ref. 17) | |
Limiting form of as . More sensitive than , . | Targeted nanoparticles, | MDA 435 (Ref. 8), B16 (Ref. 10) | |
smooth muscle cells | Humans | ||
Precancer models (Ref. 15) |
All of these points suggest the clinical utility of this approach, which might proceed along lines similar to that employed in Doppler imaging systems, where the conventional B-mode image is color-coded according to the blood cell velocity to present a combined B-mode/velocity image; similarly, a B-mode/entropy image could be made as well.9
All of these investigations required acquisition of thousands of radio frequency (RF) signals, . For each of these captured waveforms, viewed as a sampled version of a differentiable function, the probability density function of values was computed using the digitized values of . Distributions of values (or densities of distributions of values) are typically used for analysis of random variables, and all differentiable functions are, in fact, legitimate random variables. However, in the general case, it is not possible to compute the probability density function of a random variable solely from its values. For example, to determine the probability density function for noise in an amplifier, the noise source must be sampled and the distribution of sampled values subsequently fit to an assumed functional form, e.g., Gaussian or Poisson. In our case, the knowledge that the random variable is differentiable permits direct computation of its probability density function without the need to assume and fit a model.2, 5, 7 While this can be a mathematically and computationally demanding task, previous applications of this approach to analysis of acoustic signals have repeatedly demonstrated its utility. The present study extends these results by developing joint entropy, which we will denote by , to analyze the same type of data. This approach requires computation of the joint probability density of two (differentiable) functions: , representing a segment of backscattered radio frequency ultrasound, and a reference signal, . After carefully deriving a suitable expression for , we show that it provides further improvements in sensitivity over those obtained in previous studies that were based on other entropies.
APPROACH
Our goal is to further improve the ability to detect weak scatterers in strongly scattering environments, as in the case where liquid nanoparticles are targeted to cancer molecular epitopes or in material defects in advanced aerospace materials. Although prior entropy measures have been proposed, we expect the current approach to demonstrate significant enhancement in detection in real-time applications that could be ported to existing imaging systems.
Background
This study is based on an extension of the concept of joint entropy of two random variables to continuous functions, such as backscattered RF, and a reference signal, such as a reflection from a weak reflector. For two continuous random variables, X, Y with joint probability density function , the joint entropy is
(1) |
Ordinarily is a continuous function of x and y that takes non-zero values in a two dimensional region of the real plane. However, in the case where and are both continuously differentiable functions, the random variables X and Y are concentrated exclusively on a curve, , in the plane as shown in Fig. 1. Thus the densities of their values cannot be represented using conventional functions as was done in previous studies of only one differentiable random variable.1, 2, 3, 4, 5, 7 Instead, this concentration can only be represented using a mathematically “singular” object like a series of Dirac Delta functions placed on with infinitesimal spaces in between them. Technically, such an object is not a function; in precise terms, must be represented as a Schwartz distribution.16 Its use for imaging requires that the Schwartz distribution be coarse-grained, i.e., integrated, on a fine grid of small cells as shown in Fig. 2. The outcome of this course-graining, as shown in Appendix B, is the conversion of the probability density (which is a Schwartz distribution) into a bona fide probability distribution function such that the probability of X and Y being in a particular cell, which is the generalization of that we seek, is given by the time that “spends” in the cell [Eq. B28 in the following text]. This is what we would expect intuitively if we imagined a test particle traversing the curve, , with velocity , and wanted to know the probability of finding it in a particular cell. Having this result, it is possible to extract a well-defined finite component from the coarse-grained version of Eq. 1, which we will denote by , in terms of and .
As with the entropies previously investigated, is applied in a moving window analysis. However, its computation requires two input waveforms. The first, , is exactly the same as in previous investigations. However, choice of the second reference waveform, , constitutes an additional degree of freedom that may be exploited to increase sensitivity of the entropic analysis. Discovery of strategies for identification of the optimum reference waveform is one of the eventual goals of our research. In this study, we present the results of preliminary investigations that indicate a reflection of the imaging system's interrogating pulse from a weak reflector, such as a water-agarose interface, constitutes a good initial guess for the optimum. To summarize our findings:
-
(a)
analysis may be applied in the same situations as previous entropies, , , . In fact, all of the data for this study have been analyzed previously with the use of at least one of these entropies. In this study, we repeat several of these analyses using images.
-
(b)
analysis is amenable to real-time implementation, as was analysis.
-
(c)
In all cases studied so far, analysis has proven to be more sensitive than the best previous results obtained by other entropies. For example, preliminary application of this new entropy to the MDA 435 data shown in Figs. 7 and 8 shows a roughly twofold increase in peak confidence ratios.
Conventions and calculation
To fix terminology, suppose that we are given two functions, and , assumed to map into (i.e., , , , are all strictly less than one). We are interested in computing the joint entropy,
(2) |
where is their joint density, defined on the domain . It turns out that for the types of functions and that we will consider (intuitively thought of as “well-behaved”), is not a real-valued function but a (Schwartz) distribution so that Eq. 2 is not well-defined.
To get around this difficulty, we employ a “coarse-grained” approximation to Eq. 2 obtained, by dividing Ω into cells, where as shown in Fig. 2, and integrating over each cell. We denote the cells intersecting at initial time by and index them by (M will be discussed later). For subsequent discussion, we define the distance traversed along when it enters the cell by . Figure 9 illustrates our terminology. The final step in the coarse-graining process is replacement of by the quantity defined by
(3) |
where unless passes through the cell, in which case (as mentioned in the preceding text and as we shall show in the following text), is equal to the time spends in the cell.
Our ultimate goal is to obtain an expression for in terms of and because these quantities are experimentally accessible. As is carefully shown in Appendixes A–C, the following relation is true:
(4) |
It is a remarkable fact that derivation of Eq. 4, which uses the ergodic theorem, also requires the fact that the set of points for which is rational comprises a set of measure zero (in the set of real numbers). Thus the relation is essentially derived in a noncomputable setting. Nevertheless, the end result, Eq. 4, may be evaluated numerically. The full details are provided in the appendixes.
RESULTS
Noise simulation
The effect of noise on the input waveform is shown in Fig. 3. To produce this plot, Gaussian distributed noise of different amplitude was added to and (simulations using uniformly distributed noise have also been performed yielding essentially similar results). The range of noise levels spans 20 to 120 dB. At each noise level, 1000 waveforms and reference waveforms were created, computed, and the average and standard error computed. Each is shown in the plot, with its standard error bar, which in all but one case is too small to see. We notice that for signal-to-noise ratios above 80 dB, it is possible to obtain three-digit agreement with the noise-free answer [which was obtained using a separate mathematica implementation of Eq. 4 that was also used to double check the correctness of our C programming language implementation of the same equation]. However, for signal-to-noise ratios between 40 and 60 dB, which is the relevant range for our experimental data, agreement to at most one or two digits is achievable. This limitation shows that the numerical precision of our composite integrator is more than enough for analysis of experimental data, which always contain noise.
Materials characterization: Defect detection
The first application of entropy imaging was to materials characterization;17 in particular, to detection and sizing of near-surface defects in graphite/epoxy aerospace structures.1, 2, 3, 4, 5 Figure 4 shows some previously published (top row)5 and newer results (middle and bottom row) for imaging of a near-surface “resin-rich” defect in a thin (4 mm thick) graphite/epoxy composite. The bottom left corner shows a schematic of the specimen, which was fabricated by removing a 3.81 cm diameter circular section from the second ply prior to fabrication. The specimen was then autoclaved following standard manufacturing practice. This forced excess epoxy into the circular void and created the resin-rich region. During the fabrication procedure, additional resin penetrated the inter-laminar region bordering the circular void. The result is an oval shaped defect centered on the original circular cutout.
This specimen was scanned on a 101 × 101 point grid using a 5 MHz broadband transducer with the backscattered RF acquired at 8-bit digital resolution and a 100 MHz sampling rate. The width of the interrogating pulse was therefore far too wide to permit gating of front-wall reflection from that of the defect, which lay within one ply (0.25 mm) of the surface. While the top row results5 were originally published as part of a larger study of six different types of near-surface defects, results in the resin-rich case were less conclusive than hoped for because, as the top row shows, the best understood images failed to detect the boundaries of the defect region. While other thermodynamic receivers used in that study5 did delineate the defect boundaries, their physical meaning was not readily apparent.
The middle and bottom rows of Fig. 4 show that this situation has been rectified by the newer signal processing algorithms described in the preceding text. One reason for the improvement is the use of optimal smoothing splines, which are used for global noise suppression.18 Additionally, only a gated portion of the backscattered RF (128 points long for this figure) was used. Gating enables exclusion of baseline regions that contain no useful data. While and images show the defect with relatively good contrast, the boundary between defect and surrounding plate “fades-out” at the “3 o'clock” position. This can be significant, for instance, if the images are to be used as input for automatic feature extraction software. On the other hand, at all points on its perimeter, the image has greater contrast between defect boundary and surrounding “defect-free” plate (matching the clarity of the less understood images shown in earlier work (e.g., those labeled Z, , , in Fig. 6 of that work5). We also observe a circular outline within the defect region. This is the diameter of the circular region removed from the second ply to produce the resin-rich region during the curing process used to fabricate the graphite/epoxy laminate. During that process, resin accumulated not only in the circular region but was also pushed out around its boundaries to create the oblong appearance of the defect. We note that the outline of the original circle is less evident in and images and is not visible in any of the images from the previous study.5 The reference was a reflection from a stainless steel reflector placed at the 4-in. focal plane of the transducer. Because the experimental system used for data acquisition has been verified to be stable to at least the −40 decibel level, this reference was acquired once, prior to scanning of the entire collection of graphite/epoxy specimens. We observe that this is equivalent to the use of a reference acquired from the weak reflector that we use in biological studies. The operational criterion for choice of reflector is that it roughly matches the reflection coefficient of the scatterers in the specimen trace .
Medical imaging: Targeted tumor imaging
Previously, we reported on application of entropic quantifies and for detection of subtle changes in RF backscatter induced by accumulation of targeted nanoscale contrast enhancers in tumor neovasculature6, 8 or pre-cancerous tissue.7 We have used the same raw RF from these studies to prepare images, which were then analyzed for evidence of nanoparticle accumulation. As the materials and methods for these studies are described in previous publications,8 we provide only a brief summary of data acquisition and analysis in each case and then present a comparison of the and analyses.
MDA 435 tumors implanted in athymic nude mice
Human MDA 435 cancer cells were implanted in the inguinal fat pads of 15 athymic nude mice between 19 and 20 days prior to acquisition of data. Five of these animals were injected with -targeted nanoparticles, five were injected with nontargeted nanoparticles, and five were injected with saline at a whole body dose of 1 ml/kg. In addition, 15 athymic nude mice not implanted with tumors were imaged in the same region following the same imaging protocol: five were injected with -targeted nanoparticles, five were injected with nontargeted nanoparticles, and five were injected with saline.
RF data were acquired with a research ultrasound system (Vevo 660, Visualsonics, Toronto, Canada) at 0 through 60 min in 5-min intervals after injection. The tumor was imaged with a 35-MHz center frequency single element “wobbler” probe, and the digitized RF data corresponding to single frames were stored on a hard disk for later off-line analysis. The frames consisted of 384 lines of 2048 12-bit words acquired at a sampling rate of 200 MHz using a Gage 12400 digitizer card (connected to the analog-out and sync ports of the Vevo) in a controller PC. Each frame corresponds spatially to a region 1.5 cm wide and 0.8 cm deep. Further description of materials and methods may be found in previous publications.8 The reference trace was obtained by digitizing a reflection of the imaging system's interrogating pulse from a water-agarose interface using the same acquisition parameters. As for the materials study, the long-term stability of the system electronics has been verified better than the 40 dB level. Consequently, one reference trace was acquired prior to beginning the study and used for all subsequent analysis.
A moving window analysis was performed on each waveform by moving a rectangular window (128 points long, 0.64 μs) in 0.08 μs steps (16 points), resulting in 121 window positions within the output data set. A smoothing spline was fit to each window. Issues surrounding signal-to-noise (SNR) ratio, SNR estimation, and robust noise suppression for these data have been discussed in a previous publication.8 The fitting routine also returned an array of first and second derivatives at the locations of any critical points in the window. The arrays were used to compute (or ). This produced an image for each time point in the experiment (i.e., 0, 5,…, 60 min).
The segment of RF within the boxcar, or moving window, is processed to produce a pixel value for an (or ) image. For each mouse used in this study, this is done using RF data acquired at 0, 5,…, 60 min post-injection to produce an image at each time point. For this study, in which the same portion of the anatomy was imaged at successive intervals, our goal was to identify and quantify the accumulation of targeted nanoparticles, which occurs preferentially at targeting sites. This suggests that segmenting the image into “targeted” and “non-targeted” regions will be required as part of the analysis.
One of the chief goals of our research has been to develop objective segmentation algorithms that do not require user selection (e.g., hand drawn regions of interest). Figure 5, displays the steps of such an algorithm graphically. For each mouse used in this study, a histogram of the image pixel values appearing over the entire time course (i.e., 0, 5, 10,…, 60 min) was constructed and normalized to obtain the probability density function (PDF) of these values and then integrated to obtain the cumulative distribution function (CDF). This is shown in the top panel of the figure. Next, pixel values corresponding to “analysis-thresholds” at the lower 2%, 4%, …, 98% of the CDF were then used to segment the images at each time point into two regions corresponding to targeted and nontargeted tissue. The figure shows the segmentation for an example analysis-threshold of 44%. The blue lines, shown in the second panel of the figure, indicate the boundary between targeted (inside the blue boundaries) and non-targeted (outside the blue boundaries) regions. Subsequently, the mean value of pixels in the targeted region were computed as a function of time post-injection. This is indicated in the bottom panel of the figure, in which the mean value of found at each time, denoted in the figure by , i = 1,…, 12, is shown immediately below the thresholded image. Subsequently, the change in these mean values,
(5) |
is computed (). The analysis diagrammed in Fig. 5 was performed for all animals in all groups and the results averaged by group. We will drop the subscript i in the remaining discussion and refer only to the .
, corresponding to a 98% analysis-threshold is plotted in Fig. 6, in panel number one. For the purposes of detection, it is actually the ratio of the mean value to the standard error that is significant. We have plotted an example of these ratios (for the MDA 435 implanted group injected with -targeted nanoparticles) in panel two of Fig. 6. To shorten subsequent discussion of our results, we will define this ratio (confidence) as19
(6) |
As this panel shows, for an analysis-threshold of 98%, the mean value of is at all times at least four standard deviations or more from zero and peaks at 16 standard deviations from the mean at zero time. We have mapped each confidence value to a color scale shown along the x axis of the plot. Panel three of the figure shows the aggregate of all color maps resulting from analysis thresholds between 0% and 98%. These aggregated color maps, or confidence panels, may be used to quickly identify analysis thresholds at which accumulation of targeted nanoparticles is successfully detected by either or imaging and to quantify the sensitivity of each image type.
An inter-receiver comparison is shown in Fig. 7, which is a five-dimensional presentation having three dimensions within each confidence panel (i.e., post-injection time in the vertical direction, analysis-threshold in the horizontal direction, and confidence c in the out-of-plane or color direction), another dimension for animal group (the vertical direction within the array), and a fifth dimension being the masking level for the absolute value of confidence ratio c (the horizontal direction within the array). In the left column are confidence panels for all six groups used in our study as indicated in the figure caption. In the right-hand column are the corresponding confidence panels made using the same RF data and smoothing spline processing parameters.
In spite of the high dimensionality of the display, rapid comparison of or processing is possible, as well as inter-group comparison. For instance, in both columns of the top row, we observe “extensive” (meaning contiguous, wider than three columns over at least half the height of the confidence panel) regions, corresponding to confidence values greater than four in magnitude. The remaining confidence panels (for the control groups), with one exception, have much lower confidence value magnitudes in the same range of analysis-thresholds. Comparison of the confidence panels in the left column () shows that the top confidence panel, corresponding to the MDA 435-implanted group injected with -targeted nanoparticles, generally exhibits confidence ratios with the greatest magnitude. There is an anomalous region of high confidence ratio for the saline group C appearing at 30 min post-injection. Because no nanoparticles were injected into this group, it might be classified as a “false-positive” for the technique. Further research, aimed at understanding and eliminating this effect, e.g., by using different reference traces , is underway. However, we remark that this region is contained in a range of analysis thresholds that does not overlap the range of analysis thresholds for which the confidence ratio of the targeted MDA435-implanted group (row A) is large. Consequently, it possible to define a strategy for separating the groups that eliminates false positives. Moreover, for the top row, the images produce confidence values that are more than twice those obtained using images. Consequently, the criteria: “,” with analysis threshold between 60% and 98% for analysis (44%–62% for analysis), becomes a means of distinguishing the targeted/tumor-implanted group from all others. We observe also that these criteria exhibit the presence of targeted nanoparticles accumulating within five minutes after injection. While the figure shows that we may distinguish the targeted tumor-implanted group from the control groups, the issue of single animal “diagnosis” remains open due to the small cohort of animals studied. The current study is a demonstration of feasibility and suggests the utility of conducting a larger experiment having a proper receiver operator curve analysis that would permit quantification of true positives, true negatives, false positives, and false negatives.
Transgenic K14-HPV16 mouse tumor model
The model used is the transgenic K14-HPV16 mouse that contains human papilloma virus-16 oncoproteins driven by a keratin promoter so that lesions develop in the skin, particularly in the ear. Typically the ears exhibit squamous metaplasia, a pre-cancerous condition, associated with abundant neovasculature that expresses the integrin. Eight transgenic mice20, 21 were treated with 1.0 mg/kg i.v. of either -targeted nanoparticles (n = 4) or nontargeted nanoparticles (n = 4) and imaged dynamically for 1 h using a research ultrasound imager (Vevo 660 40 MHz probe) modified to store digitized RF waveforms acquired at 0, 15, 30, and 60 min. time points. Further details may be found in reference 8. All RF data were processed in the same manner as the MDA 435 data.
The results are shown in Fig. 8, which displays the three confidence panels obtained using images on the left and images on the right. Broad, contiguous, regions of large confidence ratio magnitude only occur in the right side of the panels, corresponding to analysis thresholds of 78% to 98%. Consequently, we have emphasized these portions by enclosing them with dotted lines while placing a semi-transparent layer over the complementary region (which has been displayed in this manner for completeness). Focusing attention on the 78% to 98% confidence thresholds, we see that analysis exhibits twice the sensitivity of analysis for the targeted group. Moreover, analysis, at these analysis thresholds, exhibits greater separation between the targeted group and the saline control group than does analysis. While separation between the targeted group and the non-targeted control is greater for analysis, the panels show that analysis still yields several standard deviations of separation between the two groups.
DISCUSSION AND CONCLUSIONS
In all cases, the sensitivity afforded by analysis of images exceeds that obtained in our best previous analyses of the same data by roughly a factor of 2. The main difference between the previous approach and the current one is employment of a reference trace. The results presented in this study employed a reference, water-path only reflection from a weak reflector, which is the natural choice in many conventional ultrasonic analyses that deconvolve the transfer function of the experimental apparatus from the raw experimental data. In our case, the rationale for application of this reference is based on the relation between entropy, , joint entropy, , and conditional entropy of B given A, for discrete random variables A, and B, which is
(7) |
If A is taken as a reference variable and we compute where is another random variable, then . Because analysis of entropy images described in the preceding text is based on differences between joint entropies, the last equation shows that they can be reinterpreted as differences between conditional entropies, i.e., heuristically at least, we are looking at the differences in information contained in backscattered RF (e.g., either B, or ) given that the tissue was interrogated with a certain reference pulse (A).
The reference trace is also a “knob” that we can “turn” to adjust the sensitivity of the image to specific attributes in the reference waveform. In our heuristic interpretation, we are examining the extent of differences between the information in the reference trace and the backscatter. In other words, we are estimating how much the information content of the interrogating ultrasonic pulse has been changed by the scattering events occurring in the tissue.
In spite of the fact that our choice of reference appears natural, it is conceivable that in other circumstances, a different choice of reference waveform might be more compelling. Thus the choice of reference is to some extent arbitrary. Our response to this fact has been to post-process the resulting images using all analysis thresholds and to determine if there exists a rational set of criterion for differentiating the targeted from control groups. The results of the present study, as well as previous studies,8, 15 advance one strategy for doing so. The investigation of other strategies for choice of an optimum reference given specific detection requirements is the subject of ongoing research.
ACKNOWLEDGMENTS
This study was funded by NIH EB002168, HL042950, and CO-27031 and NSF DMS 0966845. The research was carried out at the Washington University Department of Mathematics and the School of Medicine.
APPENDIX A: CALCULATION OVERVIEW
In this subsection, we derive from Eq. 2 an expression for in terms of the experimentally accessible quantities and . To do this, we claim the following assertions, which will be established in subsequent sections:
-
(a)
Claim i: if enters the cell at time and leaves it at . Otherwise, . This is Eq. B28 in the following text. Consequently, evaluation of Eq. 3 comes down to being able to compute the limiting behavior of as . We will see presently that what we really need to compute is the limiting behavior of .
This leads us to our second assertion:
-
(b)Claim ii: there exists an expression for the limiting behavior of in terms of and , assuming that is irrational. This is essentially Eq. A9 in the following text. Subsequently, we derive the expression for in terms of the experimentally accessible quantities and . The next two subsections will justify claims i and ii. At this point, we observe from Fig. 9 that the distance traversed, , in any member of is given by
(A1) The first term in the sum has the limiting behavior(A3) as . It remains to evaluate the sum(A4) However, we do not know the coordinates of the entry (or exit) points of into the cell, and thus it appears that we lack the necessary information to calculate either or . Nevertheless, if we assume that the portion of the curve within each cell is a straight line segment that starts at the point and has slope(A5) so that it leaves the cell near the point , we may then continue the limiting process by sub-partitioning each cell further into cells as shown in Fig. 10. Given this subdivision, each term in the sum, Eq. A4, must now be replaced by the sum(A6) where the n indexes the cells crossed by as it traverses the larger cell. Thus the sum appearing in Eq. A4 is replaced by a double sum(A7) We further assume that the velocity, , is constant over the entire square as we further subdivide it. Thus the inner sum in Eq. A6 becomes(A8) While it remains true that we still do not know where will enter each of these smaller cells, we have nevertheless made progress because if we imagine stacking all of the cells crossed by on top of each other, as shown in Fig. 11, we will observe that, in almost every case (we will make this statement precise presently), the entry/exit points will be uniformly distributed around the perimeter of the smaller cell. Given this picture, it is relatively easy to see that may be obtained by summing the transit lengths of over the smaller “stacked” cell and that in the limit where these crossings become infinite (i.e., ), the sum may be obtained as the integral of the transit lengths, starting along the boundary of the stacked cell.
The almost every case in the preceding text occurs when the ratio is irrational. The reader might anticipate this fact by considering first the case where the ratio is rational (and the lengths of the sides of the cell is rational, which is the case with our conventions), then it is easy to see that a small set of entry/exit points will be used over and over again, i.e., the trajectory cell crossings will exhibit periodic behavior. It is also relatively easy to see that the number of these entry/exit points grows as the decimal expansion of the grows in length. Consequently, we might anticipate that for irrational , the number of these points is infinite. The ergodic theorem tells us that this is, in fact, true and that they are also uniformly distributed around the perimeter of the cell. Thus the entry point into the original (larger) cell becomes irrelevant because as the cells are made smaller, all entry values occur with equal probability. Actually, while application of the ergodic theorem eliminates that we know where enters a cell, it also introduces a cell dependent scaling factor into our expression for . Consequently, we have traded one missing piece of information for another. However, these scaling factors group together in the course of the calculation can be eliminated from the final expression for [see Eq. C18]. Moreover, if , are continuously differentiable over their range, then the set of points at which is rational forms a set of measure zero, which may be ignored in any integral, so that it is only the cases where is irrational that matter. We provide the precise details in a subsequent section, the end result is:
(A9) |
Summing these over i, we obtain
(A10) |
The last term sums to because the time required to traverse the curve is one. The remaining terms become integrals in the limit where . Collecting all terms
(A11) |
as . Because the form of the singularity in this limit is independent of and , it makes sense to define
(A12) |
We observe that this expression is completely symmetric in and as expected.
Moreover, because
is equal to
(A13) |
Eq. A11 becomes
(A14) |
Before proceeding to the claims, we point out that in this study, we consider only the case where self-intersections of are isolated as shown in Figs. 1, 12, and 14. Inspection of our experimental data shows that this assumption is valid in all but a negligible number of cases, which have been “masked” as described in the following text and excluded from analysis. In the case of complete overlap, where , the joint entropy reduces effectively to entropy published previously.6, 7 An intermediate case, of total overlap for part of and isolated self-intersection on other portions can be handled by breaking up the integrals we will obtain into separate portions.
APPENDIX B: JUSTIFICATION OF CLAIM i
We begin by establishing the joint density as the (Schwartz) distribution obtained from the cumulative joint density function, , for and , While is a conventional function of x and y, is not, so that a course-graining operation on using test functions must be defined. This leads to the main result of this section, Eq. B28. Although this permits computation of the course-grained joint-entropy, we must also verify consistency of this approach with that used to obtain previous entropies, e.g., , , and , which were based on the density function, for a single function .6, 7, 8, 12, 22 This is established by showing that the marginal density functions obtained from are just the single function densities employed previously. The same results are established for the course-grained marginal probabilities.
Calculating using Schwartz distributions
Typically, is defined as the second partial derivative (with respect to x and y) of the joint cumulative density function . However, for the types of random variables, we consider (assumed to be infinitely differentiable) the joint density requires further discussion since, unlike and ,7, 12, 22 it is not a regular function (and is defined in the following text as a distribution in the sense of L. Schwartz). So we must start with the joint cumulative density function, , for , , an example of which is shown in Fig. 1. This is a regular function defined by
(B1) |
where denotes the measure (“length”) of the set. The right side of Fig. 1 has an example calculation of directly from Eq. B1.
On the left side of the figure, an alternative (and eventually more useful) way of looking at is shown. The trajectory defined by
(B2) |
parameterized by time running from , is shown. Also shown is a red box having lower left corner at the point and its upper right corner at . The portions of the trajectory within this rectangular region are colored red, and the entry and exit times of the red portions are labeled to show their correspondence with times in the right-hand side of the figure. The sum of these times is , so that measures the time that the trajectory spends in the red box.
The figure also suggests that the derivative of will exhibit jump discontinuities for certain values of x and y as the number of segments of contained in the red rectangle changes discontinuously, for instance, as the height of the red rectangle is adjusted so that its top moves from below to just above the point labeled . This result may also be anticipated by formally differentiating the integral representation of given in Eq. B4. Consequently, careful definition of will require the use of Schwartz distributions.
If is the Heaviside function
(B3) |
then
(B4) |
We note that the joint density appears only in expressions like
(B5) |
While is not a regular “point” function, we recognize that it can be defined as a Schwartz distribution. Thus we assume that we may freely interchange orders of integration and that if [recall ], we obtain
(B6) |
(B7) |
(B8) |
(B9) |
(B10) |
where the boundary terms vanished because . Using
(B11) |
with
(B12) |
Eq. B10 becomes
(B13) |
Integrating by parts again leads to
(B14) |
(B15) |
where once again the boundary terms vanish because . Using Eqs. B11, B12, Eq. B15 becomes
(B16) |
Combining this result with Eq. B5, we see that, as a Schwartz distribution, obeys the relation
(B17) |
This generalizes the expression for single (differentiable) random variable density function
(B18) |
which we have employed in the definition of as described in previous studies.6, 7, 8, 12, 22
Where “lives” in terms of and
Equation B16 can be visualized using the picture shown on the right-hand panel of Fig. 12. The fact that is represented by a Schwartz distribution means that the “probability” is effectively concentrated on the curve . Thus the current situation differs significantly from typical applications of jointly distributed random variables, which are usually distributed continuously over two-dimensional regions of the real plane.
Marginal distributions
Consistency of the current approach with previous studies requires that the marginal density functions for obtained using Eq. B17 should be identical with the single (differential) random variable density functions of Eq. B18. To check this, we let and compute
(B19) |
But also
(B20) |
(B21) |
and similarly
(B22) |
as required.
Choices of for coarse-graining
The integral equations, Eqs. B17, B18, defining and are all we need to compute the coarse-grained joint probabilities, , and the associated coarse-grained marginal probabilities, and , which we will need as intermediate quantities to obtain .
This is accomplished by choosing an appropriate function to use in Eq. B17 [or for the marginal probabilities: choosing an appropriate to use in Eq. B18]. Let and let be a parameter used to control the sharpness of the edges of test functions , which will be used to effect coarse-graining of the Schwartz distribution and the density functions , and . We partition Ω into small square cells, in size, over which we will compute integrals of product of test functions and , , or . The test function will be chosen so that as , the coarse-graining test functions approach unit height square waves that sample , , or over a small nonzero region. The aim is to simulate the operation of a digitizer that is sampling a one or two-dimensional function. We begin by defining an infinitely differentiable smooth function that “turns on” at as x increases,
(B23) |
and a corresponding smooth function that turns off at as x increases
(B24) |
Their product defines an infinitely differentiable step function,
(B25) |
which approaches a unit height square wave, turning on at and turning off at , as . This function may be used to “sample” a function of one variable, e.g., . Similarly is defined as an infinitely smooth precursor to a square wave that turns on at and turns off at . Their product may be used to define the infinitely differentiable (in x and y) function
(B26) |
which may be used to sample functions (or Schwartz distributions) of two variables at , . Example , , [which are all ] are shown in Fig. 13.
We will also allow the two indices, k associated with the variable and j associated with the variable to run from to ∞.
Course-grained probabilities , , and , , and
Now we have all we need to coarse-grain the densities , and . Suppose that passes only once through the cell, which we will denote by . To coarse-grain over , we compute,
(B27) |
where is defined by Eq. B26. For cells that do not intersect , we define for completeness. Defining to be the length along between and , Eq. B27 may be rewritten as
(B28) |
where is the velocity along the curve at time . If more than one passage through occurs , then all traversal times such that must be added, and we obtain
(B29) |
Discrete marginal probabilities: From the standard definition
The discrete marginal probabilities are defined by
(B30) |
For the course-grained marginal probabilities, we also set if does not intersect any cell with index j (). Similarly we define,
(B31) |
Discrete marginal probabilities: From Eqs. B21, B22, and B25
Alternatively, we may start with Eq. B21 and define
(B32) |
where is given by Eq. B25. Similarly, using Eq. B22,
(B33) |
where is also given by Eq. B25 and , are zero in the same cases described after Eqs. B30, B31. Focusing on Eq. B29, we see from Fig. 14, that the integral will reduce, in the limit where , to the sum of times spent in the cells lying between and , so that we again obtain
(B34) |
and
(B35) |
We note based on previous work that as these are asymptotic to
(B36) |
and
(B37) |
These are just the coarse-grained versions of Eqs. B21, B22 and demonstrate the consistency of computing marginals and then coarse-graining with coarse-graining and then computing marginals.
APPENDIX C: JUSTIFICATION OF CLAIM ii
While Eq. B28 gives the probabilities needed to compute joint entropies in terms of time, , we must express it in terms of experimentally accessible quantities, which means quantities derived from the backscattered RF, , and the reference signal . The first step in this process is Eq. A1, which relates , to using the velocity in the cell , However, as discussed in the preceding text, we are still faced by an apparently unsolvable problem because we do not know where enters the cell. Moreover, we also must eliminate the coarse-graining parameter, ε from the final expressions for joint entropy, which is accomplished by taking . In this section, we provide the details of our approach beginning with the “stacking” process used above which may be made rigorous using the ergodic theorem23 as we now describe. Thus we suppose that we are given a straight line segment, ℓ, of slope , and length , contained in a cell of dimensions , such as that shown in Fig. 10. We wish to subdivide the cell into sub-cells as also shown in the figure, and compute the limiting form of
(C1) |
where
(C2) |
and the index n keeps track of the cells crossed as traverses the larger cell. We begin, as shown in Fig. 11, by imagining that we translate all the subsegments of ℓ, to the square, as shown in the left hand side of the figure (this is equivalent to the stacking procedure described in the preceding text). The y coordinates of those line segments that start out on the y axis are shown the figure. These are given by
(C3) |
where is the iteration of the mapping
(C4) |
Similarly the x coordinates of those line segments that start out on the x axis are given by
(C5) |
where is the iteration of the mapping
(C6) |
Thus the physical “stacking” procedure is implemented mathematically by iterating the maps and . Because is irrational, we know that both of these mappings are ergodic.23 Hence we have the following relations between sums and integrals involving continuous functions , and iterates of and ,
(C7) |
as and,
(C8) |
as . While it is possible to use Eqs. C7, C8 to obtain Eq. A9, we will not do so. Instead, it is simpler to refer back to Fig. 11 and observe that is the number of times that the line segment crosses the vertical lines , where m is a non-negative integer, and is the number of times that the line segment crosses the horizontal lines , where q is a non-negative integer. Furthermore, we observe that the ergodic theorem tells us, via Eqs. C7, C8, that as , the crossing points become uniformly distributed along the edges of the square in Fig. 11. This means that the random variable y shown in Fig. 15 is uniformly distributed. Thus the variable θ, which is a linear function of y, is also uniformly distributed over its range (the interval ). We will use this implication of the ergodic theorem to obtain Eq. A9.
Relation between and
Let be the horizontal distance traversed and be the vertical distance traversed (so ). Then
(C9) |
where is the largest integer , and is the smallest integer . Similarly
(C10) |
As , we get as .
Calculation of
We have
(C11) |
and
(C12) |
with an error of at most by Eq. C9. Therefore
(C13) |
where we have defined the symbol to simplify calculations in the following text.
Calculation of
We are now ready to calculate
We break the calculation into two cases: and .
Case : We note that is the number of times that the trajectory, , crosses either a vertical or a horizontal line of the sub-grid. (Because α is irrational, there is at most one occasion where it crosses both simultaneously; we can ignore this event as .)
If there are two successive vertical crossings, then
(C14) |
As , if this does not happen, then there is exactly one horizontal crossing in between vertical crossings at n and , and
(C15) |
for some . By the ergodic theorem, θ will tend to a uniformly distributed random variable on as shown in Fig. 15. So letting , the contribution from Eq. C15 is, on average,
(C16) |
or
(C17) |
Of the vertical crossings, of them will have a horizontal crossing “sandwiched” in between and contribute the quantity shown in Eq. C17 on average; the remainder will contribute . Thus we obtain
(C18) |
Case : If , we repeat the calculation with x and y “switched.” The calculation proceeds exactly as before although α must be replaced by .
Calculation summary: These results may be concisely written as
(C19) |
or, recalling that ,
(C20) |
which is Eq. A9. This completes the derivation of .
References
- Hughes M., “Analysis of ultrasonic waveforms using Shannon entropy,” Proc.-IEEE Ultrason. Symp. 2, 1205–1209 (1992). 10.1109/ULTSYM.1992.275884 [DOI] [Google Scholar]
- Hughes M., “Analysis of digitized waveforms using Shannon entropy,” J. Acoust. Soc. Am. 93, 892–906 (1993). 10.1121/1.405451 [DOI] [Google Scholar]
- Hughes M., “NDE imaging of flaws using rapid computation of Shannon entropy,” Proc.-IEEE Ultrason. Symp. 2, 697–700 (1993). [Google Scholar]
- Hughes M., “Analysis of digitized waveforms using Shannon entropy. II. High-speed algorithms based on Green's functions,” J. Acoust. Soc. Am. 95, 2582–2588 (1994). 10.1121/1.409828 [DOI] [Google Scholar]
- Hughes M. S., Marsh J. N., Hall C. S., Savy D., Scott M. J., Allen J. S., Lacy E. K., Carradine C., Lanza G. M., and Wickline S. A., “Characterization of digital waveforms using thermodynamic analogs: Applications to detection of materials defects,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control 52, 1555 – 1564 (2005). 10.1109/TUFFC.2005.1516028 [DOI] [PubMed] [Google Scholar]
- Hughes M., Marsh J., Woodson A., Lacey E., Carradine C., Lanza G. M., and Wickline S. A., “Characterization of digital waveforms using thermodynamic analogs: Detection of contrast targeted tissue in MDA 435 tumors implanted in athymic nude mice,” Proc.-IEEE Ultrason. Symp. 1, 373–376 (2005). [DOI] [PubMed] [Google Scholar]
- Hughes M. S., McCarthy J. E., Marsh J. N., Arbeit J. M., Neumann R. G., Fuhrhop R. W., Wallace K. D., Znidersic D. R., Maurizi B. N., Baldwin S. L., Lanza G. M., and Wickline S. A., “Properties of an entropy-based signal receiver with an application to ultrasonic molecular imaging,” J. Acoust. Soc. Am. 121, 3542–3557 (2007). 10.1121/1.2722050 [DOI] [PubMed] [Google Scholar]
- Hughes M., Marsh G. M., Lanza, S. A. Wickline, J. McCarthy, B. Wickerhauser, V. annd Maurizi, and, W. K., “Improved signal processing to detect cancer by ultrasonic molecular imaging of targeted nanoparticles,” J. Acoust. Soc. Am. 129, 3756–3767 (2011). 10.1121/1.3578459 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marsh J. N., McCarthy J. E., Wickerhauser M., Arbeit J. M., Fuhrhop R. W., Wallace K. D., Lanza G. M., Wickline S. A., and Hughes M. S., “Application of real-time calculation of a limiting form of the Renyi entropy for molecular imaging of tumors,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control 57, 1890–1895 (2010). 10.1109/TUFFC.2010.1630 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Marsh J. N., Wallace K. D., Lanza G. M., Wickline S. A., Hughes J. E., and McCarthy M. S., “Application of a limiting form of the Renyi entropy for molecular imaging of tumors using a clinically relevant protocol,” Proc.-IEEE Ultrason. Symp. 1, 53–56 (2010). [Google Scholar]
- Wallace K. D., Marsh J., Baldwin S. L., Connolly A. M., Richard K., Lanza G. M., Wickline S. A., and Hughes M. S., “Sensitive ultrasonic delineation of steroid treatment in living dystrophic mice with energy-based and entropy-based radio frequency signal processing,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control 54, 2291–2299 (2007). 10.1109/TUFFC.2007.533 [DOI] [PubMed] [Google Scholar]
- Hughes M., Marsh J., Wallace K., Donahue T., Connolly A., Lanza G. M., and Wickline S. A., “Sensitive ultrasonic detection of dystrophic skeletal muscle in patients with Duchenne's muscular dystrophy using an entropy-based signal receiver,” Ultrasound Med. Biol. 33, 1236–1243 (2007). 10.1016/j.ultrasmedbio.2007.02.007 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hughes M., Marsh J., Agyem K., McCarthy J., Maurizi B., Wickerhauser M., Lanza W. K. D. G., and Wickline S., “Use of smoothing splines for analysis of backscattered ultrasonic waveforms: Application to monitoring of steroid treatment of dystrophic mice,” IEEE Trans. Ultrason. Ferroelectr. Freq. Control 58, 2361–2369 (2011). 10.1109/TUFFC.2011.2093 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Seip R., Ebbini E., and O'Donnell M., “Non-invasive detection of thermal effects due to highly focused ultrasonic fields,” Proc.-IEEE Ultrason. Symp. 2, 1229–1232 (1993). [Google Scholar]
- Hughes M. S., McCarthy J. E., Wickerhauser M., Marsh J. N., Arbeit J. M., Fuhrhop R. W., Wallace K. D., Thomas T., Smith J., Agyem K., Lanza G. M., and Wickline S. A., “Real-time calculation of a limiting form of the Renyi entropy applied to detection of subtle changes in scattering architecture,” J. Acoust. Soc. Am. 126, 2350–2358 (2009). 10.1121/1.3224714 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Schwartz L., Mathematics for the Physical Sciences (Dover, New York, 2008), pp. 1–368. [Google Scholar]
- Hughes M., “A comparison of Shannon entropy versus signal energy for acoustic detection of artificially induced defects in Plexiglas,” J. Acoust. Soc. Am. 91, 2272–2275 (1992). 10.1121/1.403662 [DOI] [Google Scholar]
- Reinsch C. H., “Smoothing by spline functions,” Num. Math. 10, 177–183 (1967). 10.1007/BF02162161 [DOI] [Google Scholar]
- Sackett D. L., “Why randomized controlled trials fail but needn't. II. Failure to employ physiological statistics, or the only formula a clinician-trialist is ever likely to need (or understand!),” Can. Med. Assoc. J. 165, 1226–1237 (2001). [PMC free article] [PubMed] [Google Scholar]
- Arbeit J. M., Riley R. R., Huey B., Porter C., Kelloff G., Lubet R., Ward J. M., and Pinkel D., “DFMO chemoprevention of epidermal carcinogenesis in k14-hpv16 transgenic mice,” Cancer Res. 59, 3610–3620 (1999). [PubMed] [Google Scholar]
- Arbeit J. M., Mnger K., Howley P. M., and Hanahan D., “Progressive squamous epithelial neoplasia in k14-human papillomavirus type 16 transgenic mice,” J. Virol. 68, 4358–4368 (1994). [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hughes M. S., McCarthy J. E., Marsh J. N., Arbeit J. M., Neumann R. G., Fuhrhop R. W., Wallace K. D., Thomas T., Smith J., Agyem K., Znidersic D. R., Maurizi B. N., Baldwin S. L., Lanza G. M., and Wickline S. A., “Application of Renyi entropy for ultrasonic molecular imaging,” J. Acoust. Soc. Am. 125, 3141–3145 (2009). 10.1121/1.3097489 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Katok A. and Hasselblatt B., Introduction to the Modern Theory of Dynamical Systems (Cambridge University Press, Cambridge, UK, 1995), pp. 1–804. [Google Scholar]