Abstract
Sound speed estimation can potentially correct the focusing errors in medical ultrasound. Maximizing the echo spatial coherence as a function of beamforming sound speed is a known technique to estimate the average sound speed. However, beamformation with changing sound speed causes a spatial shift of the echo signals resulting in noise and registration errors in the average sound speed estimates. We show that the spatial shift can be predicted and corrected, leading to superior sound speed estimates. Methods are presented for axial and two-dimensional location correction. Methods were evaluated using simulations and experimental phantom data. The location correction strategies improved the variance of sound speed estimates and reduced artifacts in the presence of strong backscatter variations. Limitations of the proposed methods and potential improvement strategies were evaluated.
Keywords: Aberration correction, large array imaging, sound speed, tomography
Graphical Abstract

I. Introduction
Clinical ultrasound scanners typically assume a sound speed of 1540 m/s for transmit and receive beamforming. Variation of tissue sound speed causes degraded image quality. Ultrasound imaging can be affected by the sound speed variations in two ways. First, the incorrect beamforming sound speed introduces inappropriate curvature and bulk shifts in the delay profiles causing defocusing and mis-registration, respectively. Second, random heterogeneous variations in sound speed lead to waveform distortion and random perturbations in the phase that may cause elevated sidelobes, unintended beam steering, and overall image degradation. We refer to these problems as first- and higher-order phase aberration.
First-order sound speed correction methods have been studied by many groups. These techniques seek to obtain a better global sound speed estimate than 1540 m/s. Anderson and Trahey implemented a curve fitting method on arrival time profiles to directly estimate the sound speed [1]. Other methods optimize a measure of signal quality as a function of beamforming sound speed. Various quality measures were utilized including coherence factor (CF) [2], angular CF [3], area under the coherence curve [4], echo brightness [5], and phase-variance-based metric [6]. It should be noted that these estimates are generally a harmonic average of the local sound speed along the propagation path, often simply called the average sound speed [2]. First-order correction approaches generally ignore the heterogeneous sound speed distribution in tissue. More recent methods incorporate information from the full image to form a least squares problem when estimating the global sound speed [7].
Higher-order phase aberration correction is also a rigorously studied subject. Historically, the problem has often been posed as a simple phase and amplitude distortion of the received signal [8]–[10]. Consequently, correction methods have long sought to align the phase of the channel signals and have seen varying degrees of success [10]–[12]. In addition to the difficulty of accurate phase estimation from partially coherent signals [13], [14], maintaining consistency among different regions of the image due to spatially varying phase profiles [15] is challenging. Nevertheless, phase correction has recently been implemented on commercial systems with significant image quality improvements in cardiac imaging [16].
Many recent aberration correction methods attempt to estimate the local sound speed map. Correction is then implemented by incorporating these maps in the beamforming delay calculation arithmetically [17], by using the Eikonal equation [18], or by using wavefield correlation beamforming [18]. The local sound speed is often estimated using a two-step process. First, an integrated information is estimated locally from the echo signal that captures the sound speed variation along a propagation path. For example, several works have estimated the phase shift of beamformed signals from varying apertures or transmit geometries [19], [20]. Others have estimated the average sound speed on a local scale [2], [18], [21]. In the next step, the integrated information is posed as a linear combination of the local information, and a regularized inversion scheme is applied to estimate the local sound speed [2], [18]–[21]. Recently, deep-learning-based methods were studied for pulse-echo sound speed estimation [22]–[26].
In this work, we focus on improving the local average sound speed estimators. Average sound speed maps can serve as seed points for more spatially varying first-order correction, enable high-resolution imaging with larger arrays [27], [28], accurate sizing and location estimation in instrument guidance, and serve as the basis for full local sound speed based aberration correction [2], [18], [21]. However, typical methods to estimate average sound speed are prone to error as the beamforming sound speed alters the signal registration. This leads to spatial ambiguity of the estimates [29]. Anderson and Trahey’s method yields an estimate of the location as well as the sound speed [1]. However, this approach requires transmit beamformation with a potentially incorrect sound speed which may bias the estimates. Hasegawa and Nagaoka [30] accounted for the location drift by beamforming on a time-sample-based grid which is non-trivial to implement for axially asymmetric pixel locations. Recently, Brevett et al. proposed a beamforming-free average sound speed and location estimator that performed the arrival time curve fitting using the highly coherent common mid-point signals from a multi-static acquisition [29]. To suppress clutter in unfocused data, a spatio-temporal filter was used. However, beamforming offers certain intrinsic advantages such as built-in off-axis signal suppression and robustness against incoherent noise. Therefore, we propose an alternative approach to location correction compatible with beamformed data.
In this work, we show that spatial ambiguity can be corrected by tracking the trajectory of signals as a function of beamforming sound speed. We implement various correction methods and use Field II simulations and experimental phantom acquisitions to demonstrate the effectiveness of the proposed methods. A goal of this work is to explore the feasibility of high-resolution local mapping of average sound speed in the presence of B-mode intensity variation. Using simulations and experimental data, we also highlight the challenges of spatial ambiguity correction on a local scale.
II. Methods
A. Theory and technique
A common technique to estimate average sound speed is to optimize a direct (CF, angular CF, phase variance etc.) or indirect (brightness) measure of signal coherence as a function of beamforming sound speed. A known problem of this approach is the spatial ambiguity of the scattering signal. When a signal is beamformed with various sound speeds, the beamsum signal registers at various spatial locations. For example, increasing the beamforming sound speed causes the signal to register at a deeper location. As a result, when coherence factors (or other quantities) of a given pixel are compared across a trial set of beamforming sound speed, identical scattering signals are not compared. Additionally, the resulting estimate may not register at the correct depth.
In this work, we propose that spatial ambiguity can be resolved if the signals from specific physical targets are tracked and compared across the trial set. This can be achieved by finding the focusing locations in different trial sets that sample the similar (defined next) channel domain signal during delay-and-sum.
Ultrasound channel data is acquired in time-channel space. The beamforming process maps these signals onto a uniformly spaced spatial grid. For a multi-static acquisition, the signal at a particular pixel is obtained by delaying the channel data to account for distances from each transmit and receive element and then summing the signals. The delay profile as a function of transmit channel location , receive channel location , and beamforming sound speed is given by:
| (1) |
Now, we consider a uniform spatial grid at a base beamforming speed (can be 1540 m/s or any other value) and a specific pixel in that grid. If the beamforming speed is changed to , the signal from will shift to a new position. Our goal is to find a location that has the most similar delay profile in as that of in . We define the similarity by the least -norm between delay profiles. In other words, can be estimated by minimizing argmin
| (2) |
for the known values of and .
This process estimates the location in that samples the channel data most closely to that of in . This allows us to find the spatial registration of a signal across a range of trial beamforming sound speed values. It should be noted that, unlike , the location may not correspond to a grid point.
To implement the correction, we first use a fixed, uniform spatial grid with a base beamforming speed . We typically choose this near the center of the trial range or use the nominal value of a phantom to allow for adequate freedom in either direction. At each trial sound speed, we calculate the apparent motion of each base pixel by minimizing (2).
Similar to existing methods, the signals are beamformed and the CF map is calculated for each trial speed. Now, instead of directing comparing CF across the trial speed values, we interpolate each CF image onto the estimated positions . Next, the CF of a given base pixel is compared across the trial speed at the interpolated positions. This allows us to track scatterer signals across the trial speeds. The trial speed that maximizes the CF provides the estimated average sound speed as well as the location where this estimate should be registered. Repeating this process over all pixels provides the set of average sound speeds registered on a scattered grid. Interpolating this on a uniform grid yields the estimated average sound speed map. Since this method accounts for the two-dimensional motion of the signals, we will refer to this as the 2D correction. Fig. 1 illustrates the location correction method.
Fig. 1.

Illustration of the registration correction technique. (Top row) Existing coherence-based sound speed estimator. CF at a given beamforming pixel is maximized as a function of beamforming sound speed to estimate the local average sound speed. The spatial drift of the underlying signal is not taken into account. (Bottom row) The location correction process. The underlying signal is registered at various locations as the beamforming sound speed is changed. Using expressions (2) or (3), a signal from a given pixel (from an arbitrary base beamforming speed) can be tracked across the trial set of beamforming speeds. By interpolating the CF map at each trial speed, CF can be compared at the registered locations. Maximum CF provides both the local average sound speed and the registration location. Repeating this process over all pixels yields a scattered map of estimated sound speeds which is then interpolated onto a uniform grid. Coherence factor images are also registered in the same way to apply a local first-order sound speed and registration correction.
In general, the spatial shift given by (2) is two-dimensional in nature. However, for pixels along and near the center axis of the aperture, the motion is largely axial. This axial shift can be approximated as
| (3) |
We implemented a correction method that uses equation (3) to calculate the new registration location (keeping ) and we will refer to this as the 1D correction.
Using these methods, we also obtain a scattered map of maximum CF values. Interpolating these values on the regular grid provides a map of location-corrected maximum CF ).
B. Simulations
We simulated speckle targets with a homogeneous sound speed using Field II [31], [32]. We modeled a 128-element phased array with a 2.72 MHz center frequency, 74% bandwidth, and 0.3 mm pitch. The elevation focus was modeled at 5 cm depth. Random points with a density of 15 scatterer/resolution cell were generated within a volume of 40-mm-by-40-mm-by-2.5-mm (axial, lateral, elevation). Two targets were modeled - one with a homogeneous echogenicity and one with multiple cylindrical lesions of 3 mm radius and varying contrasts. Imaging was simulated in multi-static mode by modeling transmission from each element and reception with all elements.
C. Phantom experiment
We imaged a tissue-mimicking phantom (Model 549, ATS Laboratories, Bridgeport, CT, USA) through a layer of water of roughly 13.5 mm thickness. We used a 384-element large linear phased array (Vermon Inc., Tours, France) using 2 MHz transmit pulses in Hadamard-encoded multi-static synthetic aperture mode [33]. The array elements had a 0.23 mm pitch, a total of 8.8 cm effective aperture width, a 15 mm elevation size, and an 8 cm elevation lens focus. A large aperture allowed us to image over a large depth with a wide field-of-view. We scanned the phantom using two synchronized Vantage 256 scannners (Verasonics Inc., Kirkland, WA, USA). The phantom had a nominal sound speed of approximately 1457 m/s determined by brightness maximization in prior studies [28]. However, the use of a water layer changed the average sound speed axially. We captured the channel data and decoded the multi-static data before further processing.
D. Signal processing
The position shift was computed for a fixed beamforming grid and a set of trial beamforming sound speeds. For the Field II simulation, we defined a 566-by-282 pixel grid (axial by lateral) with 0.07 mm axial increments and 0.14 mm lateral increments. For the phantom data, we defined a 2104-by-686 pixel grid (axial by lateral) with 0.06 mm axial increments and 0.13 mm lateral increments. Trial beamforming sets ranged from 1451 to 1650 m/s for simulation data and 1401 to 1600 m/s for experimental data with 1 m/s increments for both cases. The reference/base beamforming speeds were 1540 m/s for the simulation and 1457 m/s for the experiment. Using the function fmincon in MATLAB R2020a (MathWorks Inc., Natick, MA, USA), expression (2) was optimized for each grid pixel at each trial sound speed. This provided the approximate position of a given signal from the reference pixel to other beamforming sound speed spaces.
The multi-static signal was beamformed for each trial sound speed on the fixed grid defined earlier. All transmit-receive pairs were used for each pixel. After beamforming, signals were summed along the transmit channels. The coherence factor was calculated as [34]
| (4) |
where is the number of receive channels, and is the transmit-focused, delayed channel data at receive channel .
Each coherence factor image was smoothed using a 2 mm by 2 mm Gaussian kernel. The local average sound speed was estimated for each pixel as given by the maximum of coherence factor (Direct method [2]). To implement the proposed ambiguity correction, CF maps for each trial sound speed were cubic interpolated either onto the pre-calculated shifted positions (2D method) or the axially shifted positions (equation (3), 1D method). The coherence factor was tracked along the shifted trajectory through all trial beamforming speeds. Maximum coherence factor along a given trajectory yielded an estimate of the local average sound speed and the registration location of that estimate. The final collection of estimates was registered on a non-uniform grid. The estimates were finally cubic interpolated onto the original pixel grid. The maximum coherence factor was also interpolated in the same way yielding a sound speed-corrected coherence factor map.
The final interpolation step requires a sufficient estimation density. To evaluate this, we reported the point cloud density map given by the number of estimates that registered within each original pixel’s spatial extent. The spatial extent of a pixel at is a rectangular area enclosing points that satisfies
| (5) |
where and are grid spacings. A point cloud density of one implies that the local density was approximately identical to that of the original grid.
The location tracking often results in regions of extreme estimates (discussed in detail in section V) resulting in artifacts. To mitigate these artifacts, we implemented a filter on the trial set for each pixel and applied it on a second iteration. To implement this, we computed the median of the initial estimate at each depth location within a kernel ranging 5-mm axially and covering all lateral samples. We refer to this as the lateral median. The tracked CF profiles (CF vs beamforming speed) from the first iteration were then truncated to within the ±20 m/s of the lateral median. For each profile, the lateral median was selected based on the depth of the corresponding base location (rather than the final registration location in the first iteration which could be an artifact). The truncation was implemented using a Tukey window with 40% tapering. The CF is then maximized and registered in the same manner as the first iteration. We refer to this process as trial range limiting. The goal of this process was to limit the search within a range of local estimates avoiding the extreme values. Results with and without range limiting are reported for both 1D and 2D location correction.
E. Parameter study
We conducted a parameter study to understand the effects of various acquisition and processing parameters on the spatial texture of the direct estimates. We assessed the impacts of beamforming F#, CF averaging kernel size, and imaging frequency. We used the homogeneous phantom simulation for this study. For the imaging frequency study, the simulation was repeated at various frequencies.
III. Results
Fig. 2 shows the estimated position shift of select pixels with respect to a base beamforming sound speed of 1540 m/s. Position shifts are shown for a trial beamforming speed of 1451, 1510, 1570, and 1650 m/s. For pixels along the central line of the aperture, the displacement is largely axial (within some numerical tolerance). For off-axis pixels, the scattering signal also becomes steered as indicated by non-negligible lateral shift. The magnitude of displacement also increases with the difference between base and trial sound speed.
Fig. 2.

Vector plots show the signal shift. Two-dimensional shift of scattering signals when beamformed with a speed of 1451, 1510, 1570, and 1650 m/s in (a)-(d), respectively. Spatial shift was calculated relative to a base beamforming speed of 1540 m/s using equation (2) for a sparse grid. Results are shown for multi-static beamforming with the whole aperture in each pixel. Vector magnitudes are shown in the correct scale.
Fig. 3 shows the results of a homogeneous phantom simulation. Estimated sound speed exhibits a noisy pattern that increases in amplitude with depth. Axial ambiguity correction largely mitigates the noise pattern except near the edges of the image. The 2D ambiguity correction method produces a sound speed map with visibly similar variance patterns throughout the image. The 1D and 2D corrections lead to scattered registrations of the estimates. The point density of the registration is shown in the second row. Approximately 84.4 and 83.7% of pixels had a point density of unity for the 1D and 2D corrections, respectively. CF profiles as a function of trial sound speeds are shown for base locations highlighted in the first image. It should be noted that the sound speed estimated from these profiles may not correspond to the values at the highlighted locations since the correction leads to scattered registration of the final estimates. The profiles in the direct method often exhibited local maximum. The 2D corrected profiles exhibited smoother decay away from the peak coherence.
Fig. 3.

Homogeneous phantom simulation. (Top row) Estimated average sound speed map by (left) CF maximization with direct pixel-to-pixel comparison and with 1D (middle) and 2D (right) registration corrections. The red ‘X’ symbols on the left image show pixels that were analyzed in the bottom two rows. Second row shows the point density map of the 1D and 2D location correction process. The color map is discrete with integer values. Rows 3–4 show CF profiles as a function of beamforming speed. In the case of direct comparison, each point in the profile corresponds to the same beamforming location. For 1D and 2D registration corrections, each point represents a different spatial location accounting for the signal drift. All three profiles intersect at 1540 m/s, the base beamforming speed.
Table I shows the mean and standard deviation computed from the homogeneous speckle simulation around the six points shown in Fig. 3. The mean values are close to 1540 m/s for all three methods. However, location correction reduced the standard deviation of estimates. The 2D method produced the lowest standard deviation particularly in off-axis regions.
Table I.
MEAN AND STANDARD DEVIATION (S.D.) OF ESTIMATED SOUND SPEED IN THE HOMOGENEOUS SPECKLE SIMULATION. VALUES WERE COMPUTED AROUND THE SIX POINTS HIGHLIGHTED IN FIG. 3 USING 5-MM-BY-5-MM KERNELS. UNITS ARE IN M/S.
| x = −1 cm z = 4 cm |
x = 0 cm z = 4 cm | x = 1cm z = 4 cm | x = −1 cm z = 5 cm |
x = 0 cm z = 5 cm | x = 1 cm z = 5 cm | ||
| Direct | Mean S.d. |
1542.2 5.4 |
1542.1 4.8 |
1540.6 5.9 |
1544.1 10.1 |
1540.2 7.4 |
1542.2 10.1 |
| 1D | Mean S.d. |
1542.2 3.0 |
1542.1 1.9 |
1540.9 3.9 |
1542.1 5.4 |
1540.9 2.6 |
1542.1 5.4 |
| 2D | Mean S.d. |
1541.9 2.0 |
1542.1 1.8 |
1541.2 2.9 |
1541.8 2.9 |
1541.0 2.4 |
1541.7 2.6 |
Fig. 4 shows the results of a multi-lesion simulation. Although the underlying sound speed is uniform (1540 m/s), the simulation contained significant backscatter heterogeneity. Circular artifacts around the bright lesions and inside the −20 dB lesion are visible in the direct map. 1D and 2D location corrections mitigated many of these artifacts with 2D correction showing fewer artifacts. Limiting the trial range around the lateral median mitigated the remaining artifacts. The images appeared similar. However, 2D correction, particularly with trial range limits, better delineated the circular shape of the lowest contrast lesion. Artifacts are also visible in the location corrected maps which are mitigated through the trial range limits. Point density maps of 1D and 2D corrections show regions of voids and high-density registrations around the highest and lowest contrast lesions. These regions correspond to the artifacts in the maps. Trial range limits homogenized the density of registration.
Fig. 4.

Multi-lesion phantom simulation in Field II. (Top row) Estimated average sound speed maps are shown for the direct method, 1D, and 2D location correction methods and with a second iteration of trial speed range limiting (last two columns). (Middle row) Maximum coherence factor maps for the methods corresponding to the first row. (Bottom row) The first image shows the B-mode image at 1540 m/s base beamforming speed. The top three and bottom three hyperechoic lesions have a +10, +20, and +30 dB contrast and the hypoechoic lesions in the middle row have a −5, −10, and −20 dB contrast. (Bottom row) The last four images show the point density map of the registered estimates.
Fig. 5 demonstrates examples of the location tracking process using the brightest lesion at 5 cm depth (from Fig. 4). As the trial beamforming sound speed changed, the lesion changed its depth and shape. Location tracking accounted for this position change as highlighted by three example points. However, the changing shape of the lesion posed a challenge for pixels immediately outside the lesion. For example, points B and C were gradually swept inside the lesion at extreme values of beamforming sound speeds. As a result, CF optimized at extreme values of the trial set for these locations. Given such a large deviation from the true sound speed, these estimates are also registered at large distances from the true position (indicated on the point density map). This mechanism was responsible for the voids observed in the point density maps on two sides of the lesion.
Fig. 5.

Examples of point/signal tracking in the multi-lesion phantom simulation. B-mode (top row) and the coherence factor maps (second row) are shown for a hyperechoic lesion at various beamforming sound speeds. Three points (A, B, and C) indicated by ‘+’ are tracked across the trial speeds. The tracked coherence factors for the three points are plotted at the bottom row. The red circles indicate values corresponding to the sound speeds displayed at the top two rows. The point density map of the registered estimates is shown on the last row (first image). The green and magenta lines show the distance between the base pixel locations and the final registered locations for points B and C, respectively. The location shift was downward for point B and upward for point C. The movement of point A (red) is small.
Fig. 6 shows the results of a phantom experiment. The large aperture used in this work allowed imaging over a wide area and large depth. The average sound speed map estimated by the direct method exhibits texture variations that increase with depth. The 1D location correction smoothed the variance within a narrow central strip. The 2D correction extended this effect to off-axis pixels as well. Both correction methods suffered from artifacts near the anechoic targets. Trial range limits largely removed the artifacts from the background region near the lesions. However, artifacts remained inside the lesions. It is unknown to us whether the true sound speed inside the lesion is the same as the background. However, due to the lack of scattering, the estimates inside the lesions should be neglected anyway. The maps exhibited the limitation of the direct method. Most large lesions at 8 cm depth or deeper are not clearly delineated, reflecting the registration issue caused by the direct method. Additionally, most small lesions (on the left) at large depths are not visible.The 2D location correction method, particularly with trial range limits, could visualize the small lesions at approximately 12 cm depth. Density maps show voids corresponding to the lesions with 1D and 2D location correction. Additionally, dense registration was also observed below each lesion. Trial range limits mitigated the dense registration in the background region. Voids were only partially mitigated but remained localized within the lesions.
Fig. 6.

Experiments on a tissue-mimicking phantom through a layer of water using a large aperture. (Top row) The estimated average sound speed maps for the direct method, 1D, and 2D location correction methods, with a second iteration of trial speed range limiting (last two columns). (Middle row) Maximum coherence factor maps corresponding to the methods shown in the top row. (Bottom row) The first image shows the B-mode at the nominal and base sound speed of 1457 m/s. The last four images show point density maps. The imaging cross-section contains multiple columns of cylindrical anechoic lesions of various radii and point target groups.
Fig. 7 shows an analysis similar to Fig. 6 at a different cross-section of the phantom. The imaging view contained points target groups and two vertical hyperechoic lesions. A shadowing effect was observed below the bright lesion as indicated by extreme estimates of sound speeds and low maximum coherence. The 1D and 2D location correction methods minimized the high variance observed around the lesion. Point targets caused wing-like artifacts in the coherence maps caused by sidelobes. Location correction, particularly the 2D method with trial range limits, contained their effects on the sound speed map.
Fig. 7.

Results similar to the Fig. 6 corresponding to a different cross-section of the experimental phantom. The view contains columns and lateral groups of point targets and hyperechoic lesions.
Fig. 8 shows the results of a parametric study on the direct method using homogeneous Field II simulations. The transmit-receive F#, imaging frequency, and coherence averaging kernel size were varied to observe the effects on the spatial texture of the . For a fixed F# of 1.0 or higher, spatial texture appeared uniform through depth. For F# smaller than 1.0, the texture variance exhibited a slight depth-dependence. For fixed F# beamforming, particularly with larger F#s, texture variation observably increased on both sides of the images. When imaging frequency was increased without an F# limit, the spatial texture became smoother although increasing variance was observed through depth. Increasing the kernel size had a similar effect as increasing the frequency.
Fig. 8.

Impact of acquisition and processing parameters on the direct method. Images are average sound speed maps estimated from homogeneous speckle simulations using the direct method (no location correction). Results are shown as a function of transmit/receive F# (top row), imaging frequency (middle row) and CF averaging kernel size (bottom row). The top and bottom rows had a frequency of 2.72 MHz (same data as Fig. 3). Kernel size was 2 mm for the first two rows. No F# limit was applied to the last two rows.
IV. Discussion
Ambiguity correction minimizes the effects of neighboring high-coherence scatterers. When beamforming sound speed is changed, a bright target will move along a trajectory in the image and its coherence may remain high over a range of sound speed values. If the coherence remains higher than the actual targets in any part of that trajectory, the estimator will derive the maximum CF from that single bright target at all those locations. By tracking the trajectory of a specific signal source and registering the estimate only once at the maximum coherence location, the effects of a high-coherence target are localized.
The scatterer ambiguity problem is ultimately related to the spatial averaging and the speckle pattern of the CF maps. With a large enough kernel, the errors due to ambiguity can be averaged out. This can be useful in a homogeneous sound speed medium and in cases where a single global estimate is sufficient. Similarly, for a given kernel size, the higher frequency can be used to generate higher effective averaging (Fig. 8). However, for local sound speed estimation (average or absolute local), the kernel size must be smaller compared to the spatial variation of the underlying sound speed distribution. Similarly, the imaging frequency cannot be increased indefinitely for deep tissue imaging. Ambiguity correction may be useful in these applications.
The increasing variance with depth, as observed in Fig. 3, is also a consequence of the spatial averaging of CF. The CF images contain speckle-like patterns and result in the texture variation of sound speed estimates. For a fixed aperture size, the effective F# and the speckle spot size increases with depth. As a result, speckle patterns undergo a smaller effective averaging for a given kernel size and texture variation in the sound speed estimates increases with depth. This effect is also illustrated in Fig. 8. When a fixed F# is applied to transmit and receive beamforming, speckle spot size remains invariant through depth and the noise pattern appears spatially uniform. Additionally, the edges of the image use smaller apertures than what the specified F# implies and therefore exhibited higher variance as the speckle size is larger in those regions. When the fixed F# is reduced, the speckle size in the central region tightens and consequently, texture variations in F#=1.0 appear smaller than those in F#=1.4. Similarly, when the imaging frequency is increased, the speckle spot size tightens and, consequently, the noise patterns reduce with frequency. These results highlight that the speckle spot size dictates the noise pattern in local maps.
The spatial registration occurs on a non-uniform grid. This leads to regions of high-density registration and voids where few to no estimates were registered. This usually happens near regions of heterogeneous backscattering (lesions, point targets, shadowing). The problems caused by void regions are two-fold. First, the void regions cause interpolation artifacts in the final map. This can be partially mitigated by using more sophisticated region-filling/inpainting techniques. Second, a void region indicates that the scattering signal has registered somewhere far from that region. The final registration point of these pixels may cause a locally dense registration. Since the magnitude of displacement is related to the difference between trial and base sound speed, the estimates would likely be extreme in the registered locations causing artifacts. This was the rationale behind the trial range limiting. Indeed, the application of a trial range limit on a second iteration mitigated the issues with dense registration. For example, in Fig. 6, the streaks of high sound speed estimates below the anechoic lesions were removed by trial range limits.
The artifacts in location correction ultimately originate from the coherence-maximization scheme rather than the location tracking itself. When tracking a bright lesion, we can accurately predict the location of the lesion as a function of trial sound speed. However, we do not account for the higher-order effects such as shape distortion, mainlobe broadening, and elevated sidelobes. In Fig. 5, the blue point in the base sound speed map is outside the lesion. However, as the sound speed increases, it is swept inside the lesion and the CF, in fact, increases with extreme sound speeds. In our understanding, most metrics (brightness, any measure of coherence) will suffer from this pitfall. Additionally, sidelobes can cause inaccurate estimates. Sidelobe signals from an improperly focused bright target may exhibit higher coherence (or brightness) than the mainlobe of a properly focused weak scattering region. Directional filters [35] or model-based approaches [36] may reduce the sidelobe effects, but the mainlobe broadening effects (as shown in Fig. 5) are much harder to mitigate. One possible solution is to restrict the overall trial range so that the shape distortion is minimized. However, this may reduce the lateral resolution of the sound speed estimates. Further work is needed to develop an adaptive trial set method that can accommodate a wide range of sound speeds in various tissues (fat, muscle, etc) while avoiding unnecessarily extreme values.
When the envelope images are observed as a function of increasing beamforming sound speed (not shown here), one can clearly see downward motion of the speckle pattern. However, beyond a certain range of speed values, the speckle pattern becomes uncorrelated to the first envelope. This is an indication that a range of coherent sound speed exists around any given sound speed. When operating outside this range, the errors in the minimization scheme of expression (2) are not sufficiently small and, consequently, do not result in similar channel sampling. In other words, speckle signals cannot be meaningfully tracked and compared outside the coherent sound speed range. Future works should define this range and study its usage for selecting the trial set.
Hasegawa and Nagaoka first investigated the location tracking for coherence-based sound speed estimation [30]. Their method relied on beamforming on a time-sample-based grid. In their methods, the transmit delays were purely axial and the receive apertures were axially symmetric around the focusing pixels. Consequently, two-dimensional correction was not necessary. Their method also did not account for the differences in delay curvature when beamforming with variable sound speed. We generalized the location correction approach by incorporating the full two-dimensional shift and minimizing the norm among delay profiles. We also conducted our analysis on a local scale and rigorously explored the effects of non-uniform registration inherent in any location tracking approach.
A limitation of our study is that we did not estimate the absolute local sound speed from average sound speed estimates. We also did not evaluate the proposed methods with strong sound speed heterogeneity. As the idea behind location correction was comparing similar echo targets through equivalent channel sampling (expression (2)), the proposed methods should work even in the presence of sound speed heterogeneity. The CF averaging kernel was arbitrarily chosen and may need to adapt based on resolution requirements. The trial range limit was an ad hoc implementation. More sophisticated range limiters can be designed based on local registration density and prior information about tissue. The width of the trial set should also be adapted based on the local information. Another limitation is that we used the whole aperture to compute CF for ease of analysis. This resulted in lower CF in shallower regions particularly with the large aperture in experimental data. The proposed methods can be further improved by computing the location shift and CF for dynamically grown transmit/receive apertures with fixed F# that respect the element angular response. Simulations and experiments were conducted with different settings. Simulations were based on a traditional-sized aperture and a smaller field-of-view. Experiments were conducted using a larger aperture to understand the location shift over a broad region in both axial and lateral directions. Different settings enabled us to cover multiple potential use cases.
In the future, location-corrected average sound speed estimation should be evaluated in the context of local sound speed estimation. Additionally, we will explore the use of the location-corrected maps to improve the current spatial coherence imaging methods.
V. Conclusions
We implemented a location correction strategy for coherence-maximization-based average sound speed estimators. The proposed methods account for the signal drift with incorrect beamforming sound speed and register the sound speed estimates at the correct location. Simulations and experimental results demonstrated that the proposed approach reduces estimation variance and enables high-quality estimation in the presence of backscatter intensity variation. The proposed method may enable spatially varying first-order sound speed correction and improve the local sound speed estimators for higher-order aberration correction.
Highlights.
Coherence-based sound speed estimators are affected by spatial ambiguity of signals resulting from changing beamforming sound speed. In this work, we present location tracking techniques to correct such ambiguity.
Location tracking improved the variance and registration of estimated sound speed.
Proposed techniques can improve existing sound speed estimators and aberration correction methods and should be investigated in the future.
Acknowledgments
This work was supported by National Institute of Health grant 5R01EB017711-11.
References
- [1].Anderson ME and Trahey GE, “The direct estimation of sound speed using pulse–echo ultrasound,” The Journal of the Acoustical Society of America, vol. 104, pp. 3099–3106, 11 1998. [DOI] [PubMed] [Google Scholar]
- [2].Ali R, Telichko AV, Wang H, Sukumar UK, Vilches-Moure JG, Paulmurugan R, and Dahl JJ, “Local Sound Speed Estimation for Pulse-Echo Ultrasound in Layered Media,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 69, pp. 500–511, February 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [3].Ali R, Maredia S, Telichko A, Wang H, Paulmurugan R, Vilches-Moure J, and Dahl JJ, “Sound speed estimation in layered media using the angular coherence of plane waves,” 10.1117/12.2548878, vol. 11319, pp. 81–90, March 2020. [DOI] [Google Scholar]
- [4].Imbault M, Faccinetto A, Osmanski BF, Tissier A, Deffieux T, Gennisson JL, Vilgrain V, and Tanter M, “Robust sound speed estimation for ultrasound-based hepatic steatosis assessment,” Physics in Medicine & Biology, vol. 62, p. 3582, April 2017. [DOI] [PubMed] [Google Scholar]
- [5].Anderson ME, McKeag MS, and Trahey GE, “The impact of sound speed errors on medical ultrasound imaging,” The Journal of the Acoustical Society of America, vol. 107, p. 3540, May 2000. [DOI] [PubMed] [Google Scholar]
- [6].Perrot V, Polichetti M, Varray F, and Garcia D, “So you think you can DAS? A viewpoint on delay-and-sum beamforming,” Ultrasonics, vol. 111, p. 106309, March 2021. [DOI] [PubMed] [Google Scholar]
- [7].Bezek CD and Goksel O, “Analytical estimation of beamforming speed-of-sound using transmission geometry,” Ultrasonics, vol. 134, p. 107069, September 2023. [DOI] [PubMed] [Google Scholar]
- [8].Hinkelman LM, Liu D, Metlay LA, and Waag RC, “Measurements of ultrasonic pulse arrival time and energy level variations produced by propagation through abdominal wall,” The Journal of the Acoustical Society of America, vol. 95, p. 530, October 1998. [DOI] [PubMed] [Google Scholar]
- [9].Flax SW and O’Donnell M, “Phase-Aberration Correction Using Signals From Point Reflectors and Diffuse Scatterers: Basic Principles,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 35, no. 6, pp. 758–767, 1988. [DOI] [PubMed] [Google Scholar]
- [10].O’Donnell M and Flax SW, “Phase-Aberration Correction Using Signals From Point Reflectors and Diffuse Scatterers: Measurements,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 35, no. 6, pp. 768–774, 1988. [DOI] [PubMed] [Google Scholar]
- [11].Liu DL and Waag RC, “Time-shift compensation of ultrasonic pulse focus degradation using least-mean-square error estimates of arrival time,” The Journal of the Acoustical Society of America, vol. 95, pp. 542–555, January 1994. [DOI] [PubMed] [Google Scholar]
- [12].Nock L, Trahey GE, and Smith SW, “Phase aberration correction in medical ultrasound using speckle brightness as a quality factor,” The Journal of the Acoustical Society of America, vol. 85, pp. 1819–1833, May 1989. [DOI] [PubMed] [Google Scholar]
- [13].Walker WF and Trahey GE, “A Fundamental Limit on the Performance of Correlation Based Phase Correction and Flow Estimation Techniques,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 41, no. 5, pp. 644–654, 1994. [Google Scholar]
- [14].Ng GC, Walker WF, and Trahey GE, “Improvement of signal correlation for adaptive imaging using the translating transmit aperture algorithm,” Proceedings of the IEEE Ultrasonics Symposium, vol. 2, pp. 1395–1400, 1996. [Google Scholar]
- [15].Dahl JJ, Soo MS, and Trahey GE, “Spatial and temporal aberrator stability for real-time adaptive imaging,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 52, pp. 1504–1517, September 2005. [DOI] [PubMed] [Google Scholar]
- [16].Måsøy S-E, Dénarié B, Sørnes A, Holte E, Grenne B, Espeland T, Berg EAR, Rindal OMH, Rigby W, and Bjåstad T, “Aberration correction in 2d echocardiography,” Quantitative Imaging in Medicine and Surgery, vol. 13, no. 7, 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [17].Rau R, Schweizer D, Vishnevskiy V, and Goksel O, “Ultrasound Aberration Correction based on Local Speed-of-Sound Map Estimation,” IEEE International Ultrasonics Symposium, IUS, vol. 2019-October, pp. 2003–2006, October 2019. [Google Scholar]
- [18].Ali R, Brevett T, Hyun D, Brickson LL, and Dahl JJ, “Distributed Aberration Correction Techniques Based on Tomographic Sound Speed Estimates,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 69, pp. 1714–1726, May 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [19].Jaeger M, Held G, Peeters S, Preisser S, Grünig M, and Frenz M, “Computed Ultrasound Tomography in Echo Mode for Imaging Speed of Sound Using Pulse-Echo Sonography: Proof of Principle,” Ultrasound in Medicine & Biology, vol. 41, pp. 235–250, January 2015. [DOI] [PubMed] [Google Scholar]
- [20].Sanabria SJ, Ozkan E, Rominger M, and Goksel O, “Spatial domain reconstruction for imaging speed-of-sound with pulse-echo ultrasound: simulation and in vivo study,” Physics in Medicine & Biology, vol. 63, p. 215015, October 2018. [DOI] [PubMed] [Google Scholar]
- [21].Jakovljevic M, Hsieh S, Ali R, Kung GCL, Hyun D, and Dahl JJ, “Local speed of sound estimation in tissue using pulse-echo ultrasound: Model-based approach,” The Journal of the Acoustical Society of America, vol. 144, p. 254, July 2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [22].Heller M and Schmitz G, “Deep Learning-based Speed-of-Sound Reconstruction for Single-Sided Pulse-Echo Ultrasound using a Coherency Measure as Input Feature,” IEEE International Ultrasonics Symposium, IUS, 2021. [Google Scholar]
- [23].Feigin M, Freedman D, and Anthony BW, “A Deep Learning Framework for Single-Sided Sound Speed Inversion in Medical Ultrasound,” IEEE Transactions on Biomedical Engineering, vol. 67, pp. 1142–1151, April 2020. [DOI] [PubMed] [Google Scholar]
- [24].Kim MG, Oh S, Kim Y, Kwon H, and Bae HM, “Robust Single-Probe Quantitative Ultrasonic Imaging System with a Target-Aware Deep Neural Network,” IEEE Transactions on Biomedical Engineering, vol. 68, pp. 3737–3747, December 2021. [DOI] [PubMed] [Google Scholar]
- [25].Simson WA, Paschali M, Sideri-Lampretsa V, Navab N, and Dahl JJ, “Investigating pulse-echo sound speed estimation in breast ultrasound with deep learning,” Ultrasonics, vol. 137, p. 107179, February 2024. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [26].Ali R, Brevett T, Zhuang L, Bendjador H, Podkowa AS, Hsieh SS, Simson W, Sanabria SJ, Herickhoff CD, and Dahl JJ, “Aberration correction in diagnostic ultrasound: A review of the prior field and current directions,” Zeitschrift für Medizinische Physik, vol. 33, pp. 267–291, August 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [27].van Hal VH, Muller JW, van Sambeek MR, Lopata RG, and Schwab HM, “An aberration correction approach for single and dual aperture ultrasound imaging of the abdomen,” Ultrasonics, vol. 131, p. 106936, May 2023. [DOI] [PubMed] [Google Scholar]
- [28].Ahmed R, Foiret J, Ferrara K, and Trahey GE, “Large-Array Deep Abdominal Imaging in Fundamental and Harmonic Mode,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 70, pp. 406–421, May 2023. [DOI] [PMC free article] [PubMed] [Google Scholar]
- [29].Brevett T, Sanabria SJ, Ali R, and Dahl J, “Speed of Sound Estimation at Multiple Angles from Common Midpoint Gathers of Non-Beamformed Data,” IEEE International Ultrasonics Symposium, IUS, vol. 2022-October, 2022. [Google Scholar]
- [30].Hasegawa H and Nagaoka R, “Initial phantom study on estimation of speed of sound in medium using coherence among received echo signals,” Journal of Medical Ultrasonics, vol. 46, pp. 297–307, July 2019. [DOI] [PubMed] [Google Scholar]
- [31].Jensen JA, “FIELD: A Program for Simulating Ultrasound Systems,” 10th Nordic-Baltic Conference on Biomedical Imaging Published in Medical Biological Engineering Computing, vol. 34, Supplement 1, Part 1, pp. 351–353, 1996. [Google Scholar]
- [32].Jensen JA and Svendsen NB, “Calculation of Pressure Fields from Arbitrarily Shaped, Apodized, and Excited Ultrasound Transducers,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 39, no. 2, pp. 262–267, 1992. [DOI] [PubMed] [Google Scholar]
- [33].Chiao RY, Thomas LJ, and Silverstein SD, “Sparse array imaging with spatially-encoded transmits,” Proceedings of the IEEE Ultrasonics Symposium, vol. 2, pp. 1679–1682, 1997. [Google Scholar]
- [34].Mallart R and Fink M, “Adaptive focusing in scattering media through sound-speed inhomogeneities: The van Cittert Zernike approach and focusing criterion,” Journal of the Acoustical Society of America, vol. 96, pp. 3721–3732, June 1994. [Google Scholar]
- [35].Dahl J and Trahey G, “Off-axis scatterer filters for improved aberration measurements,” IEEE Symposium on Ultrasonics, 2003, vol. 2, pp. 1094–1098. [Google Scholar]
- [36].Byram B, Dei K, Tierney J, and Dumont D, “A model and regularization scheme for ultrasonic beamforming clutter reduction,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 62, pp. 1913–1927, November 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
