Skip to main content
Sensors (Basel, Switzerland) logoLink to Sensors (Basel, Switzerland)
. 2021 Feb 25;21(5):1591. doi: 10.3390/s21051591

A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment

Yiwei Huang 1,2, Jianfei Tong 1, Xiaoqing Hu 1, Ming Bao 1,*
Editor: Raquel Caballero-Aguila
PMCID: PMC7956274  PMID: 33668765

Abstract

The localization of outdoor acoustic sources has attracted attention in wireless sensor networks. In this paper, the steered response power (SRP) localization of band-pass signal associated with steering time delay uncertainty and coarser spatial grids is considered. We propose a modified SRP-based source localization method for enhancing the localization robustness in outdoor scenarios. In particular, we derive a sufficient condition dependent on the generalized cross-correlation (GCC) waveform function for robust on-grid source localization and show that the SRP function with GCCs satisfying this condition can suppress the disturbances induced by the grid distance and the uncertain steering time delays. Then a GCC refinement procedure for band-pass GCCs is designed, which uses complex wavelet functions in multiple sub-bands to filter the GCCs and averages the envelopes of the filtered GCCs as the equivalent GCC to match the sufficient condition. Simulation results and field experiments demonstrate the excellent performance of the proposed method against the existing SRP-based methods.

Keywords: source localization, wireless acoustic sensor networks, steered response power, generalized cross-correlation

1. Introduction

With the rapid development of communication technology and mobile computing devices, applications of wireless acoustic sensor networks (WASNs) are becoming popular in acoustic signal processing. Particularly, WASN-based sound source localization has captured researchers’ attention in the last two decades [1,2,3,4,5]. The existing methods available for passive source localization in WASNs include (1) the received energy-based approaches [6,7,8,9]; (2) the direction of arrival (DOA)-based approaches [10,11]; (3) the time of arrival (TOA)-based approaches [12]; (4) the time difference of arrival (TDOA)-based approaches [13,14,15] and (5) the steered response power (SRP)-based approaches [16,17,18,19,20,21,22].

Most methods require a pre-processing stage in which specific modalities are measured from sensor signals before the location-estimating stage. In contrast, the SRP-based approaches locate the source position or direction by maximizing the power of spatially steered filter and sum beamformer of a group of sensors and contain only one decision step in processing sensor signals to estimate location. Without information compression and disturbances resulting from partial mistakes in the front-end stage, the SRP-based solutions can usually yield more robust performance in noisy and reverberant acoustic environments. Practical implementations commonly use the generalized cross-correlation [23]-based form of the SRP function [16] to reduce computation. The methods similar to the GCC-expression of SRP function are also called a “global coherence field (GCF)” in several references [24,25].

In practice, the primary constraint of the SRP-based approaches is the time-consuming on-grid searching procedure for finding their global maximums. Hence, it has been a hot issue to reduce the computational cost for the SRP-based approaches. In [17], a stochastic region construction (SRC) method is proposed to avoid global grid searching. However, this strategy also causes information loss. In [26], a geometrically sampled grid set based on the TDOA gradient is proposed to improve the SRP performances. An alternative strategy to solve the high-cost searching problem is adopting some adaptive SRP functions regarding the grid resolution to apply a coarse or a hierarchical searching. In [27], the authors use the low-frequency component of GCC for coarse grid resolution and the high-frequency component for fine grids in the SRP-based DOA estimation. In [28], the authors adopt a Gaussian low-pass filter to the GCC for coarse grids. For full-band signals, a similar kind of modification is proposed both in microphone arrays [29] and WASNs [18,19], respectively, in which the spatial spectrum of a given grid is calculated from the sum of the phase-transform weighted GCCs (GCC-Phase Transform (PHAT)s) within a time window containing the TDOA values in the volume surrounding the grid, instead of the original GCC-PHAT in the SRP function.

The SRP-based approaches can provide a robust solution in DOA estimation and source localization tasks in confined spaces. However, they could lose their robustness in an outdoor WASN scenario due to the synthetic effect of the following factors. (1) Grid size, since the monitoring area in outdoor cases may become much more extensive than the area of indoor applications, and the proper searching grids would be much coarser (e.g., meter-level grids outdoors compared with centimeter-level grids indoors). (2) Steering time delay uncertainty; in the classical SRP-based localization frame, the steering time delay at a given position is generated from an ideal propagation model and is always assumed to be entirely right. However, the steering time delay to the source position is different from the actual propagation time. Such a difference becomes no more negligible in the outdoor environment and causes a defocus effect, even though the WASN system is well synchronized. (3) Signal passband; when processing the acoustic data collected in outdoor environments, high-pass or band-pass filtering is indispensable because the environmental noise is intense in the low-frequency range, and the source signals in the real world often possess the band-pass characteristic. The synthetic effect of these three factors would make it difficult to achieve stable localization results. The Modified-SRP functional (MSRP) method introduced in [18,19] provides an elegant solution for scalable grids but it is not suitable for band-pass signals. In [21], the authors elaborate on the SRP in band-pass situations and use the GCC-PHAT envelope or frequency-shifted GCC-PHAT to enhance the robustness in such situations. Nevertheless, the above two methods hardly consider the other two factors (the grid and the steering time uncertainty). In [30], the authors propose a Frequency-Sliding GCC (FSGCC) method, which uses singular value decomposition (SVD) or weighted SVD (WSVD) on the FSGCC matrix and can intelligently extract time delay information of the source signal from multiple sub-band GCCs. The authors adopt the WSVD-FSGCC to the MSRP functional for source localization. This solution can provide excellent localization performance in the band-pass situation with scalable grids. However, in outdoor applications, the high computation cost of the SVD of giant matrices is inevitable due to the long GCC range.

Previously, several common acoustic source placements have been proposed in outdoor scenarios. They mostly focus on localizing the source from TDOA [31] and DOA [32,33] measurements. Some uncertainties are then introduced by the estimation error of TDOA or DOA estimating algorithms. Moreover, some useful information is also compressed, which results in unstable performance. In this direction, in this paper, a robust SRP-based outdoor source localization problem is discussed.

In this paper, a modified SRP-based method is proposed, in which the systematic influence of the above inevitable factors in outdoor WASNs scenarios is considered. The localization performance is analyzed using the normalized contribution of the signal components in the SRP function. A sufficient condition dependent on the GCC waveform function for robust on-grid SRP-based source localization is derived by geometrical analysis. The SRP function using GCCs satisfying this condition can suppress the disturbances induced by the grid distance and the uncertain steering time delay. A GCC refinement procedure for band-pass GCCs is then designed, which uses the complex wavelet functions in multiple sub-bands to filter the GCC and averages the envelopes of the filtered GCCs as the equivalent GCC to match the sufficient condition. Simulation results and field experiments demonstrate the excellent performance of the proposed method against the existing SRP-based methods.

The rest of this paper is organized as follows. In Section 2, the outdoor SRP-based source localization problem is formulated. Section 3 gives the sufficient condition in brief and introduces the GCC refinement procedure. The results of the simulation and the field experiment are presented in Section 4. Conclusions are given in Section 5.

2. SRP-Based Localization in Outdoor Acoustic Sensor Network

2.1. System Models

We discuss the acoustic source localization problem in an N-dimensional Euclidean space with M distributed microphones (M>N). Let xRN be a spatial coordinate vector. Specifically, define xs as the source location and zm as the position of the mth sensor (m=1,2,,M). Let s(t) be the source signal in the time domain, and the received signal of the microphone at zm can be modeled as

ym[n]=hm(t)s(t)+wm(t)δ(tn/Fs), (1)

where hm(t) is the impulse response function representing the propagation of sound from xs to zm, the operator “∗” represents the convolution operation, wm(t) stands for the additive noise signal, and δ(tn/Fs) denotes the sampling process at rate Fs. When the multi-path delay and non-linear distortion are neglected, the propagation function in the frequency domain can be simplified as

Hm(ω)=Amejωtm, (2)

where AmR is the amplitude-attenuation factor and tm is the time delay factor. In the frequency domain Equation (1) can be denoted as

Ym(Ω)=AmS(Ω)ejΩFstm+Wm(Ω), (3)

where Ω=ω/Fs[π,π] is the normalized angular frequency, Ym(Ω) is the discrete-time Fourier transform (DTFT) of ym[n], S(Ω) and Wm(Ω) are the Fourier transforms of s(t) and wm(t), respectively.

Let ηm(x)R be the steering time delay function describing the time delay associated with sound propagation from a given location x to zm. In practice, it is commonly modeled as the sound traveling time going through the line-of-sight (LOS) path with a constant sound speed vs; i.e.,

ηm(x)=||xzm||/vs, (4)

where “.” denotes the Euclidean distance. Note that ηm(x) is not exactly the sound propagation in reality. Then the SRP function, which is defined as the output power of the filtered-and-sum beam-former, is given by:

P(x)=ππm=1MGm(Ω)Ym(Ω)ejΩmFsηm(x)2dΩ, (5)

where Gm(Ω)ejΩmFsηm(x) is the filter associated with the mth sensor. It can be equivalently expressed in term of GCCs [16]:

Px=2πl=1Mm=1MRl,mηm(x)ηl(x), (6)

where

Rl,m(τ)=12πππΨl,m(Ω)Yl(Ω)Ym*(Ω)ejΩFsτdΩ (7)

denotes the GCC of the sensor pair {l,m}, τ is the time lag, superscript “(.)*” represents the conjugate operation, Ψl,m(Ω)=Gl(Ω)Gm*(Ω) and denotes the weight function of the associated GCC. Ideally, each Rl,m(τ) achieves its peak at τ=tmtl so that the SRP function is supposed to achieve its maximum value at the source position xs, as shown in Figure 1a,b. The Phase Transform (PHAT) weight function

Ψl,mPHAT(Ω)=1/Yl(Ω)Ym*(Ω) (8)

is widely used in the TDOA- and SRP-based localization applications. The PHAT-weighted GCC is generally referred to as the GCC-PHAT, and the SRP using the GCC-PHAT is generally referred to as the SRP-PHAT.

Figure 1.

Figure 1

Comparison of the ideal steered response power (SRP)-based source localization in an ideal case and with the unexpected effects (the symbols “o” and “+” represent the source position and the estimated position, respectively): (a) SRP map (3D view); (b) Ideal SRP map (2D view); (c) defocus effect from steering time uncertainties; (d) undersampled effect from coarse grid; (e) rippling effect from band-pass generalized cross-correlations (GCCs); (f) combined effect.

Removing those irrelevant and repetitive terms in Equation (6), the effective component for source localization can be simplified as

PE(x)=l=1M1m=l+1MRl,mηm(x)ηl(x)=p=1CM2Rpτp(x), (9)

where p is the sequence number of the valid sensor pair cp={l,m}(l<m) and is deduced to be p=(2Ml)(l1)/2+ml, varying from one to a combinatorial number CM2; τp(x)=ηm(x)ηl(x) and can be referred to as the steering TDOA function.

2.2. Problem Formulation

The classical SRP-based localization method often lacks robustness in outdoor scenarios. The steering time delay function ηm(x) in the SRP function is different from the sound propagation in reality denoted as ηm0(x), and Δηm(x)=ηm(x)ηm0(x) is denoted as the steering time-uncertainty function. Similarly, the steering TDOA-uncertainty functions in a pair of sensors can be expressed as

Δτpx = Δηmx  Δηlx = τpx  τp0x, (10)

where τp0x = ηm0x  ηl0x, representing the real steering TDOA function for a given sensor pair cp. This term is usually negligible within a confined space, so it has been rarely discussed in classical SRP models. However, in outdoor applications, the sound propagation is much more unpredictable, resulting in enlarged uncertainty with the increase in distances. The steering time uncertainty can easily be influenced by the geography, temperature, wind, and self-localization error among sensors, and then yields a noticeable defocus effect on the SRP map, as shown in Figure 1c. The GCCs would intersect with each other dispersedly around xs.

Since the spatial spectrum generated by the SRP function contains many local extrema and ridged areas, the maximal value of P(x) is usually found through a grid-searching process. Consider a uniform sampling grid (USG) case in RN. Define Xg as the set of grid points in the candidate searching region (VRN), and dgR, NgR as the grid distance and the total number of the grids in Xg, respectively, then the estimated on-grid location is formulated as

x^s=argmaxxXgPx=argmaxxXgPEx. (11)

Note that the localization precision depends on the gird resolution. A more accurate estimation usually requires a smaller dg. This will leads to a larger Ng and significantly increased calculation burden because the number of grids is inversely proportional to the Nth power of dg (i.e., Ng(dg)N). Hence, the accuracy and feasibility can hardly be balanced in an outdoor WASN system confronting a large search region, for which the minimal grid resolution limited by computing power is much coarser than that in indoor applications. However, most SRP approaches usually work well at subtle grid resolutions, and coarser grid resolution has an undersampled effect, as shown in Figure 1d. The searching process probably would miss the source peak.

It is known that the background noise always dominates at low frequencies in the field environment, and real sound sources often show band-pass characteristics. Thus a band-pass GCC is indeed required. However, the SRP-PHAT with a band-pass source would cause a rippling effect [21], as shown in Figure 1e. The rippling effect does not alter the location of the maximal value of the SRP function. However, it may lead to local extrema and even fake peaks such that the SRP spectrum is susceptible to the two other factors and shows a lack of robustness.

Under the influence of the synthetic effect of the above inevitable factors, the real-world SRP output is illustrated in Figure 1f. It shows that classical SRP implementations hardly deal with all these factors outdoors and yield a divergent localization result.

3. A Robust Outdoor SRP-Based Source Localization Method

3.1. On-Grid SRP-Based Localization Error Bound Condition

It is known that the SRP-based spatial spectra mainly depend on the phase information of the source components. It is always reasonable to assume that the additive noise of sensors is independent of each other and the source signal, and then it has no spatial preference (which means that they have zero mean in the phase domain). Their contributions to the SRP spectrum can be neglected and not related to the grid resolution and the steering time uncertainty. Therefore, only the contribution of the source signal is considered in analyzing the SRP function. With the terms of additive noise wm(τ) neglected, the weight functions ΨpΩ of the sensor pair cp usually can be expressed as

Ψp(Ω)=BpΨ0(Ω), (12)

where BpR is an amplitude-scaling factor irrelevant to the frequency, and Ψ0Ω = Ψ0ΩR is a real function irrelevant to sensors. Substituting Equation (12) into Equation (7), the GCC Rp(τ) can be rewritten as

Rp(τ)=BpAlAm2πππΨ0(Ω)S(Ω)S*(Ω)ejΩFs(ττp0(xs))dΩ=BpAlAmC02πR0ττp0xs, (13)

where C0=maxππΨ0ΩSΩS*ΩjΩFsτdΩ, and

R0τ=1C0ππΨ0ΩSΩS*ΩejΩFsτdΩ, (14)

is the amplitude-normalized version of the weighted self-correlation function of the source signal s(t). Hence, each GCC contains the same waveform function R0τ with different time-shifting factors τp0xs and amplitude factors BpAlAm/C0. In practice, the range information in amplitude is usually less stable or accurate than in time delay. Thus, a normalized mapping function representing the contribution of the source component in the SRP function can be constructed as

FE(x,xs)=1CM2p=1CM2R0(τp(x)τp0(xs)). (15)

In the above equation, the amplitude factors BpAlAm/C0 between different sensor pairs are removed. Thus, each pair yields an equal contribution to the SRP function. Note that FEx1,1 has a definite value range regardless of the sensor number M.

For a given grid distance dgR>0, an arbitrary uniform sampling grid set in RN can be expressed as

X(dg,xgo)= x+xgo:x=[n1dg,,nNdg]T;n1,,nNZ, (16)

where xgoRN is the position of the origin of the set. Then the on-grid location estimation is given by

x^sg= argmaxxX(dg,xo)FEx,xs= argmaxxX(dg,xgo)1CM2p=1CM2R0τp(x)τp(xs)+Δτp(x). (17)

It is worth pointing out that the grid resolution, the steering time uncertainty, and band-pass issues are comprehensively considered in the above-simplified SRP function.

The grid issue should be unrelated to the origin position xgo. In the real world, the uncertainty functions Δτpx are hard to closely describe due to many interference factors, and it is reasonable to assume that they have an upper bound Δτmax (i.e., ΔτpxΔτmax). Δτmax indicates the steering time delay uncertainty level and can be estimated from the environmental and devices’ conditions. Thus, the robustness of the on-grid localization problem can be described as: given a dg and a Δτmax, there exists a ε(0,) such that

x^sgxsε. (18)

Define a level-passed area based on FEx,xs:

M(α,xs){x:FE(x,xs)α}RN, (19)

where αR is the level-pass threshold. Then a sufficient condition can be obtained in the following Proposition:

Proposition 1.

if M(α,xs)X(dg,xgo) and M(α,xs) is a bounded set (i.e., there exists a εM(0,) such that x1x2εM for all x1,x2M(α,xs)), then Inequality (18) is satisfied.

The proof is given in Appendix A.1. Thus, the robustness of the on-grid source localization problem can be analyzed in terms of Mα,xs.

A practical example of Mα,xs is depicted in Figure 2, and its area shrinks inwards when α increases. The first sub-condition (M(α,xs)X(dg,xgo)) can be satisfied when Mα,xs covers enough areas. The shape of M(α,xs) relates to α, R0τ, Δτp(x), and sensor distribution, and it is generally irregular. Consider a closed ball BN(x0,r)x:|xx0|r;x0,xRN with center x0 and radius r. If

rdgN/2, (20)

then BN(x0,r)X(dg,xgo) is satisfied. The proof can be seen in Appendix A.2. Consequently, if BN(xs,dgN/2)M(α,xs), then the first sub-condition is satisfied.

Figure 2.

Figure 2

Illustration of the level-pass area Mα,xs. (Orange: M0.3,xs; yellow green: M0.2,xs; celeste: M0.1,xs).

Figure 3 illustrates a typical waveform of R0τ, the GCC-PHAT of the passband ΩCΩB,ΩC+ΩB  0,π, which can be expressed by

R0PHATBP(τ)=sincΩBFsπτcosΩCFsτ. (21)

Figure 3.

Figure 3

An example of R0(τ).

A valid R0τ is an even and bounded function (i.e., R0τ = R0τ and R0τ  1,1) and contains a main-lobe around τ=0, where its maximum am lies. The maximum side-lobe height (or the maximum value outside the main-lobe area if R0(τ) has no side-lobes) can be denoted as as, where as<am.

Let us define a function based on R0τ by

TR(aT)inf{|τ|:R0(τ)<aT}, (22)

where aTaS,aM is the level-pass threshold of GCC, “inf{.}” represents the infimum. TR(aT) represents the half-width of the level-passed section of R0τ within its main-lobe. It follows that R0τ  aT if and only if τ  TRaT,TRaT.

Based on a geometrical analysis in Appendix A.3, if R0τ possesses the following property:

TR(α)dgN/vs+Δτmax, (23)

then M(α,xs)BN(xs,dgN/2). Therefore, the first sub-condition can be satisfied.

For all α such that α>maxx+{FE(x,xs)}, the second sub-condition (M(α,xs) is a bounded set) is satisfied. The area of M(α,xs) is mainly the superposition of the projection area of the main-lobe sections of GCCs belonging to individual sensor pairs. Denote

Λp(τc,T)={x:|τp(x)τc|T}

to be the projection area of the TDOA section τcT,τc+T of sensor pair cp, where T[0,) and τc τpmax,τpmax are the half-width and the central TDOA of the section, respectively, and τpmax=zlzm/vs is the maximal TDOA value that this sensor pair can produce.

For each sensor pair cp, the solution set of the half hyperbolic equation τpx=τc can be denoted as Λpτc,0 and extends to infinity (i.e., there exists an x such that x= and xΛpτc,0 ). For two different sensor pairs ci and cj, if there exist a τicτimax,τimax and a τjcτjmax,τjmax such that Λiτic,0Λjτjc,0 or Λiτic,0Λjτjc,0, then the half hyperbolic functions τi(x)=τic and τj(x)=τjc are not independent. The sense might occur when the sensors of these two pairs are co-linear or have the same axis of symmetry; in the meantime, both τic and τjc reach their extremum or become zero. In WASNs, this case rarely happens because the sensor distributions are often irregular. Despite this sense for all sensor pairs, the maximal value of FEx,xs at infinity does not exceed a linear combination of am and as, which is given as

αinf=CN2am+CM2CN2asCM2. (24)

The detailed derivation can be found in Appendix A.4. If α>αinf, then M(α,xs) is bounded.

Combining Inequality (23) and Equation (24) together, a sufficient condition for robust on-grid source localization is given by

TRαinf >dgN/vs+Δτmax. (25)

It means that for a given grid distance dg and steering TDOA uncertainties within Δτmax, if the GCC waveform function R0τ has a wide main-lobe satisfying this condition, then the divergent on-grid location estimation can be avoided.

The SRP-PHAT generates a sharp GCC to increase the TDOA resolution for cases with reverberation or multiple sources. However, as shown in Figure 3, the band-pass effect would bring a narrow main-lobe section and strong side-lobes to the GCC waveform function. It can hardly satisfy the requirement Inequality(25), which is also shown by the poor performance of SRP-PHAT in Figure 1f. Next, we will introduce a GCC waveform refinement procedure for the band-pass SRP.

3.2. Robust SRP-Based Source Localization with Refined GCC Waveform

The condition in Inequality (25) is too strict for band-pass GCC situations with coarse grid resolution and perceptible steering TDOA uncertainties. Some classical GCC methods utilized low-pass filtering to meet a broader main-lobe requirement, but they are not applicable for band-pass signals. In this section, the GCC is refined to obtain a suitable waveform to modify the SRP function.

Consider a complex wavelet function ψe(τ,ΩC)=ue(τ)ejΩCFsτ, where ue(τ)L2(R) is an even symmetrical function. Applying ψe(τ,ΩC) as the filtering function on the GCC-PHAT, the filtered output of cp can be denoted as

RpCF(τ,ΩC)=RpPHAT(τ)ψe(τ,ΩC), (26)

where RpPHAT(τ) is the GCC-PHAT of cp.

When the real function ue(τ) has an effective support [ΩB,ΩB][π,π] in the frequency domain, i.e.,

|Ue(Ω)|2dΩΩBΩB|Ue(Ω)|2dΩΩBΩB|Ue(Ω)|2dΩ, (27)

where Ue(Ω) is the Fourier Transform of ue(τ), and if the source is dominant in the frequency band [ΩCΩB,ΩC+ΩB](0,π], then the approximation

RpCF(τ,ΩC)=12πYpl(Ω)Ypm*(Ω)|Ypl(Ω)Ypm*(Ω)|Ue(ΩΩC)ejΩFsτdΩ12πΩCΩBΩC+ΩBYpl(Ω)Ypm*(Ω)|Ypl(Ω)Ypm*(Ω)|Ue(ΩΩC)ejΩFsτdΩ12πΩCΩBΩC+ΩBejΩFsτp0(xs)Ue(ΩΩC)ejΩFsτdΩ12πejΩFsτp0(xs)Ue(ΩΩC)ejΩFsτdΩ=ue(ττp0(xs))ejΩCFs(ττp0(xs)) (28)

exists. It can be observed that the approximate function carries the same envelope as ue(τ) and extracts the TDOA information in [ΩCΩB,ΩC+ΩB].

Note that the RpCF(τ,ΩC) is equal to the time domain approach of the sub-band GCC defined in [30]. Since the main goal is to obtain an equivalent GCC to match the sufficient condition in Inequality (25), a lightweight approach is to average the envelope of those filtered GCCs of multiple sub-bands in high SNR conditions. According to the power spectral density (PSD) of source signal or other prior knowledge, Nq valid sub-bands can be selected with individual central frequency Ωq. The final refined GCC is given by

RpWR(τ)=1Nqq|RpCF(τ,Ωq)||ue(ττp0(xs))|,

which has a specific waveform function R0(τ)|ue(τ)|. Furthermore, the improved spatial function is calculated as

PWR(x)=1CM2p=1CM2RpWR(τp(x))=1CM2Nqp=1CM2q=1Nq|RpCF(τp(x),Ωq)|. (29)

The selection ue(τ) has a significant influence on the refinement of GCC. Its envelope |ueτ| provides the waveform function of refined GCCs. The suitable envelope of a suitable ueτ should have no side-lobes, i.e., ueτ1>ueτ20 for all τ1<τ2. Meanwhile, each UeΩΩq in the frequency domain serves as a band-pass filter, thus the spectral distribution of UeΩ should be concentrated to satisfy Inequality (27). Gaussian function given by

ue(τ)=e(ΩdFsτ)2 (30)

which possesses the required properties both in the time domain and in the frequency domain. Then the corresponding complex filtering function ψeτ,ΩC can be regarded as a complex Morlet wavelet. According to (25), for a given grid distance dg and steering TDOA uncertainty level Δτmax, the parameter Ωd can be given by

Ωd=vslnα/FsdgN+vsΔτmax, (31)

where N is the space dimension, α is the threshold value, which usually can be set as α=0.5. Taking Equation (31) into Inequality (27) and dividing (27) by its right side term, it yields

eΩ2Ωd4dΩΩBΩBeΩ2Ωd4dΩ/ΩBΩBeΩ2Ωd4dΩ1.

Thus, the relation of Ωd and ΩB can be obtained by the following equivalent equation:

2Ωd0eΩ4dΩ0ΩBeΩ4dΩ/0ΩBeΩ4dΩ=c,

where c is an extremely small number. Then, it can be obtained that

ΩB=2ceΩd, (32)

where ce is the positive solution of the following equation:

xE34x4=4c1+cΓ54,

where En(x)=1+exttndt,(x>0) and Γ(x)=0+tx1e1dt,(x>0). When c is set as 0.001(−30 dB), ce in Equation (32) can be obtained as 2.89.

A simulation is performed to illustrate the effect of the GCC waveform refinement procedure on on-grid SRP-based source localization. As shown in Figure 4, the dot-dashed box shows the range of TDOA within the volume of the nearest gird xg, the dashed line with “Δ” shows the real TDOA, which should coincide with the peak of the GCC; the dotted line with “∇” marks Rpτpxg, corresponding to the nearest gird xg. The Rpτpxg of the traditional GCC-PHAT is small, thus leading to poor performance in grid searching. In contrast, the proposed refining method generates a smooth waveform and high values throughout the TDOA region indicated by the box in the figure.

Figure 4.

Figure 4

An example of refined GCC from field data: (a) GCC-Phase Transform (PHAT); (b) refined GCC.

The modified algorithm with the GCC refinement procedure is shown in Algorithm 1, in which ue(τ)=e(ΩdFsτ)2 is taken as the target waveform function.

Algorithm 1: SRP with the waveform refinement procedure
Parameter Setting
(1) Set the maximum steering TDOA error Δτmax=ΔτmaxC+ΔτmaxS, where the sub-items ΔτmaxC and ΔτmaxS are determined by the wind and the synchronization error of sensors, respectively.
(2) Set the grid distance dg and searching region V that meet the system requirement. Then the searching grid set Xg is generated.
(3) Set the waveform function ue(τ)=e(ΩdFsτ)2 and α=0.5.
(4) Set c=0.001 and compute the bandwidth ΩB using Equation (32).
Band selecting
(1) Set up the passband ΩL,ΩU
(2) Pick up Nq highest PSD bands of the source or divide the passband uniformly.
Source Localization
(1) Calculate the refinement waveform (WR)-SRP function PWR(x) by Equation (29) at all xXg.
(2) Estimate the source location x^s by Equation (11).

4. Experiment Results and Discussion

4.1. Numerical Simulations

In this section, we use Monte Carlo simulations to analyze the efficiency of the proposed SRP-based localization method (the SRP functional with the refinement waveform, referred to as WR), compared with the traditional SRP functional with GCC-PHAT (PS), the SRP functional—the envelope of GCC-PHAT (PES) that is designed for acoustic band-pass signals [21], the modified-SRP (M-SRP) functional with GCC-PHAT (PM) [18] in which grid resolution is considered, and the M-SRP functional with the envelope of GCC-PHAT (PEM) in which both band-pass and grid resolution are considered.

In this setup, M = 8 sensors and one source are randomly deployed in a monitored area of 200 m by 200 m. The propagation model is set to be the line-of-sight path with a constant sound speed of 345 m/s. The input GCCs are generated by the waveform function in Equation (21) with passband of 0.15π,0.4π. The steering TDOA uncertainty Δτp(x) uniformly distributes over Δτmax,Δτmax, where Δτmax is the maximal time uncertainty dependent on the sound-propagation model error and the synchronization error.

We consider four different conditions in WASNs to test the algorithms: (a) a small steering TDOA uncertainty and small grid distance (STSG) condition with Δτmax=0.1 ms, dg=0.1 m, (b) a large steering TDOA uncertainty and small grid distance (LTSG) condition with Δτmax=100 ms, dg=0.1 m, (c) a small steering TDOA uncertainty and large grid distance (STLG) condition with Δτmax=0.1 ms, dg=10 m, (d) a large steering TDOA uncertainty and large grid distance (LTLG) condition with Δτmax=100 ms or dg=10 m.

The mean absolute error (MAE) Ex^sxs of distance and the cumulative distribution function (CDF) of estimation errors of relative distance are calculated to evaluate the accuracy and robustness of these algorithms, where the relative distance in the cumulative distribution function (CDF) is normalized by the grid distance, i.e.,

F(eu)=Px^sxs/dgeu, (33)

where eu is the relative positioning error that is determined as the system requirement. Specifically, the 95th percentile of the localization error in meters is computed as F1(0.95)·dg.

The MAE and 95th percentile results are listed in Table 1. All the localization algorithms can obtain the best estimation accuracy in the STSD condition in which the defocus effect and undersampled effect are slight. When the steering TDOA uncertainty or the grid distance increases, the MAE would increase. However, compared with the PS, PES, PM, and PEM methods, the MAE in the WR has almost the smallest estimate error because all these factors have been considered. The 95th percentile has similar results with the MAE, which indicates that the proposed WR method has a stable localization performance in outdoor conditions.

Table 1.

Mean absolute error (MAE) and 95th percentile under different conditions in the simulation.

MAE (m)
Condition PS PES PM PEM WR
STSG 0.81 0.07 1.01 0.07 0.06
LTSG 44.53 29.27 52.04 36.37 13.16
STLG 51.90 15.39 42.97 4.07 4.46
LTLG 77.64 50.74 70.37 22.88 13.65
95th percentile (m)
Condition PS PES PM PEM WR
STSG 2.83 0.17 2.99 0.18 0.17
LTSG 123.13 82.61 128.10 118.61 33.43
STLG 147.04 58.81 124.39 7.11 9.24
LTLG 172.37 139.73 163.95 74.07 34.68

Figure 5a–d depict the CDF of each algorithm in the range eu[0.5,100m/dg] under the four conditions. Specifically, the CDF curves will increase rapidly with the location error in the fine condition, and then the estimate errors are the smallest for all the algorithms in the STSG. The CDF curve will move down as the grid distance dg and steering TDOA uncertainty Δτmax increase, such as in the LTSG, STLG, and LTLG. Since the steering TDOA uncertainty is not considered in PES and PEM, their descent range of CDF in the SDLG is lower than that in the LDSG. Among these localization algorithms, the CDF of the WR is the highest or very close to the highest (STLG), and the PEM method is better than the PS, PES, and PM. The proposed WR method is very robust even though the condition becomes abominable.

Figure 5.

Figure 5

Simulation comparison in the cumulative distribution function (CDF) of relative distance error. (a) small steering time difference of arrival (TDOA) uncertainty and small grid distance (STSG); (b) large steering TDOA uncertainty and small grid distance (LTSG); (c) small steering TDOA uncertainty and large grid distance (STLG); (d) large steering TDOA uncertainty and large grid distance (LTLG).

Furthermore, Figure 6 presents the MAE in four situations: (a) fixed small steering TDOA uncertainty (ST) with Δτmax = 0.1 ms, dg ranges from 0.1 m to 50 m; (b) fixed large steering TDOA uncertainty level (LT) with Δτmax = 100 ms, dg ranges from 0.1 m to 50 m; (c) fixed small grid distance (SG) with dg = 0.1 m, Δτmax range from 0.1 ms to 100 ms; (d) fixed large grid distance (LG) with dg = 10 m, Δτmax range from 0.1 ms to 100 ms. The MAE increases with dg or Δτmax significantly, and this indicates that the steering TDOA uncertainty and grid distance have a severe influence on the performance of source localization. In each situation, the PS and PM produce larger MAE than the other algorithms when dg and Δτmax are small because they are not applied to band-pass signals. Since the scalable grid sampling and steering TDOA uncertainty are not considered in the PES, it shows reliable performance only when dg1 m and Δτmax1 ms. The PEM considered both grid size and band-pass effect; thus, it achieves the best performance in the small Δτmax case. However, the MAE becomes worse when the influence caused by the steering TDOA uncertainties is more significant than by the grid size. The WR obtains the MAE close to the PEM when Δτmax is small. Moreover, it is the smallest in all the other situations. These results abundantly demonstrate its excellent robust performance.

Figure 6.

Figure 6

The mean absolute errors (MAEs) under different conditions. (a) small steering TDOA uncertainty (ST) (Δτmax = 0.1 ms, dg[0.1m,50m]); (b) large steering TDOA uncertainty level (LT) (Δτmax = 100 ms, dg[0.1 m,50m]); (c) small grid distance (SG) (dg = 0.1 m, Δτmax[0.1ms,100ms]); (d) large grid distance (LG) (dg = 10 m, Δτmax(0.1 ms,100 ms)).

4.2. Field Experiment

In this experiment, seven nodes are distributed in a park, as shown in Figure 7a,b. Each node consists of a microphone sensor, a Wi-Fi module, and a GPS module for self-localization and time calibration. The monitoring area has the same 200 m × 200 m in addition with a hillock. A portable speaker generates the sound signals at 12 positions inside the area, such as the Gaussian signal (S-G), the whistle of vehicles (S-V) representing an urban source, and birdsong (S-B) representing a field source. The temperature was approximately 30 °C, and the wind speed is slower than 3 m/s. Therefore, in the proposed method Δτmax can be set to be 10 ms fully considering the self-localization error of the sensors and the effect of wind.

Figure 7.

Figure 7

Setup of the field experiment (a) Device. (b) Distribution. (c) Estimated power spectrum density of sensor signal 30 m away from source. (d) Estimated signal to noise ratio.

The sampling frequency is 10,000 Hz and Figure 7c shows the PSDs of both the background noise and received source signals, which are obtained with the Burg method of 50 order number and 2048 FFT length. The PSDs of the source signals are collected at about 30 m away from the speaker. Because the environmental noise is mainly distributed in the frequency bands below 1500 Hz, the passband is set to be (1500 Hz, 3500 Hz) for all sources. The estimated SNRs are shown in Figure 7d, and the SNRs of the full band (0, 5000 Hz) and of the passband (1500 Hz, 3500 Hz) are plotted in solid lines and dashed lines, respectively. For the three source types, the SNR is improved by 20 dB∼30 dB.

The recorded data are divided into 1242 two-second audio frames. SRP algorithms with full-band and band-pass cross-correlation (referred to as CSF and CSB) are added to analyze the necessity of band-pass signals. The PS and PM are not included since they have been proven unreliable in the simulation. Then the candidate SRP-based locators compared in this sub-section include: (1) SRP with full-band GCC (CSF), (2) SRP with band-pass GCC (CSB), (3) SRP with the envelope of band-pass GCC-PHAT (PES), (4) MSRP with the envelope of band-pass GCC-PHAT (PEM) and (5) WR-SRP with band-pass GCC (WR). A well known TDOA-based localization method [13] (referred to as TC) is also compared as a reference in which the TDOAs are obtained by band-pass GCC-PHATs.

The MAE and the 95th percentile of the localization errors of the TC method and the SRP-based methods with different grid distances (dg{0.1,1,10} m) are listed in Table 2. Moreover the MAEs with grid distance dg ranging from 0.1 m to 50 m are presented in Figure 8a. Figure 8b–d give the CDF curves at the three grid distances (dg{0.1,1,10} m).

Table 2.

Mean absolute error (MAE) and 95th percentile under different conditions in the field experiment.

MAE (m)
Condition TC CSF CSB PES PEM WR
no grid 102.2 - - - - -
dg = 0.1 m - 79.2 23.5 7.1 18.7 1.4
dg = 1 m - 83.0 33.0 12.6 27.4 2.0
dg = 10 m - 93.3 66.0 42.6 46.1 7.2
95th percentile (m)
Condition TC CSF CSB PES PEM WR
no grid 322.8 - - - - -
dg = 0.1 m - 146.5 100.8 53.7 105.0 5.4
dg = 1 m - 150.4 113.1 91.6 105.1 6.0
dg = 10 m - 171.8 149.0 138.5 104.6 21.0

Figure 8.

Figure 8

Experiment results: (a) MAE comparison; (b) CDF of relative error at dg=0.1 m; (c) CDF of relative error at dg=1 m; (d) CDF of relative error at dg=10 m.

Like the simulation, the MAEs increase and the CDF curves move down as the grid distance increases. The MAE of the TC method is the highest because some sensor pairs might produce very severe TDOA measurements in noisy acoustic environments. Its CDF curve also shows that the solution is not stable. By comparing the result of CSF and CSB, the band-pass GCC can significantly enhance the SNR and the localization performance. The PES and PEM obtain more significant localization errors and lack robustness, which indicates the influence of the steering TDOA uncertainty is very remarkable. The proposed WR method achieves the best estimation for all the grid distances, which thoroughly verifies its effectiveness.

5. Conclusions

In this work, a novel and robust Steered Response Power (SRP)-based source localization approach is proposed to localize the band-pass source in outdoor WASNs with steering time delay uncertainty and coarser spatial grids. The robustness of on-grid source localization is analyzed by a sufficient condition, in which the relation between GCC signal waveform and on-grid localization error is demonstrated. A band-pass GCC refinement procedure is designed to meet the sufficient condition for enhancing the on-grid source localization performance. The Monte Carlo simulation and field experiment show that the proposed method has a robust performance in outdoor WASNs scenarios, compared with some state-of-the-art SRP-based methods.

Abbreviations

The following abbreviations are used in this manuscript:

SRP steered response power
TDOA time difference of arrival
DOA direction of arrival
GCC generalized cross-correlation
PHAT phase transform
CDF cumulative distribution function
GPS Global Position System
FFT Fast Fourier Transform

Appendix A

Appendix A.1

Proposition A1.

if M(α,xs)X(dg,xgo) and M(α,xs) is a bounded set (i.e., there exists a εM(0,) such that x1x2εM for all x1,x2M(α,xs)), then Inequality (18) will be satisfied.

Proof of Proposition A1.

For an arbitrary xgo, if M(α,xs)X(dg,xgo), there exists an xa such that xaM(α,xs)X(dg,xgo). Let x^sg be the estimated result from Equation (17). Then FE(x^sg,xs)FE(xa,xs)α. According to the definition of M(α,xs), x^sgM(α,xs) holds. Since M(α,xs) is a bounded set, xa<. Then xsxa is finite. Denote εM(0,) be a bound of M(α,xs) and let ε=εM+xsxa(0,). Then x^sgxsx^sgxa+xaxsε.

Appendix A.2

Proposition A2.

If a closed ball BN(xo,r) such that rdgN/2, then for all xgoRN, BN(xo,r)X(dg,xgo) holds.

Proof of Proposition A2.

Let BN(xo,r) be a closed ball with center xo and radius r. For an arbitrary xgoRN, the vector from xo to xgo is denoted as Δxo=xoxgo=Δx1o,,ΔxNoT. Given dgR+, it deduces nko=Δxkodg(k = 1,...,N), where “.” means the nearest integer. Therefore, we can find the grid point xgn=xgo+n1odg,,nNodgTX(dg,xgo), so that xoxgn=Δx1on1odg,,ΔxNonNodgT. The distance yields

xgnxoi=1Ndg22=Ndg2.

Thus, if rNdg/2, then xgnBN(xo,r). Hence, X(dg,xgo)BN(xo,r) holds. □

Appendix A.3

Proposition A3.

If the waveform function R0(τ) such that TR(α)2r/vs+Δτmax, then BN(xs,r)M(α,xs).

Proof of Proposition A3.

Based on Equation (4), it derives that

|τp(x)τp(xs)|= |ηm(x)ηl(x)ηm(xs)+ηl(xs)| |ηm(x)ηm(xs)|+|ηl(xs)ηl(x)|= |xzmxszm|+|xzlxszl|vs 2xxs/vs

Given the steering TDOA uncertainty level Δτmax, for each xBNxs,r, the steering TDOA function τp(x) derives that

|τp(x)τp0(xs)|= |τp(x)τp(xs)+Δτp(xs)| |τp(x)τp(xs)|+|Δτp(xs)| 2xxs/vs+Δτmax 2r/vs+Δτmax.

Since TR(α)2r/vs+Δτmax, according Equation (22), it derives that

Rp(τp(x))=R0(τp(x)τp0(xs))α

holds for all cp. According to Equation (15), then for every xBNxs,r, the inequality

FEx,xs  α

holds. According to Equation (19), BNxs,rMα,xs holds. □

Appendix A.4

Proposition A4.

If for all two different pairs of sensors ci={il,im}, cj={jl,jm} in the WASNs satisfy that τic[zilzim,zilzim]/vs and τjc[zjlzjm,zjlzjm]/vs, Λiτic,0  Λjτjc,0 and Λiτic,0  Λjτjc,0, then

max{x=+,xs<+}FE(x,xs)  CN2am+(CM2CN2)asCM2

holds.

Proof of Proposition A4.

For a spatial point x such that x=, let KN be the total number of sensor pairs cp such that xΛpτp0xs,TRas. According to Equation (15) and Inequality (22), it follows that

FE(x,xs)Kam+(CMK)asCM2. (A1)

If KCN2+1, there exists a collection of N linear independent sensor pairs from those CN2+1 sensor pairs. Without the loss of generality, denote this collection as {c1,,cN}. Then for each xdp=1NΛpτp0xs,TRas, there exists an equation set such that:

τ1(xd)=τ1c,τ2(xd)=τ2c,τN(xd)=τNc,

where τNcτp0xsTRas,τp0xs+TRas. According to the condition of the Proposition A4 and since the sensor pairs are all linear independent, these N equations are linear independent. Then it holds that xd   which is in contradiction with x = . Thus KCN2. According to Inequality (A1), it is easily obtain that FE(x,xs)(CN2am+(CM2CN2)as)/CM2. □

Author Contributions

Conceptualization, methodology, programming, writing—original draft preparation, Y.H.; conceptualization, writing—review and editing, data curation, J.T.; writing—review and editing, X.H.; supervision, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant (11774379,61501448), and Youth Innovation Promotion Association.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://1drv.ms/u/s!AskSoQGpB3VUgfIqsxtYhosVrGyzOg?e=pnfutC.

Conflicts of Interest

The authors declare no conflict of interest.

Footnotes

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

  • 1.Ajdler T., Kozintsev I., Lienhart R., Vetterli M. Acoustic Source Localization in Distributed Sensor Networks; Proceedings of the Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004; Pacific Grove, CA, USA. 7–10 November 2004; pp. 1328–1332. [Google Scholar]
  • 2.Liu Y., Hu Y.H., Pan Q. Robust Maximum Likelihood Acoustic Source Localization in Wireless Sensor Networks; Proceedings of the GLOBECOM 2009-2009 IEEE Global Telecommunications Conference; Honolulu, HI, USA. 30 November–4 December 2009; pp. 1–6. [Google Scholar]
  • 3.Saric Z., Kukolj D., Teslic N. Acoustic Source Localization in Wireless Sensor Network. Circuits Syst. Signal Process. 2010;29:837–856. doi: 10.1007/s00034-010-9187-3. [DOI] [Google Scholar]
  • 4.Kim Y., Ahn J., Cha H. Locating acoustic events based on large-scale sensor networks. Sensors. 2009;9:9925–9944. doi: 10.3390/s91209925. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 5.Cobos M., Antonacci F., Alexandridis A., Mouchtaris A., Lee B. A survey of sound source localization methods in wireless acoustic sensor networks. Wirel. Commun. Mob. Comput. 2017;2017 doi: 10.1155/2017/3956282. [DOI] [Google Scholar]
  • 6.Sheng X., Hu Y.H. Sequential acoustic energy based source localization using particle filter in a distributed sensor network; Proceedings of the 2004 IEEE International Conference on Acoustics, Speech and Signal Processing; Montreal, QC, Canada. 17–21 May 2004; pp. 972–975. [Google Scholar]
  • 7.Sheng X., Hu Y.H. Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. Signal Process. IEEE Trans. 2005;53:44–53. doi: 10.1109/TSP.2004.838930. [DOI] [Google Scholar]
  • 8.Meng W., Xiao W. Energy-based acoustic source localization methods: A survey. Sensors. 2017;17:376. doi: 10.3390/s17020376. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Chang S., Li Y., He Y., Wu Y. RSS-based target localization in underwater acoustic sensor networks via convex relaxation. Sensors. 2019;19:2323. doi: 10.3390/s19102323. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Berman Z. A reliable maximum likelihood algorithm for bearing-only target motion analysis; Proceedings of the 36th IEEE Conference on Decision and Control; San Diego, CA, USA. 12 December 1997; pp. 5012–5017. [Google Scholar]
  • 11.Doğançay K. Bearings-only target localization using total least squares. Signal Process. 2005;85:1695–1710. doi: 10.1016/j.sigpro.2005.03.007. [DOI] [Google Scholar]
  • 12.Navidi W., Murphy W., Hereman W. Statistical Methods in Surveying by Trilateration. Comput. Stat. Data Anal. 1998;27:209–227. doi: 10.1016/S0167-9473(97)00053-4. [DOI] [Google Scholar]
  • 13.Chan Y., Ho K. A Simple and Efficient Estimator for Hyperbolic Location. Signal Process. IEEE Trans. 1994;42:1905–1915. doi: 10.1109/78.301830. [DOI] [Google Scholar]
  • 14.Gillette M., Silverman H. A Linear Closed-Form Algorithm for Source Localization From Time-Differences of Arrival. Signal Process. Lett. IEEE. 2008;15:1–4. doi: 10.1109/LSP.2007.910324. [DOI] [Google Scholar]
  • 15.Bordoy J., Schott D.J., Xie J., Bannoura A., Klein P., Striet L., Hoeflinger F., Haering I., Reindl L., Schindelhauer C. Acoustic Indoor Localization Augmentation by Self-Calibration and Machine Learning. Sensors. 2020;20:1177. doi: 10.3390/s20041177. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.DiBiase J.H., Silverman H.F., Brandstein M.S. Microphone Arrays. Springer; Berlin/Heidelberg, Germany: 2001. Robust localization in reverberant rooms; pp. 157–180. [Google Scholar]
  • 17.Do H., Silverman H., Yu Y. A Real-Time SRP-PHAT Source Location Implementation using Stochastic Region Contraction(SRC) on a Large-Aperture Microphone Array; Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing—ICASSP ’07; Honolulu, HI, USA. 15–20 April 2007; pp. 121–124. [Google Scholar]
  • 18.Cobos M., Marti A., Lopez J.J. A Modified SRP-PHAT Functional for Robust Real-Time Sound Source Localization With Scalable Spatial Sampling. IEEE Signal Process. Lett. 2011;18:71–74. doi: 10.1109/LSP.2010.2091502. [DOI] [Google Scholar]
  • 19.Marti A., Cobos M., Lopez J., Escolano J. A steered response power iterative method for high-accuracy acoustic source localization. J. Acoust. Soc. Am. 2013;134:2627–2630. doi: 10.1121/1.4820885. [DOI] [PubMed] [Google Scholar]
  • 20.Traa J., Wingate D., Stein N., Smaragdis P. Robust Source Localization and Enhancement With a Probabilistic Steered Response Power Model. IEEE/ACM Trans. Audio Speech Lang. Process. 2015;24:1. doi: 10.1109/TASLP.2015.2512499. [DOI] [Google Scholar]
  • 21.Cobos M., Garcia-Pineda M., Arevalillo-Herráez M. Steered Response Power Localization of Acoustic Pass-Band Signals. IEEE Signal Process. Lett. 2017;24:717–721. doi: 10.1109/LSP.2017.2690306. [DOI] [Google Scholar]
  • 22.Ritu, Dhull S. Iterative Volumetric Reduction (IVR) Steered Response Power Method for Acoustic Source Localization. Int. J. Sens. Wirel. Commun. Control. 2020;10 doi: 10.2174/2210327910999200614001049. [DOI] [Google Scholar]
  • 23.Knapp C., Carter G. The Generalized Correlation Method for Estimation of Time Delay. Acoust. Speech Signal Process. IEEE Trans. 1976;24:320–327. doi: 10.1109/TASSP.1976.1162830. [DOI] [Google Scholar]
  • 24.Brutti A., Omologo M., Svaizer P. Speaker localization based on oriented global coherence field; Proceedings of the Ninth International Conference on Spoken Language Processing; Pittsburgh, PA, USA. 17–21 September 2006. [Google Scholar]
  • 25.Brutti A., Omologo M., Svaizer P. Localization of multiple speakers based on a two step acoustic map analysis; Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing; Las Vegas, NV, USA. 30 March–4 April 2008; pp. 4349–4352. [Google Scholar]
  • 26.Salvati D., Drioli C., Foresti G. Exploiting a geometrically sampled grid in the steered response power algorithm for localization improvement. J. Acoust. Soc. Am. 2017;141:586–601. doi: 10.1121/1.4974289. [DOI] [PubMed] [Google Scholar]
  • 27.Zotkin D.N., Duraiswami R. Accelerated speech source localization via a hierarchical search of steered response power. IEEE Trans. Speech Audio Process. 2004;12:499–508. doi: 10.1109/TSA.2004.832990. [DOI] [Google Scholar]
  • 28.Khanal S., Silverman H.F. Multi-stage rejection sampling (MSRS): A robust SRP-PHAT peak detection algorithm for localization of cocktail-party talkers; Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA); New Paltz, NY, USA. 18–21 October 2015; pp. 1–5. [Google Scholar]
  • 29.Nunes L.O., Martins W.A., Lima M.V., Biscainho L.W., Gonçalves F.M., Said A., Lee B. A steered-response power algorithm employing hierarchical search for acoustic source localization using microphone arrays. IEEE Trans. Signal Process. 2014;62:5171–5183. doi: 10.1109/TSP.2014.2336636. [DOI] [Google Scholar]
  • 30.Cobos M., Antonacci F., Comanducci L., Sarti A. Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach. IEEE/ACM Trans. Audio Speech Lang. Process. 2020;28:1270–1281. doi: 10.1109/TASLP.2020.2983589. [DOI] [Google Scholar]
  • 31.Tian Z., Liu W., Ru X. Multi-Target Localization and Tracking Using TDOA and AOA Measurements Based on Gibbs-GLMB Filtering. Sensors. 2019;19:5437. doi: 10.3390/s19245437. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 32.Kaplan L., Le Q., Molnár N. Maximum likelihood methods for bearings-only target localization; Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing; Salt Lake City, UT, USA. 7–11 May 2001; pp. 3001–3004. [Google Scholar]
  • 33.Griffin A., Alexandridis A., Pavlidi D., Mastorakis Y., Mouchtaris A. Localizing multiple audio sources in a wireless acoustic sensor network. Signal Process. 2015;107:54–67. doi: 10.1016/j.sigpro.2014.08.013. [DOI] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://1drv.ms/u/s!AskSoQGpB3VUgfIqsxtYhosVrGyzOg?e=pnfutC.


Articles from Sensors (Basel, Switzerland) are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)

RESOURCES