Skip to main content
Computational Intelligence and Neuroscience logoLink to Computational Intelligence and Neuroscience
. 2016 Aug 24;2016:6172453. doi: 10.1155/2016/6172453

An Efficient Adaptive Window Size Selection Method for Improving Spectrogram Visualization

Shibli Nisar 1,*, Omar Usman Khan 1, Muhammad Tariq 1,2
PMCID: PMC5013242  PMID: 27642291

Abstract

Short Time Fourier Transform (STFT) is an important technique for the time-frequency analysis of a time varying signal. The basic approach behind it involves the application of a Fast Fourier Transform (FFT) to a signal multiplied with an appropriate window function with fixed resolution. The selection of an appropriate window size is difficult when no background information about the input signal is known. In this paper, a novel empirical model is proposed that adaptively adjusts the window size for a narrow band-signal using spectrum sensing technique. For wide-band signals, where a fixed time-frequency resolution is undesirable, the approach adapts the constant Q transform (CQT). Unlike the STFT, the CQT provides a varying time-frequency resolution. This results in a high spectral resolution at low frequencies and high temporal resolution at high frequencies. In this paper, a simple but effective switching framework is provided between both STFT and CQT. The proposed method also allows for the dynamic construction of a filter bank according to user-defined parameters. This helps in reducing redundant entries in the filter bank. Results obtained from the proposed method not only improve the spectrogram visualization but also reduce the computation cost and achieves 87.71% of the appropriate window length selection.

1. Introduction

Time-frequency analysis is typically required to characterize nonstationary phenomena such as speech [1, 2], biomedicine [3, 4], vibration [5], and music [6] based signals. The frequency contents for the analysis can be revealed if a Fourier transform is applied to these signals [7]. However, in doing so, all time related information will be lost [8]. The deficiency was first addressed in [9] where the Fourier transform was applied to analyze small sections of a signal at a time. Over time, this technique became popularly known as the Short Time Fourier Transform (STFT) [10, 11]. A significant shortcoming of the STFT is that it considers a fixed time-frequency resolution for all types of signals [12, 13]. This approach is not desirable for wide-band or ultrawide-band signals where low spectrogram resolutions can be observed. Moreover, the selection of an appropriate window size is vital for the STFT [14]. The window size should ideally ensure that the input signal falling within it should remain stationary [15]. However, if the window is too small, then the frequency domain cannot be localized [16].

The low resolution can be improved by using the constant Q transform (CQT) which is frequently used in auditory applications [17]. Unlike the STFT, the CQT provides a frequency resolution that depends on the geometrically spaced center frequencies of an analysis window [18]. In this paper, an adaptive method is proposed that provides an effective framework of switching between STFT for narrow band and CQT for wide-band signals, after analyzing the input signal. No prior information about the input signal is required in the proposed method. The proposed method is also capable of constructing a nonuniform filter bank according to user-defined parameters. This helps in the removal of filter bank redundancies. The results obtained from the proposed approach not only show an improved spectrogram visualization but also reduce the computation cost and show 87.71% of the appropriate window length selection.

2. Short Time Fourier Transform and Constant Q Transform

The STFT is achieved by introducing a sliding window to the nonstationary signal. This window adds a new dimension of time to the frequency response. In the discrete time-case, this is represented as

Xn,ω=m=smwnmejωm, (1)

where n and k are the time and frequency domain indices, s is the input signal, w is the window function, and m is the window interval centered around zero. The STFT can also be interpreted as a uniform filter bank [19]. The output signal X(n, k) is essentially the STFT (index n) obtained at the kth channel of the filter bank (Figure 1). The window function is assumed to be nonzero only in the window interval. As an example, (1) is applied to two signals. The first signal is a composite signal bearing frequencies of 40 Hz and 100 Hz. The second shows both the signals in isolation, occupying one-half of the time window each. As can be seen from the equivalent Fourier transform (Figure 2), the Fourier space cannot distinguish between the two types of signals. On the other hand, the distinction is clearly visible upon viewing the spectrogram of the STFT (Figure 3).

Figure 1.

Figure 1

Uniform filter bank (STFT) with fixed time-frequency resolution.

Figure 2.

Figure 2

(a) Time domain representation of 40 Hz and 100 Hz combined signal for 2 seconds; (b) Fourier transform of part (a); (c) time domain representation of 40 Hz signal for first two seconds and 100 Hz signal for next two seconds; (d) Fourier transform of part (c).

Figure 3.

Figure 3

(a) Time domain representation of 40 Hz and 100 Hz combined signal for 2 seconds; (b) magnitude STFT representation of part (a); (c) time domain representation of 40 Hz signal for first two seconds and 100 Hz signal for next two seconds; (d) magnitude STFT representation of part (c).

The time-frequency resolution of the spectrogram is dependent upon the chosen window size. A larger size will result in higher spectral, but lower temporal resolution, whereas the opposite will result in a lower spectral, but higher temporal resolution. This relationship is described as the Uncertainty Principle [20]. In this case, a variable window size would be ideal as it will provide high spectral resolution at low frequencies and high temporal resolution at high frequencies. A good candidate for achieving this is the constant Q transform (CQT) [21], where Q is the quality factor and its description appears shortly. Like the STFT, the CQT can also be interpreted as a filter bank. The only difference is that, in the case of CQT, the filters are geometrically spaced center frequencies such that the bandwidth Bwk of the kth filter is a multiple of the (k − 1)th filter:

Bwk=21/nBwk1, (2)

where n is the number of octaves per filter. As such, the bandwidth Bwmin of the lowest filter is given as

Bwk=21/nkBwmin. (3)

The quality factor Q is represented as the ratio of the center frequency f k to the bandwidth:

Q=fkBwk. (4)

Due to variations, the window length for the kth filter is given as

Nk=fsBwk. (5)

Finally, the CQT is given as

XCQk=1Nkn=0Nk1Wn,kxnej2πQn/Nk, (6)

where X CQ[k] is the k component of the constant Q transform, x[n] is the input signal, and w[n, k] is the window function of length N[k]. The filter bank bearing geometrically spaced center frequencies of the CQT is shown in Figure 4.

Figure 4.

Figure 4

CQT filter bank with geometrically spaced window bins.

3. Related Work

Time-frequency analysis methods are widely used in acoustics [22, 23], mechanics [5], electronics [24, 25], telecommunications [26, 27], biomedicine [28], and other fields involving processing of nonstationary information. Time-Frequency representation techniques are broadly categorized into parametric and nonparametric methods. Different parametric and nonparametric approaches have been studied in literature [2935]. This paper deals with the nonparametric approach. An important and one of the most prevalent nonparametric tools is the STFT [1, 36] which has been discussed earlier in the introduction. The STFT is not desirable when dealing with wide and ultrawide-band signals which results in spectrogram resolution issues due to the size of the window [37, 38]. A number of techniques have addressed this issue. Spectrum analysis/synthesis can be added to the STFT as a feature [39]. Window size decisions can then be manually made on the basis of sinusoidal features of the signal such as peak amplitude, frequency, and phase trajectories. As such, two consecutive sinusoids with frequency difference Δf can then be separated by setting the window size as

W=BsFsΔf, (7)

where W is the window size (number of samples), B s is the used window's main lobe size, and F s is the sampling frequency. If no prior information is available regarding an input signal, then most of the existing methods follow the adaptive STFT that selects a window length from a pool of window sets [4043]. This approach involves a high computation cost and the limited pool of window sets also reduces the chances of getting an accurate window length.

Various adaptively varying STFT approaches are proposed in [44] that reduce filter bank artefacts without compromising on time-frequency resolution. One of the approaches accounts for the time in which signal properties such as power and spectral shape remain preserved over the period, that is, a stationary region. Likewise, the opposite would be the time in which signal properties change over a period, that is, a transient region. Identifying a region involves integration of signal energy inside a given bank. The window size is then selected on the basis of variation of energies across critical banks. The general principle is increasing the time and frequency resolution for transient and nontransient regions, respectively. Similarly, a variable window length is determined by estimating the local instantaneous frequencies in every window slice over time in [45, 46].

Non-STFT based tools for time-frequency analysis also appear in the body of literature. Amongst these, the CQT [17, 47, 48] and the wavelet transform (WT) [4952] are the most common. From the outset, both methods seem to be the same. However, the difference lies in the usage of the basis function. If the basis function can be interpreted as a windowed sinusoid, then both methods are essentially the same [53]. Wavelet transform can be categorized as discrete wavelet transform (DTW), continuous wavelet transform (CTW), and wavelet packet transform (WPT) [54]. The significance of wavelet transform depends upon the selection of appropriate wavelet basis because inappropriate wavelet basis will directly hamper the results of WT. Many publications have been seen, describing different wavelet basis and advancement in WT [5560].

4. Proposed Method

Computationally, the CQT is expensive as compared to the STFT. The asymptotic complexity for the STFT is O(nlog⁡n) following the pattern of the FFT, where n is the samples in the input signal. On the other hand, the asymptotic complexity of the CQT following (6) is O(nlog⁡n + nk + k), where k is the number of components. For performance reasons, therefore, it would be better to select the STFT over CQT for visualization of the spectrum. However, the STFT is feasible only for narrow band signals where the filter bank with fixed window size is used. A simple but effective switching framework is proposed that can alternate between both tools after analyzing the input signal using spectrum sensing techniques. A block diagram of the proposed framework is shown in Figure 5.

Figure 5.

Figure 5

Block diagram of the proposed method.

The first step involves spectrum sensing that determines the orientation of the signal on the spectrum using the normalized power spectral density f^. The expectation μ and standard deviation σ is extracted from f^ as

μ=iNf^i·Ai, (8)
σ=1N1i=1Nf^iμ2, (9)

where A i is the amplitude of normalized Power Spectral Density PSD f^i. The expectation μ returns the frequency where PSD is concentrated. Together with σ, both give information about the distribution of the PSD. A signal would be considered narrow band when σ is smaller than a user-defined threshold β. An optimum threshold can be selected empirically such that smearing effect is minimized. After the analysis of known narrow and wide-band signals, the value of β is set to be 1500. The signals having σ less than 1500 are considered as narrow band signal and the appropriate tool; that is, STFT is selected. As mentioned earlier, STFT is computationally less expensive and the smearing effect is not prominent in case of narrow band signals. Signals having σ greater than 1500 are considered wide-band signal. In such scenario, the proposed method will adopt CQT tool. Unlike the STFT, CQT will minimize the smearing effect for wide-band signal and improve the visualization of spectrogram. The check will result in the selection of either the STFT or the CQT method as

Tool=STFT,σβ,CQT,otherwise. (10)

Upon selection of STFT, the next step is to select an appropriate window size as [39], where two closest sinusoids can be distinguished using (7). However, nonstationary signals may involve a large number of sinusoids in close proximity. This results in a very small Δf and consequently a large window. This makes the STFT very similar to the Fourier transform and will hamper temporal resolution. In order to select an appropriate window size a novel empirical model is proposed that adaptively selects a window size by modifying (7) to

W=3BsFsμ. (11)

Equation (11) will adopt an appropriate window size which does not lose any temporal information after the transform, where the size of the main lobe of the window B s can be set to 2 for a rectangular, 4 for a Hamming/Hanning, and 6 for a Blackman window. In this work, Hamming window is used and the value of B s is set as 4.

The proposed method is tested over different inputs such as a heartbeat (Figure 6), mridangam (Figure 7), multiple sinusoids (Figure 8), radio (Figure 9), high-carrier (Figure 10), music (Figure 11), and a speech signal (Figure 12). According to the proposed method, five out of these seven signals are labeled as narrow band while the remaining two, music and speech, are labeled as wide-band signals. The proposed model adopts an appropriate window size for STFT using (11). All the figures show how the adaptive window selection improved the spectrogram visualization. The results from each signal type are given in Table 1.

Figure 6.

Figure 6

(a) PSD of heart signal; (b) STFT with default window; (c) STFT with proposed method window selection.

Figure 7.

Figure 7

(a) PSD of mridangam signal; (b) STFT with default window; (c) STFT with proposed method window selection.

Figure 8.

Figure 8

(a) PSD of multiple sinusoidals; (b) STFT with default window; (c) STFT with proposed method window.

Figure 9.

Figure 9

(a) PSD of radio signal; (b) STFT with default window; (c) STFT with proposed method window selection.

Figure 10.

Figure 10

(a) PSD of high-carrier signals; (b) STFT with default window; (c) STFT with proposed method window selection.

Figure 11.

Figure 11

(a) PSD of music signal (wide band); (b) STFT with default window; (c) STFT with proposed method window selection; (d) magnitude of CQT (better time-frequency resolution achieved with CQT).

Figure 12.

Figure 12

(a) PSD of speech signal (wide band); (b) STFT with default window; (c) STFT with proposed method window selection; (d) magnitude of CQT (better time-frequency resolution achieved with CQT).

Table 1.

Adaptive window selection from proposed method, where μ is estimation, σ is the standard deviation, β is the optimal threshold (1500), and W is the window size.

Signal Type μ σ Decision W
Heartbeat (Figure 6) Low 90.99 135.49 STFT 5816
Mridangam (Figure 7) Intermediate 527.61 706.89 STFT 1003
Carriers (Figure 8) Intermediate 386.13 722.57 STFT 1371
Radio (Figure 9) Intermediate 2632.8 542.37 STFT 201
High carrier (Figure 10) High 10425 1117 STFT 51
Music (Figure 11) Mixed 2170 2160 CQT Variable
Speech (Figure 12) Mixed 810.15 1302 CQT Variable

A user-defined filter bank can be constructed using an approximation of the signal bandwidth (0.4–10 KHz) and its orientation using [61] as

Bwk=C,k=1,αBwk1,2kQ,fk=f1+j=1k1Bwj+BwkBw12, (12)

where C is the arbitrary bandwidth, f 1 is the center frequency of the 1st filter, α is the logarithmic growth factor, and Q is the total number of filter banks. This will not only reduce the number of banks but will also cover the band where a signal may lie. An example of a filter bank is shown in Figure 13 bearing signal bandwidth of 7.2 KHz ([0.2,7.4] KHz), C = 0.2 KHz, f 1 = 0.3 KHz, α = 1.4142, and Q = 8. The entire process of our proposed method is listed in Algorithm 1.

Figure 13.

Figure 13

User-defined filter bank. Parameters provided by a user.

Algorithm 1.

Algorithm 1

Complete algorithm.

5. Results and Discussion

A quantitative analysis of the proposed method is discussed in this section. The method selects an appropriate window length W without prior information about the input signal. Considering a composite signal bearing frequencies 100, 200, 400, and 500 Hz, then the Hamming window length required to provide the frequency resolution of 100 Hz (Δf = 200 Hz − 100 Hz) would be B s F sf = 4 × 44100/100 = 1764.

This shows that the minimum window size required to get 100 Hz frequency resolution is 1764 samples [39]. By increasing the window size the frequency resolution increases but this will hamper the temporal resolution. The window length is set manually to 1764 samples in order to achieve the frequency resolution of 100 Hz. Background knowledge about the input signal is required to set the appropriate window length. The proposed method automatically calculates an appropriate window length using (11) as:

Δf=μ3=386.133,W=BsFsΔf=1371. (13)

Figure 8 shows how the proposed method adaptively selects the window size and improve the spectrogram. Signals that are almost invisible in default window size are explored by proposed method. The percentage of appropriate window length selection is 1371/1764 × 100 = 77.72%. In nature most of the signal are nonstationary and it is not possible to have information about all types of signal. Hence, it is very difficult to set an appropriate window length. The proposed method is evaluated on a number of nonstationary signals. Mridangam is an instrument which produces complex sound. The mridangam has got some stable harmonics and the minimum distance between two harmonics must be known in order to select an appropriate window length. After the analysis of mridangam signal, the first harmonic is around 200 Hz and the second harmonics is around 400 Hz. The minimum distance between two consecutive partials is around 200 Hz. So the appropriate window length is 882 samples. The adaptive window selected from the proposed method is 1003 samples. Hence, the percentage of appropriate window selection is 87.93%. Figure 7 shows that the proposed method improves the spectrogram by prominently displaying the harmonics which is not visible in default window selection. The proposed method is fully automatic and requires no prior information about the input signal. After the statistical analysis of input signal, the proposed method selects an appropriate window size using the empirical model proposed in this paper.

The heartbeat of normal human heart consists of S1 and S2 sounds. S1 results from mitral and tricuspid valve closure. It is a duller, lower-frequency sound than S2 and occurs at the beginning of ventricular systole. The approximate frequencies from different literatures for S1 and S2 are 20–120 Hz and 60–250 Hz, respectively. Hence, the appropriate window length to provide 30 Hz frequency resolution is 5880 samples. The window selected by the proposed method is 5816 samples. The percentage of appropriate window length is 98.91. Adaptive window clearly shows S1 and S2 signals which is completely missed in the default window as shown in Figure 6. A number of nonstationary signals are evaluated from proposed method, which is summarized in Table 2.

Table 2.

Adaptive window selection from proposed method, where l A is the appropriate length and l P is the proposed length.

Signal Type l A l P % achieved
Heartbeat (Figure 6) Low 5880 5816 98.91
Mridangam (Figure 7) Intermediate 882 1003 87.93
Carriers (Figure 8) Intermediate 1764 1371 77.72
Radio (Figure 9) Intermediate 176 201 87.56
Carrier (Figure 10) High 44.1 51 86.47

The appropriate window length is only possible when complete information about the input signal is known. This is usually not possible for all types of input signal. Hence, the proposed method is able to select an appropriate window size without any prior information about input signal and achieved the overall 87.71% of appropriate window length selection.

Note that the appropriate fixed window length is selected for narrow band signal. For wide-band signal it is not possible to select an appropriate fixed window length because long window length improves the spectral resolution at the cost of temporal resolution and vice versa. The proposed method is able to detect the wide-band signal and automatically selects constant Q transform that provides high spectral resolution at low frequency and high temporal resolution at high frequency with geometrically spaced center frequencies.

The existing methods for wide-band signal select window size from adaptive STFT using two main approaches. (1) Select a window size from a pool of windows using different concentration measurements such as skewness, kurtosis, and integrate energies [4044]. (2) Define a benchmark τ and adjust it according to local characteristics of input signal using some concentration measurements such as instantaneous frequency and integrated energies [45, 46]. The problem with former approaches is that (i) they cannot obtain the optimal window length quickly or even fail to converge to the optimal window length and (ii) they are computationally expensive.

In [44] the smearing of energy in spectrogram is reduced by calculating STFT with 4 different window sizes. This increases the computational time approximately 3 times as compared to the proposed method. For all types of input signals whether narrow or wide-band signals, 4 different window sizes are used to reduce the smearing effect. The proposed method intelligently selects STFT for narrow band signal because for narrow band signal the fixed window length will not produce much smearing effects and improves the efficiency 4 times. When the input signal is wide-band signal then smearing effect is prominent while using STFT. In such a scenario, the proposed method selects CQT, which is computational expensive compared to STFT but it provides much better resolution and reduces the smearing effect. Figures 11(d) and 12(d) show the improved time-frequency resolution achieved by CQT.

The problem with the later approaches is that they are computationally expensive, which decides the window length on local characteristics of input signal. In [46] variable STFT is proposed, which adapts variable window length after analyzing the local characteristics of input signal. This is computationally expensive. The processing time for fixed STFT of length 64 and 128 is 0.1716 s and 0.1560 s, respectively, where the processing time of variable STFT is 0.5928 s for the same data. This demonstrates that the computing cost of variable STFT or any adaptive STFT which decides window length on local characteristics is much greater than the STFT. Variable STFT and adaptive STFT provide better resolution as compared to STFT but the proposed method solved the resolution problem by adapting CQT for wide-band signal. Hence, the proposed method not only is able to improve the time-frequency resolution but also reduces the computational cost. The computing costs are compared in Table 3.

Table 3.

Adaptive short time fourier transform.

Schemes CPU time (seconds)
STFTfix=128 0.1560
STFTfix=64 0.1716
CQT 0.413
VSTFT/ASTFT 0.5928
Proposed method 0.2845

STFT: Short Time Fourier Transform; CQT: constant Q transform; VSTFT: Variable Short Time Fourier Transform; ASTFT: Adoptive Short Time Fourier Transform.

6. Conclusion

In this paper, a general framework for effective multiresolution signal analysis has been demonstrated. The framework avoids the undesirable side effect of the STFT such as fixed time-frequency resolution for all types of input signals. After the analysis of input signal the method adapted an appropriate tool, that is, STFT and CQT for narrow and wide-band signal, respectively. The proposed method is capable of selecting an appropriate window length for STFT and achieved an overall of 87.71% of appropriate window length selection. The proposed method also allows a user to dynamically construct the filter bank according to the parameters provided by the user, which helps in the reduction of redundancy. The results obtained from the proposed method have improved spectrogram visualization and computing cost and achieved 87.71% of appropriate window length selection. The proposed method is fully automatic and required no prior information about the input signal. The results obtained from the proposed method directly contributes in different domains such as feature extraction, for example, harmonic, pitch, attack, delay, and energy. These features can be used in different applications such as speech and speaker recognition, biomedical signal analysis, and music instrument analysis. In future, the authors are planning to automatically build a desirable nonuniform filter bank after analyzing the characteristics of input signal. The filter bank will not be limited to linear or geometrical spacing only. The aim is to reduce the computing cost.

Supplementary Material

The Proposed method is applied on various signals such as heart beat, speech, music (mridangam), radio, and carrier signals. The improved spectrogram of these signals obtained from our proposed method is shown in Figures 6–12. The supplementary materials of these signals are provided in S1–S7.

6172453.f1.rar (6.3MB, rar)

Competing Interests

The authors declare that they have no competing interests.

References

  • 1.Portnoff M. R. Time-frequency representation of digital signals and systems based on short-time fourier analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1980;28(1):55–69. doi: 10.1109/tassp.1980.1163359. [DOI] [Google Scholar]
  • 2.Ahmed M., Nisar S. Text-to-speech synthesis using phoneme concatenation. International Journal of Engineering, Science and Technology. 2014;3(2):193–197. [Google Scholar]
  • 3.Lee J. J., Lee S. M., Kim I. Y., Min H. K., Hong S. H. Comparison between short time Fourier and wavelet transform for feature extraction of heart sound. Proceedings of the IEEE Region 10 Conference (TENCON '99); December 1999; IEEE; pp. 1547–1550. [Google Scholar]
  • 4.Puthankattil Subha D., Joseph P. K., Acharya U R., Lim C. M. EEG signal analysis: a survey. Journal of Medical Systems. 2010;34(2):195–212. doi: 10.1007/s10916-008-9231-z. [DOI] [PubMed] [Google Scholar]
  • 5.Al-Badour F., Sunar M., Cheded L. Vibration analysis of rotating machinery using time-frequency analysis and wavelet techniques. Mechanical Systems and Signal Processing. 2011;25(6):2083–2101. doi: 10.1016/j.ymssp.2011.01.017. [DOI] [Google Scholar]
  • 6.Gold B., Morgan N., Ellis D. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. New York, NY, USA: John Wiley & Sons; 2011. [Google Scholar]
  • 7.Bracewell R. The Fourier Transform & Its Applications. New York, NY, USA: McGgraw-Hill; 1965. [Google Scholar]
  • 8.Adamczak K. S. S., Makiea W. Investigating advantages and disadvantages of the analysis of a geometrical surface structure with the use of fourier and wavelet transform. Metrology and Measurement Systems. 2010;17(2):233–244. doi: 10.2478/v10178-010-0020-x. [DOI] [Google Scholar]
  • 9.Gabor D. Theory of communication. Part 1: the analysis of information. Journal of the Institution of Electrical Engineers—Part III: Radio and Communication Engineering. 1946;93(26):429–441. doi: 10.1049/ji-3-2.1946.0074. [DOI] [Google Scholar]
  • 10.Allen J. B. Short term spectral analysis, synthesis, and modification by discrete Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing. 1977;25(3):235–238. doi: 10.1109/tassp.1977.1162950. [DOI] [Google Scholar]
  • 11.Chui C. K. Wavelets: A Tutorial in Theory and Applications. Vol. 2. Cambridge, Mass, USA: Academic Press; 2012. [Google Scholar]
  • 12.Lukin A., Todd J. Adaptive time-frequency resolution for analysis and processing of audio. Proceedings of the 120th Audio Engineering Society Convention; 2006; Paris, France. [Google Scholar]
  • 13.Rudoy D., Basu P., Quatieri T. F., Dunn B., Wolfe P. J. Adaptive short-time analysis-synthesis for speech enhancement. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '08); April 2008; Las Vegas, Nev, USA. IEEE; pp. 4905–4908. [DOI] [Google Scholar]
  • 14.Müller M., Ellis D. P. W., Klapuri A., Richard G. Signal processing for music analysis. IEEE Journal on Selected Topics in Signal Processing. 2011;5(6):1088–1110. doi: 10.1109/jstsp.2011.2112333. [DOI] [Google Scholar]
  • 15.Azami H., Sanei S., Mohammadi K. A novel signal segmentation method based on standard deviation and variable threshold. Journal of Computer Applications. 2011;34(2):27–34. [Google Scholar]
  • 16.Ozog D. Signal Analysis. Whitman College; 2007. [Google Scholar]
  • 17.Brown J. C., Puckette M. S. An efficient algorithm for the calculation of a constant Q transform. Journal of the Acoustical Society of America. 1992;92(5):2698–2701. doi: 10.1121/1.404385. [DOI] [Google Scholar]
  • 18.Holighaus N., Dörfler M., Velasco G. A., Grill T. A framework for invertible, real-time constant-Q transforms. IEEE Transactions on Audio, Speech and Language Processing. 2013;21(4):775–785. doi: 10.1109/TASL.2012.2234114. [DOI] [Google Scholar]
  • 19.Rabiner L. R., Juang B.-H. Fundamentals of Speech Recognition. Vol. 14. Englewood Cliffs, NJ, USA: PTR Prentice Hall; 1993. [Google Scholar]
  • 20.Busch P., Heinonen T., Lahti P. Heisenberg's uncertainty principle. Physics Reports. 2007;452(6):155–176. doi: 10.1016/j.physrep.2007.05.006. [DOI] [Google Scholar]
  • 21.Brown J. C. Calculation of a constant Q spectral transform. The Journal of the Acoustical Society of America. 1991;89(1):425–434. doi: 10.1121/1.400476. [DOI] [Google Scholar]
  • 22.Chen D., Durand L.-G., Bellemare F. Time and frequency domain analysis of acoustic signals from a human muscle. Muscle & Nerve. 1997;20(8):991–1001. doi: 10.1002/(sici)1097-4598(199708)20:8<991::aid-mus9>3.0.co;2-1. [DOI] [PubMed] [Google Scholar]
  • 23.Lu C., Ding P., Chen Z. Time-frequency analysis of acoustic emission signals generated by tension damage in CFRP. Procedia Engineering. 2011;23:210–215. [Google Scholar]
  • 24.Sharma G. K., Kumar A., Babu Rao C., Jayakumar T., Raj B. Short time Fourier transform analysis for understanding frequency dependent attenuation in austenitic stainless steel. NDT and E International. 2013;53:1–7. doi: 10.1016/j.ndteint.2012.09.001. [DOI] [Google Scholar]
  • 25.Yan R., Gao R. X., Chen X. Wavelets for fault diagnosis of rotary machines: a review with applications. Signal Processing. 2014;96:1–15. doi: 10.1016/j.sigpro.2013.04.015. [DOI] [Google Scholar]
  • 26.Matz G., Bolcskei H., Hlawatsch F. Time-frequency foundations of communications: concepts and tools. IEEE Signal Processing Magazine. 2013;30(6):87–96. doi: 10.1109/msp.2013.2269702. [DOI] [Google Scholar]
  • 27.Feichtinger H., Luef F. Encyclopedia of Applied and Computational Mathematics. Berlin, Germany: Springer; 2012. Gabor analysis and time-frequency methods. [Google Scholar]
  • 28.Bosschaart N., van Leeuwen T. G., Aalders M. C. G., Faber D. J. Quantitative comparison of analysis methods for spectroscopic optical coherence tomography. Biomedical Optics Express. 2013;4(11):2570–2584. doi: 10.1364/boe.4.002570. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Poulimenos A. G., Fassois S. D. Parametric time-domain methods for non-stationary random vibration modelling and analysis—a critical survey and comparison. Mechanical Systems and Signal Processing. 2006;20(4):763–816. doi: 10.1016/j.ymssp.2005.10.003. [DOI] [Google Scholar]
  • 30.Xu K., Minonzio J.-G., Ta D., Hu B., Wang W., Laugier P. Sparse inversion SVD method for dispersion extraction of ultrasonic guided waves in cortical bone. Proceedings of the IEEE 6th European Symposium on Ultrasonic Characterization of Bone (ESUCB '15); June 2015; Corfu Island, Greece. pp. 1–3. [DOI] [Google Scholar]
  • 31.Hu M., Shao H. Autoregressive spectral analysis based on statistical autocorrelation. Physica A: Statistical Mechanics and Its Applications. 2007;376(1-2):139–146. doi: 10.1016/j.physa.2006.10.087. [DOI] [Google Scholar]
  • 32.Jachan M., Matz G., Hlawatsch F. Time-frequency ARMA models and parameter estimators for underspread nonstationary random processes. IEEE Transactions on Signal Processing. 2007;55(9):4366–4381. doi: 10.1109/tsp.2007.896265. [DOI] [Google Scholar]
  • 33.Avendaño-Valencia L. D., Godino-Llorente J. I., Blanco-Velasco M., Castellanos-Dominguez G. Feature extraction from parametric time-frequency representations for heart murmur detection. Annals of Biomedical Engineering. 2010;38(8):2716–2732. doi: 10.1007/s10439-010-0077-4. [DOI] [PubMed] [Google Scholar]
  • 34.Elouaham S., Latif R., Dliou A., Laaboubi M., Maoulainie F. M. R. Parametric and non parametric time-frequency analysis of biomedical signals. International Journal of Advanced Computer Science and Applications. 2013;4(1):74–79. doi: 10.14569/IJACSA.2013.04011035;.dpuf. [DOI] [Google Scholar]
  • 35.Wacker M., Witte H. Time-frequency techniques in biomedical signal analysis. Methods of Information in Medicine. 2013;52(4):279–296. doi: 10.3414/me12-01-0083. [DOI] [PubMed] [Google Scholar]
  • 36.Robel A. Analysis/Resynthesis with the Short Time Fourier Transform. Institute of Communication Science; 2006. [Google Scholar]
  • 37.Simpson A. J. R. Time-frequency trade-offs for audio source separation with binary masks. http://arxiv.org/abs/1504.07372.
  • 38.Kraszewski M., Trojanowski M., Strąkowski M. R. Comment on quantitative comparison of analysis methods for spectroscopic optical coherence tomography. Biomedical Optics Express. 2014;5(9):3023–3033. doi: 10.1364/boe.5.003023. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39.Smith J. O., Serra X. Parshl: An Analysis/Synthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation. CCRMA, Department of Music, Stanford University; 1987. [Google Scholar]
  • 40.Yin Q., Shen L., Lu M., Wang X., Liu Z. Selection of optimal window length using STFT for quantitative SNR analysis of LFM signal. Journal of Systems Engineering and Electronics. 2013;24(1):26–35. doi: 10.1109/JSEE.2013.00004. [DOI] [Google Scholar]
  • 41.Zhong J., Huang Y. Time-frequency representation based on an adaptive short-time Fourier transform. IEEE Transactions on Signal Processing. 2010;58(10):5118–5128. doi: 10.1109/TSP.2010.2053028. [DOI] [Google Scholar]
  • 42.Kwok H. K., Jones D. L. Improved instantaneous frequency estimation using an adaptive short-time Fourier transform. IEEE Transactions on Signal Processing. 2000;48(10):2964–2972. doi: 10.1109/78.869059. [DOI] [Google Scholar]
  • 43.Pei S.-C., Huang S.-G. STFT with adaptive window width based on the chirp rate. IEEE Transactions on Signal Processing. 2012;60(8):4065–4080. doi: 10.1109/tsp.2012.2197204. [DOI] [Google Scholar]
  • 44.Lukin A., Todd J. Audio Engineering Society Convention 120. Audio Engineering Society; 2006. Adaptive time-frequency resolution for analysis and processing of audio. [Google Scholar]
  • 45.Craciun A., Spiertz M. Adaptive time frequency resolution for blind source separation. Proceedings of the International Student Conference on Electrical Engineering (POSTER '10); 2010. [Google Scholar]
  • 46.Lee J.-Y. Variable short-time Fourier transform for vibration signals with transients. Journal of Vibration and Control. 2015;21(7):1383–1397. doi: 10.1177/1077546313499389. [DOI] [Google Scholar]
  • 47.Selesnick I. W. Wavelet transform with tunable Q-factor. IEEE Transactions on Signal Processing. 2011;59(8):3560–3575. doi: 10.1109/TSP.2011.2143711. [DOI] [Google Scholar]
  • 48.Schörkhuber C., Klapuri A., Sontacchi A. Audio pitch shifting using the constant-Q transform. Journal of the Audio Engineering Society. 2013;61(7-8):562–572. [Google Scholar]
  • 49.Mallat S. G. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence. 1989;11(7):674–693. doi: 10.1109/34.192463. [DOI] [Google Scholar]
  • 50.Cohen L., Loughlin P. Recent Developments in Time-Frequency Analysis. Springer; 1998. [DOI] [Google Scholar]
  • 51.Boashash B. Time-Frequency Signal Analysis and Processing: A Comprehensive Reference. Cambridge, Mass, USA: Academic Press; 2015. [Google Scholar]
  • 52.Graps A. An introduction to wavelets. IEEE Computational Science and Engineering. 1995;2(2):50–61. doi: 10.1109/99.388960. [DOI] [Google Scholar]
  • 53.Daubechies I. Wavelets. Berlin, Germany: Springer; 1989. Orthonormal bases of wavelets with finite support—connection with discrete filters; pp. 38–66. [Google Scholar]
  • 54.Li B., Chen X. Wavelet-based numerical analysis: a review and classification. Finite Elements in Analysis and Design. 2014;81:14–31. doi: 10.1016/j.finel.2013.11.001. [DOI] [Google Scholar]
  • 55.Chen J., Li Z., Pan J., et al. Wavelet transform based on inner product in fault diagnosis of rotating machinery: a review. Mechanical Systems and Signal Processing. 2016;70-71:1–35. doi: 10.1016/j.ymssp.2015.08.023. [DOI] [Google Scholar]
  • 56.Baccar D., Söffker D. Wear detection by means of wavelet-based acoustic emission analysis. Mechanical Systems and Signal Processing. 2015;60:198–207. doi: 10.1016/j.ymssp.2015.02.012. [DOI] [Google Scholar]
  • 57.Sweldens W. The lifting scheme: a custom-design construction of biorthogonal wavelets. Applied and Computational Harmonic Analysis. 1996;3(2):186–200. doi: 10.1006/acha.1996.0015. [DOI] [Google Scholar]
  • 58.Li Z., He Z., Zi Y., Jiang H. Rotating machinery fault diagnosis using signal-adapted lifting scheme. Mechanical Systems and Signal Processing. 2008;22(3):542–556. doi: 10.1016/j.ymssp.2007.09.008. [DOI] [Google Scholar]
  • 59.Xiao W., Zi Y., Chen B., Li B., He Z. A novel approach to machining condition monitoring of deep hole boring. International Journal of Machine Tools and Manufacture. 2014;77:27–33. doi: 10.1016/j.ijmachtools.2013.10.009. [DOI] [Google Scholar]
  • 60.Wang Z., Bian S., Lei M., Zhao C., Liu Y., Zhao Z. Feature extraction and classification of load dynamic characteristics based on lifting wavelet packet transform in power system load modeling. International Journal of Electrical Power & Energy Systems. 2014;62:353–363. doi: 10.1016/j.ijepes.2014.04.051. [DOI] [Google Scholar]
  • 61.Lawrence R. Fundamentals of Speech Recognition. New Delhi, India: Pearson Education; 2008. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

The Proposed method is applied on various signals such as heart beat, speech, music (mridangam), radio, and carrier signals. The improved spectrogram of these signals obtained from our proposed method is shown in Figures 6–12. The supplementary materials of these signals are provided in S1–S7.

6172453.f1.rar (6.3MB, rar)

Articles from Computational Intelligence and Neuroscience are provided here courtesy of Wiley

RESOURCES