Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2025 Apr 22.
Published in final edited form as: Interspeech. 2024 Sep;2024:1455–1459. doi: 10.21437/interspeech.2024-1484

Comparing ambulatory voice measures during daily life with brief laboratory assessments in speakers with and without vocal hyperfunction

Daryush D Mehta 1, Jarrad H Van Stan 1, Hamzeh Ghasemzadeh 1, Robert E Hillman 1
PMCID: PMC12014202  NIHMSID: NIHMS2068883  PMID: 40264705

Abstract

The most common types of voice disorders are associated with hyperfunctional voice use in daily life. Although current clinical practice uses measures from brief laboratory recordings to assess vocal function, it is unclear how these relate to an individual’s habitual voice use. The purpose of this study was to quantify the correlation and offset between voice features computed from laboratory and ambulatory recordings in speakers with and without vocal hyperfunction. Features derived from a neck-surface accelerometer included estimates of sound pressure level, fundamental frequency, cepstral peak prominence, and spectral tilt. Whereas some measures from laboratory recordings correlated significantly with those captured during daily life, only approximately 6–52% of the actual variance was accounted for. Thus, brief voice assessments are quite limited in the extent to which they can accurately characterize the daily voice use of speakers with and without vocal hyperfunction.

Keywords: ambulatory voice monitoring, vocal hyperfunction, clinical voice assessment, real-world voice use

1. Introduction

Many voice disorders present as chronic conditions that often involve a clinical condition termed vocal hyperfunction, which is characterized by hyperactivity of the muscles in and around the larynx [1, 2]. In some cases, this hyperactivity leads to phonotraumatic vocal hyperfunction, which is associated with the formation of vocal fold lesions due to excessive tissue trauma during voicing [3]. In other cases, this hyperactivity leads to muscle tension dysphonia that does not cause phonotrauma, but still leads to symptoms such as muscle fatigue, discomfort, and changes in vocal quality [4, 5]. Vocal hyperfunction can be debilitating, linked to elevated levels of vocal effort required during speech production [6] and can have a large societal impact especially in individuals whose voice use is critical to their occupation or livelihood [7, 8].

Although vocal hyperfunction is believed to be caused by or associated with voice use during daily life, standard clinical assessment continues to rely on short-term snapshots of voice and speech production. It is unclear how these snapshots relate to an individual’s voice use in daily life. Currently, clinicians depend on subjective patient self-reports and self-monitoring to evaluate the prevalence and persistence of hyperfunctional vocal behavior outside the clinic. Unfortunately, evidence points to speakers being inaccurate at estimating their true voice use with a large uncertainty [9]. The clinical assessment and treatment of hyperfunctional voice disorders could be significantly improved by observing and measuring harmful vocal habits during a person’s daily routines.

Wearable voice monitoring technologies can capture voice features over the course of an entire day during real-world activities [1015]. These technologies are designed to quantify how daily vocal behaviors cause and/or exacerbate the development of voice disorders. The field continues to study and better understand voice features captured by ambulatory voice monitors that typically position a contact microphone or accelerometer sensor on the anterior neck surface below the larynx. Neck-surface vibration characteristics relate to the transmission of energy due to vocal fold tissue collision and aerodynamic energy radiating through the trachea [16]. Importantly, accelerometers are non-acoustic sensors that are relative immune to environmental noise, other speakers in the room, and produce an unintelligible voicing signal. Analyzing the accelerometer signal during voiced speech has been shown to yield clinically interpretable measures, such as sound pressure level (SPL) [17], fundamental frequency (fo) [18], cepstral measures of periodicity [18, 19], and spectral tilt [20].

How much information is preserved in short-duration snapshots of vocal behavior relative to longer-term voice monitoring is clinically important. Thus, the purpose of this study was to quantify the correlation and offset between voice features computed from short (~30 s) laboratory recordings and the same voice features captured from longer-term (weeklong) ambulatory voice monitoring data. Voice features were obtained from individuals with diagnosed vocal hyperfunction and vocally healthy individuals.

2. Methods

2.1. Participant demographics

Data were collected from adult females diagnosed with voice disorders associated with vocal hyperfunction (N = 201), including vocal fold nodules and/or polyps (N = 149) and primary muscle tension dysphonia (N = 52). Voice recordings were also obtained from a vocally healthy control group of female speakers with no history of voice disorders (N = 178) who were clinically confirmed to have no signs of vocal pathology. Speakers in the control group were largely matched according to approximate age (±5 years) and similar occupation to the individuals with voice disorders. The mean ± standard deviation ages of the healthy and patient groups were 28.3 ± 11.7 years and 29.4 ± 13.0 years, respectively. Informed consent was obtained from each participant, and the study protocol was approved by the institutional review board at Mass General Brigham.

2.2. Data collection

Figure 1 illustrates the data collection setup for the laboratory and ambulatory monitoring environments. All study participants wore a smartphone-based ambulatory voice monitor that sensed voicing characteristics using a neck-surface accelerometer [11]. In the laboratory, study participants read the first paragraph of the Rainbow Passage, a standard English-language passage designed to be phonetically balanced [21]. For a subset of participants (84 healthy and 105 patients), an additional recording of spontaneous monologue speech was elicited for approximately 30 seconds in response to the prompt, “Tell me what you’re doing today after this appointment.” In the daily-life setting, all individuals took the voice monitor home to wear during waking hours for approximately seven days. The smartphone application recorded the accelerometer signal at a sampling rate of 11 025 Hz, 16-bit quantization, and 76 dB recording range [11].

Figure 1:

Figure 1:

Voice features were measured from neck-surface accelerometer (ACC) signal in (a) laboratory conditions in an acoustically treated sound booth and (b) real-world, ambulatory conditions using a smartphone-based wearable voice monitor.

At the beginning of each day, the smartphone instructed participants to produce a sustained vowel while changing loudness levels. They did this with the accelerometer in place and a microphone with a 15 cm metal rod placed on the upper lip. This maneuver was performed to calibrate and interpret the accelerometer signal in terms of acoustic SPL in dB SPL at 15 cm [17]. The microphone was not used the rest of the day.

2.3. Data analysis

Data quality checks and pre-processing steps were applied to the ambulatory accelerometer signal. When a degradation in accelerometer signal quality was observed by trained study staff (e.g., due to sensor rubbing, cable breakages, or electrical malfunction), those segments of the day were flagged and not analyzed. Then, custom MATLAB code performed signal processing on the accelerometer signal in 50-ms, non-overlapping frames to yield five measures (code cannot be shared publicly due to intellectual property licensing):

  1. Estimates of SPL from the accelerometer signal used the beginning-of-day calibration gesture performed with the microphone. Per-day linear regression parameters (slope and intercept) were derived using frame-by-frame signal power of the time-aligned accelerometer and microphone signals in the dB-dB domain. The slope and intercept were then applied to the root-mean-square value of each accelerometer signal frame [11].

  2. Accelerometer signal magnitude was also saved in physical units of acceleration magnitude (cm/s2) as a general measure of laryngeal forces during voicing [22].

  3. fo was calculated from the autocorrelation function following center clipping of the frame (0.75 center clipping threshold). The locations of the peak of the normalized autocorrelation function and midway between the peak and zero (subharmonic) were candidate locations. If the subharmonic peak were greater than 0.25 of the peak, fo was computed as the reciprocal of the subharmonic peak location; otherwise, fo was computed as the reciprocal of the overall peak location (in Hz) [23].

  4. The cepstral peak prominence (CPP) was the difference between the peak value and noise floor (in dB) of the power cepstrum for quefrencies 3.3–16.7 ms [24].

  5. A measure of spectral tilt was computed as the ratio between the magnitudes of the first and second spectral harmonics (H1−H2, in dB) [24].

Voice activity detection was applied to determine a voicing decision for each 50-ms, non-overlapping frame. A frame was considered voiced if the signal passed the following criteria: SPL of 45–130 dB SPL at 15 cm, fo of 70–1000 Hz, normalized autocorrelation peak of 0.6–1, normalized subharmonic peak amplitude of 0.25–1, and alpha ratio of 22–50 dB (ratio of spectral energy below and above 2000 Hz) [24]. The resulting frames were further processed using a singing detector that classified voicing segments as singing or non-singing (i.e., speech) segments [25]. For the purposes of the current work, only frames categorized as speech-related segments were included in the analysis. Figure 2 shows an illustration of scatterplots for four mean voice features in the ambulatory and laboratory conditions.

Figure 2:

Figure 2:

Scatter plots correlating ambulatory and laboratory (spontaneous speech) features of mean (a) sound pressure level, (b) fundamental frequency, (c) cepstral peak prominence, and (d) H1−H2 in the vocally healthy group.

In addition to the frame-based voice features, two clinical voice features were computed that have been reported in the literature to characterize aspects of vocal hyperfunction. The first clinical feature was the daily phonotrauma index (DPI) whose model was trained on an ambulatory data set to optimally classify patients with phonotraumatic vocal hyperfunction and vocally healthy speakers [22, 26]. The DPI was computed for each laboratory passage (reading passage and spontaneous speech) and each day of ambulatory monitoring. Inputs to the DPI model were the standard deviation of H1−H2 and skewness of SPL across all voiced frames. The second clinical feature was the non-phonotraumatic vocal hyperfunction (NPVH) index whose model was trained on an ambulatory data set to optimally classify patients with primary muscle tension dysphonia from vocally healthy control speakers [27]. Inputs to the NPVH index were H1−H2 mode and CPP mean across voiced frames.

2.4. Statistical analysis

Two summary statistics (mean, standard deviation) were calculated for each frame-based voice feature across each laboratory passage and each day of ambulatory monitoring. The skewness statistic was also reported for SPL and accelerometer magnitude due to their use in models of phonotraumatic vocal hyperfunction [22, 28]. The DPI and NPVH index were single values computed per laboratory recording and per monitored day. Summary statistics and clinical indices were then averaged over the seven days monitored to yield a single set of features per speaker. Outlier removal was applied by removing features that were outside three standard deviations from the mean across speakers from either laboratory or ambulatory settings.

Pearson’s correlation coefficient (r, p < 0.001) quantified the relationship between voicing features in the laboratory and ambulatory environments. The statistical offset was computed as the mean difference for each voice feature (ambulatory minus laboratory feature). The offset was considered non-zero if a paired t-test was statistically significant (p < 0.001). To allow for a comparison of the size of the offset across features, the relative effect size was computed by dividing the absolute offset by the pooled standard deviation of the feature across laboratory and ambulatory settings.

3. Results and Discussion

The durations of the laboratory and ambulatory monitoring data were computed to document sample duration information. The laboratory recording duration for the reading passage was similar for both the vocally healthy (32.0 ± 4.0 s) and patient (32.9 ± 4.0 s) groups. The laboratory recording duration for spontaneous speech was similar for both the vocally healthy (31.9 ± 4.8 s) and patient (31.2 ± 3.8 s) groups. The average daily monitoring durations for the vocally healthy and patient groups were similar at 11.5 ± 1.6 hours and 11.0 ± 1.6 hours, respectively. These results show a general uniformity in the temporal sampling of the data across speakers.

3.1. Correlation between laboratory and ambulatory features

Table 1 reports Pearson’s r values between voice features captured from the two laboratory recordings and the same features from the individual’s average day of monitoring. The strength of the correlation depended on the specific voice feature, with correlations in a similar range for the vocally healthy and patient groups.

Table 1:

Statistical correlation (Pearson’s r, p < 0.001, n.s. = not significant) of voice features between long-term, daily-life monitoring and short-term, laboratory recordings of neck-surface vibration. The highest correlation per feature is bolded.

Voice Feature Reading Passage Spontaneous Speech
Healthy Patients All Healthy Patients All

Daily phonotrauma index 0.36 0.38 0.35 n.s. 0.37 0.39
Nonphonotraumatic vocal hyperfunction index 0.72 0.61 0.66 0.63 0.50 0.58
Mean:
 Sound pressure level (dB SPL at 15 cm) n.s. n.s. n.s. n.s. n.s. 0.28
 Accelerometer magnitude (dB re cm/s2) 0.59 0.70 0.66 0.56 0.51 0.57
 Fundamental frequency (Hz) 0.66 0.72 0.69 0.72 0.70 0.71
 Cepstral peak prominence (dB) 0.69 0.58 0.64 0.53 0.58 0.60
 H1−H2 (dB) 0.46 0.35 0.40 0.41 0.40 0.43
Standard deviation:
 Sound pressure level (dB) 0.40 n.s. 0.29 n.s. n.s. n.s.
 Accelerometer magnitude (dB) 0.29 n.s. 0.31 n.s. 0.37 0.39
 Fundamental frequency (Hz) n.s. n.s. n.s. n.s. n.s. n.s.
 Cepstral peak prominence (dB) 0.61 0.49 0.55 0.52 0.41 0.49
 H1−H2 (dB) n.s. 0.49 0.39 n.s. 0.34 0.31
Skewness:
 Sound pressure level n.s. 0.24 0.24 n.s. n.s. 0.25
 Accelerometer magnitude 0.29 0.26 0.26 n.s. n.s. 0.31

The strongest correlations (r > 0.5) were obtained in the reading passage for the NPVH index, mean accelerometer magnitude, mean fo, mean CPP, and standard deviation of CPP. Although these correlations indicate that average voicing tendencies during daily life may be captured to a statistically significant degree from a 30 s speech snapshot, the explained variance (r2) was only approximately 6–52%. SPL mean did not exhibit a statistically significant correlation, which could be an indication of the known uncertainty in accelerometer-derived SPL estimation [17]; without this uncertainty, average accelerometer signal magnitude correlated to a larger degree (49% explained variance). Allowing speakers to speak freely in a monologue manner was hypothesized to obtain a more ecologically valid speech sample and, thus, higher correlations with ambulatory settings. However, voice features did not exhibit higher correlations when measured during spontaneous speech by the same speaker (if anything, correlations tended to be lower).

As expected, the standard deviation of voice features exhibited much lower or non-significant laboratory-ambulatory correlations. The standard deviation of CPP exhibited statistically significant correlations in the 0.41–0.61 range. As CPP is viewed as a surrogate of overall vocal quality [29], results indicate that the temporal variability of voice quality during daily life can be captured to a certain extent (up to 36% explained variance) during a 30 s reading passage. However, longer-term monitoring is necessary to better characterize habitual variation in voice quality and to sample the “tail” of real-world feature distributions in noisy environments, in animated social settings, in lecture halls, etc.

3.2. Offset between laboratory and ambulatory features

Table 2 reports the offset (average difference) of each ambulatory voice feature with reference to the corresponding laboratory feature. Positive (negative) offset values indicated that ambulatory values were higher (lower), on average, than laboratory values. The NPVH index did not exhibit a significantly different offset. The DPI exhibited the highest offset values, which was expected given that its elements (SPL/accelerometer magnitude skewness and H1–H2 standard deviation) represent higher-order statistical moments capturing distributional variability that themselves exhibited significant offsets. Contrary to what was hypothesized, many of the largest offsets were observed to come from the spontaneous speech recordings and not the reading passage. Thus, results from the offset and correlational analyses indicate that voice features from during reading can be as good or better than spontaneous speech at representing daily vocal behavior.

Table 2:

Statistical offset of voice features from long-term, daily-life monitoring relative to short-term, laboratory recordings. Offset effect sizes are in parentheses (paired t-test, p < 0.001, n.s. = not significant). Offsets with highest effect sizes per feature are bolded.

Voice Feature Reading Passage Spontaneous Speech
Healthy Patients All Healthy Patients All

Daily phonotrauma index −4.04 (−3.28) −3.60 (−3.04) −3.81 (−3.11) −3.82 (−2.83) −3.59 (−2.61) −3.69 (−2.73)
NPVH index n.s. n.s. n.s. n.s. −0.24 (−0.42) −0.17 (−0.32)
Mean:
 Sound pressure level (dB) 5.06 (0.87) 5.87 (0.95) 5.52 (0.92) 7.26 (1.37) 9.53 (1.55) 8.48 (1.46)
 Accelerometer magnitude (dB) 4.56 (1.04) 5.00 (1.13) 4.79 (1.09) 5.61 (1.10) 6.64 (1.35) 6.23 (1.23)
 Fundamental frequency (Hz) 25.84 (1.43) 20.03 (1.15) 22.76 (1.27) 30.57 (2.05) 25.46 (1.28) 28.20 (1.67)
 Cepstral peak prominence (dB) 0.78 (0.52) 1.10 (0.69) 0.95 (0.61) 1.46 (0.78) 1.52 (0.99) 1.49 (0.89)
 H1−H2 (dB) −1.67 (−0.66) −1.89 (−0.57) −1.83 (−0.62) −2.49 (−0.84) −1.97 (−0.57) −2.25 (−0.70)
Standard deviation:
 Sound pressure level (dB) 3.75 (1.54) 2.98 (1.35) 3.34 (1.42) 3.38 (1.45) 3.08 (1.39) 3.21 (1.42)
 Accelerometer magnitude (dB) 0.41 (0.41) n.s. 0.27 (0.25) 0.87 (0.82) 0.43 (0.40) 0.63 (0.57)
 Fundamental frequency (Hz) 11.47 (0.45) n.s. 8.00 (0.31) 17.49 (0.73) n.s. 7.45 (0.26)
 Cepstral peak prominence (dB) n.s. n.s. n.s. 0.41 (0.64) n.s. 0.26 (0.40)
 H1−H2 (dB) 1.54 (1.88) 1.20 (1.33) 1.36 (1.55) 1.44 (1.73) 1.01 (0.93) 1.20 (1.21)
Skewness:
 Sound pressure level 0.95 (2.35) 0.87 (2.33) 0.91 (2.37) 0.72 (1.65) 0.79 (1.69) 0.76 (1.67)
 Accelerometer magnitude 1.77 (3.05) 1.63 (3.17) 1.70 (3.11) 1.65 (2.53) 1.68 (2.89) 1.66 (2.72)

The absolute offset values of each voice feature allow clinicians to interpret how different speech characteristics can be in a controlled recording environment. For features with a high correlation, the offset could be treated as a correction factor. An fo offset of +30.6 Hz (2.5 semitones) indicates that healthy speakers elevated their pitch by a noticeable amount after leaving the laboratory. Statistically significant offsets for ambulatory SPL, accelerometer magnitude, CPP, and H1–H2 point to louder voicing being produced outside of the laboratory with higher vocal effort, more periodicity, and higher laryngeal forces. Even though standard deviation features exhibited low or non-significant correlations, positive offset values for these features provide a sense of how much more variability is exhibited in the ambulatory setting that is missed in the analysis of the laboratory recordings.

3.3. Limitations

The DPI and NPVH index were trained on ambulatory data, and thus their model weights may not translate to the laboratory condition. This was evident in the mild correlation and large ambulatory offset for DPI. Retraining the DPI model on laboratory recordings (versus ambulatory data) may aid in classifying patients from vocally healthy speakers since the feature does not currently correlate highly across settings. The current analysis focused on voice features from female speakers due to the higher prevalence of females with phonotraumatic vocal hyperfunction. Future work is needed to determine whether the correlation and offset statistics observed are different for male speakers. In addition, future laboratory recordings could elicit speech by engaging speakers in active dialog with others in the room, as well as modifying background noises, to simulate daily-life conditions.

4. Conclusion

Clinical voice assessment typically includes the acoustic analysis of short-duration sustained vowels, reading passages, and spontaneous speech. In this study, voice features from 30 s laboratory recordings were compared with features computed from weeklong ambulatory voice monitoring data in individuals with and without vocal hyperfunction. Results indicated that average voice features from the laboratory recordings do not explain half of the variance or more exhibited by ambulatory voice features during daily life. Real-world voice monitoring is necessary to capture vocal behavior necessary for the clinical assessment and treatment of hyperfunctional voice disorders. The standard deviation of voice features exhibited much lower laboratory-ambulatory correlations due to the undersampling of vocal variability during short-duration speech segments. Features related to periodicity (CPP) and pitch (fo) exhibited the highest laboratory-ambulatory correlations but only accounting for up to 52% of the speaker-to-speaker variance. Understanding the correlation and offset of voice features can provide valuable insights for applications involving biofeedback. This information could assist in setting appropriate thresholds within clinical and ambulatory contexts, thus aiding in the treatment or prevention of hyperfunctional voice disorders.

5. Acknowledgements

This research was funded by the National Institutes of Health - National Institute on Deafness and Other Communication Disorders (K99 DC021235, T32 DC013017, R33 DC011588, P50 DC015446, R01 DC019083). The article’s contents are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. Drs. Hillman and Mehta have a financial interest in InnoVoyce LLC, a company focused on developing and commercializing technologies for the prevention, diagnosis, and treatment of voice-related disorders. Their interests were reviewed and are managed by Massachusetts General Hospital and Mass General Brigham in accordance with their conflict-of-interest policies.

6. References

  • [1].Hillman RE, Stepp CE, Van Stan JH, Zañartu M, and Mehta DD, “An updated theoretical framework for vocal hyperfunction,” American Journal of Speech-Language Pathology, vol. 29, pp. 2254–2260, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [2].Oates J and Winkworth A, “Characterising hyperfunctional voice disorders: Etiology, assessment, treatment and prevention,” International Journal of Speech-Language Pathology, vol. 10, pp. 193–194, 2008. [DOI] [PubMed] [Google Scholar]
  • [3].Karkos PD and McCormick M, “The etiology of vocal fold nodules in adults,” Current Opinion in Otolaryngology & Head & Neck Surgery, vol. 17, pp. 420–423, 2009. [DOI] [PubMed] [Google Scholar]
  • [4].Van Houtte E, Van Lierde K, and Claeys S, “Pathophysiology and treatment of muscle tension dysphonia: A review of the current knowledge,” Journal of Voice, vol. 25, pp. 202–207, 2011. [DOI] [PubMed] [Google Scholar]
  • [5].Desjardins M, Apfelbach C, Rubino M, and Abbott KV, “Integrative review and framework of suggested mechanisms in primary muscle tension dysphonia,” Journal of Speech, Language, and Hearing Research, p. in press, 2022. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [6].Hunter EJ, Cantor-Cutiva LC, Leer E. v., Mersbergen M. v., Nanjundeswaran CD, Bottalico P, et al. , “Toward a consensus description of vocal effort, vocal load, vocal loading, and vocal fatigue,” Journal of Speech, Language, and Hearing Research, vol. 63, pp. 509–532, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Roy N, Merrill RM, Gray SD, and Smith EM, “Voice disorders in the general population: Prevalence, risk factors, and occupational impact,” Laryngoscope, vol. 115, pp. 1988–1995, 2005. [DOI] [PubMed] [Google Scholar]
  • [8].Bhattacharyya N, “The prevalence of voice problems among adults in the United States,” Laryngoscope, vol. 124, pp. 2359–2362, 2014. [DOI] [PubMed] [Google Scholar]
  • [9].Mehta DD, Cheyne II HA, Wehner A, Heaton JT, and Hillman RE, “Accuracy of self-reported estimates of daily voice use in adults with normal and disordered voices,” American Journal of Speech-Language Pathology, vol. 25, pp. 634–641, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Manfredi C and Dejonckere PH, “Voice dosimetry and monitoring, with emphasis on professional voice diseases: Critical review and framework for future research,” Logopedics Phoniatrics Vocology, pp. 1–17, 2014. [DOI] [PubMed] [Google Scholar]
  • [11].Mehta DD, Zañartu M, Feng SW, Cheyne II HA, and Hillman RE, “Mobile voice health monitoring using a wearable accelerometer sensor and a smartphone platform,” IEEE Transactions on Biomedical Engineering, vol. 59, pp. 3090–3096, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Cheyne HA, Hanson HM, Genereux RP, Stevens KN, and Hillman RE, “Development and testing of a portable vocal accumulator,” Journal of Speech, Language, and Hearing Research, vol. 46, pp. 1457–67, 2003. [DOI] [PubMed] [Google Scholar]
  • [13].Popolo PS, Švec JG, and Titze IR, “Adaptation of a Pocket PC for use as a wearable voice dosimeter,” Journal of Speech, Language, and Hearing Research, vol. 48, pp. 780–791, 2005. [DOI] [PubMed] [Google Scholar]
  • [14].Searl J and Dietsch A, “Testing of the VocaLog vocal monitor,” Journal of Voice, vol. 28, pp. 523.e27–523.e37, 2014. [DOI] [PubMed] [Google Scholar]
  • [15].Lindstrom F, Waye KP, Södersten M, McAllister A, and Ternström S, “Observations of the relationship between noise exposure and preschool teacher voice usage in day-care center environments,” Journal of Voice, vol. 25, pp. 166–172, 2011. [DOI] [PubMed] [Google Scholar]
  • [16].Coleman RF, “Comparison of microphone and neck-mounted accelerometer monitoring of the performing voice,” Journal of Voice, vol. 2, pp. 200–205, 1988. [Google Scholar]
  • [17].Švec JG, Titze IR, and Popolo PS, “Estimation of sound pressure levels of voiced speech from skin vibration of the neck,” The Journal of the Acoustical Society of America, vol. 117, pp. 1386–1394, 2005. [DOI] [PubMed] [Google Scholar]
  • [18].Mehta D, Van Stan J, and Hillman R, “Relationships between vocal function measures derived from an acoustic microphone and a subglottal neck-surface accelerometer,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 24, pp. 659–668, 2016. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [19].Castellana A, Carullo A, Corbellini S, and Astolfi A, “Discriminating pathological voice from healthy voice using cepstral peak prominence smoothed distribution in sustained vowel,” IEEE Transactions on Instrumentation and Measurement, vol. 67, pp. 646–654, 2018. [Google Scholar]
  • [20].Mehta DD, Espinoza VM, Van Stan JH, Zañartu M, and Hillman RE, “The difference between first and second harmonic amplitudes correlates between glottal airflow and neck-surface accelerometer signals during phonation,” The Journal of the Acoustical Society of America, vol. 145, pp. EL386–EL392, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [21].Fairbanks G, Voice and Articulation Drillbook vol. 2. New York: Harper and Row, 1960. [Google Scholar]
  • [22].Van Stan JH, Mehta DD, Ortiz AJ, Burns JA, Marks KL, Toles LE, et al. , “Changes in a Daily Phonotrauma Index after laryngeal surgery and voice therapy: Implications for the role of daily voice use in the etiology and pathophysiology of phonotraumatic vocal hyperfunction,” Journal of Speech, Language, and Hearing Research, vol. 63, pp. 3934–3944, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [23].Van Stan JH, Mehta DD, Petit RJ, Sternad D, Muise J, Burns JA, et al. , “Integration of motor learning principles into real-time ambulatory voice biofeedback and example implementation via a clinical case study with vocal fold nodules,” American Journal of Speech-Language Pathology, vol. 26, pp. 1–10, 2017. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Mehta DD, Van Stan JH, Zañartu M, Ghassemi M, Guttag JV, Espinoza VM, et al. , “Using ambulatory voice monitoring to investigate common voice disorders: Research update,” Frontiers in Bioengineering and Biotechnology, vol. 3, pp. 1–14, 2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Ortiz AJ, Toles LE, Marks KL, Capobianco S, Mehta DD, Hillman RE, et al. , “Automatic speech and singing classification in ambulatory recordings for normal and disordered voices,” The Journal of the Acoustical Society of America, vol. 146, pp. EL22–EL27, 2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Van Stan JH, Ortiz AJ, Marks KL, Toles LE, Mehta DD, Burns JA, et al. , “Changes in the Daily Phonotrauma Index following the use of voice therapy as the sole treatment for phonotraumatic vocal hyperfunction in females,” Journal of Speech, Language, and Hearing Research, vol. 64, pp. 3446–3455, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [27].Van Stan JH, Ortiz AJ, Cortes JP, Marks KL, Toles LE, Mehta DD, et al. , “Differences in daily voice use measures between female patients with nonphonotraumatic vocal hyperfunction and matched controls,” Journal of Speech, Language, and Hearing Research, vol. 64, pp. 1457–1470, 2021. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [28].Van Stan JH, Mehta DD, Ortiz AJ, Burns JA, Toles LE, Marks KL, et al. , “Differences in weeklong ambulatory vocal behavior between female patients with phonotraumatic lesions and matched controls,” Journal of Speech, Language, and Hearing Research, vol. 63, pp. 372–384, 2020. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [29].Patel RR, Awan SN, Barkmeier-Kraemer J, Courey M, Deliyski D, Eadie T, et al. , “Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association Expert Panel to develop a protocol for instrumental assessment of vocal function,” American Journal of Speech-Language Pathology, vol. 27, pp. 887–905, 2018. [DOI] [PubMed] [Google Scholar]

RESOURCES