Skip to main content
American Journal of Speech-Language Pathology logoLink to American Journal of Speech-Language Pathology
. 2018 Nov 21;27(4):1539–1545. doi: 10.1044/2018_AJSLP-17-0212

Continuous Vocal Fry Simulated in Laboratory Subjects: A Preliminary Report on Voice Production and Listener Ratings

Anumitha Venkatraman a, M Preeti Sivasankar a,
PMCID: PMC6436459  PMID: 30178028

Abstract

Purpose

Vocal fry is prevalent in everyday speech. However, whether the use of vocal fry is detrimental to voice production is unclear. This preliminary study assessed the effects of using continuous vocal fry on voice production measures and listener ratings.

Method

Ten healthy individuals (equal male and female, mean age = 22.4 years) completed 2 counterbalanced sessions. In each session, participants read in continuous vocal fry or habitual voice quality for 30 min at a comfortable intensity. Continuous vocal fry was simulated. Phonation threshold pressure (PTP10 and PTP20), cepstral peak prominence, and vocal effort ratings were obtained before and after the production of each voice quality. Next, 10 inexperienced listeners (equal male and female, mean age = 24.1 years) used visual analog scales to rate paired samples of continuous vocal fry and habitual voice quality for naturalness, employability, and amount of listener concentration.

Results

PTP10 and vocal effort ratings increased after 30 min of continuous vocal fry. Inexperienced listeners rated continuous vocal fry more negatively than the habitual voice quality.

Conclusions

Thirty minutes of simulated, continuous vocal fry worsened some voice measures when compared with a habitual voice quality. Samples of continuous vocal fry were rated as significantly less employable, less natural, and requiring greater listener concentration as compared with samples of habitual voice quality. Future studies should include habitual users of vocal fry to investigate speech stimulability and adaptation with cueing to further understand pathogenesis of vocal fry.


Vocal fry is a perceptually distinct vocal quality characterized by fundamental frequencies between 20 and 70 Hz (Blomgren, Chen, Ng, & Gilbert, 1998). This vocal quality is produced by speakers of both sexes and across age ranges (Abdelli-Beruh, Wolk, & Slavin, 2014; Oliveira, Davidson, Holczer, Kaplan, & Paretzky, 2015; Wolk, Abdelli-Beruh, & Slavin, 2012). Vocal fry is prevalent in daily conversation (Cantor-Cutiva, Bottalico, & Hunter, 2017; Parker & Borrie, 2017). Using nonhabitual vocal qualities, such as a continuous pressed voice, even for a short time, can worsen voice measures in healthy speakers (Fujiki, Chapleau, Sundarrajan, McKenna, & Sivasankar, 2017). To the best of our knowledge, no studies have investigated the effects of using short durations of continuous vocal fry on voice measures.

The physiology of vocal fry quality has been investigated in the literature. The vocal folds are thick and short during vocal fry production (Chen, Robb, & Gilbert, 2002; Childers & Lee, 1991). Additional vibratory characteristics include a longer closed phase of vibration (Chen et al., 2002). Vocal fry may involve increased thyroarytenoid muscle activity when compared with the modal register (McGlone & Shipp, 1971). In addition, vocal fry production may be associated with ventricular phonation (Laver, 1980). Whether these aforementioned characteristics suggest that speaking in a continuous vocal fry quality is detrimental to voice production, rather than speaking in a habitual, nonvocal fry quality remains unknown. We addressed this question in Experiment 1. The aim of Experiment 1 was to compare the effects of using 30 min of continuous vocal fry and habitual voice quality on voice production. Continuous vocal fry was simulated in the lab as detailed in the Method section. Voice measures included phonation threshold pressure (PTP), cepstral peak prominence (CPP), and vocal effort ratings. It was hypothesized that 30 min of continuous vocal fry would increase PTP, increase vocal effort, and decrease CPP, when compared with the habitual voice quality.

Clinical recommendations are informed by auditory-perceptual judgments of vocal quality. Using an altered voice quality, such as vocal fry, may lead to negative judgments by listeners. In fact, listeners may rate vocal fry productions more poorly than nonvocal fry productions even from the same speaker (Anderson, Klofstad, Mayew, & Venkatachalam, 2014). However, the data in the literature are mixed and vary with biological sex and prosodic contexts (Anderson et al., 2014; Parker & Borrie, 2017; Yuasa, 2010). The aim of Experiment 2 was to determine the differences in auditory-perceptual ratings of naturalness, potential for employment (hereafter referred to as employability), and listener concentration for continuous vocal fry versus habitual voice quality. It was hypothesized that continuous vocal fry samples would be associated with ratings of lower employability, increased amount of listener concentration, and decreased naturalness, as compared with samples produced by the same speakers in their habitual voice quality.

Method

All procedures followed the approved guidelines of the Purdue Institutional Review Board.

Experiment 1

Participants

Ten native speakers of American English (five female and five male; M = 22.4 years; range: 18–30 years) completed Experiment 1. Speakers did not report a history of asthma, thyroid disease, head and neck surgery or cancer, or any other medical condition. Speakers were not taking any prescription medication at the time of the study, with the exception of birth control. All speakers presented with perceptually normal speech and voice and were nonsmokers (for the past 5 years). The first author perceptually confirmed the absence of vocal fry in conversation. Of the 30 participants that were screened, 20 participants did not meet eligibility criteria for the following reasons: used vocal fry in conversational sample, reported significant medical history that precluded participation, scheduling conflicts, and Voice Handicap Index-10 scores > 18 (Rosen, Murry, Zinn, Zullo, & Sonbolian, 2000). All female speakers were scheduled in the follicular phase of their menstrual cycle (Raj, Gupta, Chowdhury, & Chadha, 2010).

Protocol

Each speaker participated in two sessions on consecutive days at approximately the same time each day (±1 hr). Ambient humidity (range: 40%–50%) and temperature (70–75 °F) were maintained at constant levels. No speaker reported differences in voice use across sessions. During each session, the absence of vocal fry in conversational speech was confirmed perceptually (by the first author). Next, speakers completed vocal function exercises (Stemple, Lee, D'Amico, & Pickup, 1994) because even a single session of vocal function exercises can facilitate warm-up (Guzman, Angulo, Munoz, & Mayerhoff, 2013).

On the vocal fry day, speakers were trained to produce consistent vocal fry quality while reading. Speakers were presented with a verbal model of continuous vocal fry and encouraged to use this quality. Each speaker practiced reading the first three sentences of the Rainbow Passage using this voice quality. Verbal feedback was provided, and the speakers repeated the task as many times as needed if the vocal fry quality was inconsistent as perceptually judged by the first author. No subject practiced over 5 min. Subjects did not undergo training on the day that they used their habitual voice quality for 30 min.

Baseline voice measures were collected. Next, speakers read the Rainbow Passage (Fairbanks, 1960) for 30 min in continuous vocal fry or their habitual voice quality, at a comfortable, self-selected vocal intensity for each quality. Speakers were shown signs of “vocal fry” visually as a reminder to produce continuous vocal fry. All participants were perceptually deemed (by the first author) to consistently maintain a vocal fry quality throughout the reading. A duration of 30 min was chosen in line with a previous study employing altered vocal qualities on voice production measures (Fujiki et al., 2017). All speakers maintained a consistent speaking intensity in each session (±5 dB; 33-2055 SLM, RadioShack). Voice measures were then reobtained.

Voice Measures

The voice measures included PTP at the 10th and 20th pitches, CPP on the all-voiced sentence of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V; Kempster, Gerratt, Abbott, Barkmeier-Kraemer, & Hillman, 2009), and vocal effort ratings. PTP at extreme low and high pitches is sensitive to changes in laryngeal physiology (Verdolini, Titze, & Druker, 1990). But as some participants may have difficulty performing the PTP task consistently at very high pitches, only low pitches were used here (Erickson & Sivasankar, 2010). CPP may change after using nonhabitual voice quality for speaking (Fujiki et al., 2017). CPP was computed on the CAPE-V sentence “We were away a year ago,” as prior research has shown the greatest magnitude of change in CPP with this particular sentence of the CAPE-V (Awan, Roy, Jetté, Meltzner, & Hillman, 2010). CPP has also been examined as a voice measure in studies of vocal fry (Plexico & Sandage, 2017). Lastly, self-reported vocal effort ratings were obtained using the adapted BORG CR-10 scale (Baldner, Doll, & van Mersbergen, 2015), as an increase in self-perceived vocal effort may correlate with a worsening in voice measures (Laukkanen et al., 2004).

Prior to the collection of PTP, the 10th and 20th percent pitches were determined. For calculation of the 10th and 20th percent pitches, speakers were asked to glide to the highest pitch and then to the lowest pitch that they could sustain for 1 s on a soft “ee” sound. The frequencies were recorded using a contact microphone coupled to the rhinolaryngeal stroboscope (RLS 9100B, Pentax Medical). Speakers started from a comfortable pitch and completed the task two times. The lowest frequency (excluding vocal fry) and highest frequency were selected. The semitone range was computed, and the 10th and 20th percent pitches were obtained for each individual (Solomon, Ramanathan, & Makashay, 2007). The closest note corresponding to these pitches was used to cue the speakers using a keyboard (SA-76, Casio).

For collection of PTP, speakers produced five syllable strings of /pi/ at the target pitches (cued with a keyboard) in accordance with validated procedures (Fisher & Swank, 1997). Speakers wore nose plugs during the task. A circumvented pneumotachograph face mask was placed on each speaker's face. Low and high bandwidth pressure transducers (PTW-1 and PTL-1 transducers, Glottal Enterprises) were used to measure airflow and oral pressure, respectively. The pressure transducers were connected to the digital multichannel hardware system (PowerLab 16/30 A/D converter, ADInstruments). A 1-in. plastic tube (attached to the PTL-1 transducer) was placed in the speaker's mouth, above the tongue. Labchart v8.1.3 software (ADInstruments) was used to analyze the data. For analysis, three consecutive peaks of similar height were selected from each syllable string. Oral pressure was obtained at the middle of each peak of the /p/ production (corresponding oral flows: −7 to 10 ml/s). The pressures were averaged to obtain the PTP for that target pitch. One speaker had high flows (> 10 ml/s) at PTP20, and these data were excluded because of the possibility of pneumotachograph mask movement during data collection or an incomplete velopharyngeal closure during /p/ production (Fisher & Swank, 1997).

For CPP measurement, each speaker wore a head-mounted microphone (AKG C555 L, AKG Acoustics) placed 1 in. from their mouth. The microphone signal was amplified (Xenyx 1202/1002/802/502 Behringer) and sampled at 44 kHz through a digital multichannel hardware system (PowerLab 16/30 ADInstruments). The Analysis of Dysphonia in Speech and Voice program (Model 5109, KayPENTAX) was used for determining CPP. For vocal effort ratings, subjects were first provided an orientation to the BORG CR-10 scale. Subjects were asked to report their effort using the scale at the start and end of the 30 min.

Experiment 2

Participants

Ten listeners (five female and five male; M = 24.1 years; range: 18–29 years) completed Experiment 2. The listeners passed a hearing screening of 20 dB HL at 500 Hz, 1 kHz, 2 kHz, and 4 kHz (GSI-17 Portable Audiometer, Grason Stadler Inc.). No listener had received previous singing or vocal training and did not report previous knowledge relating to the field of voice or experience rating vocal qualities.

Rating Parameters

The three parameters that were rated include (a) employability of the speaker, (b) amount of concentration required to understand the sample, and (c) naturalness of the sample. Potential for employment (i.e., employability) was chosen as a parameter to replicate the findings of Anderson et al. (2014). While not explicitly applied to studies of vocal fry quality, perceptual studies often include judgments of listener concentration in comparing voice qualities (Nagle & Eadie, 2012); hence, this parameter was included. Given the high prevalence of vocal fry among the young adult population and conversational entrainment of vocal fry (Borrie & Delfino, 2017; Wolk et al., 2012), naturalness was included as an additional rating parameter for listeners.

Listener Ratings

Ratings were completed in a quiet room. Listeners heard samples at a self-selected, comfortable intensity through headphones (Sennheiser HD 280 Pro over-ear headphones, Sennheiser Electronic GmbH & Co. KG). Listeners were encouraged to ask the investigator to adjust the output gain at any time during the experiment if needed.

Listener Familiarization With the Rating Task

Before ratings began, each listener was familiarized with the rating scale. The listeners were oriented to the anchors of the scale before the task. They were encouraged to clarify the meaning of each parameter. The following definitions were provided for each parameter: (naturalness) “How likely are you to hear a speaker like this, during a typical day?” (employability) “How likely are you to hire this speaker for a job?” and (amount of concentration) “Rate how much concentration it requires for you to understand the sample.” The anchors varied on each scale, with the left anchor representing employable, very little concentration, and natural and the right anchor representing not employable, maximal concentration, and unnatural. Thus, higher ratings (as measured from left to right) indicated decreased employability, decreased naturalness, and greater listener concentration. Listeners then heard a pair of samples (continuous vocal fry and habitual qualities from the same speaker) and were encouraged to rate these samples on each of the parameters (above) on the visual analog scale. Once listeners reported comfort and familiarization with the rating parameters and procedures, the rating task was initiated.

Rating Task

The sample that was selected for listener ratings was the first paragraph of the Rainbow Passage that started no sooner than the fifth minute but no later than the 10th minute. A total of 20 samples were rated by each listener (two samples per speaker × 10 speakers). These samples were presented via a PowerPoint presentation for ratings. The order of the vocal fry and habitual samples was randomized across the speakers with the same PowerPoint presentation shown to every listener. Listeners could replay each sample as many times as required. After hearing the sample, listeners drew a vertical line corresponding to their rating on a 10-cm visual analog scale for each parameter.

Statistical Analysis

All statistical analyses were carried out using SPSS (Version 22, IBM SPSS Statistics). First reliability was computed. Ten percent of the PTP10, PTP20, CPP, and listener rating data were reanalyzed for interrater and intrarater reliability. Interclass correlation coefficients (ICCs) were calculated. The intrarater ICCs ranged from 0.93 to 1.0 (PTP10 = 0.93, p < .01; PTP20 = 0.99, p < .01; CPP = 0.96, p < .01; amount of concentration = 1, p < .01; naturalness = 1, p < .01; employability = 1, p < .01). The interrater ICCs ranged from 0.79 to 1.0 (PTP10 = 0.92, p < .01; PTP20 = 0.79, p = .057; CPP = 0.97, p < .01; amount of concentration = 1, p < .01; naturalness = 1, p < .01; employability = 1, p < .01).

Experiment 1

The dependent measures included PTP10, PTP20, CPP, and vocal effort ratings. Dependent measures were first assessed for normal distribution on the Shapiro–Wilk test of normality. PTP10, PTP20, and vocal effort ratings were transformed to meet assumptions of normality for parametric tests. CPP data did not need transformations. A two-factor, repeated-measures analysis of variance was employed for each dependent measure with time (baseline; post) and quality (habitual voice; continuous vocal fry) as factors. Significance was defined as p ≤ .05.

Experiment 2

Dependent measures were first assessed for normal distribution on the Shapiro–Wilk test of normality. All data were found to be normally distributed. A two-factor, repeated-measures analysis of variance was completed with parameter (listener concentration, employability, and naturalness) and quality (habitual voice; continuous vocal fry) as factors. Paired t tests were used for post hoc analysis. Significance was defined as p ≤ .05.

Results

Only statistically significant results are reported below.

Experiment 1

The means and standard errors for PTP10, PTP20, CPP, and vocal effort ratings are reported in Table 1. There was a significant Time × Voice Quality interaction effect for PTP10, F(1, 9) = 6.55, p = .03, partial η2 = .06. That is, PTP10 increased after continuous vocal fry but decreased after the habitual voice quality. There was a significant main effect of time for CPP and vocal effort. CPP increased after speaking in continuous vocal fry and habitual voice quality, F(1, 9) = 6.88, p = .03, partial η2 = .43. Likewise, vocal effort increased after continuous vocal fry and habitual voice quality, F(1, 9) = 21.96, p < .01; partial η2 = .71. There was also a significant main effect of voice quality for vocal effort ratings, F(1, 9) = 8.18, p = .02, partial η2 = .48. Vocal effort ratings were higher for continuous vocal fry compared with the habitual voice quality.

Table 1.

Means and standard error of the mean for voice measures at baseline and at the completion of continuous vocal fry (CVF) and habitual voice quality (HVQ).

Variable Baseline CVF Post CVF Baseline HVQ Post HVQ
PTP10 4.84 ± 0.34 5.23 ± 0.34 4.91 ± 0.41 4.77 ± 0.25
PTP20 5.19 ± 0.37 5.11 ± 0.31 4.92 ± 0.41 4.73 ± 0.31
CPP 8.78 ± 0.63 9.50 ± 0.44 8.22 ± 0.41 8.78 ± 0.44
Vocal effort 1.65 ± 0.47 4.90 ± 0.72 0.95 ± 0.44 2.70 ± 0.41

Note. Units of measurement are centimeters of water (PTP) and decibels (CPP). PTP = phonation threshold pressure; CPP = cepstral peak prominence.

Experiment 2

The means and standard errors for listener ratings of continuous vocal fry and habitual voice quality samples are reported in Table 2. There was a significant Voice Quality × Parameter interaction effect, F(2, 36) = 4.61, p = .02, partial η2 = .20, suggesting that listeners rated continuous vocal fry samples differently than habitual voice quality samples. Post hoc tests revealed higher or more negative ratings for continuous vocal fry on amount of concentration, t(9) = 4.75, p < .01, employability, t(9) = 7.93, p < .01, and naturalness, t(9) = 5.93, p < .01. Specifically, samples of continuous vocal fry were rated as less employable, less natural, and requiring greater listener concentration when compared with habitual voice quality samples.

Table 2.

Means and standard error of mean for listener ratings on parameters for continuous vocal fry (CVF) and habitual voice quality (HVQ).

Variable CVF HVQ
Amount of Concentration 4.53 ± 0.17 1.78 ± 0.21
Naturalness 6.47 ± 0.28 2.31 ± 0.29
Employability 7.27 ± 1.13 0.24 ± 0.36

Note. Higher ratings indicate greater severity or more negative ratings for all dependent measures. All listener ratings are reported in centimeters.

Confirmation of Vocal Quality With CPP

To ensure that participants maintained continuous vocal fry during the vocal fry challenge and a nonvocal fry quality during their habitual voice, the mean CPP was compared with samples of continuous vocal fry and habitual voice quality. The mean CPP was calculated on the second and third sentences of the Rainbow Passage. Habitual voice quality CPP (M ± SE = 6.49 ± 0.07 dB) was greater than continuous vocal fry CPP (3.63 ± 0.14 dB). A paired t test revealed significant differences between the two voice qualities, t(9) = 6.94, p < .01, confirming that speakers were producing different voice qualities on the 2 days.

Discussion

Here, we report data from two experiments that compared the effects of using continuous vocal fry and habitual voice quality on voice production measures and inexperienced listener ratings. Using 30 min of continuous vocal fry increased PTP at the 10th percent pitch and vocal effort ratings. These increases reached statistical significance but were of small magnitude and should be interpreted with caution. Inexperienced listeners rated samples of continuous vocal fry more negatively than habitual voice quality samples. Below, we detail why only some voice production measures may have changed, the implications of these findings, and directions for future research.

Using simulated continuous vocal fry induced negative changes in PTP10. Conversely, PTP20 did not change. Changes in laryngeal physiology after continuous vocal fry, which is characterized by increase in thyroarytenoid muscle activity (McGlone & Shipp, 1971) and increased vocal fold thickness (Chen et al., 2002), may have more easily carried over to lower register PTP productions (10th percent vs. 20th percent). PTP at pitch range extremes (e.g., 10th) is thought to be more sensitive to laryngeal physiology than more conversational (e.g., 20th) pitches (Verdolini et al., 1990). This may explain why only PTP10 changed after speaking in continuous vocal fry. PTP10 data were collected first, so it is possible that any carryover effects from speaking in continuous vocal fry may have dissipated before PTP20 data were collected.

The magnitude of change in PTP10 was 0.4 cmH2O, and the clinical significance of such a small change is unclear. However, similar magnitudes of change in PTP80 have been reported in a study examining the effects of nebulized treatments on voice production (Roy, Tanner, Gray, Blomgren, & Fisher, 2003). Regarding other voice measures, CPP increased after both continuous vocal fry and habitual voice quality. This magnitude of change was small (< 0.72 dB). Caution must be expended while interpreting a CPP change of less than 1 dB as previous literature comparing dysphonic and healthy speakers reported an average of 2.2-dB difference between the groups (Heman-Ackah et al., 2014). Lastly, vocal effort ratings increased significantly after producing both voice qualities. However, speakers rated their vocal effort much higher after continuous vocal fry than habitual voice quality, even though both sessions lasted for the same duration. An increase in vocal effort has been reported after extended voice use (Laukkanen et al., 2004; Remacle, Finck, Roche, & Morsomme, 2012), possibly explaining why vocal effort after speaking for 30 min was rated higher than baseline vocal effort. The increased vocal effort after continuous vocal fry could also be attributed to our participants using a novel speaking style (simulated and continuous), which could have influenced their ratings. Future studies should incorporate laryngeal imaging to assess visual changes to the larynx, if any, after continuous vocal fry. Investigating other potential physiological changes from using vocal fry, such as increased thyroarytenoid muscle activity, reduced airflow, and glottic compression, is an avenue for further study.

Inexperienced listeners rated continuous vocal fry samples as unnatural, requiring greater concentration, and associated with lower employability compared with habitual voice quality samples. This finding is noteworthy because speech that requires an increased amount of listener concentration may be associated with decreased speech acceptability (Nagle & Eadie, 2012). Our findings also support previous literature, where speakers using vocal fry were perceived as less employable, less trustworthy, less attractive, and less educated (Anderson et al., 2014). It is possible that lower ratings by inexperienced listeners were related to the simulated continuous vocal fry, which differs from a more naturalistic vocal fry observed in everyday use. These data should also be interpreted in the context of listener demographics. Our listeners were young adults and may not have the experience or context to rate an individual's potential for employment.

Therefore, future studies should investigate listener ratings on more naturalistic productions along the continuum of minimal to significant presence of fry to determine a threshold (if any) for unfavorable ratings. Because language background and dialect can influence listener judgments, these data should be recorded in future studies incorporating listener ratings. There were no statistical significant differences between rating judgments for female and male speakers while using continuous vocal fry (data not shown). These data are different from those reported previously (Anderson et al., 2014). Reasons for this difference may be reflective of our smaller sample size and narrow age range of listeners. Raters in our study were college students between 18 and 29 years. There is a reported high prevalence of vocal fry in this age range; hence, any negative perception of vocal fry might differ from those in a different age group. Therefore, future studies will need to examine a larger group of listeners across the age span, who are more representative of the population (Anderson et al., 2014). As clinical judgments are based on auditory-perceptual evaluations, experienced voice pathologists could also be recruited to rate samples. It will also be important to incorporate a foil or instructional manipulation check in future research (Oppenheimer, Meyvis, & Davidenko, 2009). Future work in this area should include speakers with chronic vocal fry under habitual and clinically induced reduction of this voice quality to increase the clinical relevance of this work. If vocal fry is entrained in conversation, the persistence of vocal fry should also be addressed outside of a research setting to fully understand the salient features of this voice quality.

Summary and Limitations

There are several limitations of this preliminary study that must be acknowledged. Continuous vocal fry is a simulated production and not typical of everyday speech. It is therefore possible that changes in voice production measures reflect physiological adaptation to nonhabitual speech patterns rather than the use of vocal fry. Speakers use different amounts of vocal fry in daily speech (Plexico & Sandage, 2017). It is unknown whether frequent use of vocal fry correlates with greater muscular ease and lesser cognitive effort. Given the spread in amount of vocal fry used in everyday speech, continuous vocal fry was compared with habitual voice quality as the control condition in this pilot study. The acoustic prosodic context, which influences listener judgments (Parker & Borrie, 2017), was not considered in listening ratings. Listeners were rating productions of the same speaker, so the acoustic prosodic context may not have influenced the ratings; however, future work should control for this important factor. In conclusion, data from this study provide preliminary evidence that continuous vocal fry can worsen some voice measures, even when used for a relatively short duration. The data also confirm published literature that listeners rate a continuous vocal fry quality more negatively than a habitual voice quality.

Acknowledgments

Funding was provided by a National Institutes of Health T32 Training Grant 2T32DC000030-26 to the Department of Speech, Language, & Hearing Sciences at Purdue University. This work was based on a thesis by the first author submitted in partial fulfillment of the MS-SLP degree from the Department of Speech, Language, & Hearing Sciences at Purdue University. The authors thank the members of the MS-thesis committee Barbara Solomon, Georgia Malandraki, and Christine Weber for their insightful comments. Bruce Craig and Ryan Murphy at the Purdue University Statistical Consulting Services assisted with the statistical analysis. The authors also thank Robert Fujiki, Abigail Chapleau, and Sara Loerch for their contributions to the data analysis.

Funding Statement

Funding was provided by a National Institutes of Health T32 Training Grant 2T32DC000030-26 to the Department of Speech, Language, & Hearing Sciences at Purdue University.

References

  1. Abdelli-Beruh N. B., Wolk L., & Slavin D. (2014). Prevalence of vocal fry in young adult male American English speakers. Journal of Voice, 28(2), 185–190. [DOI] [PubMed] [Google Scholar]
  2. Anderson R. C., Klofstad C. A., Mayew W. J., & Venkatachalam M. (2014). Vocal fry may undermine the success of young women in the labor market. PLoS One, 9(5), e97506. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Awan S., Roy N., Jetté M., Meltzner G., & Hillman R. (2010). Quantifying dysphonia severity using a spectral/cepstral-based acoustic index: Comparisons with auditory-perceptual judgements from the CAPE-V. Clinical Linguistics & Phonetics, 24(9), 742–758. [DOI] [PubMed] [Google Scholar]
  4. Baldner E. F., Doll E., & van Mersbergen M. R. (2015). A review of measures of vocal effort with a preliminary study on the establishment of a vocal effort measure. Journal of Voice, 29(5), 530–541. [DOI] [PubMed] [Google Scholar]
  5. Blomgren M., Chen Y., Ng M. L., & Gilbert H. R. (1998). Acoustic, aerodynamic, physiologic, and perceptual properties of modal and vocal fry registers. The Journal of the Acoustical Society of America, 103(5 Pt. 1), 2649–2658. [DOI] [PubMed] [Google Scholar]
  6. Borrie S. A., & Delfino C. R. (2017). Conversational entrainment of vocal fry in young adult female American English speakers. Journal of Voice, 31(4), 513.e25–513.e32. [DOI] [PubMed] [Google Scholar]
  7. Cantor-Cutiva L. C., Bottalico P., & Hunter E. (2017). Factors associated with vocal fry among college students. Logopedics, Phoniatrics, Vocology, 43(2), 1–7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Chen Y., Robb M. P., & Gilbert H. R. (2002). Electroglottographic evaluation of gender and vowel effects during modal and vocal fry phonation. Journal of Speech, Language, and Hearing Research, 45(5), 821–829. [DOI] [PubMed] [Google Scholar]
  9. Childers D. G., & Lee C. K. (1991). Vocal quality factors: Analysis, synthesis, and perception. The Journal of the Acoustical Society of America, 90(5), 2394–2410. [DOI] [PubMed] [Google Scholar]
  10. Erickson E., & Sivasankar M. (2010). Evidence for adverse phonatory change following an inhaled combination treatments. Journal of Speech, Language, and Hearing Research, 53, 75–83. [DOI] [PubMed] [Google Scholar]
  11. Fairbanks G. (1960). Voice and articulation drill book (2nd ed.). New York, NY: Harper and Row. [Google Scholar]
  12. Fisher K., & Swank P. (1997). Estimating phonation threshold pressure. Journal of Speech, Language, and Hearing Research, 40(5), 1122–1129. [DOI] [PubMed] [Google Scholar]
  13. Fujiki R., Chapleau A., Sundarrajan A., McKenna V., & Sivasankar P. (2017). The interaction of surface hydration and vocal loading on voice measures. Journal of Voice, 31(2), 211–217. [DOI] [PubMed] [Google Scholar]
  14. Guzman M., Angulo M., Munoz D., & Mayerhoff R. (2013). Effect on long-term average spectrum of pop singers' vocal warm-up with vocal function exercises. International Journal of Speech-Language Pathology, 15(2), 127–135. [DOI] [PubMed] [Google Scholar]
  15. Heman-Ackah Y. D., Sataloff R. T., Laureyns G., Lurie D., Michael D. D., Heuer R., … Hillenbrand J. (2014). Quantifying the cepstral peak prominence, a measure of dysphonia. Journal of Voice, 28(6), 783–788. [DOI] [PubMed] [Google Scholar]
  16. Kempster G. B., Gerratt B. R., Abbott K. V., Barkmeier-Kraemer J., & Hillman R. E. (2009). Consensus auditory-perceptual evaluation of voice: Development of a standardized clinical protocol. American Journal of Speech-Language Pathology, 18(2), 124–132. [DOI] [PubMed] [Google Scholar]
  17. Laukkanen A.-M., Järvinen K., Artkoski M., Waaramaa-Mäki-Kulmala T., Kankare E., Sippola S., … Salo A. (2004). Changes in voice and subjective sensations during a 45-min vocal loading test in female subjects with vocal training. Folia Phoniatrica et Logopaedic, 56(6), 335–346. [DOI] [PubMed] [Google Scholar]
  18. Laver J. (1980). The phonetic description of voice quality. New York, NY: Cambridge University Press. [Google Scholar]
  19. McGlone R. E., & Shipp T. (1971). Some physiologic correlates of vocal-fry phonation. Journal of Speech and Hearing Research, 14(4), 769–775. [DOI] [PubMed] [Google Scholar]
  20. Nagle K. F., & Eadie T. L. (2012). Listener effort for highly intelligible tracheoesophageal speech. Journal of Communication Disorders, 45(3), 235–245. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Oliveira G., Davidson A., Holczer R., Kaplan S., & Paretzky A. (2015). A comparison of the use of glottal fry in the spontaneous speech of young and middle-aged American women. Journal of Voice, 30(6), 684–687. [DOI] [PubMed] [Google Scholar]
  22. Oppenheimer D. M., Meyvis T., & Davidenko N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology, 45(4), 867–872. [Google Scholar]
  23. Parker M. A., & Borrie S. A. (2017). Judgments of intelligence and likability of young adult female speakers of American English: The influence of vocal fry and the surrounding acoustic-prosodic context. Journal of Voice. https://doi.org/10.1016/j.jvoice.2017.08.002 [DOI] [PubMed] [Google Scholar]
  24. Plexico L. W., & Sandage M. J. (2017). Influence of glottal fry on acoustic voice assessment: A preliminary study. Journal of Voice, 31(3), 378.e13–378.e17. [DOI] [PubMed] [Google Scholar]
  25. Raj A., Gupta B., Chowdhury A., & Chadha S. (2010). A study of voice changes in various phases of menstrual cycle and in postmenopausal women. Journal of Voice, 24(3), 363–368. [DOI] [PubMed] [Google Scholar]
  26. Remacle A., Finck C., Roche A., & Morsomme D. (2012). Vocal impact of a prolonged reading task at two intensity levels: Objective measurements and subjective self-ratings. Journal of Voice, 26(4), e177–e186. [DOI] [PubMed] [Google Scholar]
  27. Rosen C., Murry T., Zinn A., Zullo T., & Sonbolian M. (2000). Voice handicap index change following treatment of voice disorders. Journal of Voice, 14(4), 619–623. [DOI] [PubMed] [Google Scholar]
  28. Roy N., Tanner K., Gray S., Blomgren M., & Fisher K. (2003). An evaluation of the effects of three laryngeal lubricants on phonation threshold pressure. Journal of Voice, 17(3), 331–342. [DOI] [PubMed] [Google Scholar]
  29. Solomon N., Ramanathan P., & Makashay M. (2007). Phonation threshold pressure across the pitch range: Preliminary test of a model. Journal of Voice, 21(5), 541–550. [DOI] [PubMed] [Google Scholar]
  30. Stemple J. C., Lee L., D'Amico B., & Pickup B. (1994). Efficacy of vocal function exercises as a method of improving voice production. Journal of Voice, 8(3), 271–278. [DOI] [PubMed] [Google Scholar]
  31. Verdolini K., Titze I., & Druker D. (1990). Changes in phonation threshold pressure with induced conditions of hydration. Journal of Voice, 4(2), 142–151. [Google Scholar]
  32. Wolk L., Abdelli-Beruh N. B., & Slavin D. (2012). Habitual use of vocal fry in young adult female speakers. Journal of Voice, 26(3), e111–e116. [DOI] [PubMed] [Google Scholar]
  33. Yuasa I. P. (2010). Creaky voice: A new feminine voice quality for young urban-oriented upwardly mobile American women? American Speech, 85(3), 315–337. [Google Scholar]

Articles from American Journal of Speech-Language Pathology are provided here courtesy of American Speech-Language-Hearing Association

RESOURCES