Abstract
Objectives
Fundamental frequency (F0) and intensity (SPL) of voice are controlled by intrinsic laryngeal muscle (ILM) activation and subglottal pressure (Psub). Their interactions were investigated.s
Methods
In an in vivo canine model, the thyroarytenoid (TA), lateral cricoarytenoid/interarytenoid (LCA/IA), and the cricothyroid (CT) muscles were independently activated from threshold to maximal contraction by neuromuscular stimulation in various combinations while airflow was increased to phonation onset pressure and beyond. The resultant acoustic output was analyzed for effects of Psub on vibratory stability, F0 and SPL. Muscle activation plots and vocal range profiles by individual ILM activation states were analyzed.
Results
CT activation increased phonation onset F0, but vibration was less stable in high CT conditions and displayed vibratory mode change. In addition, in high CT conditions, a decrease in F0 with increased Psub was observed. SPL increased with Psub in all conditions but the slope was greater at high CT, low TA/LCA/IA activations. LCA/IA activation improved vocal efficiency. To maintain same F0 with increasing SPL (messa di voce), TA activation was decreased and LCA/IA activation was increased. The same F0 and SPL could be achieved with a variety of ILM activation combinations.
Conclusions
CT is primarily required for increasing F0, while TA can increase or decrease F0 and SPL. LCA/IA likely maintains vocal fold adduction during increased Psub and improves vocal efficiency. This study also demonstrates laryngeal motor equivalence, the ability of the larynx to achieve the same target F0 and SPL with multiple combinations of ILM activation.
Keywords: voice production, fundamental frequency, subglottal pressure, vocal intensity, neuromuscular activation, larynx, canine
INTRODUCTION
Fundamental frequency (F0) and intensity (SPL) are fundamental parameters of voice in speech and singing. Changes in pitch and loudness are also the most common complaints of patients with dysphonia. Investigations to date have identified two major variables that affect F0 and SPL: (1) neuromuscular activation of the intrinsic laryngeal muscles (ILMs) that control the glottal phonatory shape, stiffness, and tension, and (2) aerodynamic forces that act upon the vocal folds to initiate and maintain sustained vocal fold oscillations.
Previous investigations have focused primarily on the effects of aerodynamic forces (subglottal pressure and airflow) on F0 and SPL. Rubin and Vennard recorded concurrent airflow, Psub, and SPL in three subjects and found that Psub increased with increasing loudness at all F0 levels.1 Psub also increased with increased F0 at constant SPL. In general, airflow also increased with increased F0 and SPL but occasionally decreased. Titze predicted F0 change in the range of 0.5–6 Hz/cm H20 based on theoretical calculations and noted a F0-SPL dependency.2 Titze and Sundberg (1992) reported that SPL increases with F0 approximately 8–9 dB/octave and also 8–9 dB/doubling of phonation onset threshold pressure (Pth).3
One of the major limitations of prior studies was the inability to evaluate the roles of individual ILMs in F0 and SPL control. We previously demonstrated a technique of graded activation of multiple ILMs to investigate their role in F0 control at phonation onset.4–5 However, speech communication occurs beyond onset and requires further increase in Psub and/or airflow (Q). However, further increasing Psub would affect F0, SPL, and vibratory mode, but such assessment has heretofore not been performed with ILM activation state as an independent variable.6 Thus, in the present study we evaluate the effects of laryngeal neuromuscular activation state and rising Psub beyond phonation onset on the following: (1) phonatory mode stability (2) F0 change (3) SPL change (4) vocal efficiency, and (5) relationship between SPL and F0 by individual ILM activation state.
METHODS
A. In vivo canine larynx preparation
An in vivo canine model was used as previously described.4–5 The study protocol was approved by the Institutional Animal Research Committee. To improve access to the larynx for high-speed video recording the larynx was exteriorized in the neck via suprahyoid pharyngotomy and a supraglottic laryngectomy. The recurrent laryngeal nerves (RLNs) were identified, nerve branches to Galen’s anastomosis and PCA muscles divided, and followed distally until the TA and LCA/IA branches were identified.
The TA branches were tied off with silk sutures and tripolar cuff electrodes (Ardiem Medical, Indiana, PA, USA) were placed distally to activate the TA muscles. Electrodes were placed on the main RLN nerves to activate the LCA/IA muscles, and on the external branches of the superior laryngeal nerves (SLNs) to activate the CT muscles. The internal (sensory) branches of the SLN were divided bilaterally. For each muscle the nerve stimulation range from threshold to maximal activation were determined. Stimulation for each ILM activation state was performed for 1500 milliseconds with 100 microseconds long rectangular unipolar cathodic pulses at pulse repetition rates of 100 Hz.
Stimulations and recordings proceeded as follows: each bilateral paired muscle was symmetrically activated at the same grade level. First, the CT and LCA/IA were stimulated in all combinations of 7 levels of graded stimulation, from threshold to maximal activation. Then the same stimulation set was repeated at 4 levels of TA activation from threshold to maximum (levels 1–4). Muscle activation plots (MAPs) were generated for CT versus LCA/IA activation levels at each TA activation level.
B. In vivo canine model of phonation
A rigid subglottal tube to provide rostral airflow was attached and connected to an airflow controller (MCS Series Mass Flow Controller, Alicat Scientific, Tucson, Arizona, USA), which increased the airflow rate linearly from 300 to 1600 ml/s during each 1500 ms stimulation. The airflow was increased linearly to reach phonation onset pressure (Pth) and further so Psub continuously increased beyond onset until maximal Q was reached. The airflow controller was connected downstream to a heater humidifier so that the airflow at the glottis was 37 degrees Celsius and 100% relative humidity.
C. Measurement of experimental parameters
A high-speed digital video camera (Phantom v210, Vision Research Inc., Wayne, New Jersey, USA) imaged laryngeal posture changes and vibration at 3000 frames per second (fps). Acoustic and aerodynamic data were recorded using a probe microphone (Model 4128, Bruel & Kjaer North America, Norcross, Georgia, USA) and a pressure transducer (MKS Baratron 220D, MKS Instruments, Andover, Massachusetts, USA) mounted flush with the inner wall of the subglottic inflow tube about 5 cm below the inferior border of the glottis.
D. Acoustic Signal Analysis
The acoustic signal from each activation condition was first visualized using SoundForge professional digital audio editor software. The phonation onset time and the end of the stable mode from phonation onset (bifurcation points) were marked (Figure 1). In some activation conditions, the acoustic signal transitioned from onset vibratory mode to another mode as subglottal pressure continued to increase (Figure 1a–b). The SPL curve rose rapidly after phonation onset and then stabilized slightly after (blue line in Figure 1c). Where SPL curve stabilized or plateaued (see derivative of SPL curve, green line figure 1c) was marked as the SPL “stability” time.
Figure 1.
Illustrative neuromuscular activation condition showing (a) spectrogram of the acoustic signal with overlaid Psub level, (b) acoustic signal showing onset vibratory mode change (labeled “end”), and (c) Sound pressure level (SPL, blue line) and its derivative (green line) used to mark the stability point. Phonation “onset” time, SPL “stability” time, and “end” of the phonation onset vibratory mode are marked. Stability and end time points were used to calculate the slope of F0 and SPL change per incremental change in Psub.
The SPL stability time point reflects both the time required to reach phonation onset pressure as well as the rise time for SPL after onset. As one of the goals of this study was to evaluate the effect of Psub on F0 and SPL, it seemed more appropriate to use the SPL stability points as the first data points, rather than the SPL exactly at phonation onset, where acoustic energy is not fully reflected (see figure 1).
Automated F0 calculation at the stability, middle, and end points of the stable onset mode was performed by implementing the valley picking algorithm based on the recursive algorithm by Mermelstein.7 Three glottal cycles were selected to measure the SPL and F0. The absolute value of SPL considering the reference pressure was not calculated, since the comparison was performed within same recording parameters.
E. Relationship between Psub and Acoustic Outputs
The increment increase in SPL per doubling of Psub after phonation onset was calculated by making both SPL and Psub log-scaled to better define the relationship then obtained using linear regression. While SPL has a logarithmic relation to Psub, F0 has a linear relation. The slope for F0 change was also calculated using linear regression between the stability and end point of the stable phonatory mode. When the stable mode interval was shorter than 50ms, the data were ignored since the slope calculated is not reliable (this occurred in 2 of 320 total activation conditions).
Vocal efficiency was calculated as the ratio of the acoustic power to aerodynamic power at the middle of the stable vibratory mode, between stability point and end point, for each activation condition. The acoustic power (numerator) was derived in watts from the SPL. The aerodynamic power (denominator) was calculated as the product of Psub x Q.
RESULTS
A. SPL stability time after onset of stimulation
In this investigation, Q was linearly increased from 300 cc/s to 1600 cc/s during the 1500 ms neuromuscular stimulation duration for each unique combination of neuromuscular activation. Phonation onset occurred when Psub matched that required for that particular laryngeal posture, tension, and stiffness. After phonation onset, Q and Psub continued to rise until maximal airflow rate was reached at the end of stimulation (Figure 1). As shown in figure 1c, the slope of the SPL curve rose abruptly after phonation onset then stabilized a short time thereafter (range 24–80 ms). However, there was no obvious relationship between SPL stability point after onset and laryngeal activation condition. Thus Muscle Activation Plots (MAPs) for the SPL stability time from onset of stimulation are shown in figure 2. SPL stability time increased with CT activation and decreased with LCA and TA activation.
Figure 2.

CT versus LCA/IA Muscle Activation Plots (MAPs) showing SPL Stability Time for TA levels 2–4. SPL Stability Time is phonation onset time plus the SPL rise time after onset. CT activation increased and TA and LCA/IA activation decreased phonation onset and SPL stability time (see text).
B. Phonatory mode stability after phonation onset
After phonation onset was reached, Q and Psub continued to increase and the phonation onset vibratory mode remained stable for the entire stimulation duration for TA levels 0 and 1 (MAPs not shown). With further increase in TA activation, the onset mode became “unstable” and a mode change occurred (see “end” mark in Figure 1b) in certain conditions. At TA level 2, most activation conditions at mid CT activation (levels 3–4) were unstable (middle blue island, Figure 3A) and displayed reduced phonatory mode stability. At TA level 3, the higher CT activation conditions (levels 4–7) were unstable (Figure 3B). However, at TA level 4 (maximal TA activation) the higher CT activation conditions (levels 4–7) were stable throughout while the lower CT activation levels were unstable (Figure 3C).
Figure 3.

CT versus LCA/IA Muscle Activation Plots (MAPs) showing phonatory mode duration after phonation onset for TA levels 2–4 (see text).
C. Effect of Psub on F0
MAPs for the incremental change in F0 with every 1000 Pa (about 10 cm H20) increase in Psub after onset (dF0/dPsub) are shown in figure 4. Increased CT activation generally led to decrease in dF0/dPsub for mid-level TA activation conditions (TA2, TA3). At the highest TA condition (TA4), positive dF0/dPsub was observed in all conditions but CT activation reduced the magnitude of change.
Figure 4.

CT versus LCA/IA Muscle Activation Plots (MAPs) showing change in F0 per 1000 Pascal change in Psub, at TA levels 2–4.
D. Effect of Psub on SPL
The calculated change in SPL per doubling of Psub after phonation onset is shown in Figure 5. Values ranged from 5 to 20 dB. Greatest magnitude of SPL change was present with higher CT activation at lower LCA/IA levels. Both TA and LCA activation decreased the magnitude of change.
Figure 5.

CT versus LCA/IA Muscle Activation Plots (MAPs) showing change in SPL per doubling of phonation onset pressure, at TA levels 2–4.
E. Vocal efficiency
Vocal efficiency calculated from the middle of the stable phonatory onset mode is shown in figure 6. There is a general trend for increased efficiency in each condition with increasing LCA/IA activation. In addition, as TA level increased the region of increased efficiency migrated from the low CT activation to higher CT activation conditions.
Figure 6.

CT versus LCA/IA Muscle Activation Plots (MAPs) showing calculated vocal efficiency at the mid-point of the stable onset vibratory mode, at TA levels 2–4.
F. Relationship between SPL versus F0 by ILM type and Psub
Vocal range profiles (SPL versus F0) obtained from the middle of the stable phonatory onset mode is shown in Figures 7–8 for TA, LCA/IA, and CT activation levels, and for Psub. F0 was divided into lower and higher register as reported previously (Chhetri 2014).5 In the lower register (Figure 7), there is no obvious pattern for the TA or the LCA muscles. In contrast, CT activation clearly associated with F0 increase, and Psub correlated with both F0 and SPL increase. In the upper register (Figure 8), increased TA activation is generally associated with increased F0 but with decreased SPL, while increased LCA activation is associated with increased SPL alone. Similar to the lower register, increased CT activation is associated with increased F0 and Psub is associated with both SPL and F0 increase.
Figure 7.
F0 versus SPL interactions for TA, LCA/IA, CT, and Psub levels for the lower register (up to 380 Hz).
Figure 8.
F0 versus SPL interactions for TA, LCA/IA, CT, and Psub levels for the upper register (above 380 Hz).
DISCUSSION
In speech and singing, the intrinsic laryngeal muscles (ILMs) contract to set up the phonatory posture and the aerodynamic energy interacts with the vocal fold tissue to produce sound. F0 and SPL are important characteristics of voice in both speech and singing: F0 variations contain linguistic information, and singing requires complex control of pitch and intensity. For example, in messa di voce exercise, the F0 is kept constant while SPL is varied.8
Each ILM has a differential effect on vocal posture and acoustics. As seen from the superior view, the TA muscle first adducts then shortens the membranous vocal fold, CT lengthens the vocal fold, and the LCA/IA adducts the cartilaginous vocal fold.4–5 This investigation demonstrated that CT activation increases phonation onset and SPL stability time whereas TA and LCA/IA have opposite effects. This appears related to the effects on Pth: increased glottal tension from CT activation increases Pth, thus requiring more time to reach the target pressure, while increased TA and LCA/IA counter these effects.4–5 Thus, while CT activation is requisite for production of higher register, the phonatory effort is increased.
While phonation occurs across a wide Psub range to achieve the desired F0 or SPL, certain laryngeal activation conditions appear less stable to increases in Psub. At the lower TA level (TA2) the midrange CT activation underwent mode change prior to the end of stimulation (Figure 3). At TA level 3, where maximal adduction without shortening of the vocal fold was achieved, high CT activation conditions (level 4 and above) displayed reduced vibratory mode stability. In other words, this region produced the highest F0 but was stable across a narrow Psub range (as a ratio of phonation onset pressure). This implies that it would be more challenging to sing in the upper register with increasing intensity compared to the lower register. On the other hand, maximal TA contraction at low CT activation is also inherently more unstable, likely due to excessive glottal resistance and tissue interactions, and higher CT activation counters this hyperadduction to impart stability to voice. These findings are consistent with prior reports of CT and TA counterbalancing each other.4
While modeling and some human experiments have found that increasing Psub increases F0, an unexpected finding in this study was decreased F0 in some conditions. In particular, when TA activation was lower and CT activation was high, F0 often decreased with increasing Psub. Typical values were −20 Hz per 1000 pascals (or approximately −2 Hz/cm H20). The decrease in F0 with increasing Psub primarily occurred in conditions where F0 was already in the upper F0 range (600–700 Hz), illustrating the challenges of stably maintaining high F0 while increasing SPL. In line with prior reports, F0 increased with increasing Psub in nearly all cases of lower register. The most dramatic F0 increase occurred at high TA and LCA/IA activation at low CT activation conditions.
An interesting relationship was observed between SPL change and vocal efficiency (Figure 5 versus 6). Figure 5 shows that most SPL increase per doubling of Psub occurred in the low LCA/IA, high CT conditions. As TA and LCA/IA levels increased the dSPL/dPsub ratio decreased as well. The regions of higher SPL change are also the areas with higher Pth, thus doubling of Psub would be a more significant increase in aerodynamic power. It might be clinically relevant to look at these areas in terms of vocal efficiency. Vocal efficiency is an important measure of how effective the larynx is at converting aerodynamic power to acoustic power. LCA/IA improved vocal efficiency most consistently while TA activation moved efficiency towards higher CT activation levels. LCA/IA activation closes the posterior glottis and reduces the airflow requirement, thus more efficiency with higher LCA/IA is consistent. TA appears to improve efficiency for increased CT activation states, but CT is also able to counteract TA to improve efficiency when TA activation is maximal and glottis is hyperadducted.
The interactions of ILMs and Psub on F0 and SPL are further clarified in this study. In the lower register, F0 is primarily controlled by CT and Psub, while SPL is controlled by Psub. In the higher register F0 is controlled by both CT and TA, while SPL is negatively correlated with TA activation and positively with Psub. These findings are consistent with prior reports of TA and CT interactions and CT dominance in register control.9 A strong coupling of SPL and F0 was seen in this study as has been reported.2,3,10 F0 increase requires CT activation but the resultant increased tension requires increased Psub. At the same time, increased Psub increases F0. However, magnitude of F0 increase by aerodynamic forces alone is significantly less than that achieved by ILM activation consistent with theory.11
CONCLUSIONS
This in vivo study evaluated the interactions of subglottal pressure and various combinations of intrinsic laryngeal muscle activation states. CT muscle activation increased phonation onset time and vocal intensity stability time while TA and LCA/IA decreased them. CT activation was essential for F0 increase, but high CT activation states were more unstable and vibratory mode changes occurred as Psub increased. Moreover, in those high CT regions, where F0 was in the upper register, F0 decreased with increasing Psub. In regards to intensity control, the most increase in SPL per doubling of Psub after phonation onset occurred in high CT, low LCA/IA conditions. However, higher LCA/IA activations conditions demonstrated improved vocal efficiency, likely due to maintenance of vocal fold adduction and decreasing airflow requirement. Furthermore, as TA activation increased some counterbalance with CT activation was required to improve vocal efficiency (and vice versa). Increased TA activation increased F0 but decreases SPL while increased LCA/IA is needed to increase SPL in higher registers, likely by maintaining vocal fold adduction during increased Psub. Consistent with prior studies, Psub is positively correlated with both F0 and SPL increase.
Footnotes
The authors have no other financial disclosures to make.
Conflict of Interest: None
Level of evidence: N/A
This article was presented as an oral presentation at 136th Annual Meeting of the American Laryngological Association (ALA), April 22-23, 2015, Boston, Massachusetts, USA.
Financial Disclosure: This study was supported by Grant No. RO1 DC011300 from the National Institutes of Health
References
- 1.Rubin HJ, LeCover M, Vennard W. Vocal intensity, subglottic pressure and air flow relationships in singers. Folia Phoniatr (Basel) 1967;19(6):393–413. doi: 10.1159/000263170. [DOI] [PubMed] [Google Scholar]
- 2.Titze IR. On the relation between subglottal pressure and fundamental frequency in phonation. J Acoust Soc Am. 1989 Feb;85(2):901–6. doi: 10.1121/1.397562. [DOI] [PubMed] [Google Scholar]
- 3.Titze IR, Sundberg J. Vocal intensity in speakers and singers. J Acoust Soc Am. 1992 May;91(5):2936–46. doi: 10.1121/1.402929. [DOI] [PubMed] [Google Scholar]
- 4.Chhetri DK, Neubauer J, Berry DA. Neuromuscular control of fundamental frequency and glottal posture at phonation onset. J Acoust Soc Am. 2012 Feb;131(2):1401–12. doi: 10.1121/1.3672686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Chhetri DK, Neubauer J, Sofer E, Berry DA. Influence and interactions of laryngeal adductors and cricothyroid muscles on fundamental frequency and glottal posture control. J Acoust Soc Am. 2014 Apr;135(4):2052–64. doi: 10.1121/1.4865918. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 6.Berry DA, Herzel H, Titze IR, Story BH. Bifurcations in excised larynx experiments. J Voice. 1996 Jun;10(2):129–38. doi: 10.1016/s0892-1997(96)80039-7. [DOI] [PubMed] [Google Scholar]
- 7.Mermelstein P. Automatic segmentation of speech into syllabic units. J Acoust Soc Am. 1975 Oct;58(4):880–3. doi: 10.1121/1.380738. [DOI] [PubMed] [Google Scholar]
- 8.Titze IR, Long R, Shirley GI, Stathopoulos E, Ramig LO, Carroll LM, Riley WD. Messa di voce: an investigation of the symmetry of crescendo and decrescendo in a singing exercise. J Acoust Soc Am. 1999 May;105(5):2933–40. doi: 10.1121/1.426906. [DOI] [PubMed] [Google Scholar]
- 9.Kochis-Jennings KA, Finnegan EM, Hoffman HT, Jaiswal S, Hull D. Cricothyroid muscle and thyroarytenoid muscle dominance in vocal register control: preliminary results. J Voice. 2014 Sep;28(5):652.e21–652.e29. doi: 10.1016/j.jvoice.2014.01.017. [DOI] [PubMed] [Google Scholar]
- 10.Hsiao TY, Liu CM, Luschei ES, Titze IR. The effect of cricothyroid muscle action on the relation between subglottal pressure and fundamental frequency in an in vivo canine model. J Voice. 2001 Jun;15(2):187–93. doi: 10.1016/S0892-1997(01)00020-0. [DOI] [PubMed] [Google Scholar]
- 11.Titze IR, Talkin DT. A theoretical study of the effects of various laryngeal configurations on the acoustics of phonation. J Acoust Soc Am. 1979 Jul;66(1):60–74. doi: 10.1121/1.382973. [DOI] [PubMed] [Google Scholar]



