Skip to main content
The Journal of the Acoustical Society of America logoLink to The Journal of the Acoustical Society of America
. 2013 Apr;133(4):2340–2349. doi: 10.1121/1.4794385

Voicing produced by a constant velocity lung source

M S Howe 1,a), R S McGowan 2
PMCID: PMC3631246  PMID: 23556600

Abstract

An investigation is made of the influence of subglottal boundary conditions on the prediction of voiced sounds. It is generally assumed in mathematical models of voicing that vibrations of the vocal folds are maintained by a constant subglottal mean pressure pI, whereas voicing is actually initiated by contraction of the chest cavity until the subglottal pressure becomes large enough to separate the vocal folds. The problem is reformulated to determine voicing characteristics in terms of a prescribed volumetric flow rate Qo of air from the lungs—the evolution of the resulting time-dependent subglottal mean pressure p¯_(t) is then governed by glottal mechanics, the aeroacoustics of the vocal tract, and the influence of continued contraction of the lungs. The new problem is analyzed in detail for an idealized mechanical vocal system that permits precise specification of all boundary conditions. Predictions of the glottal volume velocity pulse shape are found to be in good general agreement with the traditional constant-pI theory when pI is set equal to the time averaged value of p¯_(t). But, in all cases examined the constant-pI approximation yields values of the mean flow rates Qo and sound pressure levels that are smaller by as much as 10%.

INTRODUCTION

Numerical simulations of voiced speech usually determine the glottal flow in terms of a prescribed constant or slowly varying subglottal mean pressure (for recent examples see: Zhang et al., 2002; Zhao et al., 2002; Rosa et al., 2003; de Vries et al., 2003; Tao et al., 2007; Luo et al., 2008; Link et al., 2009; Zheng et al., 2011). The magnitude of the corresponding mean volume flow rate Qo of air from the lungs is then deduced from the results of the simulation. In reality, however, the lung cavity contracts at a more or less fixed rate Qo, and the subglottal driving pressure is time dependent and determined by flow continuity and the excess Q(t) − Qo of the glottal volume velocity Q(t) over Qo (t denotes time).

The constant pressure hypothesis was used in previous work by McGowan and Howe (2012) on Level I source-tract interactions [for which the motion of the vocal folds is prescribed; Titze (2008)], the assumption being that the pressure of air flowing from the lungs was approximately constant at the glottis, at least prior to any back-reaction from the subglottal system. In this paper we show that this assumption is unnecessary, and that an arguably more natural and rigorous boundary condition is the specification of the flow rate Qo at the lungs without regard to the nature of the resulting pressure behind the glottis. This approach is more suited to modeling the practical situation where voicing is initiated by contraction of the chest cavity until the subglottal pressure becomes large enough to force apart the vocal folds—the subsequent evolution of the subglottal pressure and its mean value are then determined by the equations of motion, subject to the influence of continued contraction of the lung cavity.

We describe in this paper how this procedure can be incorporated into the reduced complexity equation for the glottal volume velocity Q. This is the “Fant equation” introduced in Fant's (1960) pioneering analysis of the problem. It determines Q in terms of forcing by the subglottal and supraglottal pressures and by the hydrodynamic pressures produced by jet formation at the glottis and jet interactions with vocal tract structures. The latter includes the false folds, whose contribution has not hitherto appeared explicitly in the Fant equation. McGowan and Howe (2010) argued from a simplified model that the false folds have a negligible impact on the glottal pulse amplitude, which is in agreement with the numerical predictions of Zhang et al. (2002) and Zheng et al. (2011).

On the other hand, several studies have revealed that structures within the supraglottal tract can substantially modify the voice source, either by changing the inertia of the glottal flow, or the spectrum of the supraglottal pressure. Titze (1994) and Titze and Story (1997) have pointed out that proper geometrical adjustment of the lower supraglottal tract (the epilarynx tube, the piriform sinuses, the pharynx) plays a crucial role in voice quality control. A relatively narrow epilarynx tube, for example, appears to promote interactions between higher formants and the glottal flow and produces rippling of the volume velocity pulse profile (Titze and Story, 1997). At the very least one might expect the false folds to produce an acoustic mass that could skew the glottal pulse beyond that calculated in their absence. Indeed, analytical modeling of source-tract interaction (McGowan and Howe, 2012) has confirmed the existence, established previously by Rothenberg (1981), Ananthapadmanabha and Fant (1982), and Fant (1986), of a rightward skewing of the glottal pulse for glottis frequencies smaller than either the first subglottal formant or supraglottal formant. The skewing is attributed to an effective mass-loading of the tracts at those frequencies.

A “level I” analysis is discussed in this paper of the modified Fant equation for the idealized mechanical model of the vocal system illustrated in Fig. 1. It would actually have been more satisfactory to extend the treatment given by McGowan and Howe (2012) which uses experimental data from measurements of the acoustic impedance on either side of the glottis. But, because the desired modification that properly accounts for steady lung contraction is contentious, it was decided to proceed first with a model that is simple enough to permit an exact comparison to be made of predictions of Q(t) and the subglottal pressure for a prescribed lung contraction rate Qo with corresponding predictions derived from the constant-pressure-driven Fant equation. In view of the level I limitation, however, a proper account is taken of the influence on Q(t) of excitation by lung contraction at a prescribed rate Qo and of jet interactions with the glottis and the false folds, but possibly related effects on the mechanics of vocal fold vibration are ignored.

Figure 1.

Figure 1

Idealized configuration of the vocal tract used to illustrate voicing produced by steady contraction of the lung cavity at volume velocity Qo. The upper tract span s is in the x3 direction, out of the plane of the paper.

The modified Fant equation is derived in Sec. 2 for the vocal tract model of Fig. 1. The level I numerical procedure for solving the equation is discussed in Sec. 3. Illustrative numerical results are analyzed (Sec. 4), and a comparison is made with predictions of the constant-pressure-driven Fant equation. The influence of the false folds on the skewing of the glottal pulse is also briefly discussed.

FORMULATION

Model configuration

The voice source is usually of sufficiently low frequency to permit the supraglottal tract to be modeled as a plane-wave guide. We shall therefore consider the idealized mechanical vocal system of Fig. 1, consisting of a nominally hard-walled upper tract of length L and cross-sectional area A terminated at its upper extreme by a “mouth,” and at its lower end by the glottis and false vocal folds. The subglottal tract will be treated also as a rigid, uniform duct of cross-section A that enters at distance Ls from the glottis the “lung complex” modeled by a plenum of cross-section ALA and length H. The hard wall boundary condition will be relaxed in Sec. 2D in order to take into account damping induced by small amplitude vibrations of the walls. The glottis is taken to have the simplified form of a narrow duct of rectangular cross-section Ag(t) and streamwise length gA, which opens and closes at a nominal frequency fo. Take the origin of coordinates x = (x1, x2, x3) at the geometric center of the opening of the glottis into the upper tract, with x1 directed along its axis. The upper tract has a rectangular cross-section of span lslg in the x3 direction (out of the plane of the paper in Fig. 1) and width w (parallel to x2). It will be assumed that ls = 1.5 cm and w = 2 cm, as in the recent numerical simulations of Zheng et al. (2011). The idealized symmetric approximation of Fig. 1 incorporates simplified false vocal folds, also based on the model of Zheng et al., which was derived from a high resolution laryngeal CT scan of a normal, 30 yr old male. The false folds have a span of ls and an overall length of 1.5 cm. The area Af of the rectangular channel between the false folds is ∼0.34A.

Voicing is initiated by contraction of the lung cavity causing a rise in subglottal pressure. In the model of Fig. 1, steady contraction of the cavity is achieved by a piston-like motion of the lower end [at x1 = −(Ls + H) in Fig. 1] with constant volume velocity Qo. This results in a very low Mach number flow into the upper tract that is interrupted by the periodic opening and closing of the glottis. The air emerges into the upper tract as a succession of “puffs” of volume velocity Q(t), the latter being equivalent to the effective monopole source strength of the sound radiated into the supraglottal tract.

Equation for the glottal flow

Fant's (1960) treatment of voicing was based on an equation for the glottal flow, which may be regarded as locally incompressible, and where the inertia of the fluid column in motion through the glottis is balanced against an aggregate of forces consisting of a constant subglottal overpressure pI, back-pressure associated with turbulence losses and interactions with the upper tract, and viscous forces at the walls. Subsequent analyses involving increasingly sophisticated applications of this basic equation have been discussed extensively in the literature and reviewed by McGowan and Howe (2012). Howe and McGowan (2011) derived the general form of Fant's equation from the equations of aerodynamic sound. For the model of Fig. 1 it takes the form

ρo¯dQdt+ρoV(Yωv)(y,t)d3y=A(pp+), (1)

where ρo is the mean air density, v ≡ v(y,t) is the velocity at y and time t in the fluid volume V,ω = curl v is the vorticity, and p ± ≡ p±(t) are, respectively, the pressures just downstream of the false folds and upstream of the glottis.

The auxiliary function Y(y,t) is a solution of Laplace's equation 2Y=0 that defines the unique velocity potential of a hypothetical incompressible flow from y1 < 0 to y1 > 0 through the glottis and false folds. It is normalized to have unit speed in the positive y1 direction within the uniform upper and lower tracts (where Y/y1=1), with normal velocity Y/yn=0 on the instantaneous configuration of the glottis and vocal tract walls (see Howe, 2002, for a detailed discussion]. The length ¯¯(t) is the effective column length of fluid involved in unsteady motion through the glottis, given by

¯(t)=(Yy1(y,t)1)dy1, (2)

where the integration path passes in the positive direction through the glottis between sufficiently distant points within the lower and upper tracts at which Y/y1=1.

Contributions to Eq. 1 from additional surface integrals involving Y(y,t) have been neglected. These represent surface viscous forces within the glottal region and the influence of normal motions of the glottis wall and neighboring tissue, which were shown by Howe and McGowan (2012) to be responsible for small modifications of the glottal pulse profile; they have been ignored for the purpose of the present discussion.

The vortex force

The integral

F(t)=ρ0V(Yωv)(y,t)d3y (3)

represents the drag force F(t) exerted on the glottis and false folds by vortex structures (“turbulence”) in the flow (Howe, 2002, Sec. 4.4.2). This force acts to modulate the airstream through the glottis. It is equal and opposite to the vortex-surface interaction force on the fluid in the glottal region, and its evaluation requires a detailed knowledge of the complex flow near the glottis. This is not normally available, but the dominant contribution will obviously be supplied by the mean characteristics of the glottis jet, and for the configuration of Fig. 1 we can put F = Fg + Ff, where Fg, Ff, respectively, denote the force components produced respectively by jet interactions with the glottis and the false folds.

The glottis component Fg can be evaluated approximately by use of a simple quasi-static, “free-streamline” model of the jet (Howe and McGowan, 2011; McGowan and Howe, 2012). The jet in the vicinity of the idealized glottis is modeled as in Fig. 2, where the vorticity is confined to thin vortex sheet shear layers at the outer edge SJ of the jet. Then ωv=(1/2)Uσ2δ(s)n, where Uσ(t) is the jet velocity just inside the shear layer (constant along the free streamline), s is the distance measured in the direction of the outward unit normal n from the jet, and the vorticity is convected at half the free streamline velocity. Figure 2 illustrates schematically the family of “streamlines” near the glottis of the velocity potential Y(y,t). The main contribution to the integral is from the section of the jet close to the glottis, where these streamlines cut across the edge of the jet and spatial variations of Uσ can be neglected, so that

FgρoV(t)Yyωvd3yUσ2(t)2SJYyndS=(AσAg)Uσ22, (4)

where σ = σ(t) is the jet contraction ratio, and the surface integral is just equal to the net flux (AσAg) through the jet boundary of the hypothetical flow defined by the velocity potential Y(y,t).

Figure 2.

Figure 2

Local streamline pattern of the hypothetical flow through the glottis defined by the velocity potential Y (y,t) intersecting the vortex sheet boundary of the idealized jet.

This result for Fg does not depend on the precise functional form of Y(y,t). However, some knowledge of the behavior of Y(y,t) in the vicinity of the false folds is necessary to calculate the component Ff in terms of the vorticity convecting between the folds. In principle Y(y,t) can be found by numerical integration of Laplace's equation. But more insight is obtained by use of the following approximation (Rayleigh, 1945, Sec. 308; Howe, 2002, p. 80)

Y(y,t)0y1AS(ξ,t)dξ, (5)

where S(ξ,t) is the cross-sectional area at any point y1 = ξ within the entire vocal tract at time t. In our case S(y1,t) is time dependent only within the glottis (−g < y1 < 0), where S(y1,t) = Ag(t).

Equation 5 is strictly valid when the cross-sectional area varies “slowly” with position in the duct. But it nonetheless provides a good and physically correct approximation when used to evaluate the force integral [Eq. 3]. Formula (5) determines only the component Y/y1 of Y. The equation of continuity can be used to obtain an improved approximation to Y, and for the present two-dimensional duct geometry one finds (Rayleigh, 1945, Sec. 308)

Yy1=AS(y1,t),Yy2=y2y1(AS(y1,t)). (6)

The jet emerging from the glottis expands laterally and can be expected to “wet” the surfaces of the false folds. The lateral velocities are small, however, compared with the axial mean speed, and to evaluate Ff it will be assumed that the vorticity convects parallel to the duct axis through the section of cross-sectional area Af between the false folds at the mean speed Qo/Af. Thus

Ff(t)ρoy2(ωv)2(y,t)(AAf1)(δ(y1f)δ(y1t))d3y, (7)

where the main contributions to Ff arise from vortex-surface interactions at the leading and trailing edges, respectively, x1 = f,t, of the idealized false folds, where Y is singular.

However, vorticity interacting with a “trailing edge” x1 = t (labeled B in Fig. 1) would in practice induce the shedding of new vorticity into a wake. When this shedding is ignored the force calculated from Eq. 7 has two components, from vortex elements passing the leading and trailing edges. The overall force, however, also involves a contribution from the shed vorticity. A similar problem arises in the calculation of the force produced by a turbulent “gust” in mean flow past an airfoil. In that case it is known that in a first approximation the force component attributed to the wake is equal and opposite to that generated by the gust at the trailing edge (Howe, 1976, 2002, Sec. 6.3). This happens because shedding acts to smooth the trailing edge flow, removing potential flow singularities that would otherwise occur. Therefore the leading order effect of the wakes of the false folds can be determined, without knowledge of details of the shed vorticity, merely by deleting the singularity δ(y1t) from the integrand of Eq. 7 and ignoring the contribution to the remaining integral from the wake vorticity. Of course, the trailing edge of a false fold is actually tapered and much smoother than the sharp corner B of Fig. 1 (see Fig. 1 of Zheng et al., 2011). But separation of the mean flow must still occur in this region, which will again reduce its contribution to Ff. The wake was neglected in the analogous false folds problem discussed by McGowan and Howe (2010), where surface wetting by the mean jet was ignored.

If lateral expansion of the jet is ignored over the relatively short distance ∼f between the glottis and the false folds, the integration in Eq. 7 can be performed by means of the vortex sheet approximation used above for Eq. 4, to obtain

Ff(t)ρoσ(AAf1)[Ag(t)Uσ2(t)2], (8)

where the square brackets [] indicate that the enclosed quantity is to be evaluated at the “retarded” time at which the vorticity passing the leading edge x1 = f at time t emerged from the glottis.

Combining this result with Fg of Eq. 4, we find

F(t)ρoV(Yωv)(y,t)d3y(AσAg(t))Uσ2(t)2σ(AAf1)[Ag(t)Uσ2(t)2]. (9)

The contribution from the false folds is smaller than that from the glottis by a factor of order Ag/Af. It is also of opposite sign, and represents the effect of a “suction” force at the leading edge of the false folds acting on the mean flow in the +x1 direction, whereas the glottis component Fg opposes the motion through the glottis. This reduces the overall flow resistance through the glottis produced by vortex-surface interactions, in agreement with numerical simulations reported by Zhao et al. (2002), and should therefore reduce the subglottal pressure required to maintain a given mean volume velocity Qo.

The pressures p+ and p

The pressure fluctuations p+ in Eq. 3 are determined by the volume source of strength Q(t) at the glottis of the flow into the upper tract. Standard acoustic analysis for the supraglottal duct of Fig. 1 (cf. Murray and Howe, 2012) provides the Fourier representation

p+=iρoco2πAQ(τ)sin(koL¯)eiω(tτ)cos(koL¯)dωdτ, (10)

where ko = ω/co, co is the speed of sound, and the length L¯ is equal to the interior duct length L suitably augmented to account for the end-correction of the open mouth.

The pressure p near the glottis in the lower tract is governed by the net volumetric rate of inflow, comprising the steady inflow Qo due to lung contraction and the outflow Q(t) through the glottis. By making the usual assumptions of continuity of pressure and volume velocity at the junction x1 = −Ls, we find

p=iρoco2πA(QoQ(τ))[Acos(koLs)cos(koH)ALsin(koLs)sin(koH)]eiω(tτ)[Asin(koLs)cos(koH)+ALcos(koLs)sin(koH)]dωdτ. (11)

In both of these formulas causality requires the path of integration with respect to ω to pass above all singularities (simple poles) of the integrands.

These expressions are strictly applicable in the absence of damping. Thus, the undamped poles for the upper tract pressure p+ [Eq. 10] occur at the resonance frequencies ω=ωn=(n1/2)πco/L¯(<n<). Damping in the upper tract is produced by thermo-viscous wall-losses, open-end radiation (consistent with the idealized model of Fig. 1), and flexural motions of the tract walls. These effects cause the poles to be shifted into the lower complex plane, perturbing in general both their real and imaginary parts (Stevens, 1998). In a first approximation we can write

ω=±ωniϵn,n1,ωn=(n12)πcoL¯,ϵn>0. (12)

Evaluation by residues of the integral [Eq. 10] then gives

p+=ρoco2n=1Zn(t), (13)

where

Zn=2ALtQ(τ)cosωn(tτ)eϵn(tτ)dτ,n1. (14)

The integrand of Eq. 11 for the subglottal pressure p contains a simple pole at ω = 0, which yields the contribution p¯, say, given by

p¯=ρoco2W0(t),W0(t)=1Vt(QoQ(τ))dτ, (15)

where V = ALs + ALH is the mean volume of the lung cavity and the subglottal tract.

When ALA and ω ≠ 0

Acos(koLs)cos(koH)ALsin(koLs)sin(koH)Asin(koLs)cos(koH)+ALcos(koLs)sin(koH)sin(koLs)cos(koLs).

The undamped subglottal resonance frequencies therefore correspond approximately to the quarter-wave modes of a duct of length Ls open at one end. When the influence of damping is included the perturbed resonance frequencies are taken in the form

ω=±ωnsiϵns,n1,ωns=(n12)πcoLs,ϵns>0, (16)

so that the net subglottal pressure p, including the zero frequency pole contribution p¯, becomes

p=ρoco2(W0(t)+n=1Wn(t)), (17)

where

Wn(t)=2ALst(QoQ(τ))cosωns(tτ)eϵns(tτ)dτ,n1. (18)

The Fant equation

The results of Eqs. 9, 13, 18 and the relation Q = σAgUσ now permit the Fant Eq. 1 for the idealized vocal system of Fig. 1 to be cast in the form

¯AdQdt+(1σAgA)Q22σ2Ag2(t)(AAf1)[Q2(t)2σAAg(t)]=co2(W0(t)+n=1{Wn(t)Zn(t)}), (19)

where the glottal column length ¯ is given by

¯(t)=g(AAg(t)1)+(tf)(AAf1). (20)

In general the time dependent variation of the glottis cross-section Ag(t) is known, or determined by an equation that is to be solved simultaneously with Eq. 19. The solution Q(t) of Eq. 19 vanishes identically in the absence of lung contraction, i.e., unless Qo ≠ 0 in the definition [Eq. 15] of W0.

Interpretation of p¯_=ρoco2W0(t)

Integration of the continuity equation ρ/t+ρodivv=0 over the entire volume V of the subglottal region, and use of the linear, adiabatic formula δp=δp/co2, relating changes in pressure δp and the density δρ, reveals that

Vρoco2dp¯dt=QoQ(t), (21)

where p¯(t) is the space-averaged subglottal pressure. Therefore

p¯p¯=ρoco2W0(t). (22)

It follows fairly obviously from Eq. 21 that during periodic voicing characterized by a limit cycle solution of the Fant Eq. 19, the mean volume velocity Q(t)Qo, where

Q(t)=foQ(t)dt,

the integration being over a complete period 1/fo of the glottal motion.

Equations 21, 22 also permit the Fant Eq. 19 to be re-cast in the form

d2p¯_dt2+ωH2p¯_=ωH2ρo{(1σAg(t)A)Q2(t)2σ2Ag2(t)(AAf1)[Q2(t)2σAAg(t)]+co2n=1{Wn(t)Zn(t)}}, (23)

where ωH=co2A/V¯ is the instantaneous value of the resonance frequency of the “Helmholtz resonator” formed by the entire subglottal region of volume V with a small opening at the glottis (Rayleigh, 1945). Equation 23 determines the space-averaged subglottal pressure p¯_(t) produced by lung contraction, nonlinear jet forces, and acoustic modes within and outside the cavity.

The constant-pressure-driven Fant equation

Most discussions of the Fant equation assume that the motion is driven by a constant subglottal pressure pI, rather than the contraction rate Qo of the lungs. To derive this approximation the space-averaged subglottal pressure term W0(t) in Eq. 19 is replaced by its time-averaged value W0(t)=p¯_(t)/ρopI/ρo. Similarly, we must set Qo = 0 in the definition [Eq. 18] of Wn. This yields the constant-pressure-driven Fant equation

¯AdQdt+(1σAg(t)A)Q22σ2Ag2(t)(AAf1)[Q2(t)2σAAg(t)]=pIρo+co2n=1{Wn(t)Zn(t)}. (24)

The value of the volumetric inflow velocity Qo no longer provides a boundary condition for this equation and the ultimate voice source; its value must be determined from the solution of Eq. 24 and the formula Qo=foQ(t)dt.

Other special cases

The Fant Eq. 19 can be modified further to treat idealized representations of two alternative geometrical configurations that are frequently discussed in the literature.

  • (1)

    Exposed glottis: The upper tract downstream of the false folds is removed. Then p+ = 0 to a very good approximation, and the required Fant equation is obtained from Eq. 19 by discarding n=1Zn.

  • (2)

    Non-reflective upper tract:co2n=1Zn in Eq. 19 is replaced by coQ(t)/A. This is equivalent to modeling the upper tract as a semi-infinite duct.

THE LEVEL I APPROXIMATION

Prescribed glottal motion

The numerical analysis of the Fant Eqs. 19, 24 will be based on Titze's (2008) level I approach, whereby the character of the solution generated by contraction of the lungs is examined for prescribed cyclic variations of the glottis cross-section Ag(t). This is done using the formula Ag(t)=A{a0+(1/2)a1(1cos(2πfot))}, where fo Hz is the fundamental voicing frequency and a0, a1 are suitable coefficients. The coefficient a0 must be assigned a small positive value (≪ a1) to avoid instability in the numerical solution. The maximum open area is then ∼a1A.

In practice, however, the glottis closes for a finite time tc, say, during each cycle. To model this, the following more general formula will be used

Ag(t)A={a0,for0fot[fot]<fotca0+a12(1cos(2π{fo(ttc)[fot]}1fotc)),forfotc<fot[fot]<1, (25)

where [fot] denotes the largest integer not exceeding fot. The open duty factor of the glottis is then equal to 1 − fotc.

The jet contraction ratio

Calculation for an orifice of uniform rectangular cross section indicates that the glottis-jet contraction ratio σ varies abruptly between 0.61 and 1.1 during opening and closing of the glottis (Howe and McGowan, 2010). This is consistent with the experimental findings of Park and Mongeau (2007), although the variation must actually be a smooth function of the time. Discontinuities associated with rapid jumps in σ will therefore be avoided by using in the Fant equation the following smoothed representation of the variation:

σ=0.61+0.49(a0+a1)Ag(t)A. (26)

A similar smoothing formula was used by Zanartu et al. (2007) to model vocal fold mechanics.

Numerical procedure

The Fant equation is solved by fourth order Runga-Kutta integration applied to Eq. 19 and to the corresponding system of equations satisfied by W0(t), Wn(t), and Zn(t), n ≥ 1, viz.,

dW0dt=QoQ(t)V, (27)
dWndt=2(QoQ(t))AL¯sϵnsWnωnsWndWndt=ωnsWnϵnsWn}n=1,2,, (28)
dZndt=2Q(t)ALϵnZnωnZndZndt=ωnZnϵnZn}n=1,2,..., (29)

where

Wn=2AL¯st{QoQ(τ)}sinωns(tτ)eϵns(tτ)dτZn=2AL¯tQ(τ)sinωn(tτ)eϵn(tτ)dτ,}n=1,2, (30)

(cf. Howe and McGowan, 2011).

The modal coefficients Wn(t), Zn(t) govern the subglottal and supraglottal cavity mode contributions to the pressures p, and are small when 2πfoωns, ωn. This means that it is usually permissible to truncate the infinite systems [Eqs. 28, 29] at n ∼ 3, i.e., by taking account of the first three formants of the lower and upper tracts. The integration is started at t = 0 subject to the initial conditions Q = 0, W0=Wn=Wn=Zn=Zn=0(n1).

Periodic solutions of Eq. 19 must satisfy the condition

foQ(t)dt=Qo,

which provides a convenient check on convergence to a limit cycle solution.

NUMERICAL RESULTS

Model vocal tract parameters

Sample solutions of the Fant Eq. 19 and of the constant-pressure-driven approximation [Eq. 24] are now discussed. Typical parameter values for the simplified model of Fig. 1 are given in Table Table I., based on those used in the recent investigation of Zheng et al. (2011).

Table I.

Parameter values for the ideal vocal system of Fig. 1.

Parameter Value
Glottal length g 3 mm
Glottis and duct span s 15 mm
Glottis coefficient a0 0.00005
Glottis coefficient a1 0.1
Duct width w 20 mm
False fold f 5 mm
False fold t 20 mm
False fold Af/A 0.34
Volume of subglottal cavity V 4000 ml
Glottis frequency (nominal) fo 120 Hz
Density of air ρo 1.23 kg/m3
Speed of sound co 350 m/s

The first subglottal formant ω1s/2πFs1620Hz (Ishizaka et al., 1976). Therefore for the purpose of illustration the three subglottal formants Fs1, Fs2, Fs3 and their corresponding half-power bandwidths Δf are taken to be defined as in Table Table II. (Ishizaka et al., 1976; Zanartu et al., 2007; Howe and McGowan, 2012).

Table II.

Subglottal frequencies.

  Fs1 Fs2 Fs3
Frequency, Hz 620 1860 3100
Bandwidth Δf, Hz 200 150 100

These values are consistent with the simple rectilinear model of the subglottal tract of Fig. 1 and the frequency formula [Eq. 16]. The half-power bandwidths supply the dissipation rates ϵns by means of the formula

ϵns=π(Δf)n. (31)

Numerical results will be discussed for a notional neutral vowel phoneme corresponding to supraglottal formants F1, F2, F3 consistent with the rectilinear duct model of Fig. 1. They are defined along with their bandwidths (estimated from Olive et al., 1993) in Table Table III..

Table III.

Supraglottal formants.

  F1 F2 F3
Frequency, Hz 500 1500 2500
Bandwidth Δf, Hz 50 75 100

Influence of glottal frequency

The glottis volume velocity Q(t) is determined by the Fant Eq. 19 entirely in terms of the prescribed value of the lung cavity contraction rate Qo. The details of the glottal pulse depend on the acoustic properties of both the subglottal region (Table Table II.) and the articulatory state of the upper vocal tract. Figure 3 depicts a survey of possible neutral vowel waveforms, each for fixed values of Qo but for a range of increasing values of the glottis frequency fo. In all cases fotc = 0.3, so that the glottis duty factor is 0.7.

Figure 3.

Figure 3

Illustrating the variation with fo of limit cycle glottal pulse profiles Q(t)/Qo predicted by the Fant Eq. 19 for the conditions of Table Table I., when Qo = 200 cm3/s, fotc = 0.3. The subglottal resonances are defined as in Table Table II., and the calculations are performed for the neutral vowel defined by the upper tract formants of Table Table III..

The contraction rate is fixed at Qo = 200 cm3/s and fo is increased in stages from 120 to 700 Hz. This interval encompasses the subglottal resonance at Fs1 = 620 Hz and the first supraglottal formant F1 = 500 Hz. Each of the frames [Figs. 3(a)–3(j)] depicts three limit cycles of the glottal pulse Q(t)/Qo plotted against fot. In case (a) fo = 120 Hz is well below the two resonance frequencies, and the smooth volume velocity pulse is slightly skewed to the latter half of the glottis open phase. This skewing is due almost entirely to the back-reaction on the glottis flow of the acoustic modes in the upper and lower tracts. The back-reactions are weak because foFs1, F1, but the skewing is absent only when the contributions to Eq. 19 from modes in both tracts are ignored (cf. the discussion below of Fig. 4). The contribution to skewing from the augmentation of the slug length ¯ produced by the narrowed passage between the false folds is much smaller (Rothenberg, 1981; Ananthapadmanabha and Fant, 1982; Fant, 1986; Titze, 1994; Titze and Story, 1997), and does not become noticeable until Af/A is smaller than about 0.1 (Fig. 6).

Figure 4.

Figure 4

Illustrating the limit cycle variations (—) of Ag(t)/A, Q(t)/Qo, p¯(t) Pa for fo = 120 Hz, fotc = 0.3, Qo = 200 cm3/s. The mean subglottal pressure pI = 478.41 Pa. The broken-line curve (- - - -) is the non-skewed profile of Q(t)/Qo predicted when back reactions from acoustic modes in the upper and lower tracts are ignored. The dotted curve (• • •) is the volume velocity predicted by the constant-pressure-driven Fant Eq. 24 for pI = 478.41 Pa, for which Qo = 180.46 cm3/s.

Figure 6.

Figure 6

Volume velocity profile skewing produced by narrowing of the gap between the false folds. The figure compares limit cycle volume velocity profiles for Qo = 320 cm3/s, fo = 120 Hz, a0 = 0.06, tc = 0 in the two cases Af/A = 0.1 (—) and Af/A = 1 (- - - -).

The appearance of a “knuckle” at the front of the glottal pulse when fo = 170 Hz [labeled A in Fig. 3b] indicates the increasing influence of the first formant F1. The knuckle advances toward the peak of the volume velocity profile with increasing frequency [Figs. 3c, 3d]; the distinct double peak in Fig. 3c occurs at the subharmonic fo = 250 Hz of F1. In Fig. 3d the secondary peak is maintained by subglottal interactions at the subharmonic 310 Hz of Fs1. At higher frequencies the double peak disappears, and Fig. 3e reveals a new disturbance B associated with the first formant F1 advancing toward the velocity peak. At fo = 480 Hz the formation of a secondary ripple C is evident. The combination leads to the triple peak profile of Fig. 3g at the resonance condition fo = F1. The central peak in this profile is produced by interaction with F2; when the contribution from the second formant is omitted from Eq. 19 the resonant profile assumes a characteristic double peak form, with a deep central minimum that is typical of resonance forcing (Lighthill, 1978; Howe and McGowan, 2011). At higher frequencies the volume velocity profile becomes skewed to the first half of the glottal cycle. Resonance forcing at the subglottal frequency fo = Fs1 [Fig. 3i] does not exhibit a double peak behavior, because of the heavy damping of the subglottal modes.

Figures 45 display the combined limit cycle variations of Ag(t)/A, Q(t)/Qo, and the subglottal space-averaged pressure p¯_(t) Pa for special cases of the neutral vowel. Figure 4 is for case (a) of Fig. 3, i.e., for fo = 120 Hz, Qo = 200 cm3/s and fotc = 0.3, typical of quiet speech. The back-reaction of the upper and lower tract resonant modes produce skewing of the volume velocity profile to the latter half of the open phase of the glottis. The back-reaction is relatively mild because of the large disparity between the glottal frequency fo and Fs1 and the first formant F1. In the absence of these modes there is no skewing of the corresponding Q(t)/Qo-profile (- - - -), obtained by discarding Wn, Zn, n ≥ 1 in Fant Eq. 19 and the systems [Eqs. 28, 29]. The mean subglottal pressure p¯_(t)pI=478.41 Pa, as indicated by the broken line in the upper part of Fig. 4, and p¯_(t) varies over a narrow range of about ±15 Pa about this mean.

Figure 5.

Figure 5

Limit cycle variations (—) of Ag(t)/A, Q(t)/Qo, p¯_(t) Pa for fo = 250 Hz, fotc = 0.3, Qo = 273.78 cm3/s. The mean subglottal pressure pI = 1000 Pa. The dotted curve (• • •) is the volume velocity predicted by theconstant-pressure-driven Fant Eq. 24 for pI = 1000 Pa, for which Qo = 258.09 cm3/s.

The profiles in Fig. 5 are for fo = 250 Hz, fotc = 0.3 and Qo = 273.78 cm3/s, the latter having been adjusted to yield a mean subglottal pressure pI = 1000 Pa—a value frequently used in voicing studies. The glottis frequency fo is a subharmonic of the formant F1 = 500 Hz, and the variation of the volume velocity Q(t)/Qo is strongly influenced by interaction with this resonant mode. The subglottal mean pressure p¯_(t) exhibits a near saw-tooth cyclic waveform over ±10 Pa about the mean.

Solution of the constant-pressure-driven Fant equation

Inspection of Figs. 45 reveals that the departures of the space-averaged subglottal pressure p¯_(t) from its mean value pI do not exceed about ±3%. It might therefore be surmised that corresponding predictions of the constant-pressure-driven Fant Eq. 24 should, in such cases, be similar or very close to those of the “exact” Eq. 19.

To examine this hypothesis Q(t) has been calculated from Eq. 24 for the conditions of Figs. 45 by setting the pressure pI in the equation respectively equal to the mean values p¯_(t)=478.41 and 1000 Pa calculated from the Fant Eq. 19. The numerical solution of Eq. 24 is then used to evaluate the corresponding mean volume velocity Qo=foQ(t)dt, say, and thence the ratio Q(t)/Qo, which is plotted as the dotted curves (• • •) in Figs. 45. This procedure is seen to yield an excellent approximation to the fractional volume velocity waveform Q(t)/Qo determined by Eq. 19.

However, in all cases the predicted mean volume velocity Qo<Qo, as indicated in Table Table IV.. To obtain equality of the mean volumetric flow rates (and therefore of the predicted speech sound pressure levels) it is necessary to increase the magnitude of the constant driving pressure pI in Eq. 24 to the respective values labeled pI in the table.

Table IV.

Compared predictions of Eqs. 19, 24.

fo (Hz) Qo (cm3/s) Qo (cm3/s) pI (Pa) pI (Pa)
120 200.00 180.46 478.41 549.10
250 273.78 258.09 1000.00 1089.47

Influence of the false folds on profile skewing

According to Rothenberg (1981), Ananthapadmanabha and Fant (1982), Fant (1986), Titze (1994), and Titze and Story (1997), constriction of the glottis flow by a narrowing of the lower end of the supraglottal tract is one factor that causes skewing of the volume velocity wave pulse. Narrowing can occur in the model of Fig. 1 by reducing the area Af between the false folds, which has the effect of increasing the glottal column length ¯ [see Eq. 20], and therefore of slowing the initial rate of rise of the velocity pulse. This is illustrated in Fig. 6, which displays the waveforms for the conditions: fo = 120 Hz, Qo = 320 cm3/s, and tc = 0, when a0 = 0.06 [corresponding to a maximum glottis area Ag(t) = 18 mm2] in the two cases Af/A = 0.1 (—) and Af/A = 1 (- - - -).

CONCLUSION

Voiced speech arises from vibrations of the vocal folds produced by air forced to flow through the glottis by contraction of the lung cavity. However, mathematical representations of this mechanism have largely been formulated in terms of a prescribed subglottal pressure pI applied to the folds, the amplitude of the pressure being fixed to accord with experiment. But experiment has also determined the characteristic rate Qo of volumetric airflow produced by lung contraction which flows through the glottis during voicing. The actual value of the subglottal pressure p¯_(t) determined by Qo is not constant, even when the lungs contract at a constant rate. The relation between these parameters has been investigated in terms of the Fant equation for an idealized mechanical vocal system that is simple enough to permit precise specification of all boundary conditions.

The approximate, constant-pressure-driven Fant equation in which pI is set equal to p¯_(t) yields predictions of Q(t)/Qo that are generally in excellent agreement with those obtained from the exact equation. However, in all cases examined it is found that the absolute level of the mean flow rate Qo calculated from the approximate equation can be up to 10% smaller than for the exact equation. This implies also that there would be corresponding discrepancies in the predicted sound pressure levels. The differences are admittedly small, and can be removed by suitably increasing by a few percent the driving pressure pI in the approximate equation, but the conclusion suggests that it would be worthwhile to extend the present investigation to a more realistic model of the vocal system. Such calculations should be done at Titze's (2008) “level II,” by including a separate equation of motion for the vocal fold vibrations, and should also incorporate a geometrically precise representation of vocal tract area variations, perhaps by use of concatenated cylindrical elements (Lighthill, 1978).

ACKNOWLEDGMENT

This work was supported by a subaward of Grant No. 1R01 DC009229 from the National Institute on Deafness and other Communication Disorders to the University of California, Los Angeles.

References

  1. Ananthapadmanabha, T. V., and Fant, G. (1982). “ Calculation of the true glottal flow and its components,” Speech Comm. 1, 167–184. 10.1016/0167-6393(82)90015-2 [DOI] [Google Scholar]
  2. de Vries, M. P., Hamburg, M. C., Schutte, H. K., Verkerke, G. J., and Veldman, A. E. P. (2003). “ Numerical simulation of self-sustained oscillation of a voice-producing element based on Navier-Stokes equations and the finite element method,” J. Acoust. Soc. Am. 113, 2077–2083. 10.1121/1.1560163 [DOI] [PubMed] [Google Scholar]
  3. Fant, G. (1960). Acoustic Theory of Speech Production (Mouton, The Hague: ), Sec. A2. [Google Scholar]
  4. Fant, G. (1986). “ Glottal flow: Models and interaction,” J. Phonetics 14, 393–399. [Google Scholar]
  5. Howe, M. S. (1976). “ The influence of vortex shedding on the generation of sound by convected turbulence,” J. Fluid Mech. 76, 711–740. 10.1017/S0022112076000864 [DOI] [Google Scholar]
  6. Howe, M. S. (2002). Theory of Vortex Sound (Cambridge University Press, Cambridge: ), Secs. 4.4.2 and 6.3, p. 80 and Chap. 3. [Google Scholar]
  7. Howe, M. S., and McGowan, R. S. (2010). “ On the single-mass model of the vocal folds,” Fluid Dyn. Res. 42, 015001. 10.1088/0169-5983/42/1/015001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Howe, M. S., and McGowan, R. S. (2011). “ Production of sound by unsteady throttling of flow into a resonant cavity, with application to voiced speech,” J. Fluid Mech. 672, 428–450. 10.1017/S0022112010006117 [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Howe, M. S., and McGowan, R. S. (2012). “ On the role of glottis-interior sources in the production of voiced sound,” J. Acoust. Soc. Am. 131, 1391–1400. 10.1121/1.3672655 [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Ishizaka, K., Matsudaira, M., and Kaneko, T. (1976). “ Input acoustic-impedance measurement of the subglottal system,” J. Acoust. Soc. Am. 60, 190–197. 10.1121/1.381064 [DOI] [PubMed] [Google Scholar]
  11. Lighthill, J. (1978). Waves in Fluids (Cambridge University Press, Cambridge: ), p. 119. [Google Scholar]
  12. Link, G., Kaltenbacher, M., Breuer, M., and Doellinger, M. (2009). “ A 2D finite-element scheme for fluid-solid-acoustic interactions and its application to human phonation,” Comput. Methods Appl. Mech. Eng. 198, 3321–3334. 10.1016/j.cma.2009.06.009 [DOI] [Google Scholar]
  13. Luo, H., Mittal, R., Bielamowize, S., Walsh, R., and Hahn, J. (2008). “ An immersed-boundary method for flow-structure interaction in biological systems with applications to phonation,” J. Comput. Phys. 227, 9303–9332. 10.1016/j.jcp.2008.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. McGowan, R. S., and Howe, M. S. (2010). “ Influence of the ventricular folds on a voice source with specified vocal fold motion,” J. Acoust. Soc. Am. 127, 1519–1527. 10.1121/1.3299200 [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. McGowan, R. S., and Howe, M. S. (2012). “ Source-tract interaction with prescribed vocal fold motion,” J. Acoust. Soc. Am. 131, 2999–3016. 10.1121/1.3685824 [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Murray, P. R., and Howe, M. S. (2012). “ On the thermo-acoustic Fant equation,” J. Sound Vib. 331, 3345–3357. 10.1016/j.jsv.2012.03.014 [DOI] [Google Scholar]
  17. Olive, J. P., Greenwood, A., and Coleman, J. (1993). Acoustics of American English Speech: A Dynamic Approach (Springer-Verlag, New York: ), pp. 104 and 208. [Google Scholar]
  18. Park, J. B., and Mongeau, L. (2007). “ Instantaneous orifice discharge coefficient of a physical, driven model of the human larynx,” J. Acoust. Soc. Am. 121, 442–455. 10.1121/1.2401652 [DOI] [PubMed] [Google Scholar]
  19. Rayleigh, Lord (1945). Theory of Sound (Dover, New York: ), Vol. 2, Sec. 308. [Google Scholar]
  20. Rosa, M. D. O., Pereira, J. C., Grellet, M., and Alwan, A. (2003). “ A contribution to simulating a three-dimensional larynx model using the finite element method,” J. Acoust. Soc. Am. 114, 2893–2905. 10.1121/1.1619981 [DOI] [PubMed] [Google Scholar]
  21. Rothenberg, M. (1981). “ Acoustic interaction between the glottal source and the vocal tract,” in Vocal Fold Physiology, edited by Stevens K. N. and Hirano M. (University of Tokyo Press, Tokyo: ), pp. 305–328. [Google Scholar]
  22. Stevens, K. N. (1998). Acoustic Phonetics (MIT Press, Cambridge, MA: ), pp. 55–152. [Google Scholar]
  23. Tao, C., Zhang, Y., Hottinger, D. G., and Jiang, J. J. (2007). “ Asymmetric airflow and vibration induced by the Coanda effect in a symmetric model of the vocal fold,” J. Acoust. Soc. Am. 112, 2270–2278. 10.1121/1.2773960 [DOI] [PubMed] [Google Scholar]
  24. Titze, I. R. (1994). Principles of Voice Production (Prentice Hall, Upper Saddle River, NJ: ), p. 72. [Google Scholar]
  25. Titze, I. R. (2008). “ Nonlinear source-filter coupling in phonation: Theory,” J. Acoust. Soc. Am. 123, 2733–2749. 10.1121/1.2832337 [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. Titze, I. R., and Story, B. H. (1997). “ Acoustic interactions of the voice source with the lower vocal tract,” J. Acoust. Soc. Am. 101, 2234–2243. 10.1121/1.418246 [DOI] [PubMed] [Google Scholar]
  27. Zanartu, M., Mongeau, L., and Wodicka, G. R. (2007). “ Influence of acoustic loading on an effective single mass model of the vocal folds,” J. Acoust. Soc. Am. 121, 1119–1129. 10.1121/1.2409491 [DOI] [PubMed] [Google Scholar]
  28. Zhang, C., Zhao, W., Frankel, S. H., and Mongeau, L. (2002). “ Computational aeroacoustics of phonation. Part II: Effects of flow parameters and ventricular folds,” J. Acoust. Soc. Am. 112, 2147–2154. 10.1121/1.1506694 [DOI] [PubMed] [Google Scholar]
  29. Zhao, W., Zhang, C., Frankel, S. H., and Mongeau, L. (2002). “ Computational aeroacoustics of phonation. Part I: Computational methods and sound generation mechanisms,” J. Acoust. Soc. Am. 112, 2134–2146. 10.1121/1.1506693 [DOI] [PubMed] [Google Scholar]
  30. Zheng, X., Mittal, R., Xue, Q., and Bielamowicz, S. (2011). “ Direct-numerical simulation of the glottal jet and vocal-fold dynamics in a three-dimensional laryngeal model,” J. Acoust. Soc. Am. Volume 130, 404–415. 10.1121/1.3592216 [DOI] [PMC free article] [PubMed] [Google Scholar]

Articles from The Journal of the Acoustical Society of America are provided here courtesy of Acoustical Society of America

RESOURCES