Abstract
This paper presents measurements conducted in a physical model of the adult human airway. The goals of this work are to (1) benchmark the physical model to excised larynx models in the literature and (2) empirically demonstrate the relationship between vocal fold drag and sound production. Results from the airway model are first benchmarked to published time-averaged behavior of excised larynx models. The airway model in this work exhibited higher glottal volume flow, lower glottal resistance, and less fundamental frequency variation than excised larynx models. Next, concurrent measurements of source behavior and radiated sound were compared. Unsteady transglottal pressure (a surrogate measure for vocal fold drag) and radiated sound, measured at the mouth, showed good correlation. In particular, the standard deviation and the ratio of the power of the first and second harmonics of the transglottal and mouth pressures were strongly correlated. This empirical result supports the assertion that vocal fold drag is the principal source of sound in phonation.
I. INTRODUCTION
This paper describes characterization of the aeroacoustic source in a physical model of the human airway. The model consists of synthetic rubber vocal folds housed in an acrylic duct, designed to replicate the aeroelastic-aeroacoustic behavior of adult human phonation at life-scale. Direct measurements of transglottal pressure characterize the source strength, due to the equivalence between vocal fold drag and transglottal pressure force.
Phonatory sound is generated by unsteady airflow through the glottis. The airflow is driven by the lungs, initiating flow-induced vibration of the vocal folds, valving glottal airflow. The traditional description of phonatory sound production (Fant, 1960; Titze, 2000) considers the acoustic field in the vocal tract and outside the mouth to be driven by glottal volume flow, which appears as a volume (monopole) source. In contrast, more recent theoretical analyses using aeroacoustic theory (Howe and McGowan, 2010, 2011, 2012; McGowan and Howe, 2012; Suh and Frankel, 2007; Zhang et al., 2002a; Zhao et al., 2002) have identified the vocal fold drag force, a dipole, as the principal source of phonatory sound. Hirschberg (1992) has argued the essential equivalence of these approaches, in terms of predicting the radiated sound. However, aeroacoustic theoretical approaches provide a more complete picture because the analytical domain includes the subglottal airway as well as the larynx and the supraglottal airway. While many measurements have established a correlation between glottal volume flow and radiated sound [see, e.g., Holmberg et al. (1995) or Oren et al. (2015)], no similar experimental efforts have yet demonstrated a correlation between vocal fold drag and radiated sound.
Aeroacoustic theory also provides a means for distinguishing between periodic and broadband source contributions. Vortex sound theory shows how drag on the vocal folds can be expressed in terms of the path of the glottal jet, relative to the instantaneous shape of the vocal folds (Krane, 2005). It also indicates that the vocal fold drag source is composed of both periodic and broadband components. The large-scale behavior of the glottal jet is well-correlated with the vibration cycle, contributing a periodic component to the vocal fold drag. Glottal jet vortices are less well-correlated with the vibration cycle, and thus provide an aperiodic contribution to the drag source.
The vortices in the supraglottal space also directly generate sound as they interact with one another (quadrupoles) and with vocal tract walls (dipoles). Krane (2005) and Howe and McGowan (2005) presented theoretical descriptions of unvoiced sounds in terms of jet interactions with vocal tract walls. Nomura and Funada (2007) and Alipour et al. (2007) compared the acoustic output from computational and physical models, respectively, with and without false vocal folds. They found the interaction of the glottal jet and false vocal folds to contribute substantially to broadband sound generation. However, McGowan and Howe (2010) demonstrated little difference in sound generation in a computational model with and without false folds. Zhang et al. (2004) and Zhang and Mongeau (2006) characterized broadband sound in experiments of a rigid glottis and with driven vocal folds, respectively. Thus, glottal jet turbulence contributes both to the vocal fold drag dipole, as well as direct quadrupole sound radiation. The latter mechanism is generally much weaker than the former, and is thus not addressed in this paper.
Physical models provide a well-controlled, repeatable means to study aeroacoustic sources of phonation. These models are often used to study the relationship between pressure and flow for voice. Few studies have directly characterized the aeroacoustic sources of voice in physical models. Zhang et al. (2002b) measured the pressure drop across the glottis and the glottal volume flow but did not treat it as a direct measurement of the source strength. Lodermeyer et al. (2018) applied the Lighthill (1952) analogy to characterize the sound sources using experimentally measured velocity fields in a physical model. Direct measurements of the principal source, the time-varying vocal fold drag, have not been conducted, likely due to the technical challenges of measuring drag on a vibrating structure. As such, to date, no experimental characterization of vocal fold drag and its relationship to radiated sound has been conducted.
A challenge with physical models composed of synthetic rubber vocal folds is obtaining physiologically relevant vibration patterns. Early versions of these models exhibited large inferior-superior motion, lacked a mucosal wave, or did not exhibit the same subglottal pressure, volume flow, and fundamental frequency behavior as excised larynges (Mittal et al., 2013; Murray and Thomson, 2012). More recent physical models [see, e.g., Murray and Thomson (2012)] exhibit improvements in achieving life-like vocal fold vibration. While the models used in this work do not match all parameters of the physiological system, the same sound production mechanisms are at play: the self-oscillating folds generate sound by valving airflow through the model larynx.
The following describes the aeroacoustic source characterization of the Penn State airway model (PSAM). The PSAM is first benchmarked against other physical model and excised larynx results available in the literature. An approximate equivalence between the transglottal pressure force and the vocal fold drag force is demonstrated using a control volume approach and the usual assumptions about phonatory aerodynamics. This result motivates the use of transglottal pressure as a surrogate measure of the vocal fold drag source. Then, the correlation between concurrent measurements of source region fluctuations (transglottal pressure) and sound pressure outside the source region (at the mouth) is established.
II. MODEL AND METHODS
This section describes the PSAM and the measurement techniques and analysis used in this study. The collection and digitization of data from the literature is also described.
A. Penn State airway model
The PSAM, shown in Fig. 1(a), is an idealized-geometry model of the adult human airway at life-scale. A complete description of the model, including detailed mechanical drawings, is provided in Campo (2012). The model was designed so that the aeroelastic-aeroacoustic behavior occurs in the same regimes as an adult human. The dimensions of the model follow the average values reported by Titze (2000). The model has a uniform, 7.78 cm2 square cross-section with three regions that nominally correspond to the trachea (12.9 cm height), larynx (1.4 cm height), and vocal tract (17.2 cm height). The larynx section houses two stereolithography-fabricated brackets onto which silicone rubber vocal folds are molded. These folds are molded in a two-layer body-cover model, after Pickup and Thomson (2011). Vocal fold models were scaled to the dimensions of an adult male, using the “M5” shape by Scherer et al. (2001). Vocal folds were molded from silicone rubber (EcoFlex 0300, Smooth-On, Macungie, PA) using the technique of Riede et al. (2008) and Drechsel (2008). Material properties were controlled by the ratio of silicone oil thinner to the two part EcoFlex mixture. The ratio of EcoFlex Part A to EcoFlex part B to silicone oil for the body layer was 1:1:2 and 1:1:5 for the cover layer.
FIG. 1.
(Color online) (a) The assembled Penn State airway model (PSAM). (b) Schematics of the molding process for the two layer vocal fold models (Campo, 2012). (c) Schematic of the PSAM and measurement and control system.
Molding the model folds was done in two steps, depicted in Fig. 1(b). Ease Release 200 Spray (Smooth-On, Macungie, PA) was applied to the mold and exterior sides of the brackets. The body layer was molded first within the inside bracket, depicted in the top row of Fig. 1(b). Once cured, the inside bracket was inserted into the outside bracket as shown in the center row of Fig. 1(b). The silicone body and bracket assembly was then placed into the second mold to mold the cover layer [bottom row of Fig. 1(b)]. See Campo (2012) for further details of the vocal fold geometry and molding process.
Airway model walls are made of transparent acrylic for optical access. Aluminum corner pieces and PVC caps on the top and bottom of the model hold the acrylic walls in place. Vocal fold models are positioned using stages with Del-Tron101-SD-X (Del-Tron Precision, Inc., Bethel, CT) micrometer positioning stages. The model is driven with pressurized shop air controlled with a pressure regulator (Control Air, Inc., Amherst, NH). A 5.08 cm inner diameter Tygon tube, length = 8.5 m, connects the pressure regulator to the model. The outlet, or “mouth,” exits to the atmosphere. A wooden shroud was fit over the model to prevent outside disturbances from contaminating measurements and to provide a 61 cm × 61 cm square baffled outlet.
B. Data acquisition
A schematic of the measurement suite is shown in Fig. 1(c). Time averaged flow rate is measured using a RMC106 flowmeter (Dwyer Instruments, Inc., Michigan City, IN) placed just downstream of the pressure regulator. Time averaged subglottal pressure is measured with a micromanometer connected to a 3.175 mm diameter port at the inlet of the trachea section of the PSAM. Acoustic pressure at the mouth is measured by a 12.7 mm diameter model 2451 microphone (Larson-Davis, Depew, NY), the center of which is located 9.5 mm from the duct exit. Two XCS-093-5D pressure transducers (Kulite Semiconductor Products, Inc., Leonia, NJ) are placed on either side of the model vocal folds (0.95 cm from the VF mounting bracket edges) to measure subglottal and supraglottal pressures. Transducer and microphone measurements were recorded with Wavebook 512 and 516 DAQ systems. High-speed video was taken with a MEMRECAM (Nac Image Technology, Simi Valley, CA) looking through the “mouth” of the model to record projected glottal open area. Microphone, pressure transducer, and camera samples were acquired at 22 kHz. Microphone and pressure transducer data were stored as text files and images were stored as 8-bit TIFFs. Manometer and flowmeter readings were made by eye and recorded during each run. Each run resulted in 200 000 pressure samples and 16 500 images.
Experiments were conducted in the anechoic chamber located in the Garfield Thomas Water Tunnel Building at the Applied Research Laboratory at Penn State. The chamber's dimensions are 9.3 m high by 5.5 m wide by 6.8 m deep and its interior surfaces are covered with 91 cm deep fiberglass wedges. The chamber meets the requirements of ISO 3745 (ANSI S12.55) and IEC 268 from 80 Hz to 12.5 kHz. The PSAM was placed in the center of the chamber on a wooden test stand.
The experimental procedure was as follows: (1) driving pressure was increased with the pressure regulator until regular vibration of the vocal folds was audibly observed, (2) a series of recordings was acquired at a number of driving pressures, increasing the driving pressure for each acquisition, until irregular vibration was observed audibly, then (3) another series of recordings was acquired at decreasing pressure increments, until vibration ceased. The same molded vocal folds were used for four consecutive days of testing.
C. Data reduction and uncertainty
Microphone, pressure transducer, and image recordings were post-processed with scripts written in matlab (The Mathworks Inc., Natick, MA). Images from the high-speed camera were post-processed to determine the projected glottal open area. The model vocal folds were illuminated from above with a lamp so that the superior surface of the folds appeared bright and the glottal gap appeared dark. All pixels below a threshold intensity level were counted as contributing to the projected glottal open area. The known geometry of the model fold brackets was used to estimate a calibration scale factor (pixels to millimeters). The scale factor was taken as the average of calibrations from different landmarks on the brackets. The range of these scale factors was used to estimate the bias uncertainty.
Auto-power spectral densities were computed by averaging the auto-spectra of segments 4096 samples in length. A Hann function was applied to each segment with 50% overlap. Source peak frequency and magnitude were extracted by fitting a parabola to the three points defining the source peak. The maximum of the parabola provided the magnitude and the corresponding frequency was recorded. From these data the fundamental frequency f0 was extracted.
The dimensionless measure H1-H2 was calculated as the ratio of the power of the first and second harmonic peaks of the auto-power spectral densities. The power was estimated by integrating twice the frequency bandwidth found at half the peak magnitude for each harmonic. This measure is associated with a breathy voice quality and has a strong relationship with open quotient, the proportion of the glottal cycle that the glottis is open (Hanson, 1995; Holmberg et al., 1995).
The precision of the transducer and microphone measurements was used to estimate the bias uncertainty. The precision of the measurements made by eye was used as the total uncertainty. Random uncertainties were calculated with the expressions provided in Benedict and Gould (1996) for statistical quantities.
D. Data digitization from literature
Papers were collected from the literature that used self-oscillating physical models of near human anatomical size. The collected dataset (see Table I for references) is composed of excised human larynges, excised canine larynges, and rubber vocal fold models. The measured variables used for comparison were fundamental frequency, average volume flow, and average subglottal pressure. Data were digitized from these works with engauge, an open-source plot digitizing software package. Points were manually picked in the software. Overlapping data points were often difficult to distinguish, and so nearest neighbor interpolation was used to combine data sets that originated in two different plots. For example, many papers presented separate plots of vs and f0 vs . Nearest neighbor interpolation was then used to assemble a complete dataset of . Most papers presented tests of a number of different models or different model configurations. For example, Drechsel (2008) presented results from different rubber VF designs and Alipour et al. (2007) varied the supraglottal structures of an excised larynx. For visual clarity, all results from a given reference are displayed using a common symbol. Some datasets were down-sampled to improve the clarity of the plots: Drechsel (2008) was down-sampled by a factor of 2, Alipour et al. (2007) by a factor of 5, and Alipour et al. (2009) by a factor of 10.
TABLE I.
Papers that were digitized for comparison in Fig. 2.
| Paper | Model description |
|---|---|
| PSAM | Molded rubber folds |
| Van den Berg and Tan (1959) | Excised human larynx (1) |
| Baer (1975) | Excised human larynx (1) |
| Jiang and Titze (1993) | Excised canine larynx (1) |
| Alipour et al. (2007) | Excised canine larynx (1) |
| Drechsel (2008) | Molded rubber foldsa |
| Alipour et al. (2009) | Excised canine larynx (1) |
| Murray and Thomson (2012) | Molded rubber folds |
| Alipour et al. (2013) | Excised human larynx (1) |
| Oren et al. (2015, 2014a,b, 2016)b | Excised canine larynges (16) |
Model 34 was not included due to the large subglottal pressure and flow rate.
Parts of this dataset appear in these publications.
III. RESULTS AND DISCUSSION
Experiments were conducted in the PSAM, over four consecutive days, with the same vocal fold models. The PSAM exhibited behavior consistent with a breathy voice, as the model vocal folds were blown open and never completely closed. Pressure measurements are shown for the four testing days and imaging results are presented for three of the days. Uncertainty bars are plotted for just one dataset on each plot for clarity. Little difference was found in uncertainty across different testing days. Uncertainty bars are omitted when the uncertainty in the variable is smaller than the marker size.
A. Benchmarking to other benchtop models
Results from the PSAM are first compared to others in the literature. For our purposes, excised larynx models serve as a gold standard to assess the performance of molded rubber vocal folds. Figure 2(a) shows average subglottal pressure versus time-averaged glottal volume flow . A line above a symbol denotes a time-averaged quantity. Generally, the individual datasets appear linear in , with a large variation in their slopes. Lines are included for reference with slopes, or resistances, of , and 10 (kPa)/(L/s). The rubber vocal fold models used in this paper, Drechsel (2008), and Murray and Thomson (2012) show lower resistances and roughly lie between R = 1 and 2.5 (kPa)/(L/s) whereas the majority of the excised larynx models show resistances between R = 2.5 and 10 (kPa)/(L/s).
FIG. 2.
(Color online) Comparison of data collected from the literature (Table I) and the PSAM. (a) Average subglottal pressure versus average volume flow. (b) Fundamental frequency versus average volume flow.
Similar results are found in the plots of fundamental frequency f0 versus volume flow rate in Fig. 2(b). The data from excised larynges and rubber model vocal folds clustered amongst themselves. The excised larynges generally show a strong frequency dependence with glottal volume flow. The rubber vocal fold models of Drechsel (2008) and the current study, both based on the M5 shape of Scherer et al. (2001), show a weak frequency dependence. The magnetic resonance imaging (MRI) derived models of Murray and Thomson (2012) show frequency dependence comparable to the excised larynges.
B. Statistics and spectral measures
Images of the superior surface of the PSAM model vocal folds over one cycle of vibration are shown in Fig. 3(a). The black region shows the projected glottal area. The vocal folds were blown superiorly and vibrated while always remaining open, resulting in a breathy sounding voice.
FIG. 3.
(a) Images of the superior surface of the two-layer M5 model vocal folds in the PSAM. Images are shown for one cycle of vibration with time t noted relative to the period of vibration T. (b),(c) Statistics of the projected glottal area versus time-averaged subglottal pressure. (b) Time-averaged area and (c) standard deviation of the area. Day 1: day 2: day 4: ×.
The average and standard deviation of the projected glottal area Sg both increase linearly with average subglottal pressure, as shown in Figs. 3(b) and 3(c). The standard deviation of Sg is roughly 10% of time averaged Sg. The day-1 measurement at the highest subglottal pressure exhibited attenuated vibration amplitude, as shown in Fig. 3(b). For this subglottal pressure, the vibration was aperiodic, whereas all other measurements performed showed periodic vibration. It is worth noting that the maximum glottal opening was always less than 10% of the airway cross-sectional area (7.78 cm2).
Time-averaged pressure transducer recordings are shown in Fig. 4 as a function of the average subglottal pressure recorded with the micro-manometer. The subglottal pressure recorded by the micro-manometer tracks well with the average pressure recorded by the subglottal pressure transducer. The average supraglottal pressure is negative and decreases with increasing subglottal pressure. This result indicates the glottal jet has not diffused completely at the axial location of the supraglottal pressure transducer. As a result, the transglottal pressure, calculated from the difference of the subglottal and supraglottal pressures, is higher than the subglottal pressure. Excellent agreement is found in the time-averaged pressures across the four days of recording.
FIG. 4.
Time-averaged pressure transducer measurements of subglottal, supraglottal, and transglottal pressure vs time-averaged subglottal pressure measured with a micro-manometer. Day 1: day 2: day 3: □, day 4:×.
Example auto-power spectral densities of the subglottal, supraglottal, and transglottal pressure recordings are shown in Fig. 5(a). These data were recorded on day 4 at a subglottal pressure of 2.7 kPa. The supraglottal sensor shows higher magnitude than the subglottal sensor across all non-zero frequencies, although the two sensors show comparable source peak magnitudes. As a result, the transglottal pressure spectrum closely follows the supraglottal spectrum. The higher broadband content measured by the supraglottal transducer is primarily due to near-field pressure fluctuations of the turbulent glottal jet.
FIG. 5.
(Color online) Power-spectral densities (PSD) from day-4 at a subglottal pressure of 2.7 kPa. (a) PSD of the subglottal, supraglottal, and transglottal pressures. (b) PSD of the mouth pressure and glottal area.
Figure 5(b) shows the auto-power spectral densities of the mouth pressure pm and glottal open area Sg. The fundamental frequency and harmonics of the glottal open area correspond well with those in the mouth pressure. The broad formant peaks from the vocal tract are observed in the mouth pressure spectrum at ∼500 and ∼1500 Hz.
IV. SOURCE CHARACTERIZATION
A. Equivalence of transglottal pressure force and vocal fold drag as measures of source strength
Previous works have identified vocal fold aerodynamic drag as the principal sound source in phonation (Howe and McGowan, 2010, 2011, 2012; McGowan and Howe, 2012; Suh and Frankel, 2007; Zhang et al., 2002a; Zhao et al., 2002). This section demonstrates the approximate equivalence of the axial transglottal pressure force and the vocal fold drag force using a control volume analysis. This approximate equivalence has been demonstrated indirectly through solutions to aeroacoustic analogies (Hofmans, 1998; Zhang et al., 2002b).
The equations of motion for the air in the larynx are written and reduced with the usual assumptions about glottal aerodynamics. The problem under consideration is sketched in Fig. 6(a) with details of the airflow topology within the larynx sketched in Fig. 6(b). The flow passes through the larynx from left to right, separates from the vocal folds at x1 = xS and forms the glottal jet. The flow is assumed uniform and axial at the entrance x1 = xA and exit x1 = xD of the control volume.
FIG. 6.
(Color online) (a) Schematic of the airway: trachea (xL < x1 < xA), larynx (xA < x1 < xD), vocal tract (xD < x1 < xm), and exterior (x1 > xm). (b) Detail of the air motion in the larynx. Flow is from left to right. Vocal folds located xA < x1 < xC. Glottal jet separates from vocal folds at xS. Boundary of control volume used for analysis shown by dashed line.
The transglottal pressure force and vocal fold drag force are related using the integral form of the momentum equation along the axial (x1) direction for the air in the larynx:
| (1) |
As shown above, the model folds never close, and so Eq. (1) holds throughout a full cycle of vibration. The terms on the left hand side of this equation sum to the rate of change of momentum of the air in V. Here, ui is the air velocity, wi the velocity of the control volume boundary (such as the vocal fold wall) and ni the outward-directed unit normal vector indicating local control surface orientation. Subscripts i refer to the coordinate directions shown in Fig. 6. From left to right, the terms are: the rate of accumulation of axial momentum in the control volume (unsteady acceleration), the net flux of axial momentum out of the control volume (convective acceleration) by axial flow, and two terms representing the net flux of axial momentum out of the control volume by transverse and out of plane flow, respectively. The right-hand side of the equation is the net axial force acting on the air in the laryngeal control volume, composed of, from left to right, a net pressure force where p is the pressure and a net friction force FfD.
The convective acceleration term has three parts, corresponding to the momentum fluxes through the inlet (SA), the outlet (SD), and the walls (Swalls). On the walls u1 − w1 = u2 − w2 = u3 − w3 = 0, so the contribution of these momentum flux integrals is zero. The entrance and exit control surfaces are stationary, giving w1 = w2 = w3 = 0 on these surfaces. The transverse and out of plane components of velocity, u2 and u3, respectively, are zero at the entrance and exit control surface because uniform axial motion is assumed at these locations. The net axial convective acceleration is then the difference of the momentum flux at the inlet and outlet . Assuming uniform axial motion and matching areas S = SA = SB at xA and xD, the net axial convective acceleration is zero.
The net pressure force is a combination of several effects. The wall pressure distribution decreases monotonically along the walls in the flow direction until the separation point, reflecting the spatial acceleration of incompressible airflow through the compliant constriction. The pressure force term can be decomposed into the contributions from pressure on the vocal fold walls, and from the inlet and outlet
| (2) |
where FP = (pA − pD)S is the transglottal pressure force imparted on the air in the larynx by the air motion in the trachea and the vocal tract, pA = p(xA), pD = p(xD), and FpD is the pressure drag on the vocal folds. The friction force FfD is much smaller than the pressure drag, except when the glottis is very narrow (Deverge et al., 2003; Krane and Wei, 2006; Vilain et al., 2004).
Equation (1) is rewritten in Eq. (3) by neglecting the convective acceleration and combining the friction ffD and pressure drag fpD into a total aerodynamic drag FD = FpD + FfD:
| (3) |
Estimating the order of magnitude (OOM) of the terms in Eq. (3) facilitates our analysis of the forces acting on the air in the larynx. More precisely, the OOM of the fluctuating components are estimated to determine their relevance to sound generation. First, apply a Reynolds decomposition to Eq. (3) and time-average, yielding the mean momentum equation for laryngeal air flow:
| (4) |
where a line above a term denotes a time-average. Subtracting Eq. (4) from Eq. (3) then yields an equation describing departures of the forces from the mean:
| (5) |
where fluctuating quantities are denoted with a prime.
The rate of change of momentum in the control volume has OOM:
| (6) |
where L = xD − xA is the length scale of the glottis in Fig. 6, Q is the volume flow, uj is the glottal jet velocity scale at the minimum glottal area Sj, and To is the time that the glottis is open during a cycle of vibration. The OOM of the rate of change of momentum is estimated relative to the OOM of the transglottal pressure force , as in Krane and Wei (2006). Equation (6) and are first expanded into steady and fluctuating components:
| (7) |
| (8) |
The ratio of the fluctuating components of Eq. (7) to Eq. (8) is given by
| (9) |
The first two terms on the right hand side of Eq. (9) rely only on steady scales, and combine to O(10−3). The third term is O(1) because the numerator and denominator have similar order. These estimates imply that the integral momentum equation for laryngeal airflow reduces to the sum of the total aerodynamic drag force FD and the external (transglottal) pressure force FP in both steady and fluctuating components. Thus Eq. (3) reduces to
| (10) |
Equation (10) states that the force of the air outside the laryngeal control volume (transglottal pressure force) is balanced by aerodynamic drag on the vocal folds, and that the inertia of the air in the control volume has a negligible role in this relationship. This equivalence of FP and FD holds in both steady and fluctuating forces. The results in Sec. IV B report FP/S = pA − pD and use this as a surrogate for FD, the principal sound source.
B. Source results
This section characterizes the relationship between the aeroacoustic source, measured with the transglottal pressure, and the acoustic pressure outside the source region. The standard deviation of the transglottal and mouth pressures are shown in Figs. 7(a) and 7(b), respectively. Both show a linear trend with average transglottal pressure and a slight decrease in standard deviation value with days of testing. This decrease is likely due to changes in model mechanical properties and/or wear of the molded vocal fold models.
FIG. 7.
Characterization of the aeroacoustic source. Standard deviation of the (a) transglottal pressure and the (b) mouth pressure versus time-averaged transglottal pressure. (c) Standard deviation of the mouth pressure versus standard deviation of the transglottal pressure. H1-H2 of the (d) transglottal pressure and the (e) mouth pressure versus time-averaged transglottal pressure. (f) H1-H2 of the mouth pressure versus the transglottal pressure. Day 1: day 2: day 3: □, day 4:×.
A strong correlation is found between the standard deviations of mouth pressure and transglottal pressure, shown in Fig. 7(c). Furthermore, there is a marked reduction in the scatter observed in Fig. 7(c) compared to that in Figs. 7(a) and 7(b). This relationship between the transglottal pressure and the acoustic pressure outside the source region supports the claim that fluctuating vocal fold drag, as estimated by transglottal pressure, is the primary source of sound.
Figures 7(d) and 7(e) show the H1-H2 for the transglottal and mouth pressure, respectively. An increase in H1-H2 is associated with a breathier voice quality (Hillenbrand et al., 1994). The measure is used to here to demonstrate the relationship between spectral characteristics of the transglottal and mouth pressure signals. Both of these H1-H2 results show an increase with measurement days, another indication that model folds changed as the test progressed. In spite of these changes, there is a strong correlation between the H1-H2 measures of the mouth pressure and transglottal pressure, as shown in Fig. 7(f), with a reduction in scatter relative to Figs. 7(d) and 7(e), similar to the reduction seen in Figs. 7(a)–7(c). This further supports the claim that the fluctuating transglottal pressure is a measure of the principal aeroacoustic source strength.
V. SUMMARY
This paper reports the aeroacoustic characterization of two-layer rubber model vocal folds in the PSAM. A comparison of the time-averaged behavior (fundamental frequency, glottal volume flow, and subglottal pressure) was made between the PSAM and other self-oscillating, human scale physical models in the literature. The rubber vocal folds used in this study exhibited higher glottal volume flow, lower average glottal resistance, and less fundamental frequency variation than the excised larynx models.
The strength of the principal aeroacoustic source was characterized in the PSAM. The primary finding is the correlation between concurrent measurements of the aeroacoustic source strength, measured with transglottal pressure, and acoustic pressure measurements outside the source region (mouth pressure). This finding supports the assertion that vocal fold drag is the principal phonatory sound source.
ACKNOWLEDGMENTS
The authors acknowledge financial support from the National Institutes of Health (NIH), Grant No. R01 DC005642-13. M.J.M. acknowledges support from the Department of Biomedical Engineering at Penn State. The authors thank Dr. Jesse Belden at the Naval Undersea Warfare Center for lending the high-speed camera.
References
- 1. Alipour, F. , Finnegan, E. M. , and Jaiswal, S. (2013). “ Phonatory characteristics of the excised human larynx in comparison to other species,” J. Voice 27(4), 441–447. 10.1016/j.jvoice.2013.03.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Alipour, F. , Finnegan, E. M. , and Scherer, R. C. (2009). “ Aerodynamic and acoustic effects of abrupt frequency changes in excised larynges,” J. Speech, Lang., Hear. Res. 52(2), 465–481. 10.1044/1092-4388(2008/07-0212) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Alipour, F. , Jaiswal, S. , and Finnegan, E. (2007). “ Aerodynamic and acoustic effects of false vocal folds and epiglottis in excised larynx models,” Ann. Otol, Rhinol. Laryngol. 116(2), 135–144. 10.1177/000348940711600210 [DOI] [PubMed] [Google Scholar]
- 4. Baer, T. (1975). “ Investigation of phonation using excised larynxes,” Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA. [Google Scholar]
- 5. Benedict, L. , and Gould, R. (1996). “ Towards better uncertainty estimates for turbulence statistics,” Exp. Fluids 22(2), 129–136. 10.1007/s003480050030 [DOI] [Google Scholar]
- 6. Campo, E. (2012). “ The effect of vocal fold geometry on the fluid structure acoustic interactions in an experimental model of the human airway,” Master's thesis, The Pennsylvania State University, State College, PA. [Google Scholar]
- 7. Deverge, M. , Pelorson, X. , Vilain, C. , Lagrée, P.-Y. , Chentouf, F. , Willems, J. , and Hirschberg, A. (2003). “ Influence of collision on the flow through in-vitro rigid models of the vocal folds,” J. Acoust. Soc. Am. 114(6), 3354–3362. 10.1121/1.1625933 [DOI] [PubMed] [Google Scholar]
- 8. Drechsel, J. S. (2008). “ Characterization of synthetic, self-oscillating vocal fold models,” Master's thesis, Brigham Young University-Provo, Provo, UT. [Google Scholar]
- 9. Fant, G. (1960). Acoustic Theory of Voice Production ( Mouton, The Hague: ). [Google Scholar]
- 10. Hanson, H. M. (1995). “ Glottal characteristics of female speakers,” Ph.D. thesis, Harvard University, Cambridge, MA. [Google Scholar]
- 11. Hillenbrand, J. , Cleveland, R. A. , and Erickson, R. L. (1994). “ Acoustic correlates of breathy vocal quality,” J. Speech, Lang., Hear. Res. 37(4), 769–778. 10.1044/jshr.3704.769 [DOI] [PubMed] [Google Scholar]
- 12. Hirschberg, A. (1992). “ Some fluid dynamic aspects of speech,” Bull. Commun. Parlée no. 2, 7–30. [Google Scholar]
- 13. Hofmans, G. C. J. (1998). “ Vortex sound in confined flows,” Technische Universiteit Eindhoven, Eindhoven 10.6100/IR514917. [DOI]
- 14. Holmberg, E. B. , Hillman, R. E. , Perkell, J. S. , Guiod, P. C. , and Goldman, S. L. (1995). “ Comparisons among aerodynamic, electroglottographic, and acoustic spectral measures of female voice,” J. Speech, Lang., Hear. Res. 38(6), 1212–1223. 10.1044/jshr.3806.1212 [DOI] [PubMed] [Google Scholar]
- 15. Howe, M. , and McGowan, R. (2010). “ On the single-mass model of the vocal folds,” Fluid Dyn. Res. 42(1), 015001. 10.1088/0169-5983/42/1/015001 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Howe, M. , and McGowan, R. (2011). “ On the generalised Fant equation,” J. Sound Vib. 330(13), 3123–3140. 10.1016/j.jsv.2011.01.017 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Howe, M. , and McGowan, R. (2012). “ On the role of glottis-interior sources in the production of voiced sound,” J. Acoust. Soc. Am. 131(2), 1391–1400. 10.1121/1.3672655 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Howe, M. , and McGowan, R. S. (2005). “ Aeroacoustics of [s],” Proc. R. Soc. London, Ser.A 461, 1005–1028. 10.1098/rspa.2004.1405 [DOI] [Google Scholar]
- 19. Jiang, J. J. , and Titze, I. R. (1993). “ A methodological study of hemilaryngeal phonation,” Laryngoscope 103(8), 872–882. 10.1288/00005537-199308000-00008 [DOI] [PubMed] [Google Scholar]
- 20. Krane, M. H. (2005). “ Aeroacoustic production of low-frequency unvoiced speech sounds,” J. Acoust. Soc. Am. 118(1), 410–427. 10.1121/1.1862251 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21. Krane, M. H. , and Wei, T. (2006). “ Theoretical assessment of unsteady aerodynamic effects in phonation,” J. Acoust. Soc. Am. 120(3), 1578–1588. 10.1121/1.2215408 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Lighthill, M. J. (1952). “ On sound generated aerodynamically I. General theory,” Proc. R. Soc. London A 211(1107), 564–587. 10.1098/rspa.1952.0060 [DOI] [Google Scholar]
- 23. Lodermeyer, A. , Tautz, M. , Becker, S. , Döllinger, M. , Birk, V. , and Kniesburges, S. (2018). “ Aeroacoustic analysis of the human phonation process based on a hybrid acoustic PIV approach,” Exp. Fluids 59(1), 13. 10.1007/s00348-017-2469-9 [DOI] [Google Scholar]
- 24. McGowan, R. S. , and Howe, M. S. (2010). “ Influence of the ventricular folds on a voice source with specified vocal fold motion,” J. Acoust. Soc. Am. 127(3), 1519–1527. 10.1121/1.3299200 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. McGowan, R. S. , and Howe, M. S. (2012). “ Source-tract interaction with prescribed vocal fold motion,” J. Acoust. Soc. Am. 131(4), 2999–3016. 10.1121/1.3685824 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Mittal, R. , Erath, B. D. , and Plesniak, M. W. (2013). “ Fluid dynamics of human phonation and speech,” Ann Rev. Fluid Mech. 45, 437–467. 10.1146/annurev-fluid-011212-140636 [DOI] [Google Scholar]
- 27. Murray, P. R. , and Thomson, S. L. (2012). “ Vibratory responses of synthetic, self-oscillating vocal fold models,” J. Acoust. Soc. Am. 132(5), 3428–3438. 10.1121/1.4754551 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28. Nomura, H. , and Funada, T. (2007). “ Effects of the false vocal folds on sound generation by an unsteady glottal jet through rigid wall model of the larynx,” Acoust. Sci. Technol. 28(6), 403–412. 10.1250/ast.28.403 [DOI] [Google Scholar]
- 29. Oren, L. , Khosla, S. , Dembinski, D. , Ying, J. , and Gutmark, E. (2015). “ Direct measurement of planar flow rate in an excised canine larynx model,” Laryngoscope 125(2), 383–388. 10.1002/lary.24866 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30. Oren, L. , Khosla, S. , and Gutmark, E. (2014a). “ Intraglottal geometry and velocity measurements in canine larynges,” J. Acoust. Soc. Am. 135(1), 380–388. 10.1121/1.4837222 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Oren, L. , Khosla, S. , and Gutmark, E. (2014b). “ Intraglottal pressure distribution computed from empirical velocity data in canine larynx,” J. Biomech. 47(6), 1287–1293. 10.1016/j.jbiomech.2014.02.023 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32. Oren, L. , Khosla, S. , and Gutmark, E. (2016). “ Effect of vocal fold asymmetries on glottal flow,” Laryngoscope 126(11), 2534–2538. 10.1002/lary.25948 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 33. Pickup, B. A. , and Thomson, S. L. (2011). “ Identification of geometric parameters influencing the flow-induced vibration of a two-layer self-oscillating computational vocal fold model,” J. Acoust. Soc. Am. 129(4), 2121–2132. 10.1121/1.3557046 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Riede, T. , Tokuda, I. T. , Munger, J. B. , and Thomson, S. L. (2008). “ Mammalian laryngseal air sacs add variability to the vocal tract impedance: Physical and computational modeling,” J. Acoust. Soc. Am. 124(1), 634–647. 10.1121/1.2924125 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 35. Scherer, R. C. , Shinwari, D. , De Witt, K. J. , Zhang, C. , Kucinschi, B. R. , and Afjeh, A. A. (2001). “ Intraglottal pressure profiles for a symmetric and oblique glottis with a divergence angle of 10 degrees,” J. Acoust. Soc. Am. 109(4), 1616–1630. 10.1121/1.1333420 [DOI] [PubMed] [Google Scholar]
- 36. Suh, J. , and Frankel, S. H. (2007). “ Numerical simulation of turbulence transition and sound radiation for flow through a rigid glottal model,” J. Acoust. Soc. Am. 121(6), 3728–3739. 10.1121/1.2723646 [DOI] [PubMed] [Google Scholar]
- 37. Titze, I. R. (2000). Principles of Voice Production ( National Center for Voice and Speech, Iowa City, IA: ). [Google Scholar]
- 38. Van den Berg, J. , and Tan, T. (1959). “ Results of experiments with human larynxes,” ORL 21(6), 425–450. 10.1159/000274240 [DOI] [PubMed] [Google Scholar]
- 39. Vilain, C. , Pelorson, X. , Fraysse, C. , Deverge, M. , Hirschberg, A. , and Willems, J. (2004). “ Experimental validation of a quasi-steady theory for the flow through the glottis,” J. Sound Vib. 276(3-5), 475–490. 10.1016/j.jsv.2003.07.035 [DOI] [Google Scholar]
- 40. Zhang, C. , Zhao, W. , Frankel, S. H. , and Mongeau, L. (2002a). “ Computational aeroacoustics of phonation, Part II: Effects of flow parameters and ventricular folds,” J. Acoust. Soc. Am. 112(5), 2147–2154. 10.1121/1.1506694 [DOI] [PubMed] [Google Scholar]
- 41. Zhang, Z. , Mongeau, L. , and Frankel, S. H. (2002b). “ Experimental verification of the quasi-steady approximation for aerodynamic sound generation by pulsating jets in tubes,” J. Acoust. Soc. Am. 112(4), 1652–1663. 10.1121/1.1506159 [DOI] [PubMed] [Google Scholar]
- 42. Zhang, Z. , Mongeau, L. , Frankel, S. H. , Thomson, S. , and Park, J. B. (2004). “ Sound generation by steady flow through glottis-shaped orifices,” J. Acoust. Soc. Am. 116(3), 1720–1728. 10.1121/1.1779331 [DOI] [PubMed] [Google Scholar]
- 43. Zhang, Z. , and Mongeau, L. G. (2006). “ Broadband sound generation by confined pulsating jets in a mechanical model of the human larynx,” J. Acoust. Soc. Am. 119(6), 3995–4005. 10.1121/1.2195268 [DOI] [PubMed] [Google Scholar]
- 44. Zhao, W. , Zhang, C. , Frankel, S. H. , and Mongeau, L. (2002). “ Computational aeroacoustics of phonation, Part I: Computational methods and sound generation mechanisms,” J. Acoust. Soc. Am. 112(5), 2134–2146. 10.1121/1.1506693 [DOI] [PubMed] [Google Scholar]







