Optimization of Synthetic Vocal Fold Models for Glottal Closure

Cassandra J Taylor; Scott L Thomson

doi:10.1115/1.4054194

. 2022 Apr 27;5(3):031106. doi: 10.1115/1.4054194

Optimization of Synthetic Vocal Fold Models for Glottal Closure

Cassandra J Taylor ^1,^✉, Scott L Thomson ^1,¹

PMCID: PMC9132011 PMID: 35832120

Abstract

Synthetic, self-oscillating models of the human vocal folds are used to study the complex and inter-related flow, structure, and acoustical aspects of voice production. The vocal folds typically collide during each cycle, thereby creating a brief period of glottal closure that has important implications for flow, acoustic, and motion-related outcomes. Many previous synthetic models, however, have been limited by incomplete glottal closure during vibration. In this study, a low-fidelity, two-dimensional, multilayer finite element model of vocal fold flow-induced vibration was coupled with a custom genetic algorithm optimization code to determine geometric and material characteristics that would be expected to yield physiologically-realistic frequency and closed quotient values. The optimization process yielded computational models that vibrated with favorable frequency and closed quotient characteristics. A tradeoff was observed between frequency and closed quotient. A synthetic, self-oscillating vocal fold model with geometric and material properties informed by the simulation outcomes was fabricated and tested for onset pressure, oscillation frequency, and closed quotient. The synthetic model successfully vibrated at a realistic frequency and exhibited a nonzero closed quotient. The methodology described in this study provides potential direction for fabricating synthetic models using isotropic silicone materials that can be designed to vibrate with physiologically-realistic frequencies and closed quotient values. The results also show the potential for a low-fidelity model optimization approach to be used to tune synthetic vocal fold model characteristics for specific vibratory outcomes.

1 Introduction

An important characteristic of typical human phonation is the contact of the medial surfaces of opposing vocal folds (VFs) that occurs during a portion of each period. This period of time is referred to as the closed phase. The ratio of the closed phase duration to the overall period is known as the closed quotient (CQ). The CQ and its opposite, the open quotient (OQ, where OQ = 1 − CQ) are commonly-used parameters in classifying VF vibration properties (e.g., Refs. [1,2]). The CQ is important in normal phonation because of its role in influencing the spectral content of radiated sound (e.g., Ref. [3]). Normal values for the CQ vary, with average values found in the literature ranging from 0.22 to 0.53 [2,4,5]. While CQ values vary, the commonality remains that the glottis is typically closed for a significant amount of time during each cycle, and this closure has important implications regarding acoustic output [4,6].

Synthetic VF models are often used to study aspects of voice biomechanics because they can be parameterized, adjusted, and readily observed. Although synthetic VF models are capable of self-sustained vibration with favorable lifelike characteristics, consistently creating models with adequate CQ values has been relatively elusive. For example, Murray and Thomson [7,8] created and tested a synthetic, so-called “EPI” VF model that vibrated favorably compared to the human VFs in terms of frequency, onset pressure, and mucosal wave properties. However, the OQ for the EPI model was approximately one [9], meaning the model glottis never closed. After augmentation injections, the OQ was smaller; however, it never reached an adequate value.

Using a computational VF model with anisotropic material properties, Zhang [3] showed that medial surface length and VF stiffness are important parameters in achieving an adequate CQ. Xuan and Zhang [10] showed that introducing fibers and/or adding a stiffer outer epithelial layer could facilitate glottal closure in synthetic VF models. However, consistently obtaining glottal closure using isotropic synthetic VF models has thus far proven to be somewhat elusive, with closure being obtained in some cases with high subglottal pressures (e.g., Ref. [11]) and/or medial surface compression in the prephonatory, no-flow resting state (e.g., Refs. [12,13]). It thus remains to be seen whether realistic CQ values can be consistently achieved using synthetic VF models with isotropic silicone materials, and if so, which physical characteristics would enable such a response. In the affirmative, such models would constitute a useful step toward developing models that could be used to study an increasing number of conditions.

The purpose of this study was to use a genetic algorithm-based optimization approach coupled with computer modeling to find optimal geometric and material characteristics of a multilayer VF model to achieve desired frequency and CQ characteristics. The results were tested by fabricating a synthetic VF model with parameters informed by optimization outputs. In the following sections, the computational model that formed the basis of the optimization algorithm is described. The algorithm setup and parameters are then presented. The methods for testing the synthetic model are outlined. Results of the optimization and synthetic VF model tests are presented and discussed, and it is shown that the synthetic model successfully exhibited nonzero CQ. Limitations of the study and suggestions for future work are then discussed.

2 Methods

In the following sections, the computational vocal fold model is described, the genetic algorithm is summarized, and an overview of the synthetic model fabrication and testing procedures is given.

2.1 Computational Vocal Fold Model.

The computational model consisted of two-dimensional, fully-coupled VF and airway domains (see Fig. 1) developed using the commercial finite element code ADINA (ADINA R&D, Inc., Watertown, MA) The model was intentionally low-fidelity to minimize computation time and enable the large number of simulations required for optimization.

(Top) meshed fluid and solid regions of the finite element model; (bottom) Fluid domain outline with boundary conditions labeled

The fluid domain was modeled using air with a density of 1.2 kg/m³, a viscosity of 1.8 × 10⁻⁵ N·s/m², and a bulk modulus of 1.41 × 10⁵ Pa. A slightly compressible flow model was used to account for potential acoustic effects. The top of the duct was a symmetry line (slip condition). As described in Ref. [14], a contact line located 25 μm below the symmetry line was used to prevent fluid mesh collapse. The inlet pressure was a constant 0.9 kPa. The fluid domain was meshed with four-node quadrilateral elements with approximately 1100 elements and 1200 nodes (the precise number depending on geometric design variables).

The VF model (i.e., solid domain; see Fig. 2) was comprised of four different layers, similar to the EPI model created by Murray and Thomson [7,8] and Murray et al. [9]. The superficial-most layer was the 0.05 mm thick epithelium. The elastic modulus of human VF epithelium is not known, but following the method described in a previous computational study [15], the model's epithelium was prescribed to have an elastic modulus of 50 kPa. The body elastic modulus was 50 kPa. The cover and ligament stiffness and thickness values, along with medial surface length, were design variables in the genetic algorithm as described in Sec. 2.2. The lateral edge of the VF model had a fixed length (10.75 mm) and the VF height was fixed (8.4 mm); consequently, the body thickness and the inferior angle were allowed to change as other geometric parameters were varied.

Solid domain with key dimensions and parameters labeled. Note that the cover thickness includes the epithelium thickness.

All materials were defined as Ogden solids based on linear stress–strain data (see Ref. [16]). The Ogden solid model allowed for large displacement and large strain. The density and Poisson's ratio of the layers were 1070 kg/m³ and 0.49, respectively. The bulk modulus, κ, of each material was based on the corresponding elastic modulus, E, and Poisson's ratio, ν, according to the following equation:

κ = \frac{E}{3 (1 - 2 ν)}

(1)

The bulk modulus for the body and epithelium layers was 833.3 kPa. The bulk modulus of the other layers varied depending on the corresponding elastic modulus values assigned by the optimization algorithm. Following the approach utilized in a previous study employing a computational VF model [17], Rayleigh damping was used to simulate damping with α = 19.89 and β = 1.253 × 10⁻⁴, a combination that yields theoretical damping ratios in the range of 0.052–0.070 over the frequency range of 85–150 Hz.

The solid domain was meshed with first-order, three-node triangular elements. There were approximately 800 elements and 500 nodes (as with the fluid model, the precise number depended on geometric design variables). The lateral edge of the VF model was fixed and the remaining exterior edges were treated as fluid-structure interaction boundaries that enforced consistent displacement and stress at the boundaries between the fluid and solid domains. The fluid-solid interaction was simulated using an Arbitrary-Lagrangian–Eulerian (ALE) approach. The body-fitted fluid domain mesh deformation was guided via leader–follower pairs of points throughout the fluid domain (see Refs. [18,19] for commercial solver theory and modeling details).

2.2 Optimization Algorithm.

The simulation was allowed to proceed until time 0.1 s with a time-step size of 50 μs. The model was predicted to have reached a steady-state after 0.07 s, and data from the model vibration from 0.07 s to 0.1 s were used to provide input to the genetic algorithm.

Optimization was pursued via a real-valued genetic algorithm developed in matlab. The design variables (all continuous) were the elastic moduli of the cover and ligament, the medial surface length, and the thicknesses of the cover and ligament. The ranges of values for these variables are listed in Table 1.

Table 1.

Genetic algorithm design variables and value ranges

Design variable	Range
Cover elastic modulus	0.4–1.5 kPa
Ligament elastic modulus	0.4–2.0 kPa
Cover thickness	0.5–2.0 mm
Ligament thickness	0.5–4.0 mm
Medial surface length	1.01–6.01 mm

Open in a new tab

The cover thickness includes the epithelium thickness.

The genetic algorithm was coupled with the computational model with the entire process being automated through MATLAB. The algorithm created an input file based on values generated via a random selection or crossover depending on the generation (discussed later). The algorithm then called ADINA to run the simulation, after which CQ, vibration amplitude, and frequency values were found. As described below, the model fitness was then calculated based on these output values. This proceeded for each simulation in a generation, at which point a new generation was formed and simulated. This was allowed to proceed through 11 generations.

Simulation output variables of CQ, frequency, and vibration amplitude were used to determine the fitness of each model. CQ was calculated by dividing the amount of time any medial surface nodes were touching the contact line by the total evaluation time over an integer number of periods. Frequency was estimated using the Fourier transform of the glottal width waveform, and amplitude of vibration was the difference between the smallest and largest values of the glottal width waveform.

The optimization objective was to minimize the fitness function, $f$ , defined as

f = \max (\frac{f_{\max}}{1 + e^{- k (C Q - {C Q}_{2})}}, \frac{f_{\max}}{1 + e^{k (C Q - {C Q}_{1})}}) + p

(2)

where CQ is the closed quotient; CQ₁ = 0.15, CQ₂ = 0.70, and f_max = 0.5 are constants that defined acceptable CQ values; k = 30 is a parameter that governed the shape of the overall function; and $p$ is a penalty term as described in the following paragraph. Equation (2) is plotted versus CQ in Fig. 3, where it can be seen that these parameters resulted in a function that assigned a value of 0.001 or lower (i.e., better fitness) to a range of values from approximately CQ = 0.36 to 0.49.

Fitness function versus closed quotient from Eq. (2), shown here with penalty term p = 0

The model included four constraints, each with an associated penalty. The first constraint required that the simulation be successfully completed. If the simulation failed to finish, a penalty of $p_{fail}$ = 2.5, designating very poor fitness, was assigned. Excessive deformation to the point of overlapping elements was a common reason for failure. The second constraint was that the model had to close (e.g., CQ > 0). A penalty, $p_{CQ}$ , was assigned to models that never closed such that models that vibrated closer to the contact line would result in better fitness values than models that vibrated further from the contact line. If the model never touched the contact line, $p_{CQ}$ was calculated and applied as follows:

p_{CQ} = \frac{{{G W}_{\max} + d}_{closed}}{0.0010625}

(3)

where $G W_{\max}$ was the maximum glottal width and $d_{closed}$ was set equal to −0.026 mm (just offset from the position of the contact line). The value 0.0010625, or 1.0625 mm, was chosen based on trial and error to tune this penalty's weight for favorable algorithm progression. The third constraint required the frequency to fall within the male physiological range, here approximated as being between 85 and 150 Hz. A penalty, $p_{freq}$ , was assigned to any function outside of this range as follows:

p_{freq} = \{\begin{matrix} \frac{F_{0} - 150}{150} \times 1.5 i f F_{0} > 150 \\ \frac{85 - F_{0}}{85} \times 1.5 i f F_{0} < 85 \end{matrix}

(4)

where $F_{0}$ was the frequency of the glottal width waveform determined using an FFT calculation. The factor of 1.5 in Eq. (4) was included to give the penalty more weight in the algorithm's progression. The fourth constraint required that the model self-oscillate. A penalty, $p_{amp}$ , was assigned to low vibration amplitudes to encourage parameters that yielded model vibration. If the model vibration amplitude, Amp, was less than 6 × 10⁻⁵ m, $p_{amp}$ was calculated and applied as follows:

p_{amp} = 1 - (\frac{Amp}{6 \times 10^{- 5}})

(5)

where Amp was the difference between the maximum glottal width and the minimum glottal width. The value of $p$ in Eq. (2) was the sum of $p_{fail}$ , $p_{C Q}$ , $p_{freq}$ , and $p_{amp}$ .

A large population size, 50, with respect to the number of design variables, five, was chosen to promote diversity. This ensured that the first population contained simulations that successfully ran and exhibited self-sustained vibration. The selection was via a tournament selection process adopted with a dynamic tournament size such that the first two generations had tournament sizes of four and subsequent generations had tournament sizes of two. After the first two generations, a smaller selection pressure encouraged diversity. Optimization strategies of varying selection pressures, crossover, mutation, and partial elitism were used as described in Ref. [14].

2.3 Synthetic Model.

The findings identified by inspecting the genetic algorithm output were tested using a synthetic VF model that was fabricated as described by Taylor [14] and summarized below. In a computer-aided design software package (solidworks, Waltham, MA), two-dimensional geometries of the body, cover, and ligament layers were extruded 17 mm to create a three-dimensional model of each layer. The layer geometries were as shown in Fig. 2, with values of cover thickness, ligament thickness, and medial surface length as reported in Sec. 3.2. A stiff backing layer lateral to the body layer was also included as described by Taylor [14]. Following the general process established by Murray and Thomson [7,8], the body, ligament, and cover layers were sequentially cast using different liquid silicone mixtures (described below) in their respective molds, followed by a pouring of a silicone epithelium layer.

The body and epithelium layers were fabricated using the silicone compound Dragon Skin (Smooth-On, Inc., Macungie, PA) and Silicone Thinner (Smooth-On, Inc.) at a mixing ratio of 1:1:1 by weight (Part A Dragon Skin:Part B Dragon Skin: Thinner). The cover and ligament layers were fabricated using the silicone compound Ecoflex 00-30 (Smooth-On, Inc.) and Silicone Thinner at ratios of 1:1:6.5 and 1:1:5.5, respectively. Cylindrical samples (40 mm diameter, 2 mm thick) were fabricated and tested using an AR2000ex rheometer (TA Instruments, New Castle, DE) via frequency sweep test from 1 to 10 Hz at 6% strain. One sample of the epithelium material and two samples of the cover, ligament, and body materials were made and tested. The first sample was made immediately following the mixing of the material. After the layer and sample were allowed to cure the second rheometry sample was poured and cured.

The models were mounted and tested in a manner similar to that outlined in Murray and Thomson [7,8] and Murray et al. [9]. The models were tested in a hemilarynx configuration (see Fig. 4) with a compressed air supply connected to a plenum as the flow source. The model and acrylic plate were fastened to a platform at the top of 2.54 cm diameter tubing from the plenum. Subglottal pressure was measured using a pressure sensor (Omega PX26-005DV) located 1.7 cm upstream of the model. Frequency and CQ were measured using a high-speed camera (Phantom v1610, 8000 frames per second, 111.62 μs shutter speed, 512 × 512 pixel resolution, 15 pixels/mm resolution).

Experimental setup for synthetic VF model testing (not to scale)

3 Results

3.1 Optimization Results.

The optimization algorithm was allowed to proceed for 11 generations (550 simulations). The fitness values for all simulations, along with the average fitness value of each generation, are shown in Fig. 5. As can be seen, the algorithm showed favorable progression until reaching an average fitness of 0.03 at generation six, after which the average fitness for subsequent generations fluctuated between 0.04 and 0.12. As generations progressed, an increase in number of models with fitness results below 0.001 can be seen.

Average fitness of each genetic algorithm generation (large dark markers) and all individual models (small light markers)

Fitness and frequency versus CQ are plotted for all 550 simulations in Fig. 6. Of the 550 simulations, 259 resulted in fitness values of less than 0.001. The algorithm successfully favored models with CQs and frequencies within the targeted bounds for these output variables. The lowest fitness value of any one model was 0.000131, for which the frequency was 134 Hz and the CQ was 0.425. As seen in Fig. 6, a possible Pareto front seemed to emerge in the frequency versus CQ data indicating an apparent tradeoff between frequency and glottal closure, with higher frequencies being associated with larger CQs. A multi-objective algorithm with equal weighting of frequency and CQ may more definitively establish this Pareto front. The lowest fitness values denoted by the green symbols in Fig. 6 did not all lie along this Pareto front but were instead grouped between CQs of approximately 0.35 and 0.50 and frequencies between approximately 120 and 150 Hz.

Fig. 6.

Open in a new tab

Fitness (top) and frequency (middle) versus closed quotient for all genetic algorithm models. Green markers (gray in print version) denote models with fitness < 0.001. The solid line in the frequency versus closed quotient plot denotes possible emergence of a Pareto front where there is a tradeoff between frequency and closed quotient. A, B, and C are models that lie along this front, and D is a model with material and geometric parameters that were close to those of the fabricated synthetic model. Bottom: M5 outer profile geometry (far left), EPI geometry (2^nd from left), and model A–D geometries.

A closer inspection of three specific models (A, B, and C) along with the Pareto front yields insight into how the design variables affected CQ and frequency. Toward the middle-lower-left region of the front is model A with a frequency of 113 Hz, a CQ of 0.27, and a fitness value of 0.0128. Toward the upper-right region of the front is model C with a frequency of 210 Hz, a CQ of 0.69, and a fitness value of 0.806. In between these two models is model B, the model with the best fitness value (0.000313) of these three selected models. Model B vibrated at 125 Hz with a CQ of 0.45. Model D, a model with design variables that relatively closely matched the synthetic VF model (described further in Sec. 3.2), vibrated at 140 Hz with a CQ of 0.42 and yielded a fitness value of 0.000143, which was close to the overall minimum fitness value.

The design variable values of models A through D are listed in Table 2, and these models, along with the so-called M5 [20] and EPI [7,8] models, are illustrated in Fig. 6. One of the most notable differences between models A through D and the EPI and M5 models is the much longer medial surface length in models A through D. The extended medial surface lengths of models A through D, all of which were between 5.588 and 5.936 mm, suggest that this parameter was key in enabling the desired CQ values. This finding is consistent with that predicted by Zhang [3]. Additionally, the ligament layers of models A, B, and D were significantly thicker than of model C. Of the four present models, A had the thickest ligament layer (3.236 mm), whereas C had the thinnest (1.263 mm). Variations in cover layer thickness are evident, with model A also having the thickest cover layer (1.530 mm) and model C having the thinnest (0.919 mm). Considering these differences in cover and ligament layer thickness between models A and C, while also observing that model A exhibited the lowest frequency (113 Hz) and model C exhibited the highest (210 Hz), there thus appears to be a direct correlation between combined cover and ligament layer thickness and model frequency. This correlation is intuitive and consistent with physical principles (i.e., increased quantity of material of lower stiffness led to a lower frequency). The material properties of the two layers are also important to consider, however. From Table 2, models B and C had identical ligament modulus values (1.240 kPa) and comparable cover stiffness values (0.957 kPa for model B and 1.096 kPa for model C). Models B, C, and D all had softer cover layers than ligament layers. By contrast, the model A cover modulus (1.415 kPa) was greater than the ligament modulus (0.875 kPa).

Table 2.

Design variable, frequency, and CQ values of models A–D and of the fabricated synthetic model

Model	Cover E (kPa)	Lig. E (kPa)	Cover thickness (mm)	Lig. thickness (mm)	Medial surface length (mm)	Frequency (Hz)	CQ
A	1.415	0.875	1.530	3.236	5.851	113	0.27
B	0.957	1.240	0.921	2.790	5.588	125	0.45
C	1.096	1.240	0.919	1.263	5.588	210	0.69
D	0.644	1.017	1.022	2.633	5.936	140	0.42
Synthetic	0.63^a	0.96^a	0.955	2.704	5.953	133	0.32^b
EPI	0.224	1.132	1.5	1	0.1	102	0
M5-UNI	1.132	NA	1.5	NA	2	102	NA

Open in a new tab

Values for the EPI and M5-UNI models of Murray and Thomson [8] are included for reference. The reported frequency for the EPI model is for a tensioned EPI model.

From rheometer data.

Measured at a subglottal pressure 44% above onset pressure (see Table 3). The CQ value is the area-based estimate and includes some durations of incomplete glottal closure (see text); the VKG-based CQ estimate was 0.35.

Figure 7 shows the distributions of the five design variables for the 259 cases for which the fitness value was less than 0.001. The elastic modulus of the cover was somewhat uniformly distributed between 0.51 and 1.28 kPa, with eight cases falling outside of this range. The algorithm favored the values toward the middle of the ligament elastic modulus range (mostly between 0.88 and 1.52 kPa, with more than half being between 1.20 and 1.36 kPa). It is possible that cover and ligament elastic moduli below these ranges would have been more likely to have resulted in models that vibrated with larger deformation and resulted in model failure, no closure, and/or frequencies below 85 Hz, although this would need further investigation. Cover and ligament thickness values were predominantly grouped in the ranges of 0.50–1.10 mm and 2.25–2.95 mm, respectively, although some favorable models existed outside of these ranges. Finally, the algorithm favored the longest medial surface length allowed, with the minimum length being 4.53 mm and with 203 out of the 259 cases having lengths between 5.53 and 6.03 mm. Again, this supports the findings of Zhang [3] who reported that a longer medial surface would lead to larger CQ.

Design variable histograms for models where the fitness was 0.001 or below. Vertical axes denote number of simulations.

Profiles of the four models vibrating over one cycle of vibration are displayed in Fig. 8, and glottal width waveforms are provided in Fig. 9. (The glottal width was here calculated as twice the distance between the symmetry line and the nearest node that had originally been located along the initially-flat VF medial surface segment indicated in Fig. 2.) Mucosal wavelike motion and alternating converging-diverging intraglottal profiles, characteristics of human vocal fold vibration, are evident in Fig. 8. In Figs. 8 and 9 it can be seen that model A exhibited the largest maximum glottal width during vibration (approximately 1.2 mm) whereas model C exhibited the smallest (approximately 0.3 mm).

Profiles of models A, B, C, and D during one vibration cycle. For each case, only one vocal fold was actually simulated (assuming medial-lateral symmetry), and the opposing folds shown here are mirrors of the simulated folds and are included for visualization purposes only; medial-lateral placement of the opposing fold is approximate. The thin lines between opposing folds are the solid domain contact lines. The first frame of each case corresponds to the approximate time of maximum glottal opening. Variations in closure durations consistent with the CQ values in Fig. 6 and Table 2 are evident, as well as variations in amplitude that are consistent with the glottal width waveforms in Fig. 9.

Glottal width waveforms of models A, B, C, and D, manually shifted such that the start of each waveform is at time 0 s

The influence of fidelity on model output was preliminarily explored by simulating models A through D using meshes that contained, on average, four times as many elements in the fluid domains and 2.9 times as many elements in the solid domains as their respective original, coarser-meshed models. Frequency and CQ results for the original and refined meshes are shown in Fig. 10. Quantitative differences between original and refined model results can be seen, as would be expected since neither sets of models was grid-independent. Encouragingly, however, overall trends in model frequency and CQ results between the four cases were generally consistent between the original and refined models, the one exception being the frequency of model D being 15 Hz higher than that of model B with the original mesh but 8 Hz lower with the refined mesh.

Frequency (top) and CQ (bottom) for simulations with original and refined meshes

3.2 Synthetic Model.

A synthetic model was fabricated to initially explore whether the design variable trends identified via the optimization outcome might indeed be expected to result in a real model that would exhibit favorable CQ and frequency characteristics as predicted. The synthetic model geometric parameters are given in Table 2. Shear modulus values (i.e., the real part of the complex shear modulus) for the material samples described in Sec. 2.2 are plotted in Fig. 11 along with the average shear moduli at each frequency for the cover, ligament, and body layers. For each layer the average shear moduli (or the single set of shear modulus data in the case of the epithelium material) were then averaged over the frequency range to obtain a frequency-averaged shear modulus, $G$ . Using an assumed Poisson's ratio of $ν = 0.49$ , the corresponding frequency-averaged elastic modulus of each layer, $E$ , was estimated from $G$ using the following relation for linear elastic materials

E = 2 G (1 + ν)

(6)

Shear modulus versus frequency for individual test specimens (symbols) and corresponding averages (lines)

These calculations resulted in values of E = 26.9 kPa for the epithelium, 0.63 kPa for the cover, 0.96 for the ligament, and 50.2 kPa for the body.

Computational model D, which is referenced in Figs. 6, 8, 9, and Table 2, was one case from the genetic algorithm simulations with design variables close to those of the synthetic model. Specifically, the cover modulus of 0.644 kPa (see Table 2) was 2.2% greater than the measured synthetic model cover average modulus of 0.63 kPa. The other computational model D design variables were 5.9% greater (ligament modulus), 8.0% greater (cover thickness), 2.6% less (ligament thickness), and 0.4% less (medial surface length) than the corresponding synthetic model variables.

The synthetic VF model onset pressure, $p_{on}$ , was 0.71 kPa. High-speed imaging data were acquired at three different pressures: 1.02 kPa (1.44 $p_{on}$ ), 1.28 kPa (1.8 $p_{on}$ ), and 1.45 kPa (2.04 $p_{on}$ ). Images of the synthetic VF model over one period at a subglottal pressure of 1.45 kPa are shown in Fig. 12. Using a custom MATLAB script, glottal area versus time waveforms were obtained, as shown in Fig. 13, and video kymographs (VKGs) were generated as shown in Fig. 14.

Superior view from the high-speed camera of the synthetic VF model over one period with a subglottal pressure of 1.45 kPa. Every 8th frame is shown and each frame measures approximately 13 mm wide × 21 mm high. The dashed line in the first frame denotes the approximate anterior-superior location of the interrogation line for the video kymographs shown in Fig.14. Mucosal wave-like motion with alternating convergent-divergent shape is evident. Complete closure appears to be evident in the 4th frame, and incomplete closure in the 3rd, 5th, and 6th frames.

Glottal area versus time for synthetic VF models operating at subglottal pressures of 1.02 kPa (top), 1.28 kPa (middle), and 1.45 kPa (bottom). The closed phase in each of these graphs is denoted by the window between times t1 and t2, and the total period by the window between times t1 and t3. — Glottal area versus time for synthetic VF models operating at subglottal pressures of 1.02 kPa (top), 1.28 kPa (middle), and 1.45 kPa (bottom). The closed phase in each of these graphs is denoted by the window between times t₁ and t₂, and the total period by the window between times t₁ and t₃.

Synthetic model VKG at pressures of 1.02 kPa (top), 1.28 kPa (middle), and 1.45 kPa (bottom). White lines denote one complete cycle (long lines) and closed phase (short lines).

Frequencies were calculated using the glottal area waveforms, and the VKGs and the glottal area graphs were used to estimate CQ as reported in Tables 2 and 3. When calculating the CQ using the glottal area waveforms, the closed phase for each subglottal pressure was deemed to consist of the time from when the first glottal area waveform in Fig. 13 first fell below 1% of the respective maximum area and the time when the glottal area then steadily increased beyond 1% of the maximum area. These times are illustrated in Fig. 13. This approach thus included some time periods of complete as well as incomplete glottal closure (i.e., the latter being when a portion of the glottis was closed and a portion remained slightly open, leading to small nonzero fluctuations between times t₁ and t₂ in Fig. 13), but did not include the entire duration of incomplete glottal closure (see Ref. [3] for additional discussion of the use of complete and incomplete glottal closure for calculating CQ). The CQ estimate using the VKGs was accomplished through visual inspection as illustrated in Fig. 14. A line was drawn from one cycle peak to the next and another line was drawn from the instance when the model closed to the instance when the model opened again. The CQ was taken as the ratio of the shorter to longer lines. The CQ estimates from the two methods differed because the VKG was created using the center of the glottis, meaning the model appeared to be closed when the center closed, when in fact some other regions of the glottis may have been open (i.e., incomplete glottal closure). The glottal area graph took the entire glottis into account. Thus the VKG-based CQ was larger than the area-based CQ as can be seen in Table 3. It is important to note that these CQ estimates are somewhat subjective due to limitations associated with image resolution, viewing angle, brightness and contrast, image noise, and reflections from the acrylic plate used for the hemilarynx configuration (Fig. 4).

Table 3.

Frequency and CQ estimates of the synthetic VF model at three different subglottal pressures

	Pressure (kPa)	Frequency (Hz)	Area-based CQ	VKG-based CQ
1.44 $p_{on}$	1.02	133	0.32	0.35
1.80 $p_{on}$	1.28	129	0.35	0.39
2.04 $p_{on}$	1.45	127	0.22	0.41

Open in a new tab

Area-based CQ values include some durations of incomplete glottal closure as described in the text and as illustrated in Fig. 13.

As listed in Table 2, the synthetic model frequency at 1.44p_on was 133 Hz, which was 5% less than that predicted by computational model D. The area-based synthetic model CQ at 1.44p_on was 0.32, which was 31% less than that of computational model D. The VKG-based synthetic model CQ at 1.44p_on (0.35) was 20% less than that of the computational model D CQ. These differences are not unexpected given factors such as the computational model being two-dimensional, uncertainties in material properties, differences between measured and computational geometric and material property design variables, imperfections inherent in synthetic model fabrication and mounting processes, hemilarynx configuration (experiments) versus symmetric larynx configuration (simulations), and CQ measurement uncertainties. Further experiments would be needed to study these effects in more detail. Nevertheless, and notwithstanding the abovementioned differences and considerations, the predicted and measured frequencies were in close agreement and the synthetic model exhibited nonzero CQ as desired. These synthetic model data support the notion that isotropic synthetic VF models can be fabricated that exhibit frequencies and CQs in the desired ranges. However, as noted above, additional synthetic model fabrication, testing, and detailed analysis will be required to further confirm these results.

4 Conclusion

The geometric and material properties of a two-dimensional, low-fidelity finite element model of VF vibration were optimized using a genetic optimization algorithm. The optimization objective was to find parameters that would yield a CQ in the range of 0.36–0.49 and a frequency in the range of 85–150 Hz. The approach led to configurations that successfully yielded the desired response. The beginnings of a Pareto front were evident in the results, showing a tradeoff between frequency and CQ. A synthetic model was fabricated, tested, and compared to the predicted output of one of the cases returned by the optimization algorithm, with results showing promise that models may be able to be fabricated that will yield these desired responses.

Several limitations of the present study and areas for improvement are worth noting. First, because of “noise” in the CFD model results (e.g., small fluctuations in glottal width around the closed phase) and variations in waveforms (e.g., contrasting the waveforms of models A through D in Fig. 9), in some cases, the automated measures used by the optimization algorithm resulted in errors in calculating frequency and CQ. Thus the automated measures could be fine-tuned to minimize such errors. Other limitations of the optimization algorithm included the low-fidelity, two-dimensional nature of the model itself; the assumption of the model reaching steady-state by 0.07 s, which may not have been the case in some models; and the inability of some models to successfully solve due to exceedingly large deformations. Optimization on an increased number of parameters, such as anterior-posterior geometric variations (such as occurs in the human larynx), could be achieved if the model were to be extended to three dimensions, as has been preliminarily explored [21]. Other parameters of interest for similar future optimization studies include glottal entrance and exit radii, material anisotropy and nonlinearity, and vertical stiffness gradients. The genetic algorithm parameters could also be further explored and refined; for example, by testing the algorithm outcome dependence on crossover rate, mutation rate, tournament selection, and penalty weight. A derivative-based algorithm, starting with a model with favorable fitness, could lead to a more optimized result than reported here. The possible Pareto front formed by frequency and CQ was an interesting development. Future work to explore the outcomes predicted by this study using even higher fidelity models is recommended. Lastly, and importantly, the results of only one synthetic model are presented here, and further tests using additional synthetic models will be required to validate these results, including the pattern predicted by the Pareto front.

Notwithstanding these limitations, the outcome of the study includes: (1) demonstration of a method for using low-fidelity computational models coupled with an optimization algorithm to identify VF model configurations with desirable vibratory characteristics and (2) a materially-isotropic synthetic model that exhibits contact. It is anticipated that this optimization approach could be utilized in additional applications, such as optimization for different outcomes (e.g., acoustic) as well as applying the optimization algorithm to female voice characteristics, the latter of which are heretofore underrepresented in computational and synthetic vocal fold modeling efforts.

Acknowledgment

Portions of this work were included in oral presentations at the 2018 International Conference on Voice Physiology and Biomechanics (ICVPB) and at the 2019 Advances in Quantitative Laryngology (AQL) Conference. The contributions of Michael Farnsworth are gratefully acknowledged, specifically in writing the basis of the genetic algorithm and writing about the details of the algorithm in an academic report. The content of this paper is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Funding Data

National Institute of Health (NIH) (Grant No. R01DC005788; Funder ID: 10.13039/100000055).

Nomenclature

$C Q_{1}, C Q_{2}$ =: fitness function parameters
$CQ$ =: closed quotient
$E$ =: Young's modulus
$F_{0}$ =: frequency of glottal width waveform
$f$ =: fitness
$G$ =: shear modulus
$G W_{\max}$ =: maximum glottal width
$d_{closed}$ =: fitness penalty parameter
$f_{\min}, f_{\max}$ =: fitness function parameters
$k$ =: fitness function parameter
$OQ$ =: open quotient
$p$ =: fitness penalty
$p_{amp}$ =: amplitude-based fitness penalty
$p_{CQ}$ =: CQ-based fitness penalty
$p_{fail}$ =: dimulation success-based fitness penalty
$p_{freq}$ =: frequency-based fitness penalty
$α, β$ =: Rayleigh damping coefficients
$κ$ =: bulk modulus
$ν$ =: Poisson's ratio

References

[1]. Henrich, N. , d'Alessandro, C. , Doval, B. , and Castellengo, M. , 2005, “ Glottal Open Quotient in Singing: Measurements and Correlation With Laryngeal Mechanisms, Vocal Intensity, and Fundamental Frequency,” J. Acoust. Soc. Am., 117(3), pp. 1417–1430. 10.1121/1.1850031 [DOI] [PubMed] [Google Scholar]
[2]. Lohscheller, J. , Švec, J. G. , and Döllinger, M. , 2013, “ Vocal Fold Vibration Amplitude, Open Quotient, Speed Quotient and Their Variability Along Glottal Length: Kymographic Data From Normal Subjects,” Logop. Phoniatr. Voco., 38(4), pp. 182–192. 10.3109/14015439.2012.731083 [DOI] [PubMed] [Google Scholar]
[3]. Zhang, Z. , 2016, “ Cause-Effect Relationship Between Vocal Fold Physiology and Voice Production in a Three-Dimensional Phonation Model,” J. Acoust. Soc. Am., 139(4), pp. 1493–1507. 10.1121/1.4944754 [DOI] [PMC free article] [PubMed] [Google Scholar]
[4]. Holmberg, E. B. , Hillman, R. E. , and Perkell, J. S. , 1989, “ Glottal Airflow and Transglottal Air Pressure measure1ments for Male and Female Speakers in Low, Normal, and High Pitch,” J. Voice, 3(4), pp. 294–305. 10.1016/S0892-1997(89)80051-7 [DOI] [PubMed] [Google Scholar]
[5]. Baken, R. J. , and Orlikoff, R. F. , 2000, Clinical Measurements of Speech and Voice, Singular Publishing Group, Delmar, pp. 409–411. [Google Scholar]
[6]. Hodge, F. S. , Colton, R. H. , and Kelley, R. T. , 2001, “ Vocal Intensity Characteristics in Normal and Elderly Speakers,” J. Voice, 15(4), pp. 503–511. 10.1016/S0892-1997(01)00050-9 [DOI] [PubMed] [Google Scholar]
[7]. Murray, P. R. , and Thomson, S. L. , 2011, “ Synthetic, Multi-Layer, Self-Oscillating Vocal Fold Model Fabrication,” J. Vis. Exp., 58, p. 3498. 10.3791/3498 [DOI] [PMC free article] [PubMed] [Google Scholar]
[8]. Murray, P. R. , and Thomson, S. L. , 2012, “ Vibratory Response of Synthetic, Self-Oscillating Vocal Mold Models,” J. Acoust. Soc. Am., 132(5), pp. 3428–3438. 10.1121/1.4754551 [DOI] [PMC free article] [PubMed] [Google Scholar]
[9]. Murray, P. R. , Thomson, S. L. , and Smith, M. E. , 2014, “ A Synthetic Self-Oscillating Vocal Fold Model Platform for Studying Augmentation Injection,” J. Voice, 28(2), pp. 133–143. 10.1016/j.jvoice.2013.10.014 [DOI] [PMC free article] [PubMed] [Google Scholar]
[10]. Xuan, Y. , and Zhang, Z. , 2014, “ Influence of Embedded Fibers and an Epithelium Layer on the Glottal Closure Pattern in a Physical Vocal Fold Model,” J. Speech Lang. Hear. Res., 57(2), pp. 416–425. 10.1044/2013_JSLHR-S-13-0068 [DOI] [PMC free article] [PubMed] [Google Scholar]
[11]. Lodermeyer, A. , Bagheri, E. , Kniesburges, S. , Näger, C. , Probst, J. , Döllinger, M. , and Becker, S. , 2021, “ The Mechanisms of Harmonic Sound Generation During Phonation: A Multi-Modal Measurement-Based Approach,” J. Acoust. Soc. Am., 150(5), pp. 3485–3499. 10.1121/10.0006974 [DOI] [PubMed] [Google Scholar]
[12]. Motie-Shirazi, M. , Zañartu, M. , Peterson, S. D. , Mehta, D. D. , Kobler, J. B. , Hillman, R. E. , and Erath, B. D. , 2019, “ Toward Development of a Vocal Fold Contact Pressure Probe: Sensor Characterization and Validation Using Synthetic Vocal Fold Models,” Appl. Sci., 2019, 9(15), p. 3002. 10.3390/app9153002 [DOI] [PMC free article] [PubMed] [Google Scholar]
[13]. Motie-Shirazi, M. , Zañartu, M. , Peterson, S. D. , and Erath, B. D. , 2021, “ Vocal Fold Dynamics in a Synthetic Self-Oscillating Model: Contact Pressure and Dissipated-Energy Dose,” J. Acoust. Soc. Am,. 150(1), pp. 478–489. 10.1121/10.0005596 [DOI] [PMC free article] [PubMed] [Google Scholar]
[14]. Taylor, C. J. , 2018, “ Internal Deformation Measurements and Optimization of Synthetic Vocal Fold Models,” M.S. thesis, Brigham Young University, Provo, UT. [Google Scholar]
[15]. Smith, S. L. , and Thomson, S. L. , 2013, “ Influence of Subglottic Stenosis on the Flow-Induced Vibration of a Computational Vocal Fold Model,” J Fluids Struct., 38, pp. 77–91. 10.1016/j.jfluidstructs.2012.11.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
[16]. Shurtz, T. E. , and Thomson, S. L. , 2013, “ Influence of Numerical Model Decisions on the Flow-Induced Vibration of a Computational Vocal Fold Model,” Comput. Struct., 122, pp. 44–54. 10.1016/j.compstruc.2012.10.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
[17]. Latifi, N. , Heris, H. K. , Thomson, S. L. , Taher, R. , Kazemirad, S. , Sheibani, S. , Li-Jessen, N. Y. K. , Vali, H. , and Mongeau, L. , 2016, “ A Flow Perfusion Bioreactor System for Vocal Fold Tissue Engineering Applications,” Tissue Eng. Part C Methods, 22(9), pp. 823–838. 10.1089/ten.tec.2016.0053 [DOI] [PMC free article] [PubMed] [Google Scholar]
[18].ADINA R&D, Inc., 2021, “ ADINA Theory and Modeling Guide Volume I: ADINA,” ADINA R&D, Inc., Watertown, MA.
[19].ADINA R&D, Inc., 2021, “ ADINA Theory and Modeling Guide Volume III: CFD & FSI,” ADINA R&D, Inc., Watertown, MA.
[20]. Scherer, R. C. , Shinwari, D. , De Witt, K. J. , Zhang, C. , Kucinschi, B. R. , and Afjeh, A. A. , 2001, “ Intraglottal Pressure Profiles for a Symmetric and Oblique Glottis With a Divergence Angle of 10 Degrees,” J. Acoust. Soc. Am., 109(4), pp. 1616–1630. 10.1121/1.1333420 [DOI] [PubMed] [Google Scholar]
[21]. Vaterlaus, A. C. , 2020, “ Development of a 3D Computational Vocal Fold Model Optimization Tool,” M.S. thesis, Brigham Young University, Provo, UT. [Google Scholar]

[bib1] [1]. Henrich, N. , d'Alessandro, C. , Doval, B. , and Castellengo, M. , 2005, “ Glottal Open Quotient in Singing: Measurements and Correlation With Laryngeal Mechanisms, Vocal Intensity, and Fundamental Frequency,” J. Acoust. Soc. Am., 117(3), pp. 1417–1430. 10.1121/1.1850031 [DOI] [PubMed] [Google Scholar]

[bib2] [2]. Lohscheller, J. , Švec, J. G. , and Döllinger, M. , 2013, “ Vocal Fold Vibration Amplitude, Open Quotient, Speed Quotient and Their Variability Along Glottal Length: Kymographic Data From Normal Subjects,” Logop. Phoniatr. Voco., 38(4), pp. 182–192. 10.3109/14015439.2012.731083 [DOI] [PubMed] [Google Scholar]

[bib3] [3]. Zhang, Z. , 2016, “ Cause-Effect Relationship Between Vocal Fold Physiology and Voice Production in a Three-Dimensional Phonation Model,” J. Acoust. Soc. Am., 139(4), pp. 1493–1507. 10.1121/1.4944754 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] [4]. Holmberg, E. B. , Hillman, R. E. , and Perkell, J. S. , 1989, “ Glottal Airflow and Transglottal Air Pressure measure1ments for Male and Female Speakers in Low, Normal, and High Pitch,” J. Voice, 3(4), pp. 294–305. 10.1016/S0892-1997(89)80051-7 [DOI] [PubMed] [Google Scholar]

[bib5] [5]. Baken, R. J. , and Orlikoff, R. F. , 2000, Clinical Measurements of Speech and Voice, Singular Publishing Group, Delmar, pp. 409–411. [Google Scholar]

[bib6] [6]. Hodge, F. S. , Colton, R. H. , and Kelley, R. T. , 2001, “ Vocal Intensity Characteristics in Normal and Elderly Speakers,” J. Voice, 15(4), pp. 503–511. 10.1016/S0892-1997(01)00050-9 [DOI] [PubMed] [Google Scholar]

[bib7] [7]. Murray, P. R. , and Thomson, S. L. , 2011, “ Synthetic, Multi-Layer, Self-Oscillating Vocal Fold Model Fabrication,” J. Vis. Exp., 58, p. 3498. 10.3791/3498 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib8] [8]. Murray, P. R. , and Thomson, S. L. , 2012, “ Vibratory Response of Synthetic, Self-Oscillating Vocal Mold Models,” J. Acoust. Soc. Am., 132(5), pp. 3428–3438. 10.1121/1.4754551 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] [9]. Murray, P. R. , Thomson, S. L. , and Smith, M. E. , 2014, “ A Synthetic Self-Oscillating Vocal Fold Model Platform for Studying Augmentation Injection,” J. Voice, 28(2), pp. 133–143. 10.1016/j.jvoice.2013.10.014 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib10] [10]. Xuan, Y. , and Zhang, Z. , 2014, “ Influence of Embedded Fibers and an Epithelium Layer on the Glottal Closure Pattern in a Physical Vocal Fold Model,” J. Speech Lang. Hear. Res., 57(2), pp. 416–425. 10.1044/2013_JSLHR-S-13-0068 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib11] [11]. Lodermeyer, A. , Bagheri, E. , Kniesburges, S. , Näger, C. , Probst, J. , Döllinger, M. , and Becker, S. , 2021, “ The Mechanisms of Harmonic Sound Generation During Phonation: A Multi-Modal Measurement-Based Approach,” J. Acoust. Soc. Am., 150(5), pp. 3485–3499. 10.1121/10.0006974 [DOI] [PubMed] [Google Scholar]

[bib12] [12]. Motie-Shirazi, M. , Zañartu, M. , Peterson, S. D. , Mehta, D. D. , Kobler, J. B. , Hillman, R. E. , and Erath, B. D. , 2019, “ Toward Development of a Vocal Fold Contact Pressure Probe: Sensor Characterization and Validation Using Synthetic Vocal Fold Models,” Appl. Sci., 2019, 9(15), p. 3002. 10.3390/app9153002 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib13] [13]. Motie-Shirazi, M. , Zañartu, M. , Peterson, S. D. , and Erath, B. D. , 2021, “ Vocal Fold Dynamics in a Synthetic Self-Oscillating Model: Contact Pressure and Dissipated-Energy Dose,” J. Acoust. Soc. Am,. 150(1), pp. 478–489. 10.1121/10.0005596 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib14] [14]. Taylor, C. J. , 2018, “ Internal Deformation Measurements and Optimization of Synthetic Vocal Fold Models,” M.S. thesis, Brigham Young University, Provo, UT. [Google Scholar]

[bib15] [15]. Smith, S. L. , and Thomson, S. L. , 2013, “ Influence of Subglottic Stenosis on the Flow-Induced Vibration of a Computational Vocal Fold Model,” J Fluids Struct., 38, pp. 77–91. 10.1016/j.jfluidstructs.2012.11.010 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib16] [16]. Shurtz, T. E. , and Thomson, S. L. , 2013, “ Influence of Numerical Model Decisions on the Flow-Induced Vibration of a Computational Vocal Fold Model,” Comput. Struct., 122, pp. 44–54. 10.1016/j.compstruc.2012.10.015 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] [17]. Latifi, N. , Heris, H. K. , Thomson, S. L. , Taher, R. , Kazemirad, S. , Sheibani, S. , Li-Jessen, N. Y. K. , Vali, H. , and Mongeau, L. , 2016, “ A Flow Perfusion Bioreactor System for Vocal Fold Tissue Engineering Applications,” Tissue Eng. Part C Methods, 22(9), pp. 823–838. 10.1089/ten.tec.2016.0053 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib18] [18].ADINA R&D, Inc., 2021, “ ADINA Theory and Modeling Guide Volume I: ADINA,” ADINA R&D, Inc., Watertown, MA.

[bib19] [19].ADINA R&D, Inc., 2021, “ ADINA Theory and Modeling Guide Volume III: CFD & FSI,” ADINA R&D, Inc., Watertown, MA.

[bib20] [20]. Scherer, R. C. , Shinwari, D. , De Witt, K. J. , Zhang, C. , Kucinschi, B. R. , and Afjeh, A. A. , 2001, “ Intraglottal Pressure Profiles for a Symmetric and Oblique Glottis With a Divergence Angle of 10 Degrees,” J. Acoust. Soc. Am., 109(4), pp. 1616–1630. 10.1121/1.1333420 [DOI] [PubMed] [Google Scholar]

[bib21] [21]. Vaterlaus, A. C. , 2020, “ Development of a 3D Computational Vocal Fold Model Optimization Tool,” M.S. thesis, Brigham Young University, Provo, UT. [Google Scholar]

PERMALINK

Optimization of Synthetic Vocal Fold Models for Glottal Closure

Cassandra J Taylor

Scott L Thomson

Abstract

1 Introduction