Computation of physiological human vocal fold parameters by mathematical optimization of a biomechanical model

Anxiong Yang; Michael Stingl; David A Berry; Jörg Lohscheller; Daniel Voigt; Ulrich Eysholdt; Michael Döllinger

doi:10.1121/1.3605551

. 2011 Aug;130(2):948–964. doi: 10.1121/1.3605551

Computation of physiological human vocal fold parameters by mathematical optimization of a biomechanical model

Anxiong Yang ^1,^a), Michael Stingl ², David A Berry ³, Jörg Lohscheller ⁴, Daniel Voigt ⁵, Ulrich Eysholdt ⁶, Michael Döllinger ⁶

PMCID: PMC3195891 PMID: 21877808

Abstract

With the use of an endoscopic, high-speed camera, vocal fold dynamics may be observed clinically during phonation. However, observation and subjective judgment alone may be insufficient for clinical diagnosis and documentation of improved vocal function, especially when the laryngeal disease lacks any clear morphological presentation. In this study, biomechanical parameters of the vocal folds are computed by adjusting the corresponding parameters of a three-dimensional model until the dynamics of both systems are similar. First, a mathematical optimization method is presented. Next, model parameters (such as pressure, tension and masses) are adjusted to reproduce vocal fold dynamics, and the deduced parameters are physiologically interpreted. Various combinations of global and local optimization techniques are attempted. Evaluation of the optimization procedure is performed using 50 synthetically generated data sets. The results show sufficient reliability, including 0.07 normalized error, 96% correlation, and 91% accuracy. The technique is also demonstrated on data from human hemilarynx experiments, in which a low normalized error (0.16) and high correlation (84%) values were achieved. In the future, this technique may be applied to clinical high-speed images, yielding objective measures with which to document improved vocal function of patients with voice disorders.

INTRODUCTION

In recent years, voice scientists have utilized high-speed and multiple-camera imaging techniques to intensify their study of vocal fold dynamics. One important part of normal vocal fold vibration is the traveling wave which propagates along the vocal fold mucosa, commonly referred to as “mucosal wave propagation” presented by Hirano1 and Baer,2 and further developed by recent studies (e.g., Refs. 3, 4). The wave propagates not only in the transverse plane (lateral-longitudinal) but also in the sagittal plane (longitudinal-vertical) and coronal plane (lateral-vertical). The medial edge of the vocal fold exhibits wider and faster movements than more ventral and dorsal regions. Also, larger amplitudes of vibration are observed superiorly along the medial surface of the folds, as compared to more inferior regions.3, 5

Healthy vocal folds vibrate more symmetrically and periodically than vocal folds with pathologies.6 In clinical diagnosis, morphological pathologies of the larynx such as unilateral vocal fold polyps can be visually observed using a standard endoscope. However, some pathologies may not exhibit static, morphologic alterations, but only manifest themselves dynamically during phonation, e.g., functional dysphonias.7 Clinically, such voice disorders may exhibit left-right asymmetries or non-periodic oscillations, and usually result in a hoarse voice.8 These asymmetric and irregular vocal fold vibrations may be induced by pathological changes in biomechanical properties of vocal fold tissues.6, 7, 8, 9, 10, 11, 12

To determine whether an acoustic voice signal is abnormal, it is evaluated with respect to its perceived vocal quality, variability, pitch, and loudness.13 Such parameters may be dependent on subglottal pressure, vocal fold posture, tension,11, 14 and morphology.15 Common and widespread methods for diagnosing laryngeal diseases are generally classified into subjective and objective categories. Trained listeners use running speech for auditory assessment of voice quality, which is a subjective method capturing only the symptomatic information of a laryngeal disease.16 Indeed, discrepancies between auditory and laryngoscopic findings may occur.17 Moreover, the diagnosis and differentiation of voice disorders is often marked by ambiguity.9 Conventional visual methods like stroboscopy (high spatial resolution) and videokymography [high temporal resolution: approx. 8000 frames per second (fps)] may assist with clinical diagnosis. However, stroboscopy is limited by low temporal resolution,18 and videokymopraphy is limited by low spatial resolution (i.e., a single line of the sensor data).19 Not surprisingly, the increased temporal and spatial resolution offered by high-speed (HS) imaging of the vocal folds during phonation may further enhance clinical diagnosis.7, 20 Clinically, this is performed by using an endoscopic HS digital imaging technique (2000–4000 fps).21, 22, 23

Currently, analysis of HS recordings serves as basis for objective and quantitative measures of normal and pathological vocal fold vibrations.21, 23 However, so far endoscopic in vivo HS recordings only permit analysis of horizontal vocal fold dynamics along the vocal fold edge.21 While a recent attempt has been made to analyze vertical dynamics,21 most of the biomechanical characteristics of three dimensional (3D) mucosal wave propagation during phonation (i.e., medio-lateral, inferior-superior, and anterior-posterior)4, 5 are not captured by endoscopic imaging.3 Such 3D characteristics were first detected and investigated in in vitro laryngeal experiments.4, 24, 25 For deeper analysis of such vibration patterns, a biomechanical numerical 3D-multi-mass-model (3DM)26 was proposed. It was designed to simulate not only lateral and longitudinal vocal fold vibrations, but also vibrations in the vertical direction. The 3DM was a vertical extension of the previous two dimensional multi-mass-model (2DM),20 and was introduced in detail in Ref. 26. Numerical modeling of vocal fold dynamics and physiological interpretation of the biomechanical parameter values may help to increase our understanding of both normal and pathological voice production.27 For example, all the measured aspects of vocal fold dynamics (including amplitudes of vibration, phase delays, velocities and accelerations across a large region of tissue locations) were evaluated to reveal viscoelastic properties of vocal fold tissues.26 Significantly, even biomechanical parameters of pathological vocal folds may be deduced using the 3DM. Thus, the derived parameters may serve as an objective measure to quantify the physiological states of vocal fold vibration. For example, observed left-right asymmetries can be detected and further analyzed by comparing the resultant bilateral, biomechanical parameters of the model. Therefore, the 3DM provides a convenient way to describe the biomechanical properties of measured 3D vocal fold dynamics.

In recent attempts using the two-mass-models (2MM)7, 28, 29 and the 2DM20, 30 only 1D trajectories at one or more specific locations on the vocal fold edges along the longitudinal direction were fitted. Classification schemes between healthy and pathological vocal fold vibrations were presented. However, such models considered the vocal fold oscillations only in the lateral dimension. The corresponding dynamic behaviors in the remaining dimensions were negligible, due to undue restriction of the models. In this work, the 3D modeling approach is proposed to be able to reproduce the 3D vibrations of the entire medial surface of the vocal fold. This can be regarded as an important step for linking with the classification of the various 3D vocal fold vibratory modes as well as the production of the acoustic voice signal in future studies.

In order to refine the quantification of the biomechanical properties of human vocal folds and to support clinical diagnosis, the goal is to adapt the 3DM for both in vitro4, 31, 32, 33 and in vivo 3D HS recordings. To this end, an optimization procedure with a corresponding objective function for the 3DM adaption is presented. Modifications of the 3DM can be imposed by expressing the model parameters in terms of their initial values by introducing optimization factors, which influence the constants of spring, mass, subglottal pressure and rest position, respectively. As objective functions, minimizing the normalized error between the model adapted 3D trajectories and the trajectories to be fitted are chosen. The non-convex objective function contains a large number of local minima which make the optimization procedure complicated.7 Hence, a combination of global and local optimization techniques is applied, e.g., Particle Swarm Optimization,34 simulated annealing35 and Powell’s direction set method.36 In order to yield appropriate results progressive optimization sub-procedures are implemented along each side, each coronal cross section and each transverse plane within the 3DM. The optimization strategy is focused on a refinement, wherein gradually increasing numbers of individual mass elements are taken into account. Due to characteristics of the 3D vocal fold motions, special weighting coefficients for different dimensions, cross-sections and planes are considered. Overall the methodology of optimization gets more important, since the complexity of the 3D modeling increases. The optimization procedure is verified on 50 synthetically generated vocal fold vibrations encompassing five well-known glottal closure types.37, 38, 39 Finally, as an essential basis of the application of the 3D modeling to different laryngeal data (excised or in vivo) in future, the optimization procedure is demonstrated on data from an in vitro experiment. Once the biomechanical parameters have been derived using this method, an interpretable physiological representation of the vocal fold dynamics will be presented and discussed.

METHODS

3D biomechanical model

For modeling realistic 3D vocal fold dynamics a 3D-multi-mass-model was previously proposed.26 It consists of five transverse planes (from inferior to superior) with five coronal cross-sections (from dorsal to ventral). The subglottal pressure P^sub acts as a driving force for air flowing in the inferior to superior direction, causing vocal fold oscillations. Figure 1 shows a 3D view of the 3DM.

Schematic representation of the 3DM for biomechanical modeling of human vocal fold dynamics. Every mass element is elastically connected to a rigid body with an anchor spring. Moreover, at each side every mass is connected to its adjacent masses with springs in vertical and longitudinal directions. The indices denote the different planes s and columns i for the mass elements.

The model parameters serve as an approximation of tissue properties of the vocal folds, i.e., distributions of masses m_i_,_s, dampings r_i_,_s and different spring stiffness coefficients $k_{i, s}^{a}$ (anchor spring), $k_{i, s}^{υ}$ (vertical spring), $k_{i, s}^{l}$ (longitudinal spring). Index s = 1,…, 5 numbers the transverse planes from inferior to superior. Index i = 1,…, 5 is for labels of mass elements on the right side from dorsal to ventral, and i = 6,…, 10 is on the left side, see Fig. 1.

The biomechanical model can be described by a system of 50 Ordinary Differential Equations (ODE). For each mass element m_i,s the following differential equation holds:

m_{i, s} {\ddot{X}}_{i, s} = {\vec{F}}_{i, s}^{A} + {\vec{F}}_{i, s}^{V} + {\vec{F}}_{i, s}^{L} + {\vec{F}}_{i, s}^{C} + {\vec{F}}_{i, s}^{D},

(1)

The 3D position of each mass element is denoted as x_i,s=[x_i,s,y_i,s,z_i,s]^T. Its first and second derivatives with respect to time are the velocity ${\overset{\cdot}{x}}_{i, s}$ and acceleration ${\overset{··}{x}}_{i, s}$ , respectively. The applied forces acting on each mass element are defined as follows.

(1)
The anchor force ${\vec{F}}_{i, s}^{A}$ is due to the anchor spring $k_{i, s}^{a}$ and the damping $r_{i, s}^{a} \cdot$
(2)
The vertical coupling force ${\vec{F}}_{i, s}^{V}$ arises from the vertical spring stiffness $k_{i, s}^{υ}$ and the damping coefficient $r_{i, s}^{υ} \cdot$
(3)
The longitudinal coupling force ${\vec{F}}_{i, s}^{L}$ is caused by the longitudinal spring stiffness $k_{i, s}^{l}$ and the corresponding damping coefficient $r_{i, s}^{l} \cdot$
(4)
The collision impact force ${\vec{F}}_{i, s}^{C}$ is derived from the collision restoring spring $k_{i, s}^{c} \cdot$
(5)
The driving force ${\vec{F}}_{i, s}^{D}$ is generated by the glottal flow. It is derived from the subglottal pressure P^sub and the corresponding effective area using Bernoulli’s Law.

The different damping coefficients $r_{i, s}^{a}, r_{i, s}^{l}, r_{i, s}^{υ}$ are dependent on m_i_,_s and $k_{i, s}^{a}$ of the adjacent mass elements. The collision impact force ${\vec{F}}_{i, s}^{C}$ plays a vital role in phonation, especially for higher resonant frequency components.40 With the aid of the 3DM, not only the small amplitude oscillation but also the large amplitude oscillation are modeled. Nonlinearities of the deflection of muscles and ligaments cannot be ignored at large amplitudes of vibration and at high subglottal pressure.41, 42 As underlined by Titze43 human tissue does not exhibit linear stiffness characteristics. Also, Titze and Durham44 demonstrated that the fundamental frequency is significantly affected by nonlinearity in the stiffness. Therefore, to incorporate variation of fundamental frequency of large∕small oscillations with subglottal pressure into the 3DM, the nonlinear coefficient η for spring stiffness is set to 100 cm⁻².41, 45

The detailed mathematical definition of the model and the forces were described in Ref. 26. The ODE-system is solved by using the Runge-Kutta algorithm with a step-size of 0.25 ms.7

Automatic parameter optimization

In order to match the model dynamics with the dynamics of human vocal folds, the model parameters must be optimized. Compared to previous studies applying 2MM41 and 2DM,20 the 3DM has a high dimensionality captured by the spring stiffnesses $(50 k_{i, s}^{a}, 60 k_{i, s}^{l}, 60 k_{i, s}^{υ}),$ masses (50m_i,s), damping coefficients $(50 r_{i, s}^{a}, 60 r_{i, s}^{l}, 60 r_{i, s}^{υ}),$ rest positions $(50 x_{i, s}^{r} in 3 D)$ , subglottal pressure P^sub, and boundary conditions (including the vertical anchors and fixed positions at the ventral∕dorsal ends of vocal folds). Overall, the 3DM yields 531 degrees of freedom.

Initialization of model and optimization parameters

First, a default model configuration is set. That is, standard model parameters (e.g., stiffness, mass, thickness, rest position, etc.) are chosen to serve as an initial approximation of biomechanical properties of the vocal fold.26, 46 The initial values for the model parameters are chosen not only on the basis of previous 2MM41 and 2DM20, 30 configurations, but also on the basis of propagation properties of the mucosal wave, as observed in human laryngeal experiments.4, 25 A detailed description of the applied standard model parameters was presented previously.26 Thus, the configuration of the 3DM is imposed by expressing the model parameters $P^{sub}, m_{i, s}, k_{i, s}, x_{i, s}^{r}$ in terms of the previously introduced standard model parameters ${\tilde{P}}^{sub}, {\tilde{m}}_{i, s}, {\tilde{k}}_{i, s}, {\tilde{x}}_{i, s}^{r}$ In order to reduce the dimensionality of the parameter space and to decrease computational costs, certain model parameters were coupled together by introducing an optimization parameter set:30Q:=(Q_p,Q_i,s,Q_r), with index i,s for each mass element m_i_,_s within the 3DM. The parameters $Q_{i, s} : = (Q_{i, s}^{a}, Q_{i, s}^{l}, Q_{i, s}^{υ})$ influence the stiffness of anchor spring $k_{i, s}^{a},$ the stiffness of longitudinal spring k^l_i,s and vertical spring $k_{i, s}^{ν}$ , respectively. Using the optimization parameters Q^a_i_,_s, modifications of mass and anchor spring parameters (i.e., lateral stiffness) can be performed simultaneously, due to the mass-spring systems as well as the highest vibratory displacements in the lateral direction.4, 26 In addition, Q_p affects the subglottal pressure P^sub, and $Q_{r} : = (Q_{r}^{i, s, x}, Q_{r}^{i, s, y}, Q_{r}^{i, s, z})$ modifies the rest position $x_{i, s}^{r}$ of each mass element. Overall, the various model parameters can be derived completely from the initial standard model parameters:7, 20

P^{sub} = {\tilde{P}}^{sub} Q_{p},

(2)

m_{i, s} = {\tilde{m}}_{i, s} ∕ Q_{i, s}^{a} .

(3)

k_{i, s}^{ξ} = {\tilde{k}}_{i, s}^{ξ} . Q_{i, s}^{ξ}, ξ \in {a, l, υ},

(4)

x_{i, s}^{r} = {[{\tilde{x}}_{i, s}^{r} \cdot Q_{r}^{i, s, x}, {\tilde{y}}_{i, s}^{r} \cdot Q_{r}^{i, s, y}, {\tilde{z}}_{i, s}^{r} \cdot Q_{r}^{i, s, z}]}^{T} .

(5)

Moreover, it is also considered that the damping coefficients r_i_,_s are modified as ${\tilde{r}}_{i, s} \cdot Q_{d} .$ Corresponding scaling factor ${\tilde{Q}}_{d}$ is initialized to 1, since the damping coefficients were initialized with the relationship among the standard values of stiffness and mass as well as damping ratio.26 However, they are negligible in validation of the optimization, because their effects on model-generated dynamics are small enough, compared to the stiffness and mass.26 After reducing the dimensionality, it results in 491 Q scaling factors to be optimized.

Since the model contains 50 simple mass-spring-oscillators with mass m_i_,_s and reciprocal coupling spring constant k_i_,_s, the fundamental frequency f_i_,_s for each mass can be approximated as follows:7

f_{i, s} = \frac{1}{2 π} \sqrt{\frac{k_{i, s}}{m_{i, s}} \cdot}

(6)

To get initial parameters ${\tilde{Q}}_{i, s}$ Eq. 6 can be rewritten:7

{\tilde{Q}}_{i, s}^{ξ} = 2 π f_{i, s} \sqrt{\frac{{\tilde{m}}_{i, s}}{{\tilde{k}}_{i, s}^{ξ}}},

(7)

which shows that the optimization parameter Q_i_,_s primarily affects the fundamental frequency of oscillation.28

The optimization parameter Q_p mainly influences the oscillation amplitude.30 An appropriate value of the initial optimization parameter ${\tilde{Q}}_{p}$ yields smaller amplitude differences between the curves c_i,s[n] to be fitted and the model generated curves $c_{i, s}^{M} [n],$ where n = 1,…, N denotes the frame number. Namely, by comparing the amplitude differences under different values (2–35 cm H₂O) of subglottal pressure p, the initial value of ${\tilde{Q}}_{p}$ is determined:

{\tilde{Q}}_{p} : = \frac{1}{{\tilde{p}}^{sub}} \cdot \arg \min_{p} (\frac{1}{5 \cdot 10} \sum_{s = 1}^{5} \sum_{i = 1}^{10} ∥ A_{i, s} - A_{i, s}^{m} (p) ∥ 2),

(8)

A_{i, s} : = \frac{1}{2} \cdot {∥ \max_{n} (c_{i, s} [n]) - \min_{n} (c_{i, s} [n]) ∥}_{2},

(9)

A_{i, s}^{M} (p) : = \frac{1}{2} \cdot ∥ \max_{n} (c_{i, s}^{M} [n]) - \min_{n} (c_{i, s}^{M} [n]) ∥_{2},

(10)

where A_i_,_s denotes the amplitude of the experimental 3D data. $A_{i, s}^{M} (p)$ is the amplitude of the 3D curves generated by 3DM with a specified subglottal pressure value.

Additionally, the optimization parameter ${\tilde{Q}}_{r}$ is initialized to 1, since the rest positions were initialized with the standard values.

Objective function

In general, the optimization of the 3DM compares the dynamic properties between the adapted model-generated 3D trajectories and the experimental 3D trajectories, such as amplitudes and corresponding correlations.

In this work, unlike the definition of the objective function in 2DM20 which also used a glottis area consistency measure, a trajectory consistency measure in 3D is applied. Additionally, according to the previous studies of laboratory larynx experiments,4, 24, 47 it is revealed that the displacements along three orthogonal directions are highest in the superior region of the vocal folds. Moreover, the vocal fold displacements in the lateral direction are significantly higher than those in the other directions, while the longitudinal displacements are the smallest. In consideration of such characteristics, different weighting coefficients for each mass element in each dimension, each coronal cross section and each transverse layer are taken into account within the computation of the objective function, so that the main components of vocal fold dynamics can be determined. For the purpose of minimizing the error between the experimental 3D trajectories c_i_,_s[n] and the theoretical trajectories $c_{i, s}^{M} [n]$ of the 3DM, the following objective function r which measures the error is suggested:

\begin{matrix} Γ (Q) : = \sum_{a = 1}^{5} \sum_{i = 1}^{10} {g_{i}^{m} \cdot g_{s}^{p} \cdot [\frac{\sum_{n = 1}^{N} ∥ g_{μ}^{d} \cdot (c_{i, s} [n] - c_{i, s}^{M} [n]) ∥_{2}}{\sum_{n = 1}^{N} {∥ g_{μ}^{d} \cdot c_{i, s} [n] ∥}_{2}}]}, \end{matrix}

(11)

with

\begin{matrix} c_{i, s} = {[c_{i, s, x}, c_{i, s, y}, c_{i, s, z}]}^{T}, \\ c_{i, s}^{M} = {[c_{i, s, x}^{M}, c_{i, s, y}^{M}, c_{i, s, z}^{M}]}^{T}, \end{matrix}

(12)

where μ denotes x- (lateral), y- (longitudinal), and z-dimension (vertical), respectively. i = 1,…, 10 denotes the number of the mass elements at each plane. s = 1,…, 5 denotes the number of the planes. Equation 11 corresponds to the energy of the error normalized to the energy of the experimental trajectories.30 Hence, the objective function Г can be regarded as a measure for the normalized relative error.

The weighting coefficients for 3D trajectories are introduced as $g_{μ}^{d}, g_{i}^{M} and g_{s}^{p} \cdot$ They refer to different dimensions (superscript d) and different mass elements (superscript m) as well as different planes (superscript p), respectively. All of the weighting coefficients are located in the range between 0 and 1. By observing the mucosal wave dynamics of the vocal folds in hemilarynx experiments3, 4, 24 and excised canine larynx experiments,47, 48 the lateral displacement (x-component) is in general highest. The corresponding local vertical displacement maxima (z-component) predominantly range between longitudinal (y-component) and lateral values.4, 25 Based on these reasons, the weighting $g_{μ}^{d}$ is derived as

g_{μ}^{d} : = \frac{\sum_{s = 1}^{5} \sum_{i = 1}^{10} [\max_{n} (c_{i, s, μ} [n]) - \min_{n} (c_{i, s, μ} [n])]}{\sum_{s = 1}^{5} {\sum_{i = 1}^{10} ∥ \max_{n} (c_{i, s} [n]) - \min_{n} (c_{i, s} [n]) ∥}_{1}},

(13)

where n = 1,…, N.

In addition, the 3D displacements differ along the length of the vocal folds. Usually the highest amplitude of displacement is found midway between anterior and posterior ends of the fold.4, 49 The 3D displacements at the anterior and posterior ends of the vocal folds are lower, being almost fixed during oscillation.24, 50 Larger and faster movements occur in the central section of the vocal fold than near the glottal extremities.5 According to these properties, we assume the weighting $g_{i}^{m}$ gm corresponding to each mass element as follows:

g_{i}^{m} : = \frac{\sum_{s = 1}^{5} (\max_{n, n'} ∥ c_{i, s} [n] - c_{i, s} [n'] ∥_{2})}{\sum_{i = 1}^{10} \sum_{s = 1}^{5} (\max_{n, n'} ∥ c_{i, s} [n] - c_{i, s} [n'] ∥_{2})},

(14)

where n′ = n + 1 denotes the frame number. Here, n = 1,…, N – 1.

From earlier studies of hemilarynx experiments, it was concluded that the superior part of the vocal fold exhibits greater amplitudes of vibration.47 Therefore, for optimization, 3D movements in the upper planes receive relatively more attention than movements in the lower planes. The weighting $g_{s}^{p}$ for each plane from inferior to superior is defined as follows:

g_{s}^{p} : = \frac{\sum_{i = 1}^{10} (\max_{n, n'} ∥ c_{i, s} [n] - c_{i, s} [n'] ∥_{2})}{\sum_{s = 1}^{5} \sum_{i = 1}^{10} (\max_{n, n'} ∥ c_{i, s} [n] - c_{i, s} [n'] ∥_{2})},

(15)

where n = 1,…, N – 1.

The weighting coefficients are computed based on 3D trajectories from in vitro hemilarynx experiments:4

g_{μ}^{d} = (0.5, 0.1, 0.4), g_{s}^{p} = (0.1, 0.1, 0.3, 0.3, 0.2),

(16)

g_{i = 1, . . ., 5}^{m} = g_{i = 6, . . ., 10}^{m} = (0.095, 0.1, 0.11, 0.1, 0.095) .

(17)

These values are used in the following validation and application of the optimization procedure.

A second criterion for judging the quality of fit is the correlation coefficient κ. It measures how closely the shapes of the optimized 3D trajectories match the experimental 3D trajectories. Thus, the similarity and phases of the trajectories are taken into account:

\begin{matrix} κ : = \sum_{s = 1}^{5} \sum_{i = 1}^{10} [g_{i}^{m} \cdot g_{s}^{p} \cdot (g_{x}^{d} \cdot κ_{i, s, x} + g_{y}^{d} \cdot κ_{i, s, y} + g_{z}^{d} \cdot κ_{i, s, z})], \end{matrix}

(18)

with

k_{i, s, μ} : = \frac{〈 c_{i, s, μ} [n], c_{i, s, μ}^{M} [n] 〉}{{∥ c_{i, s, μ} [n] ∥}_{2} {∥ c_{i, s, μ}^{M} [n] ∥}_{2}} \cdot 100 % .

(19)

Now, the condition for the quality of the optimization procedure is defined as

(Γ < 0.2) \land (κ \geq 75 %) .

(20)

If this is fulfilled, the optimization quality can be regarded as sufficiently successful. The lower bound 75% for κ is on the basis of results of Wurzbacher et al.28 The upper bound 0.2 for Г is defined in accordance with the relative endoscopic HS image processing error.28

Optimization algorithm

Since the resulting optimization problem is nonconvex,7 applying only local (gradient) optimization techniques will not yield sufficiently accurate results. Therefore, various combinations of global and local optimization algorithms are applied. During the optimization procedure, the global optimization algorithm is used to determine preliminary parameter values. Then, on the basis of the preliminary minimum, a local optimization algorithm is applied to find parameter values that are even closer to the global minimum of the objective function Г. However, reaching the global minimum cannot be guaranteed in nonconvex optimization problems. For controlling the switch of the global optimizations, the following heuristic condition is defined as κ < 80%. If this condition is fulfilled, both the global and local optimization algorithms are sequentially applied. Otherwise, only the local optimization algorithm is carried out. This switch condition is implemented once at the beginning of each phase within each optimization sub-procedure, which is introduced in the next section.

For the selection of the optimization algorithm, conjugate gradient algorithms (e.g., Fletcher-Reeves) and variable metric algorithms (e.g., Broyden-Fletcher-Goldfarb-Shanno)36 have proven to be inadequate and relatively time-consuming because of the complex partial derivatives of the objective function. Thus, in this work, the more suitable global methods for minimizing the objective function are the Particle Swarm Optimization (PSO) and the Simulated Annealing (SA). Briefly, the PSO is a population-based stochastic optimization technique.34, 51 It was first used to model the social behavior of bird flocking, bee swarming or fish schooling.52 Similar to Genetic Algorithms (GA),53 PSO also uses many evolutionary computation techniques,54 including random searches, fitness values, iteration time, and so on. The system is initialized with a population of potential solutions and searches for optima by updating generations. Each individual (particle) traverses the problem space with its own vector, which determines its next trajectory following the optimum particle. However, no evolution operators such as crossover and mutation are applied in PSO. Therefore, PSO has the advantage of simply controlling and robustly adjusting parameters. The SA is known to work as a generic probabilistic metaheuristic for optimization problems of large scale, especially the ones in which a desired global extremum is hidden among many local extrema. A detailed description of Simulated Annealing can be found in Ingber.35 Accordingly, we choose a combination between PSO and SA to globally seek the minimum of the objective function, since in a study of metaheuristic methods54 Ercan confirmed that PSO-SA combinations outperform the basic PSO algorithm. A heuristic condition (Γ≥0.1)∨(κ≤90%) is defined as a switch to the PSO-SA combination. Namely, if it is fulfilled, the PSO will be implemented first, followed by the SA (with the results of the PSO used as initial solutions for the SA). Otherwise, only the SA will be executed. Moreover, for the local optimization algorithm, the Powell’s Direction Set Method (PDSM)30 [which belongs to the class of conjugate direction algorithms] is adapted, which in this instance has been proven to be faster and more stable than other optimization algorithms such as the Nelder-Mead algorithm.7

Compared to the use of any single algorithm (PSO, SA, or PDSM), the consecutive use of different optimization algorithms improves the consistency and stability in approximating the global minimum. However, it may also pose a computational burden. To better clarify the proposed optimization algorithm, a flow chart is sketched in Fig. 2. It is implemented in each phase of the optimization procedure, which is introduced in the following section.

Flow chart of the combined optimization algorithms. PSO, SA, and PDSM are applied to optimize the parameters Q to fit the model-generated dynamics $c_{i, s}^{M} [n]$ to the vocal fold vibrations c_i_,_s[n]. This is performed in each optimization phase, Fig. 3.

Progressive optimization procedure

In optimizing the 3DM, a progressive concept is introduced, wherein the search space consists of the following optimization parameters: (Q_p,Q_i,s,Q_r). Due to the location of each mass element in 3D, three aspects (plane, side, cross section) are taken into account, which affect the optimization of the 3DM. In accordance with various combinations among the three aspects, the so-called progressive optimization procedure consists of three main coarse optimization sub-procedures (Fig. 3). The basic idea of the optimization procedure is to gradually adapt the model parameters from a rough state to a more refined state, to overcome the difficulties of multidimensional optimization. In this work, the simple global optimization approach has proven to be inconsistent in fitting model-generated dynamics to experimental data. However, using a hierarchical coarse optimization, the global minimum can be consistently and stably approached.

General flow chart of the hybrid optimization procedure. $({\tilde{Q}}_{p}, {\tilde{Q}}_{i, s}, {\tilde{Q}}_{r},)$ represent the initial optimization parameters. $({\hat{Q}}_{p}, {\hat{Q}}_{i, s}, {\hat{Q}}_{r})$ represent the best optimization parameters. There are three *coarse* sub-procedures and a fine optimization process. Each *coarse* sub-procedure has three *phases*. In this flow chart, m, n, l temporarily denote the index of the *coarse* sub-procedure, the *phase*, and the loop, respectively.

Within each sub-procedure, three phases are implemented one after another (see Fig. 4) Moreover, within each phase, the combination (Sec. 2B3) between global optimization algorithm (PSO-SA) and local optimization algorithm (PDSM) is implemented. Switching from the current phase to the next phase occurs when the objective function r falls below a predefined threshold or when the maximum iteration value is reached. The configuration of each individual optimization procedure is detailed below:

(1)
PSO. The number of particles is set to 100. The maximum number of iterations is 1000. An inertia weight of the particle is 0.73. Two acceleration constants of the particle, which represent the cognitive and social learning of the particle, are set to 1.49.
(2)
SA. The number of steps is set to 2. The maximum iteration number at each step is 300. The temperature is initialized to 10. The density of the randomly chosen points of the initial simplex is 0.1. For determining a local extremum, a so-called fractional tolerance is set to 0.001, which measures the difference between the objective function values in two consecutive steps. The non-linear constant is set to 2. The more nonlinear the objective function is, the lower the non-linear constant should be set.
(3)
PDSM. The maximum iteration number is set to 600. A parabolic extrapolation algorithm “Parabolic-Bracketing” is used as the bracketing algorithm. To determine where an extremum value lies, a minimum distance between the end points of any bracketing interval is set to 0.5. The localization algorithm in PDSM is Brent’s algorithm.36 Similarly, a fractional tolerance is set to 0.001. In addition, a tolerance value which is used to check the exit condition for PDSM is set to 0.5. As a rule, the smaller the tolerance value, the more accurate the result of PDSM algorithm. However, the algorithm also becomes more computationally demanding.

Block charts for the three course optimization sub-procedures. Each sub-procedure is divided into three phases: Along each row, the intercommunities are vertical, lateral, and longitudinal, respectively.

The value assignments are based on heuristic analysis and experience. More detailed descriptions of the applied algorithms can be found in Refs. 34, 36.

Additionally, in each phase, control parameters $α_{ν} : = (α_{ν}^{a}, α_{ν}^{l}, α_{ν}^{υ}, α_{ν}^{r})$ serve as scale factors to control the modification of the optimization parameters Q:

Q_{new} = α_{ν} \cdot Q_{old},

(21)

where Q_old and Q_new are the optimization parameters before and after each phase, respectively. Thus, the optimization parameters can be continuously updated in each phase.

The concept of three sub-procedures and their individual conditions for switching to the next sub-procedure are introduced in the following sections. Each sub-procedure (coarse 1–3) is terminated if its cycle index reaches a recommended limit value of l = 10.

In addition, at the end of the optimization procedure a fine optimization process (Fig. 3) is implemented. In this process, all of the model parameters are one-to-one optimized, in order to further improve the optimization results.

Overall, in each phase of the optimization procedure, the objective function r either decreases or remains momentarily unchanged. Г is bounded by greatest lower bound 0, which indicates no error.

Sub-procedure coarse 1

The purpose of this sub-procedure is to roughly adapt the model dynamics to the experimental data. It is continuously performed until the following condition is no longer fulfilled:

(Γ_{l} < Γ_{l - 1}) \land (k < 92 %) \land (l < 10),

(22)

where the subscript l denotes the index of the loop. l – 1 and l represent two consecutive loops. The threshold value of κ is defined as 92%, based on experience gained in former work.28 In general, it is separated into three phases [Figs. 4a, 4b, 4c]:

(1)
Vertical optimization. Parameters for each plane are modified with the same factor α_ν_=1,_…_,5.
(2)
Lateral optimization. Parameters for each side are modified with the same factor α_ν with
$v = {\begin{matrix} 1 f o r i = 1, ..., 5, \\ 2 f o r i = 6, ..., 10. \end{matrix}$ (23)
(3)
Longitudinal optimization. Parameters for each cross section are modified with the factor α_ν with
$v = {\begin{matrix} i f o r i = 1, ..., 5, \\ i - 5 f o r i = 6, ..., 10. \end{matrix}$ (24)

Sub-procedure coarse 2

The parameters resulting from coarse 1 are set as initial values for this sub-procedure. The exit-condition for coarse 2 is defined as follows:

(Γ_{l} <_{l - 1}) Λ (Γ > 0.04) Λ (l < 10),

(25)

where 0.04 is a heuristic threshold value for the normalized error, in accordance with results presented in Wurzbacher et al.28 Three phases are employed as follows [see Figs. 4d, 4e, 4f]:

(1)
Vertical-lateral optimization Parameters for each side of each plane are modified with the same factor α_ν, with:
$v = {\begin{matrix} s, f o r i = 1, ..., 5, \\ s + 5, f o r i = 6, ..., 10 . \end{matrix}$ (26)
(2)
Lateral-longitudinal optimization Parameters for each side of each cross section are modified with the same factor α_ν, with ν = i.
(3)
Longitudinal-vertical optimization Parameters for each pair of mass-spring-oscillators are modified with the same factor a_ν with:
$v = {\begin{matrix} 5 \cdot (s - 1) + i, f o r i = 1, ..., 5, \\ 5 \cdot (s - 2) + i, f o r i = 6, ..., 10. \end{matrix}$ (27)

Sub-procedure coarse 3

The aim of this sub-procedure is to adapt the optimization parameters even more thoroughly, on the basis of the preceding coarse 2. The exit-condition for coarse 3 is defined as follows:

(Γ_{l} < Γ_{l - 1}) \land (k < 95 %) \land (l < 10),

(28)

where 95% is defined as the threshold value of κ in this sub-procedure, based on former work28 as well as the preceding sub-procedures. Three phases are implemented as follows [see Figs. 4g, 4h, 4i]:

(1)
Plane optimization. Parameters for the 10 mass elements (m_1,s,…, m_10,s) at the current plane s are allowed to be modified with different control parameters a_ν_=1,…,10, while the parameters for mass elements at the other four planes are not adjusted.
(2)
Side optimization. Parameters for 25 mass elements on the current side are modified with different control parameters a_ν with ν = 1,…, 25.
(3)
Cross section optimization. Parameters for 10 mass elements at the current cross-section are modified with different control parameters a_ν=_1,…,10, while the parameters for the mass elements at the other cross-sections are not modified.

Validation and application

Synthetic data

In order to evaluate the reliability and capability of the optimization procedure, synthetically generated data are analyzed. Fifty sets of synthetic, laterally-symmetric 3D trajectories are produced by the 3DM with known optimization parameter values (see TableTABLE I.). To simulate a wide variety of vocal fold vibrations (i.e., frequencies for males and females, amplitudes between 0.3 and 1.2 mm),55 the scale factors of the predefined model parameters are randomly set in the follow ing range: $Q_{p}^{S} \in {0.5, 0.6, . . ., 2.7}, Q_{i, s}^{S} \in {0.5, 0.6, . . ., 3.1}$ , $Q_{r}^{S} \in {0.6, 0.7, . . ., 1.2} .$ By using these values, masses, elastic modulus of the model, the subglottal pressure, and the rest positions are correspondingly modified.

Table 1.

Fifty synthetic data sets with predefined parameters $(Q_{p}^{S}, Q_{i, s}^{S}, Q_{r}^{S})$ to be optimized. The data are separated by glottal closure types (Fig. 5) and gender.

GCT	Gender	No.	$Q_{p}^{S}$	$Q_{i, s}^{S}$	$Q_{r}^{S}$	Frequency (Hz)
RA	male	5	0.8 ± 0.2	0.9 ± 0.2	0.9 ± 0.2	110 ± 10
RA	female	5	2.2 ± 0.3	1.5 ± 0.6	0.8 ± 0.2	216 ± 22
HG	male	5	0.7 ± 0.1	0.9 ± 0.2	0.9 ± 0.1	110 ± 11
HG	female	5	2.1 ± 0.4	1.4 ± 0.5	0.8 ± 0.1	206 ± 19
TPD	male	5	0.7 ± 0.1	0.9 ± 0.1	0.9 ± 0.2	114 ± 6
TPD	female	5	1.9 ± 0.3	1.5 ± 0.6	1.0 ± 0.2	209 ± 15
TPV	male	5	0.9 ± 0.1	0.9 ± 0.2	0.9 ± 0.2	113 ± 10
TPV	female	5	1.8 ± 0.1	1.4 ± 0.5	0.9 ± 0.1	204 ± 17
CV	male	5	0.9 ± 0.3	0.9 ± 0.2	21.0 ± 0.1	114 ± 5
CV	female	5	2.0 ± 0.6	1.5 ± 0.6	0.8 ± 0.2	226 ± 33

Open in a new tab

From a physiological and clinical point of view, the glottal closure type (GCT) is described as an important aspect of laryngeal behavior.38, 56 Hence, in generating synthetic data, five well-known glottal closure types are considered: rectangle (RA), hourglass (HG), triangular-pointed dorsal (TPD), triangular-pointed ventral (TPV) and convex (CV) (see Fig. 5 and Table TABLE I.). Since the mean fundamental frequency for males is considerably lower than for females,57 the frequency range of the 50 synthetic dynamics is roughly divided into two groups: ≤120 Hz for male, ≥180 Hz for female. Results from Sulter et al.58, 59 indicate that GCT and glottal chink locations are different for males and females. Thus, in this work, GCT and gender are regarded as two factors that influence optimization accuracy. A variety of vocal fold behaviors was produced for different GCT and gender types, in order to validate optimization of the 3DM as thoroughly as possible.

Schematic representation of occurring glottal closure-types and corresponding modeling: (a) rectangle (RA), (b) hourglass (HG), (c) triangular-pointed dorsal (TPD), (d) triangular-pointed ventral (TPV), (e) convex (CV).

An accuracy value is defined to measure how closely the optimized parameters ( ${\hat{Q}}_{p}, {\hat{Q}}_{i, s}, {\hat{Q}}_{r}$ ) approximate the predefined values $(Q_{p}^{S}, Q_{i, s}^{S}, Q_{r}^{S})$ :

λ : = (1 - \frac{| \hat{Q} - Q^{S} |}{Q^{S}}) \cdot 100 % .

(29)

Overall, the assessments of similarity between the adapted dynamics $c_{i, s}^{M} [n]$ and the synthetic dynamics $c_{i, s}^{S} [n]$ are implemented by using the following: the accuracy λ of optimization parameters, the correlation coefficient κ verifying the similarity between both 3D trajectories, and the objective function (i.e., normalized error) Г. In this way, the performance of the optimization procedure and the corresponding reproducibility of the optimization results could be evaluated.

Hemilarynx experimental data

As a prelude to the future optimization of both in vivo and in vitro with physiological relevance,4 the optimization procedure is also applied to an in vitro hemilarynx experimental data set, which includes recorded 3D trajectories of 25 marker-points attached along the medial vocal fold surface. Vertical distance between marker-points was approximately 1.7 ± 0.2 mm, and horizontal distance was approximately 2.0 ± 0.2 mm. A detailed description of the hemilarynx experiments can be found in Refs. 4, 25, and 31. For comparison with the dynamics of the 3DM (which includes both left and right vocal folds), bilateral symmetry was assumed for the in vitro hemilarynx data set, since the dynamics of only one vocal fold could be observed in hemilarynx experiments. The fundamental frequency was 120 Hz, and the subglottal pressure was 22.6 cm H₂O.

With the proposed parameter optimization procedure, a set of optimization parameters $({\hat{Q}}_{p}, {\hat{Q}}_{i, s}, {\hat{Q}}_{r})$ is derived from the adapted 3DM. To adapt the 3D rest positions of the 50 mass elements (i.e., for each mass element in each dimension) 150 ${\hat{Q}}_{r}$ are computed, since the adaptations of the complex marker-point locations would not be accurate if only one scale factor were used to modify all of the 3D rest positions.

RESULTS

Validation of optimization

Global measures

TableTABLE II. shows the accuracy λ, the correlation coefficient κ and the normalized error Г for the fifty synthetic data sets. The best value of r occurred in the glottal closure type RA with 0.05 ± 0.03, while the worst occurred in HG and TPV with 0.09 ± 0.03. For κ, the best value (97% ± 2%) occurred in TPD and CV, while the worst (94% ± 2%) was in HG. The value of λ was the highest in RA with 92% ± 5%, while the lowest (89% ± 3%) was in HG. The measures by different genders were essentially equal, except for Г (better value for males). Overall, the measures (Г, κ, λ) were 0.07 ± 0.03, 96% ± 2% and 91% ± 4%, averaged over all 50 synthetic data sets. The corresponding ranges for these values were 0.007–0.153, 90%–100%, and 83%–99%, respectively. Additionally, the given symmetry ratio (perfect symmetry= 1) right to left was reproduced as 1.01 ± 0.08 in an acceptable range (0.8 to 1.2).

Table 2.

Optimization results for the synthetic data: The global accuracy λ, correlation coefficient κ and objective function Γ exhibit sufficient good performance. Each glottal closure type covers 10 subjects (5 male, 5 female).

GCT	Γ	κ	λ
RA	0.05 ± 0.03	96% ± 2%	92% ± 5%
HG	0.09 ± 0.03	94% ± 3%	89% ± 3%
TPD	0.07 ± 0.03	97% ± 2%	90% ± 5%
TPV	0.09 ± 0.03	95% ± 3%	92% ± 2%
CV	0.05 ± 0.02	97% ± 2%	91% ± 3%
total	0.07 ± 0.03	96% ± 2%	91% ± 4%

Gender	Γ	κ	λ
male	0.05 ± 0.03	96% ± 2%	91% ± 5%
female	0.09 ± 0.03	96% ± 3%	91% ± 3%

Open in a new tab

To illustrate the optimization results, Fig. 6 compares the adapted 3D trajectories $c_{4, 4}^{M} [n]$ and the synthetic 3D trajectories $c_{4, 4}^{S} [n],$ corresponding to the mass element m_4,4 of a subject with a rectangular glottal closure pattern and a fundamental frequency of 120 Hz.

(Color online) Exemplary results of the 3D dynamics of the mass element m_4,4 are given. Synthetic trajectories (solid lines) and optimized trajectories (dotted lines) are presented for all three displacement directions. The corresponding accuracy values are: (Γ, κ, λ) = (0.05, 97%, 96%).

Reproducibility of parameters

To determine the reproducibility of the parameters, the individual accuracies (λ_p,λ_i,s,λ_r) for the predefined optimization parameters $(Q_{p}^{S}, Q_{i, s}^{S}, Q_{r}^{S})$ were derived as: 92 ± 5%, 91% ± 9%, and 97% ± 3%, averaged over all of the synthetic subjects. The lowest value of the individual accuracy was for $Q_{i, s}^{S},$ whereas the highest was for $Q_{r}^{S} .$ To facilitate an intuitive feel for the reproducibility of the parameters, the relationship between the adapted optimization parameters and the predefined optimization parameters is illustrated in Fig. 7. In each chart, the regression lines with their respective regression functions are sketched. The corresponding regression fits were high percentage (i.e., close to one), and their disturbance terms were low. Additionally, the 95% confidence interval is very narrow: the distances between the regression lines and the boundaries of the regions were 0.28, 0.27, and 0.08.

Comparison between the adapted optimization parameters $({\hat{Q}}_{p}, {\hat{Q}}_{i, s}, {\hat{Q}}_{r})$ and pre-defined values $(Q_{p}^{S}, Q_{i, s}^{S}, Q_{r}^{S})$ . The solid lines indicate the regression lines for the optimization parameters. The corresponding regression functions are shown including the 95% confidence interval. The distances between the regression lines and the confidence interval boundaries are indicated (Δ_a, Δ_b, Δ_c exhibiting the reliability of the algorithm).

Influence from weighting coefficients

In order to examine whether the measures of the optimization results influenced the weighting coefficients, the variations of the individual accuracies λ_i_,_s at different planes are shown in Fig. 8. Higher accuracies were achieved at the superior planes s = 3, 4 with λ_i,₃ = 93% ± 6% and λ_i_,4 = 95% ± 5% averaged over all mass elements, while the lowest was at the inferior plane s = 1. Additionally, statistical analysis (analysis of variance) showed that optimization accuracy was statistically significantly dependent on plane (p < 0.01).

Boxplots of the accuracies λ*_i_,_s* averaged over all 10 mass elements (i = 1,…, 10) at each plane (s = 1,…, 5) for the 50 synthetic data sets. The mean values are marked with *. Higher performances occur at the superior planes, while lower values occur at the inferior planes.

Influence from GCT and gender

As shown in Table TABLE II., the optimization results may be affected by different GCT and gender. To further investigate how a response is affected by the two factors, two-way analysis of variance (analysis of variance) was applied. It shows that GCT and gender both exhibit statistically significant effects for Г at the 99% confidence interval (significance level a = 0.01, probability of a Type I error), since ratios of F-test (6.61 for GCT, 26.68 for gender) are larger than the corresponding critical values $F_{0.99}^{crit} = 3.83$ and 7.31, respectively. The larger the F ratios, the more GCT and gender differ. Also, the corresponding lower p-values (3.5E-04 < 0.01 for GCT, 7.0E-06 < 0.01 for gender) provided sufficient evidence for such a conclusion. Additionally, for the analysis of κ and λ, no statistically significant differences based on GCT or gender, or the interaction between the two, were found, due to smaller F ratios ( $< F_{0.95}^{crit}$ ) and larger p-values (> 0.05). Overall, analysis of variance for (Г, κ, λ) confirmed the assumption (displayed in Table TABLE II.) that GCT and gender partly influence (normalized error Г) the optimization results.

Application to an in vitro hemilarynx experiment

In order to adapt the model dynamics to a hemilarynx data set, the initial rest positions of the mass elements were first estimated in accordance with the mean values of the 3D movements of the mounted marker points. Figure 9 shows the initial rest positions at one side of the 3DM. The average values of the computed optimization parameters $({\hat{Q}}_{p}, {\hat{Q}}_{i, s}, {\hat{Q}}_{r})$ were 0.79, 0.75 ± 0.36, 0.94 ± 0.26. The corresponding global measures (Г, κ) were 0.16 and 84%, respectively. Accuracy of the optimization parameters could not be reported, since no predefined optimization parameters existed for the experimental data. However, the optimized subglottal pressure (19.0 cm H₂O) and fundamental frequency (120 Hz) were sufficiently accurately reproduced.

The rest positions of the 25 mass elements for one side are estimated in accordance to the mean values of the marker points placed on the hemilarynx. The marks × denote the rest position of the 25 mass elements.

Figure 10 shows the high similarity between the hemilarynx experimental 3D trajectories c_3,3[n] and adapted theoretical 3D trajectories $c_{3, 3}^{M} [n]$ of the mass element m_3,3, which responds to the marker-point located in the third vertical column and fourth sagittal line in the hemilarynx. The corresponding fundamental frequency was 120 Hz. The normalized errors Γ for trajectories in lateral, longitudinal, and vertical directions were 0.04, 1.5, and 0.26, respectively. Their correlations κ were 99%, 68%, and 97%, respectively. The smallest error with highest correlation occurred in the lateral direction, while the highest error with lowest correlation occurred in the longitudinal direction.

(Color online) Results of the adapted 3DM (dotted lines) of the mass element m_3,3 located in the median cross section at the plane s = 3 on the right side of the model, compared to the hemilarynx trajectories (solid lines) at the corresponding suture-point. The fundamental frequency is 120 Hz.

Figure 11 shows the optimized model parameters applied to the hemilarynx experimental data set. The derived masses ${\hat{m}}_{i, s}$ decreased from inferior to superior (s = 1 to s = 5) for all five coronal cross-sections [see Fig. 11a]. They ranged between 5.7 × 10⁻³ and 0.14 g. Additionally, from dorsal to ventral, the highest value occurred at the most ventral cross section i = 5 for four planes (s = 2 to s = 5), except the most inferior plane s = 1 (which had the highest value at i = 1 near the vocal fold dorsal extreme). The lowest values at the superior planes (s = 4, s = 5) were at the most dorsal cross section i = 1, while the inferior three planes (s = 1, 2, 3) had the lowest values in cross section i = 4. A large difference occurred at the transition from the most superior plane s = 5 to the second plane s = 4 [see Fig. 11a].

Logarithmic charts for the optimized model parameters after application to a hemilarynx experimental data set. (a)–(d) describe the mass ${\hat{m}}_{i, s}$ , anchor stiffness ${\hat{k}}_{i, s}^{a}$ , longitudinal stiffness ${\hat{k}}_{i, s}^{l}$ , and the vertical stiffness ${\hat{k}}_{i, s}^{v}$ at∕between different transverse planes s and coronal cross-sections i. ${\hat{k}}_{i, s}^{v}$ occurs between the current plane and the next upper plane. ${\hat{k}}_{i, s}^{l}$ occurs between the current cross section and the next one. The achieved subglottal pressure ${\hat{P}}^{sub}$ was equal to 19.0 cm H₂O, and the estimated glottis length was 13.0 mm. i = 1,…, 5 denotes the coronal cross-sections from dorsal to ventral. s = 1,…, 5 denotes the transverse planes from inferior to superior. In general, the values of the model parameters at inferior plane (s = 1) are higher than those at superior plane (s = 5).

In Fig. 11b, at the superior planes (s = 3 to s = 5), the anchor spring ${\hat{k}}_{i, s}^{a}$ decreased from inferior to superior for all five coronal cross-sections (i = 1 to i = 5). Moreover, the highest value occurred at the most inferior plane s = 1, i = 4. The lowest value was found at the most superior plane s = 5 near the ventral extreme (i = 5). Values of ${\hat{k}}_{i, s}^{a}$ at the two inferior planes (s = 1, 2) were nearly equal at the dorsal cross-sections i = 1, 2. Also, planes s = 2, 3 had nearly equal values at i = 4. The values ranged between 1.63 and 29.37 N∕m.

Figure 11c shows that the longitudinal spring ${\hat{k}}_{i, s}^{l}$ had comparatively lower values at the most superior plane s = 5. For planes (s = 2, 5), there was an increase from dorsal (i = 1, 2) to ventral (i = 4, 5). In general, the lowest value of ${\hat{k}}_{i, s}^{l}$ occurred at the most superior plane s = 5 between the two dorsal cross-sections (i = 1, 2), while the highest was found at the most inferior plane s = 1 between cross-sections i = 2 and i = 3, shown in Fig. 11c. The stiffnesses ranged between 17.98 and 1954.36 N∕m.

In Fig. 11d, the vertical spring ${\hat{k}}_{i, s}^{υ}$ obviously decreased from inferior to superior for all five coronal cross-sections. A decrease occurred from dorsal to ventral for the superior spring (s = 4, 5), and the inferior spring (s = 1, 2). Values for the spring (s = 3, 4) were nearly equal in all five coronal cross-sections. The highest value was found in the spring (s = 1, 2) near the dorsal extreme (i = 1), while the lowest was in s = 4, 5 near the ventral extreme (i = 5). The stiffnesses ranged between 24.53 and 1035.50 N∕m.

DISCUSSION

In this work, a mathematical optimization approach for approximating 3D vocal fold dynamics by a biomechanical model was suggested and verified. Validation of the optimization with fifty synthetic data sets and application to an in vitro human hemilarynx experiment reflect its suitability and applicability. Among other things, the proposed method may be used to objectively quantify vocal fold asymmetries. A long-term goal of this method is to support clinical diagnosis and treatment of voice disorders through a better knowledge of the biomechanical parameters underlying the disorders.