Skip to main content
The Journal of the Acoustical Society of America logoLink to The Journal of the Acoustical Society of America
. 2021 Mar 11;149(3):1712–1723. doi: 10.1121/10.0003561

A one-dimensional flow model enhanced by machine learning for simulation of vocal fold vibration

Zheng Li 1, Ye Chen 1, Siyuan Chang 1,a), Bernard Rousseau 2, Haoxiang Luo 1,b)
PMCID: PMC7954577  PMID: 33765799

Abstract

A one-dimensional (1D) unsteady and viscous flow model that is derived from the momentum and mass conservation equations is described, and to enhance this physics-based model, a machine learning approach is used to determine the unknown modeling parameters. Specifically, an idealized larynx model is constructed and ten cases of three-dimensional (3D) fluid–structure interaction (FSI) simulations are performed. The flow data are then extracted to train the 1D flow model using a sparse identification approach for nonlinear dynamical systems. As a result of training, we obtain the analytical expressions for the entrance effect and pressure loss in the glottis, which are then incorporated in the flow model to conveniently handle different glottal shapes due to vocal fold vibration. We apply the enhanced 1D flow model in the FSI simulation of both idealized vocal fold geometries and subject-specific anatomical geometries reconstructed from the magnetic resonance imaging images of rabbits' larynges. The 1D flow model is evaluated in both of these setups and shown to have robust performance. Therefore, it provides a fast simulation tool that is superior to the previous 1D models.

I. INTRODUCTION

Computational modeling of the fluid–structure interaction (FSI) for vocal fold vibration is useful as it may provide a computer-based tool for clinical management of voice disorders, e.g., surgical planning for vocal fold paralysis.1 Despite that the underlying physical principle of vocal fold vibration is straightforward and can be modeled simply using lumped-mass models, high-fidelity modeling to simulate details of the tissue's dynamic deformation is still very challenging especially if patient-specific features should be simulated for the purpose of developing modeling tools that can capture differences in the laryngeal anatomy and tissue properties of individuals.

With tremendous growth of the computer power and improvement of the modeling approach, computational models for the FSI of vocal fold vibration have been advanced substantially in recent years. These physics-based models typically couple a two-dimensional (2D) or three-dimensional (3D) glottal airflow model and a finite-element representation of the vocal fold tissue.2–13 In addition, they adopt increasingly high resolution and have provided insightful fundamental understanding about this FSI system such as eigenmodes,2 mechanics of posturing,3 sensitivity to the geometry and material properties,5,8–10 vortex flow and pressure on the vocal fold surface,6,7,13 energy transfer,4,7 and acoustic wave propagation.11,12 In Refs. 2–13, the vocal fold was often represented by a schematic that captures only the overall characteristics of the laryngeal geometry. Such models are obviously not sufficient for subject-specific representation. Medical imaging technologies, e.g., computed tomography (CT) and magnetic resonance imaging (MRI), allow the computational models to incorporate more realistic or even subject-specific laryngeal geometries. These imaging tools may provide detailed 3D anatomy of the larynx as well as the interior structure of the tissue.14–17 Coupled with a 3D airflow solver in the FSI simulation, the anatomical models of the larynx represent a significant step toward patient-specific modeling of the vocal fold vibration, which may be needed for the clinical care of voices of individual patients. In recent years, such patient-specific models have been developed to study vocal fold vibration by several researchers.18–20

On the other hand, the unknown tissue properties for individual subjects, which cannot be identified from current imaging technologies, limit the application of patient-specific modeling. There have been prior efforts to derive the vocal fold tissue properties using finite-element method (FEM)–based models combined with experimental tests,21–23 but they were limited to ex vivo conditions. Although it is probable to determine the in vivo, subject-specific tissue properties by running the high-fidelity FSI simulations and solving an inverse problem, assuming that all other aspects in the FSI model match the corresponding in vivo experiment (e.g., the anatomy and boundary conditions), such an approach is still not practical because the 3D airflow simulation is too expensive even with high-performance parallel computing. More practically, one could use a simplified flow model coupled with a realistic FEM representation of the vocal fold for the FSI simulation to determine the elastic properties with much lower computational cost. Such efforts have been made recently in a few studies.20,24–26 In practice, this kind of simplified FSI model could also be combined with the high-fidelity models to increase the overall modeling accuracy.20

Bernoulli-based flow equations have long been used for vocal fold vibration. Decker and Thomson27 compared the Bernoulli equation with the Navier–Stokes equation for the simulation of the vocal fold vibration, and they found that the Bernoulli models could be highly dependent on the ad hoc assumption of the flow separation in the glottis. Chang28 also reached a similar conclusion, and found that the Bernoulli model may lead to a significantly different vibration mode of the vocal fold from a full Navier–Stokes model. In general, the main limitations in the Bernoulli principle are to the result of the assumption of the ideal flow in the glottis and a priori unknown location of the flow separation. To address the limitations of the Bernoulli equation, in our recent works,29,30 we developed a one-dimensional (1D) momentum equation–based flow model that was originally designed to solve separated flow in the collapsible tube31,32 and recently introduced for modeling of vocal fold dynamics.33 In this model, we have included the effect of pressure loss, which is caused by the flow separation and the viscous effect. Furthermore, we have included an entrance effect, which is due to an inertial flow entering the glottis from a rapidly converging shape in the subglottal region. This 1D flow model was coupled with the 3D tissue model for the FSI simulation of both idealized and anatomical vocal fold geometries from rabbits. In the idealized vocal fold cases, the reduced FSI model achieved consistent results with those from the full 3D FSI model for different medial vocal fold thicknesses, subglottal pressures, tissue models, and tissue stiffness properties. In the anatomical models, vibration results from the reduced FSI model agreed well with the experimental data of the evoked in vivo rabbit phonation.29,30

However, the pressure loss and entrance effect in the glottal airflow depend on the overall shape of the glottis as well as its instantaneous deformation during vibration. In addition, the Reynolds number plays a significant role. Since no analytical expression exists to describe these effects precisely, in the previous work,29,30 we chose to use constant parameters based on the knowledge learned from the 3D flow simulations. These parameters may need adjustment, depending on specific vocal fold geometry, e.g., a large or small medial thickness, to achieve good accuracy. This limitation reduces generalization of the flow model. To overcome the limitation, we set variable parameters for the pressure loss and entrance effect and seek to express them in a functional form that is convenient to use.

Machine learning techniques, which have gained popularity in fluid mechanics in recent years,34 provide a useful approach to help determine the functional form of a physical effect especially when general characteristics of such an effect have been understood. This feature applies well to the situation that we are considering. Here, we use the 1D Navier–Stokes equation to describe the flow physics and leave only the undescribed effects to be determined empirically. These effects, i.e., the pressure loss and entrance effect in a converging-diverging glottis, can be expressed as functions of only a few dimensionless parameters such as the instantaneous shape of the glottis and the Reynolds number of the flow. Once these functions are determined using a training data set through machine learning, they can be incorporated into the 1D model for flow simulation of arbitrary glottal shapes. Recently, there have been other studies that also incorporated machine learning into modeling of vocal fold dynamics.26,35,36 Compared to those studies, the present work takes advantage of the physics-based model, i.e., the unsteady viscous Navier–Stokes equation, albeit within the limitations of 1D, as much as possible and only resort to machine learning for the remaining undescribed effects that include the entrance effect.

In fluid mechanics, many machine learning techniques have been developed to identify the governing equation directly from the data.34,37–39 Here, we adopt the sparse identification of nonlinear dynamical (SINDy) systems,37 which is a regression algorithm suitable for the physical systems having only a few relevant terms to define the dynamics. Specifically, we will use the SINDy method to identify the pressure loss and entrance effect at locations near the glottal exit and express them as polynomial functions of the Reynolds number, the channel length, and the convergent or divergent ratios of the glottis. To generate the training data for machine learning, we will use 3D FSI simulations of an idealized vocal fold with different medial thicknesses, stiffness properties, and subglottal pressures. Additional 3D simulation cases will be performed for validation of the machine learning. To further assess the performance of the new flow model, we will compare other quantities, such as the pressure distribution, flow rate, and vibratory characteristics, in the idealized model against the 3D simulation. Furthermore, we will apply the new model to the FSI simulation of anatomical vocal fold geometries that are based on excised rabbit larynges. The simulation results will be compared with high-speed video data from in vivo phonation of the same larynx samples prior to the excision. The descriptions of the model setup, machine learning procedure, and results from the machine learning and FSI simulations are provided in Secs. II AII E.

II. THE FLOW MODEL AND TRAINING CASE SETUP

A. The 1D viscous flow model

A 2D schematic of the geometry in the transverse plane of the larynx is shown in Fig. 1 where the glottis is depicted as a converging-diverging channel. To facilitate our discussion, we use xa and xb to mark the locations of the glottal inlet and exit, respectively, and xc shows the location of the minimal cross-section area in the glottis. Note that xc varies between xa and xb when the vocal fold is vibrating, and it could coincide with xb such that the glottis is purely convergent. In practice, the location of xb is straightforward to choose as the cross section typically experiences a sudden expansion at the glottal exit. On the other hand, the location of xa sometimes is not obvious because the subglottal region may narrow down gradually rather than abruptly. We point out from our tests that the present flow model is not sensitive to the inlet location if the glottis does not have a clear entrance location in which case an approximate choice of xa would be sufficient. This is because the entrance effect of a gradually convergent section is small anyway and, in addition, the pressure loss primarily takes effect in the diverging section in the present model and has little dependence on the inlet location.

FIG. 1.

FIG. 1.

(Color online) Schematic of airflow entering the glottis, which generally has a converging-diverging shape. The flow experiences loss of energy and also a vena contracta effect in which the effective area, A, may be smaller than the actual cross section area, A0. The red-dashed lines indicate the boundary layers.

When the air flows through the glottis during vibration, the pressure generally decreases before the minimal area section at xc due to the Bernoulli effect. After this section, the pressure would increase along with the area expansion. However, the pressure will not recover to its full extent because of the possible separation in the divergent section and also because of the viscous effect in the entire glottis. Therefore, accounting for the pressure loss in the flow model will help overcome limitations of the Bernoulli equation. Considering the mass and momentum conservation equations, Cancelli and Pedley31 developed a 1D flow model to describe a collapsible tube. In the momentum equation, they included the viscous loss and separation effects. To generalize the pressure loss, we combine the viscous loss and separation effects as one single loss term represented by the shear stress, τ, in the following equation:

At+Aux=0,ρut+ρuux=px+τx, (1)

where ρ, u, and p are, respectively, the density, velocity, and pressure, and A is the effective area of the cross section. We will discuss the calculation of the shear stress τ(x) later using a machine learning approach. For the boundary conditions of the 1D flow, we set a specified subglottal pressure, Psub, and the pressure at the glottal exit, Pe. Equation (1) represents a nonlinear boundary value problem and can be solved using a shooting method once we have an expression for τ. Its numerical procedure was described in Li et al.29

Besides the loss term, the vena contracta effect of flow entering an expansion was introduced in our flow model.29,30 In particular, as the air flows into the glottis, and especially the diverging section, it tends to focus to the center under its inertia rather than following the exact shape of the channel. Thus, we use the effective cross-section area, A, in Eq. (1) for the mass conservation equation. This area is smaller than the actual cross-sectional area, A0, as illustrated in Fig. 1. Without such an entrance effect, the negative pressure (gage pressure) at the minimum section could be overestimated, leading to an inaccurate pressure load on the vocal fold surface. To calculate the effective area, A, we introduce a correctional coefficient, α(x), so that

A(x)=α(x)A0(x). (2)

Note that α is a function of the streamwise location, x.

In our previous work,29,30 we estimated α(x) based on the 3D simulation of the FSI problem by calculating it from α(x)=uavg/u, where uavg is the average streamwise velocity in the cross section and u is the maximum streamwise velocity. We further assumed a quadratic function form for α(x) with a single free parameter to be determined through machine learning as will be discussed in Sec. II C. The quadratic function represents narrowing down of the effective area due to the growth of the boundary layer along the glottis.

B. Input variables for machine learning

To outline the flow model for machine learning, we use a simplified but characteristic geometry of the glottis and define the input–output variables for the machine learning module. Figure 2(a) shows a 2D schematic of the glottis. We define five locations along the flow, which are (1) the glottal inlet xa, (2) the glottal exit xb, (3) the narrowest area location xc, (4) a point in the subglottal region xd, and (5) an intermediate location in the divergent section xe. The average gap width at the narrowest section is denoted as H, therefore, H=A(xc)/L, where A(xc) is the cross-section area at xc and L is the longitudinal length of the glottis. These locations, xa, xb, and xc, and the corresponding cross-sectional areas, A(xa),A(xb), and A(xc), describe the overall converging-diverging shape of the glottis. In addition to these variables, xd and A(xd) are used to describe the slope of the subglottal region, which is useful in measuring the extent to which the flow is focused when entering the glottis. Previous study has shown that the geometry of the glottal entrance has a significant influence on the intraglottal pressure distribution.40 Here, we set xd at a distance of 5H from xa to capture the slope of the subglottal shape. Furthermore, the intermediate point xe is added so that the pressure loss at this location will be determined as one output variable as described in Sec. II C. We set xe to be closer to xb with a distance ratio of 3:1 between |xcxe| and |xbxe| because the pressure loss increases more quickly near the glottal exit as illustrated in Fig. 2(b). It is worth pointing out that the exact locations of xd and xe are not crucial as xd is used to calculate the subglottal slope of the vocal fold and xe is used to provide another data point for the pressure loss estimate.

FIG. 2.

FIG. 2.

(a) Geometric description of the glottis used in the machine learning. (b) Functions of the pressure loss, τ(x), and the area correction coefficient, α(x), along the glottis.

In terms of nondimensional parameters, the geometric variables along with the Reynolds number are defined as

rb=A(xb)/A(xc),lbc=Lbc/H,ra=A(xa)/A(xc),lac=Lac/H,rd=A(xd)/A(xa),Re1/3=[ρucHμ]1/3. (3)

Among these six variables, rb, ra, and rd are the area ratios, lbc and lac are the normalized distances. The Reynolds number is defined using the velocity at the narrowest section, uc, and reduced to the power 1/3 so that this variable is at a similar order of magnitude as the other five variables.

C. Output variables for machine learning

To determine the pressure loss or the shear stress term in Eq. (1), we assume that τ = 0 and dτ/dx=0 at the glottal inlet xa, τ=τe at the intermediate point xe, and τ=τb at the exit xb. The two unknown variables, τe and τb, will be determined using machine learning as functions of the six input variables described in Eq. (3). Once τe and τb are determined from machine learning, we assume a cubic distribution for τ(x) from the glottal inlet xa to the exit xb as shown in Fig. 2(b). This assumption is made based on observation of general characteristics of the pressure loss from our 3D flow simulations.30 Note that a higher order distribution is also possible using the same strategy if more output variables are used from machine learning.

For the entrance effect, we need to determine the area correction coefficient, α(x), along the glottis. Similar to our previous publications,29,30 we assume a quadratic distribution for α(x) between xc and xb. However, in the present study, we will only need to determine α/α(xc) because only the relative area ratio is needed when solving Eq. (1). Furthermore, α(x) is assumed to have a zero derivative at xb. Therefore, we will only need to determine α(xb)/α(xc) through machine learning. In summary, there are three output variables for the machine learning process, which are τe, τb, and α(xb)/α(xc).

D. The SINDy method for machine learning

For machine learning, we use SINDy systems,37 which is a data regression approach to discover governing equations for nonlinear dynamical systems, including fluid flows. In particular, SINDy uses sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data, and this results in parsimonious models that balance accuracy with model complexity to avoid overfitting.37 The key assumption in this approach is that for many systems of interest, the governing equation consists of only a few terms, making it sparse in the space of possible functions.37 Using training data that will be described in Sec. II E, SINDy can determine a generic output variable, f, as a polynomial function of the six input variables defined in Eq. (3), i.e.,

f=f(rb,lbc,ra,lac,rd,Re1/3,rb2,rblbc,,Re2/3,). (4)

The software package of SINDy in MATLAB is freely available from the authors37 and is used here for our study, and only the terms up to the third order are retained in this polynomial function.

E. Setup of the 3D FSI model and data generation

To generate training data for machine learning, we use a previous 3D setup of the FSI of an idealized vocal fold geometry as illustrated in Fig. 3. This model has been described in our previous publications9,29,30 and is only briefly summarized here. The airflow is driven by a constant subglottal pressure Psub, and the outlet has a reference pressure of Pout = 0 kPa for all of the cases in consideration. The air is assumed to be incompressible and governed by the viscous Navier–Stokes equation. A pair of vocal fold bands are placed symmetrically in the channel with a length, width, and depth of L = 20 mm, W = 13 mm, and D = 10 mm, respectively. The medial thickness, T, has significant effects on the flow and vocal fold vibration28 as the medial surfaces are the primary loading surfaces for the sustained vibration. The vocal fold here is assumed to be isotropic, homogeneous, and is governed by a hyperelastic, two-parameter Mooney-Rivlin model. The strain energy density function for this model is given as

W=α10(I¯13)+α01(I¯23)+K/2(J1)2, (5)

where K represents the bulk modulus, α10 and α01 are material constants related to the distortional response, and J=det(F) where F stands for the deformation gradient. In addition, I¯1 and I¯2 are invariants based on J and the principal stretches of the deformation gradient. Further detail of this model for the vocal fold can be found in previous work by our group.41 Anisotropic tissue behavior, or a multi-layer structure as proposed in many previous works,2,7,10 would be a better representation of the real tissue of the vocal fold. However, we only need characteristic vocal fold deformation and corresponding flow data for the training purpose; thus, the specific material model for the vocal fold tissue is not essential in this study.

FIG. 3.

FIG. 3.

(Color online) The vocal fold model and computational domain used for 3D FSI simulation and data generation.

To solve the 3D FSI, an in-house immersed-boundary method is employed for the flow simulation, whereas the tissue deformation is solved with a FEM.41 In total, we solved ten simulation cases after a careful mesh independence study.29,30 Table I shows the details for all of the test cases, which contain variations in these parameters: the medial thickness T, the subglottal pressure Psub, and the material stiffness constants α10 and α01. The tissue density is ρs=1040 kg/m3 and the mass damping is 0.05 s−1 in all of the cases. The air density is ρ=1.13 kg/m3. Thus, the characteristic intraglottal velocity is V=2(P0Pout)/ρ=42.1 m/s. We define the jet Reynolds number using ReJ=ρVd/μ where d1 mm is the characteristic glottal gap during the opening phase and μ is the air viscosity. In the current study, we set ReJ=210.

TABLE I.

FSI cases setup for data generation. Cases 1–6 are used for training, and cases 7–10 are used for additional validation.

Case 1 2 3 4 5 6 7 8 9 10
T (mm) 1.75 3.50 1.75 3.50
Psub (kPa) 1.00 1.25 0.75 1.25 0.75
α10 (kPa) 2.29 2.58 9.16 2.29 4.58 9.16 2.29
α01 (kPa) 0.25 0.50 1.00 0.25 0.50 1.00 0.25

Cases 1–6 are utilized in SINDy as training data, which have the same Psub but two different medial thicknesses and three stiffness constants. After steady vibration is established in the 3D FSI simulations, 80 time frames of data in each case, which cover at least 2 vibration cycles (from 8 to 16 ms, depending on the frequency), are used for training. That is, at each chosen time frame, the instantaneous values of the six input variables and the three output variables represent a data point for machine learning. To calculate these input and output variables, we extract the flow velocity and pressure along the centerline of the 3D flow field. In total, there are 480 data points for all 6 cases together for training. We will compare the output from the regression (i.e., the equations derived from machine learning) with those provided for training. To extend the validation, we consider cases 7–10, in which the subglottal pressure is different and whose data have not been used for training, for further assessment. For all ten cases, we will also use the machine learning enhanced 1D flow model to replace the 3D flow and perform FSI simulations, and we will compare the vibration frequency, amplitude, and phase delay of the vocal fold between the 3D FSI and simplified FSI models. The entire procedure is shown using a flow chart in the supplementary material.42

III. RESULTS AND DISCUSSION

A. Results from machine learning

After the training process, we obtain the explicit expressions of the output variables as defined in Sec. II C. These expressions are given in Appendix A. Results of the data regression are shown in Fig. 4 for τb/(ρuc2),τe/(ρuc2), and α(xb)/α(xc), where uc is the centerline velocity at the minimum section xc. In Fig. 4, the x axis represents the 3D FSI results of cases 1–10 and the y axis represents the predicted value based on the trained regression model, i.e., Eqs. (A1)–(A3). Red symbols represent training data from cases 1–6, whereas the blue symbols are the validation data from cases 7–10. Ideally, the predicted value is equal to the 3D FSI value so that all data points would fall on the dashed line y = x in Fig. 4. However, as a result of the error in the fitting process, the data are scattered around the line. From Fig. 4, we can see that both the training data and the validation data mostly cluster around y = x; thus, the predicted results from regression agree reasonably well with the 3D results. We calculate the mean error between the machine learning result and the benchmark result from the 3D simulations for each of these three output variables, i.e., |ϕMLϕ3D|. For τb/(ρuc2), the mean error of the training data and the mean error of the validation data are 0.029 and 0.033, respectively; for τe/(ρuc2), the mean error of the training data and the mean error of the validation data are 0.057 and 0.086, respectively; and for α(xb)/α(xc), they are 0.048 and 0.078, respectively.

FIG. 4.

FIG. 4.

(Color online) Comparison between 3D FSI results and the predicted results from data regression. Data in cases 1–6 (red symbols) are used as training data; data in cases 7–10 (blue symbols) are only used for validation. (a) Pressure loss at the glottal exit, τb/(ρuc2), (b) pressure loss at xe, τe/(ρuc2), and (c) the entrance effect ratio at xb, α(xb)/α(xc). The dashed line represents the ideal fit.

In Figs. 4(a) and 4(b), we see that the loss coefficients in the glottis, τb/(ρuc2) and τe/(ρuc2), vary from nearly zero to nearly three. The zero value means that there is no loss in the flow, whereas the three value represents a significant loss in the flow. Such great loss happens when the glottis is nearly closed and the flow speed becomes small, which is analogous to a closing mechanical valve in a pipe flow.

In Fig. 4(c), we see that α(xb)/α(xc) is clustered above 0.5, indicating that the entrance effect at the glottal exit is not necessarily significant at those time frames. Further examination shows that in those situations, the glottis has a small divergent angle or a short divergent section, which does not create a strong entrance effect. However, there are also a number of data points in Fig. 4(c) where α(xb)/α(xc) is below 0.5 and even close to 0.2. For those situations, the divergent section is typically long and/or has a large diverging angle, which causes early flow separation and a strong entrance effect. The distribution of the data points in Fig. 4(c), hence, covers a wide range of situations for the flow, which is preferred for the training purpose. To further illustrate the data variation and validation of the machine learning, we have added a figure in the supplementary material, plotting the three output variables against time for the validation cases.42

Once the expressions for τb, τc, and α(xb)/α(xc) are derived from machine learning, we can apply them in the 1D flow model. In doing so, τ(x) and α(x) in the glottis take the assumed distribution as in Sec. II C. That is, we assume a cubic function for τ(x) as shown in Fig. 2(b) where τ = 0 and dτ/dx=0 at xa. When xc is very close to xb, i.e., the glottis is almost purely convergent, the cubic interpolation may result in overshoot of the function. Thus, when |xcxb| is less than H, we disregard τe and switch the cubic function for τ(x) to the quadratic function with the same boundary conditions at xa. For the area correction coefficient α(x), we assume a quadratic function between xc and xb while requiring dα/dx=0 at xb as shown in Fig. 2(b). Between the glottal inlet xa and the minimum section xc, the thickness of the boundary layer has only small change, therefore, we assume that α(x)/α(xc)=1 in that region.

To verify the 1D flow model enhanced by machine learning, we first compared the pressure distribution predicted by this model against the 3D results generated from cases that are not used in the training process. More specifically, we use the instantaneous glottal shape obtained from the 3D FSI simulations of cases 7–10 and calculate the pressure distribution using the 1D flow model in Eq. (1). Then, the result is compared with the pressure at the centerline extracted from the 3D flow field.

Figure 5 shows such a comparison of the pressure distributions at the vocal fold opening and closing phases from cases 7 and 8, which have the same medial thickness of T = 1.75 mm but different Psub values. It can be seen that the pressure produced by the 1D flow model is close to that from the 3D simulation for different Psub values, including the negative pressure during the opening phase when the glottis is of divergent shape. From our previous study,29 the correct prediction of the negative pressure in this case of small medial thickness is important; otherwise, the vocal fold may exhibit a different vibration mode that is associated with the first eigenmode of the tissue structure.29

FIG. 5.

FIG. 5.

Comparison of the pressure distribution along the centerline between the 3D FSI and 1D flow model for small medial thickness T cases. (a) Closing phase in case 7, (b) opening phase in case 7, (c) closing phase in case 8, and (d) opening phase in case 8. The inset shows the corresponding vocal fold deformation (solid lines) from its initial configuration (dashed lines).

Figure 6 shows a comparison of the pressure distribution for cases 9 and 10, which have the same medial thickness of T = 3.50 mm but different Psub values. In these cases, the glottis is relatively long and may form diverging, converging, and converging-diverging shapes at different vibration phases. From our previous study,29 if a Bernoulli equation–based model is used, then an inappropriate setting of the location for flow separation may lead to an exceedingly negative pressure that might destabilize the FSI simulation. In the present study, the 1D flow model correctly predicts the negative pressure zone [e.g., Figs. 6(c) and 6(f)] and captures the pressure reasonably well at different glottal shapes.

FIG. 6.

FIG. 6.

Comparison of the pressure distribution along the centerline between the 3D FSI and 1D flow model for large medial thickness T cases. (a) Closing phase, (b) opening phase, and (c) maximum opening in case 9; (d) closing phase, (e) opening phase, and (f) maximum opening in case 10. The inset is explained in Fig. 5.

In Figs. 5 and 6, we also include the pressure distributions calculated by the untrained 1D flow model with ad hoc assumptions in the pressure loss and entrance effect.29,30 Overall, the current 1D flow model trained by machine learning achieves better accuracy as compared with those reference results.

Besides the pressure distribution, the flow rate is also important for the glottal airflow. Figure 7 compares the volume flow rate between the 3D FSI and 1D flow models for cases 7–10, which are not used in the training process. Similar to the pressure distribution, we calculate the volume flow rate in the 1D flow model using the same glottal shape from the 3D simulation. Two representative cycles are selected for each case comparison. In cases 9 and 10, in which the medial thicknesses are larger, the vocal folds have better closure and, thus, the flow rates have greater oscillations and are reduced to nearly zero at closure. On the other hand, in cases 7 and 8, in which the medial thicknesses are smaller, the vocal folds maintain a significant gap at the closing phases, which leads to a high flow rate and low-magnitude oscillations during vibration. In all cases under consideration, the flow rate from the 1D flow model agrees well with the 3D FSI result.

FIG. 7.

FIG. 7.

Comparison of the volume flow rate between the 3D FSI and 1D flow model for (a) case 7, (b) case 8, (c) case 9, and (d) case 10, which are not in the training data set.

As a reference, we also include the volume flow rates in cases 7–10, calculated using the untrained 1D flow model.29,30 Comparing with the reference result, the flow rate from the present model shows significantly better agreement with the 3D FSI result.

Therefore, the present 1D flow model provides reliable predication for the volume flow rate during the vocal fold vibration.

B. Application in the FSI of idealized vocal fold models

After verifying the 1D flow model for the flow calculation only, we then apply it in the FSI simulation by coupling it with the 3D idealized vocal fold model described in Sec. III A. We compare the vibration characteristics from this 1D-flow/3D-solid hybrid FSI simulation with those from the full 3D FSI simulation. For this comparison, all ten cases in Table I are considered, which include variations in the medial thickness, stiffness properties, and subglottal pressure. As mentioned before, cases 7–10 were not used in the training data but are included here as an additional validation.

We use the vibration amplitude, frequency, and phase delay for quantitative comparison between the two sets of simulations. The vibration amplitude is defined as the maximum y-displacement of the vocal fold measured at the glottal exit in a cycle. In cases 1–3, the tissue stiffness increases while the other parameters are the same. As shown in Fig. 8(a), the vibration amplitude decreases with the increasing tissue stiffness in cases 1–3, which have a smaller medial thickness. Similar trends can be seen in cases 4–6, which have a larger medial thickness. Cases 7, 1, and 8 have different subglottal pressures from high to low. Correspondingly, their vibration amplitudes show a decrease in the same order. A similar result can be seen for cases 9, 4, and 10 in which Psub is decreased. In all ten cases, the two sets of FSI simulations produce closely agreeable results.

FIG. 8.

FIG. 8.

(Color online) Comparison between the 3D FSI and reduced-order model. (a) Amplitude, (b) frequency, and (c) phase delay.

The comparison of the vibration frequency is shown in Fig. 8(b). The second-eigenmode type vibration7 is established in all of the cases in which the vocal fold oscillation is primarily in the lateral or y-direction, and this mode is captured by the 1D-flow FSI simulation. Thus, the frequency predicted by this hybrid FSI matches that predicted by the full 3D FSI. From Fig. 8(b), the effect of the tissue stiffness on the vibration frequency is clear, e.g., in cases 1–3 and 4–6 in which an increase in the tissue stiffness leads to an increase in the vibration frequency.

The phase delay is calculated using the temporal difference between the glottal inlet and the exit in the mid xy-plane in terms of the displacement. Good agreement is again achieved in the comparison between the two sets of simulations for all cases. For this quantity, the most influential parameter is the medial thickness as the longer glottis would create a greater phase difference between the glottal inlet and the exit and, therefore, lead to a more pronounced mucosal wave along the glottis.

To compare the performance of the present flow model with that of the untrained flow model, we include the results from a previous study.29,30 This comparison is shown in Table II. Only one case each from the large T and small T is shown here for brevity. We also include the average differences for the frequencies, amplitudes, and phase delays in cases 1–10. If we only consider cases 7–10, the average errors are 3.3%, 6.3% and 5.2° for the frequency, amplitude, and phase delay, respectively, which is consistent with the overall average in Table II. From the comparison, it can be seen that the trained flow model leads to better accuracy in the predication of the vibration than the previous untrained model.

TABLE II.

1D-flow FSI results compared with 3D FSI in terms of vibration frequency f, amplitude d, and phase delay ϕ. Results using the untrained model in Ref. 29 are also included.

Model f (Hz) Difference d (mm) Difference ϕ (°) Difference (°)
Case 1 3D FSI 132 0.69 −19
1D-flow FSI 135 2.3% 0.65 5.8% −15 4
Ref. 29 144 9.1% 0.58 15.9% −15 4
Case 4 3D FSI 140 1.02 157
1D-flow FSI 136 2.9% 1.04 2.0% 159 2
Ref. 29 144 2.9% 1.00 2.0% 161 4
Average error (cases 1–10) 1D-flow FSI 2.3% 5.3% 6
Refs. 29 and 30 3.9% 9.1% 11

C. Application in the subject-specific vocal fold models

Other than the idealized vocal fold geometry, we also apply the 1D flow model in the FSI simulation of the subject-specific vocal fold models that were generated based on a 3D scan of a rabbit's larynx (Fig. 9). In the present study, we use the vocal fold models created previously in Chang et al.20 and will validate the simulation results against the experimental data of evoked in vivo phonation. The same models were also used in Chen et al.30 to validate the 1D flow model without machine learning. Readers are referred to Chang et al.20 for the details of how these anatomical models were created and how the in vivo measurement of the vocal fold vibration was conducted. Only a brief summary is given here to provide the context.

FIG. 9.

FIG. 9.

(Color online) Subject-specific vocal fold model. (a) Reconstructed larynx geometry from a superior view and (b) profiles of the vocal fold cover (blue) and body (red) segmented from the MRI scan.

In the experiment,17 live rabbits were used in the study; their vocal folds were surgically sutured to achieve adduction, and phonation was evoked by introducing pressurized air from their trachea. High-speed videos of the vocal fold vibration were taken during the experiment, which provide the vibration frequency, magnitude, and waveform as the validation data for our current study.

After the phonation experiment, the rabbit larynx was excised and high-resolution MRI was performed to obtain details of the morphology of the vocal fold while the vocal fold maintained the adducted phonatory position. The 3D anatomical vocal fold model was generated for each of the five samples after manual segmentation from the MRI data and surface mesh reduction/smoothing.20 Furthermore, the tissue properties were estimated in that study through simulations, and these properties of the five samples can be found in our previous publications.20,30

To perform the hybrid 1D-flow/3D-solid FSI simulation, we couple the FEM representation of each anatomical vocal fold with the present trained 1D flow model. The subglottal pressure in each case is obtained from the experiment and is in the range of 0.72–1.05 kPa.20 Figure 10 shows a comparison of the normalized glottal gap width in a sequence of vocal fold oscillations between the FSI simulation and the experiment. Because the high-speed imaging does not provide a length scale, we use the normalized gap width, d/dmax, for comparison where d is the gap width of the glottis measured at the mid-section and dmax is its peak value. From Fig. 10, it can be seen that the waveform obtained from the simulation agrees generally well with the experiment. In modeling the vocal fold contact, we maintain a minimum glottal gap of 0.02 mm between the sides for the flow.6 Thus, the waveform from the FSI simulation does not have full closure. For quantitative comparison, we further compute the normalized root-mean-square (rms) error of the waveform between the simulation data and experiment for each sample. The result is listed for all five samples in Table III, which shows that the error is within 15% for all cases. As shown in Table III, these rms errors are lower than the results in the previous work in which the untrained flow model was used.30 Therefore, the trained flow model provides improved results and reasonable prediction of the vibration for these subject-specific models.

FIG. 10.

FIG. 10.

Waveforms of the normalized glottal gap width from the in vivo phonation experiment and the FSI simulation for the rabbit sample R1.

TABLE III.

The normalized root-mean-square (rms) error of the gap width waveform for each rabbit sample. Results from a previous study (Ref. 30) with the untrained flow model is also included.

Sample R1 R2 R3 R4 R5
Error 13.7% 10.3% 12.7% 14.5% 14.0%
Error (Ref. 30) 14.3% 11.3% 15.3% 15.9% 16.9%

Figure 11 further shows a quantitative comparison between the experiment and the FSI simulation for all five samples in terms of the vibration frequency and normalized amplitude, dmax/L, where L is the vocal fold length. In the experiment, each sample had three trials to analyze the standard deviations.17,20 From Fig. 11, both the frequency and amplitude from the simulation fall within the range of the experimental data for all of the five samples despite significant variations among the individual subjects. This result, again, confirms the performance of the present 1D flow model in the FSI simulation.

FIG. 11.

FIG. 11.

(Color online) Frequency and amplitude comparison between the experimental and numerical results for five rabbit samples.

D. Discussion

In contrast with 3D computational fluid dynamics models that employ extensive computing resources and require substantially longer simulation time, the drastically simplified flow models, such as the present 1D model, offer much faster turnaround and may be used in conjunction with the 3D models in a complementary fashion for model-based prediction. The performance of such models could be evaluated in terms of their accuracy, robustness, and required information during practical implementation.

Using a similar set of partial differential equations based on momentum and mass conservation, the present 1D flow model retains the viscous and entrance effects of the flow model from the previous studies of Luo and co-workers29,30 and, hence, offers similar advantages shown therein in comparison with the traditional Bernoulli-based models. In the present model, we have incorporated machine learning to generalize the pressure loss and entrance effect in the glottis. Those effects were only assumed by ad hoc manners previously in the untrained model29,30 and may need adjustment in practical use. For example, previously the shear stress related to flow separation, τχ, is modeled as τχ=(A/s)(1χ)ρu(u/x) where s is the perimeter around the cross section, A is the effective cross-section area, and 0χ1 is a constant representing the pressure recovery [see Eq. (2) in Ref. 30]. In its application, the value of χ has to be adjusted empirically for a divergent channel to achieve matching results to the 3D model. In addition, the area correctional coefficient, α(x), has an adjustable constant C1 in Eq. (4) of Ref. 30. In the present study, the parameters in the shear stress and the area correctional coefficient have been expressed in explicit functions of the Reynolds number and a few geometrical parameters describing the instantaneous shape of the glottis through a data regression procedure and, thus, have better capability of generalization.

Using the idealized vocal fold geometry, we have demonstrated that the new flow model provides a better prediction of the pressure distribution and flow rate than the previous untrained model provides (Sec. III A). When applied to the FSI simulation, the new flow model leads to clearly more accurate predication of the vibration frequency, amplitude, and phase delay in the vocal fold dynamics (Sec. III B). However, when it is applied to the subject-specific vocal fold geometries, the new flow model offers only small or limited improvement as compared with the previous untrained model (Sec. III C) in which the normalized rms for all samples ranges from 10% to 15%. The reason for this limitation is mostly likely due to the presence of many other uncertainties related to this type of subject-specific models, e.g., quality of the MRI data, the segmentation errors, assumption of the tissue properties, as well as the experiment itself, which start to become predominant factors over the numerical model's own error. In that case, improving the numerical model alone will not further increase the overall accuracy.

Despite being unable to substantially improve the accuracy in the subject-specific cases, the present flow model is still advantageous as compared with the similar existing models. We emphasize that from the idealized geometries (including significant variations in the medial thickness, subglottal pressure, and the tissue stiffness) to the anatomical geometries, the current 1D flow model uses exactly the same pressure loss and entrance effect functions that are derived from machine learning, and there is no need to make any parameter adjustment. This feature of robustness, the overall improved accuracy in all of the tests, as well as the fact the present model does not require any additional input information, indicates that the present flow model has significantly better performance than does the previous untrained model.29,30

IV. CONCLUSION

In this study, we have presented a new 1D flow model for the glottal airflow, which is based on the viscous flow assumption. As compared with similar models in the previous studies, in the current flow model we derive the pressure loss and the entrance effect using a machine learning approach and express them as explicit functions of the Reynolds number and the parameters describing the characteristic shape of the glottis at any instantaneous moment. Unlike previous models in which the parameters need to be modified ad hoc for different cases, such as the vocal fold geometry, the present machine-trained model can be used for more general situations without the need to modify its parameters. We have tested the performance of this 1D flow model in three scenarios. First, we use this flow model to calculate the pressure distribution and the volume flow rate using the glottal configuration from the 3D FSI simulations not included in the training process. The results agree well with those results directly from the 3D simulations, and they are significantly better than those from the previous untrained model. Second, the 1D flow model is coupled with the 3D idealized vocal fold geometry to perform the hybrid 1D-flow/3D-solid FSI simulation, and the results show that the vibration characteristics match the full 3D FSI simulation results significantly better than the untrained model in terms of the vibration frequency, amplitude, and phase delay. Third, we applied the 1D flow model to the subject-specific vocal fold models constructed from the rabbit larynx, and the FSI simulation results are compared against the previous in vivo experimental data. Even though in this case the improvement is limited, likely due to the presence of uncertainties, the new model achieves the accuracy performance without the need to adjust its parameters.

In summary, we conclude that the present 1D glottal airflow model enhanced by machine learning is more accurate and robust than the similar models in the previous studies and could be useful for the efficient modeling of vocal fold dynamics, e.g., an estimate of the unknown tissue properties of an individual subject's vocal fold using model-based simulations, or design optimization of the surgical implant inserted into a paralyzed vocal fold.

ACKNOWLEDGMENT

This research was supported by National Institutes of Health (NIH) Grant No. 5 R01 DC016236 03 from the National Institute of Deafness and Other Communication Disorders (NIDCD).

APPENDIX A: EXPRESSIONS FROM SPARSE REGRESSION

The following formulas are the expressions derived from the SINDy method for the pressure losses τb, τe, and the area correction coefficient α(x) at xb.

τbρuc2=5.1282rb0.1902lbc2.2354Re1/3+3.3258rd+0.2611ra0.0553lac0.1955rb20.9508rbrd0.2648rbra+0.0606lbcra+0.0759Re2/3+0.2523Re1/3ra0.2052rd20.2450rdra+0.2834ra2+0.0680rbrd20.1170rbra20.0361Re1/3ra2, (A1)
τeρuc2=20.8293rb3.4297lbc+0.0784Re1/30.2519rd1.8603ra0.0637lac2.0506rb2+0.2090rblbc1.9726rbRe1/31.7743rbrd1.1751rbra+0.0292lbcRe1/3+0.8267lbcrd+0.5027lbcra+0.2645rdra+0.5762ra2+0.0636rb3+0.1096rb2Re1/3+0.0548rb2rd0.0316rblbcrd+0.0593rbRe2/3+0.1785rbRe1/3ra+0.1384rbrd20.2707rbra20.0544lbcrd20.0796lbcrdra0.0300Re1/3ra2, (A2)
α(xb)α(xc)=1.6652rb3.4873lbc+0.0285Re1/3+1.1238rd+0.3850ra0.8358rb2+0.9378rblbc0.0559rbRe1/31.1604rbrd0.5005rbra+0.6961lbcrd+0.8935lbcra0.0344Re1/3lac0.0938rd20.1952rdra+0.0449rdlac+0.0716rb30.0551rb2lbc+0.1134rb2rd+0.0926rb2ra0.0788rblbcrd0.1336rblbcra+0.0373rbRe1/3lac+0.0900rbrd2+0.2041rbrdra0.0504rbrdlac0.0353lbcrd20.0896lbcrdra0.0565lbcra2. (A3)

References

  • 1. Mittal R., Zheng X., Bhardwaj R., Seo J. H., Xue Q., and Bielamowicz S., “ Toward a simulation-based tool for the treatment of vocal fold paralysis,” Front. Physiol. 2, 19 (2011). 10.3389/fphys.2011.00019 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Alipour F., Berry D. A., and Titze I. R., “ A finite-element model of vocal-fold vibration,” J. Acoust. Soc. Am. 108(6), 3003–3012 (2000). 10.1121/1.1324678 [DOI] [PubMed] [Google Scholar]
  • 3. Hunter E. J., Titze I. R., and Alipour F., “ A three-dimensional model of vocal fold abduction/adduction,” J. Acoust. Soc. Am. 115(4), 1747–1759 (2004). 10.1121/1.1652033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4. Thomson S. L., Mongeau L., and Frankel S. H., “ Aerodynamic transfer of energy to the vocal folds,” J. Acoust. Soc. Am. 118(3), 1689–1700 (2005). 10.1121/1.2000787 [DOI] [PubMed] [Google Scholar]
  • 5. Cook D. D. and Mongeau L., “ Sensitivity of a continuum vocal fold model to geometric parameters, constraints, and boundary conditions,” J. Acoust. Soc. Am. 121(4), 2247–2253 (2007). 10.1121/1.2536709 [DOI] [PubMed] [Google Scholar]
  • 6. Luo H., Mittal R., Zheng X., Bielamowicz S. A., Walsh R. J., and Hahn J. K., “ An immersed-boundary method for flow–structure interaction in biological systems with application to phonation,” J. Comput. Phys. 227(22), 9303–9332 (2008). 10.1016/j.jcp.2008.05.001 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7. Luo H., Mittal R., and Bielamowicz S. A., “ Analysis of flow-structure interaction in the larynx during phonation using an immersed-boundary method,” J. Acoust. Soc. Am. 126(2), 816–824 (2009). 10.1121/1.3158942 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8. Shurtz T. E. and Thomson S. L., “ Influence of numerical model decisions on the flow-induced vibration of a computational vocal fold model,” Comput. Struct. 122, 44–54 (2013). 10.1016/j.compstruc.2012.10.015 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9. Chang S., Tian F.-B., Luo H., Doyle J. F., and Rousseau B., “ The role of finite displacements in vocal fold modeling,” J. Biomech. Eng. 135(11), 111008 (2013). 10.1115/1.4025330 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10. Zhang Z., “ Effect of vocal fold stiffness on voice production in a three-dimensional body-cover phonation model,” J. Acoust. Soc. Am. 142(4), 2311–2321 (2017). 10.1121/1.5008497 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11. Yang J., Wang X., Krane M., and Zhang L. T., “ Fully-coupled aeroelastic simulation with fluid compressibility-for application to vocal fold vibration,” Comput. Methods Appl. Mech. Eng. 315, 584–606 (2017). 10.1016/j.cma.2016.11.010 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12. Valášek J., Kaltenbacher M., and Sváček P., “ On the application of acoustic analogies in the numerical simulation of human phonation process,” Flow, Turbul. Combust. 102(1), 129–143 (2019). 10.1007/s10494-018-9900-z [DOI] [Google Scholar]
  • 13. Sadeghi H., Kniesburges S., Kaltenbacher M., Schützenberger A., and Döllinger M., “ Computational models of laryngeal aerodynamics: Potentials and numerical costs,” J. Voice 33(4), 385–400 (2019). 10.1016/j.jvoice.2018.01.001 [DOI] [PubMed] [Google Scholar]
  • 14. Madruga de Melo E. C., Lemos M., Aragão Ximenes Filho J., Sennes L. U., Nascimento Saldiva P. H., and Tsuji D. H., “ Distribution of collagen in the lamina propria of the human vocal fold,” The Laryngoscope 113(12), 2187–2191 (2003). 10.1097/00005537-200312000-00027 [DOI] [PubMed] [Google Scholar]
  • 15. Pickup B. A. and Thomson S. L., “ Flow-induced vibratory response of idealized versus magnetic resonance imaging-based synthetic vocal fold models,” J. Acoust. Soc. Am. 128(3), EL124–EL129 (2010). 10.1121/1.3455876 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Wu L. and Zhang Z., “ A parametric vocal fold model based on magnetic resonance imaging,” J. Acoust. Soc. Am. 140(2), EL159–EL165 (2016). 10.1121/1.4959599 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Novaleski C. K., Kojima T., Chang S., Luo H., Valenzuela C. V., and Rousseau B., “ Nonstimulated rabbit phonation model: Cricothyroid approximation,” The Laryngoscope 126(7), 1589–1594 (2016). 10.1002/lary.25559 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18. Mittal R., Erath B. D., and Plesniak M. W., “ Fluid dynamics of human phonation and speech,” Annu. Rev. Fluid Mech. 45, 437–467 (2013). 10.1146/annurev-fluid-011212-140636 [DOI] [Google Scholar]
  • 19. Xue Q., Zheng X., Mittal R., and Bielamowicz S., “ Subject-specific computational modeling of human phonation,” J. Acoust. Soc. Am. 135(3), 1445–1456 (2014). 10.1121/1.4864479 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20. Chang S., Novaleski C. K., Kojima T., Mizuta M., Luo H., and Rousseau B., “ Subject-specific computational modeling of evoked rabbit phonation,” J. Biomech. Eng. 138(1), 011005 (2016). 10.1115/1.4032057 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21. Schmidt B., Stingl M., Leugering G., Berry D. A., and Döllinger M., “ Material parameter computation for multi-layered vocal fold models,” J. Acoust. Soc. Am. 129(4), 2168–2180 (2011). 10.1121/1.3543988 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22. Schmidt B., Leugering G., Stingl M., Hüttner B., Agaimy A., and Döllinger M., “ Material and shape optimization for multi-layered vocal fold models using transient loadings,” J. Acoust. Soc. Am. 134(2), 1261–1270 (2013). 10.1121/1.4812253 [DOI] [PubMed] [Google Scholar]
  • 23. Alipour F., Finnegan E. M., and Jaiswal S., “ Phonatory characteristics of the excised human larynx in comparison to other species,” J. Voice 27(4), 441–447 (2013). 10.1016/j.jvoice.2013.03.013 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24. Döllinger M., Gómez P., Patel R. R., Alexiou C., Bohr C., and Schützenberger A., “ Biomechanical simulation of vocal fold dynamics in adults based on laryngeal high-speed videoendoscopy,” PloS One 12(11), e0187486 (2017). 10.1371/journal.pone.0187486 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25. Hadwin P. J., Motie-Shirazi M., Erath B. D., and Peterson S. D., “ Bayesian inference of vocal fold material properties from glottal area waveforms using a 2D finite element model,” Appl. Sci. 9(13), 2735 (2019). 10.3390/app9132735 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26. Zhang Z., “ Estimation of vocal fold physiology from voice acoustics using machine learning,” J. Acoust. Soc. Am. 147(3), EL264–EL270 (2020). 10.1121/10.0000927 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27. Decker G. Z. and Thomson S. L., “ Computational simulations of vocal fold vibration: Bernoulli versus Navier–Stokes,” J. Voice 21(3), 273–284 (2007). 10.1016/j.jvoice.2005.12.002 [DOI] [PubMed] [Google Scholar]
  • 28. Chang S., “ Computational fluid-structure interaction for vocal fold modeling,” Ph.D. thesis, Vanderbilt University, 2016. [Google Scholar]
  • 29. Li Z., Chen Y., Chang S., and Luo H., “ A reduced-order flow model for fluid-structure interaction simulation of vocal fold vibration,” J. Biomech. Eng. 142, 021005 (2020). 10.1115/1.4044033 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30. Chen Y., Li Z., Chang S., Rousseau B., and Luo H., “ Reduced-order flow model for vocal fold vibration: From idealized to subject-specific models,” J. Fluids Struct. 94, 102940 (2020). 10.1016/j.jfluidstructs.2020.102940 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 31. Cancelli C. and Pedley T., “ A separated-flow model for collapsible-tube oscillations,” J. Fluid Mech. 157, 375–404 (1985). 10.1017/S0022112085002427 [DOI] [Google Scholar]
  • 32. Anderson P., Fels S., and Green S., “ Implementation and validation of a 1D fluid model for collapsible channels,” J. Biomech. Eng. 135(11), 111006 (2013). 10.1115/1.4025326 [DOI] [PubMed] [Google Scholar]
  • 33. Vasudevan A., Zappi V., Anderson P., and Fels S., “ A fast robust 1D flow model for a self-oscillating coupled 2D FEM vocal fold simulation,” in Interspeech (2017), pp. 3482–3486. [Google Scholar]
  • 34. Brunton S. L., Noack B. R., and Koumoutsakos P., “ Machine learning for fluid mechanics,” Annu. Rev. Fluid Mech. 52, 477– 508 (2020). 10.1146/annurev-fluid-010719-060214 [DOI] [Google Scholar]
  • 35. Gómez P., Schützenberger A., Semmler M., and Döllinger M., “ Laryngeal pressure estimation with a recurrent neural network,” IEEE J. Transl. Eng. Health Med. 7, 1–11 (2019). 10.1109/JTEHM.2018.2886021 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 36. Zhang Y., Zheng X., and Xue Q., “ A deep neural network based glottal flow model for predicting fluid-structure interactions during voice production,” Appl. Sci. 10(2), 705 (2020). 10.3390/app10020705 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 37. Brunton S. L., Proctor J. L., and Kutz J. N., “ Discovering governing equations from data by sparse identification of nonlinear dynamical systems,” Proc. Natl. Acad. Sci. 113(15), 3932–3937 (2016). 10.1073/pnas.1517384113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38. Raissi M., Perdikaris P., and Karniadakis G. E., “ Machine learning of linear differential equations using gaussian processes,” J. Comput. Phys. 348, 683–693 (2017). 10.1016/j.jcp.2017.07.050 [DOI] [Google Scholar]
  • 39. Berg J. and Nyström K., “ Data-driven discovery of pdes in complex datasets,” J. Comput. Phys. 384, 239–252 (2019). 10.1016/j.jcp.2019.01.036 [DOI] [Google Scholar]
  • 40. Li S., Scherer R. C., Wan M., and Wang S., “ The effect of entrance radii on intraglottal pressure distributions in the divergent glottis,” J. Acoust. Soc. Am. 131(2), 1371–1377 (2012). 10.1121/1.3675948 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41. Tian F.-B., Dai H., Luo H., Doyle J. F., and Rousseau B., “ Fluid–structure interaction involving large deformations: 3D simulations and applications to biological systems,” J. Comput. Phys. 258, 451–469 (2014). 10.1016/j.jcp.2013.10.047 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.See supplementary material at https://www.scitation.org/doi/suppl/10.1121/10.0003561 for the machine learning flow chart, and plots of the output variables against time.

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

  1. See supplementary material at https://www.scitation.org/doi/suppl/10.1121/10.0003561 for the machine learning flow chart, and plots of the output variables against time.

Articles from The Journal of the Acoustical Society of America are provided here courtesy of Acoustical Society of America

RESOURCES