Analysis of Tongue Muscle Strain During Speech From Multimodal Magnetic Resonance Imaging

Muhan Shao; Fangxu Xing; Aaron Carass; Xiao Liang; Jiachen Zhuo; Maureen Stone; Jonghye Woo; Jerry L Prince

doi:10.1044/2022_JSLHR-22-00329

. 2023 Jan 30;66(2):513–526. doi: 10.1044/2022_JSLHR-22-00329

Analysis of Tongue Muscle Strain During Speech From Multimodal Magnetic Resonance Imaging

Muhan Shao ^a,^✉, Fangxu Xing ^b, Aaron Carass ^a, Xiao Liang ^c, Jiachen Zhuo ^c, Maureen Stone ^d, Jonghye Woo ^b, Jerry L Prince ^a

PMCID: PMC10023187 PMID: 36716389

Abstract

Purpose:

Muscle groups within the tongue in healthy and diseased populations show different behaviors during speech. Visualizing and quantifying strain patterns of these muscle groups during tongue motion can provide insights into tongue motor control and adaptive behaviors of a patient.

Method:

We present a pipeline to estimate the strain along the muscle fiber directions in the deforming tongue during speech production. A deep convolutional network estimates the crossing muscle fiber directions in the tongue using diffusion-weighted magnetic resonance imaging (MRI) data acquired at rest. A phase-based registration algorithm is used to estimate motion of the tongue muscles from tagged MRI acquired during speech. After transforming both muscle fiber directions and motion fields into a common atlas space, strain tensors are computed and projected onto the muscle fiber directions, forming so-called strains in the line of actions (SLAs) throughout the tongue. SLAs are then averaged over individual muscles that have been manually labeled in the atlas space using high-resolution T2-weighted MRI. Data were acquired, and this pipeline was run on a cohort of eight healthy controls and two glossectomy patients.

Results:

The crossing muscle fibers reconstructed by the deep network show orthogonal patterns. The strain analysis results demonstrate consistency of muscle behaviors among some healthy controls during speech production. The patients show irregular muscle patterns, and their tongue muscles tend to show more extension than the healthy controls.

Conclusions:

The study showed visual evidence of correlation between two muscle groups during speech production. Patients tend to have different strain patterns compared to the controls. Analysis of variations in muscle strains can potentially help develop treatment strategies in oral diseases.

Supplemental Material:

https://doi.org/10.23641/asha.21957011

The human tongue plays an essential role in many vital behaviors including swallowing, breathing, and speech production (Pierre et al., 2014; Stone et al., 2018). In speech production, the tongue helps form sounds by changing its surface shape via highly complex deformation patterns (Buchaillard et al., 2009; Sanguineti et al., 1997) that are produced by sequential activation of different tongue muscles. Understanding and interpreting the relationship between individual muscle groups and the complex functions performed by the tongue have been previously investigated (Gomez et al., 2020; Takemoto, 2001; Xing et al., 2019), but this research has been hampered by the inability to directly measure muscle activations in the tongue. Electromyography (EMG; Pittman & Bailey, 2009; Sasaki et al., 2016; Tankisi et al., 2013) has been used to record the electrical activity of tongue muscle during rest and contraction for diagnostic purposes. The EMG signals are measured by placing electrodes over muscle groups of interest to record the electrical activity. However, the surface or hooked wire electrodes that are used to make these measurements affect natural speech motion.

The concept of strain represents the relative change in the position of points within the tongue body and can be used to quantify the deformation of the tongue during speech production. Measurements of three-dimensional strain throughout the tongue, acquired using tagged magnetic resonance imaging (MRI), have been used as proxy measurements by which muscle shortening (negative strain) is used to infer muscle activity (Gomez et al., 2020; Xing et al., 2018) since muscle activity usually causes fiber shortening. However, we are aware that muscles can shorten passively, and conversely, they can be stretched by external means despite being activated. Therefore, we are mindful in our interpretations of MRI-based strain. Measurements of strain made this way do not interfere with natural speech production (other than the speaker is lying in the supine position), but previous inferences about the relationships between strain and specific muscle groups have been limited by absent or inaccurate identification of tongue muscle fibers within the tongue. Such identification is challenging because the tongue contains a collection of extensively interdigitated muscles that are nearly orthogonal in three dimensions (Kier & Smith, 1985; Stone et al., 2018). As a result, most regions in the tongue contain two fiber directions crossing at approximately a right angle, whereas the remaining regions contain a single fiber direction.

In the study of speech production, analyzing the interaction and cooperation between tongue muscles is of great scientific and clinical interest (Hiiemae & Palmer, 2003; Woo et al., 2019). During speech, the tongue shape changes by deformation patterns that are time variant and spatially inhomogeneous. Assessment of these patterns can be beneficial to understand the mechanism of tongue movement and interpret clinical observations. For glossectomy patients whose tongue has been partly removed due to oral cancer, use of the modified muscle might result in an imperfect coordination pattern. Therefore, understanding the adaptive behaviors of the tongue in patients with prior glossectomies is of interest to oral surgeons and speech pathologists (Bressmann et al., 2004, 2009; Chuanjun et al., 2002; Rentschler & Mann, 1980).

MRI is a noninvasive medical imaging technique that reveals anatomical structures. In the past, diffusion-weighted MRI (DWI) has been used to image tongue muscle architecture and interpret its function (Gaige et al., 2007; Shinagawa et al., 2008; Voskuilen et al., 2019). DWI is capable of quantifying the diffusion properties of the underlying tissue by measuring the attenuation of the magnetic resonance (MR) signal along selected gradient directions. The commonly used DWI technique, diffusion tensor imaging (DTI), can only describe a single fiber direction within each voxel; this is not adequate for tongue fiber direction reconstruction because of the highly interdigitated muscles in the tongue. To discriminate crossing fiber directions in the tongue, high angular resolution diffusion imaging (HARDI; Tuch et al., 2002) can be used. The HARDI technique acquires diffusion data with a larger number of gradient directions determined by spherical sampling schemes. Previous studies have investigated techniques to reconstruct the crossing fiber directions based on DWI. For example, the constrained spherical deconvolution algorithm models the fiber direction distribution as the deconvolution of the measured DWI signal and a response function (Tournier et al., 2007), and a deep convolutional neural network has been used to estimate the fiber direction distribution function (Lin et al., 2019). However, most existing fiber reconstruction algorithms were designed for brain tissue and do not account for the orthogonal nature of the crossing muscle fibers in the human tongue. Furthermore, although the DWI data of the tongue were acquired at a static state, similar to that of brain data, the reconstructed fiber directions are applied to tongue motion data acquired during speech production to assist in analyzing the dynamic patterns of tongue motion.

Estimation of tongue motion has previously been performed using tagged MRI (Parthasarathy et al., 2007; Xing et al., 2017, 2019). In tagged MRI acquisition, a temporary pattern that can deform with tissue during motion is made to appear in the images using manipulation of the magnetic field (Zerhouni et al., 1988). Tongue motion is quantified by processing the deformed tag pattern (Liu et al., 2011; Xing et al., 2017). Previous studies have used this method to compare the motion fields produced by speech in healthy controls and postpartial glossectomy patients (Stone et al., 2014; Xing et al., 2016). To study the interaction of tongue muscles, Xing et al. (2018) analyzed strain in tongue muscle fibers and showed contraction–extension patterns of individual muscles during speech. Xing et al. (2019) identified the muscle coordination patterns among various tongue muscles in a normalized space to investigate speech motor control. Gomez et al. (2020) presented a model-based approach to study the mechanical cooperation of the tongue muscles.

Although use of both DWI and tagged MRI in the tongue is possible, there is very little work on their joint use. Existing works that do consider both together (Xing et al., 2018, 2021) do not characterize different strains in the orthogonal fibers that interdigitate within the tongue, which constitutes about two thirds of the entire tongue. Also, the strains that characterize specific muscles are not properly distinguished since their interdigitated fiber groups are not separately associated with their underlying tongue muscles. Without these capabilities, it has not been previously possible to characterize contractions and elongations in whole tongue muscles during speech, and accordingly, it has not been possible to characterize the coordinated interactions between tongue muscles in their generation of tongue deformations for speech production.

To fill this important gap, we developed a comprehensive pipeline that jointly analyzes diffusion MRI, tagged MRI, and cine MRI to reveal strain components in muscle fibers throughout the tongue during speech production. This article presents a measurement and analysis process that estimates fiber directions throughout the tongue, including regions of crossing fibers, and determines strain components along all these computed fiber directions. This permits a far more detailed analysis of tongue muscle strains throughout the tongue during speech production. We employed some previously reported methods including (a) a deep convolutional neural network to directly reconstruct the tongue muscle fiber directions using HARDI data (Shao et al., 2021a); (b) a fiber matching algorithm that yields more consistent tongue muscle fiber arrangements (Shao et al., 2021b); and (c) a tongue motion estimation algorithm based on incompressible diffeomorphic image registration (Xing et al., 2017). Given these measurements, the motion fields were temporally and spatially aligned across all subjects before further analyses, and strains in the line of action (SLAs) at each tongue voxel and for each fiber direction were calculated by projecting the computed strains onto the local fiber directions. The SLA is defined as a ratio of a local tissue line element that deforms along the muscle fiber direction compared to its length at rest. Finally, using a manually labeled high-resolution muscle mask, the SLA patterns in two individual muscle groups were analyzed to reveal the cooperation between these muscles. We performed these steps on a cohort of eight healthy controls and two postpartial-glossectomy patients and achieved both qualitative and quantitative results. We analyzed and compared muscle cooperation pattern differences between the healthy controls and glossectomy patients in speech production. We obtained insights related to this using imaging methods that permit characterization of muscle fiber directions, muscle group delineations, and muscle strain analysis. This new approach has the potential to aid researchers in better understanding speech production and in learning about the adaptive behaviors of patients with glossectomies. The code for the processing pipeline is available at https://github.com/muhanshao/TongueStrain.

Materials and Method

Data Collection

The data set in this study includes eight healthy controls and two postpartial-glossectomy patients with tongue flaps. The subjects were native speakers of American English. The eight healthy controls were between ages 24 and 32 years (five men and three women). The two patients were 38 and 40 years old, both women. Written informed consent was obtained from all participants, and the study was approved by the University of Maryland Baltimore Institutional Review Board (protocol number: HP-00042060). All images were acquired on a 3T Prisma MR scanner (Siemens Healthcare), with a 64-channel head/neck coil. To reconstruct the tongue muscle fiber direction, we collected HARDI data for each subject in a static state. The HARDI tongue data were acquired using a single-shot echo-planar diffusion MRI sequence with real-time motion detection and re-acquisition (Elsaid et al., 2019; Liang et al., 2021) with TE = 48 ms and TR = 2600 ms (i.e., TE and TR stand for echo time and repetition time, respectively). The image resolution is 2.5 mm isotropic, and 34 axial slices were acquired to cover the entire tongue. There were 14 non–diffusion-weighted images (b0 images) and 200 diffusion gradient directions with b value of 500 s/mm².

To capture the tongue's motion during speech, the participants were asked to say the phrase “a thing” in repeated speech cycles while tagged and cine MR images were acquired with TE = 1.47 mm and TR = 36 ms. The field of view was 240 × 240 mm² with an in-plane resolution of 1.875 × 1.875 mm² and a slice thickness of 6 mm. Each data set contains a stack of images that covers the whole tongue and the surrounding tissues. The phrase was designed to start with a neutral tongue position /ə/, moving the tongue tip forward to make /θ/, and ending with the upward motion of the tongue body into /ŋ/. The motion into the /θ/ uses the most tongue tip protrusion of any English sound. Although tagged MRI provides information about internal tongue deformation, its images have low spatial resolution. Therefore, we use high-resolution MRI to characterize tongue anatomy and cine MRI to characterize tongue surface motion. Both tagged and cine data were collected over multiple repetitions of the speech task, timed to a metronome repeated every 2 s. The duration of the recorded phrase is 1 s, and 26 time frames (TFs) were acquired over this period. The tagged MRI data were collected using a complementary spatial modulation of magnetization pulse sequence (Fischer et al., 1993) in both sagittal and axial orientations. In the sagittal acquisitions, vertical and horizontal tags were used to capture the anterior–posterior and superior–inferior motions, respectively. The left–right motion was captured using vertical tags in the axial acquisitions. The cine MRI data were collected in axial, coronal, and sagittal orientations, and these acquisitions were combined using a superresolution algorithm (Woo et al., 2012) to form a single high-resolution image volume.

Strain Analysis Pipeline

The overall processing pipeline is illustrated in Figure 1. The input to the pipeline is HARDI data as well as tagged and cine MR images of each subject. The output is the SLA associated with the muscle fiber directions. The details of each method are described in the following sections.

Muscle Fiber Direction Reconstruction

To reconstruct the tongue muscle fiber directions from HARDI data, we used a deep convolutional neural network to directly estimate the crossing fiber directions (Shao et al., 2021a). The fiber direction reconstruction workflow is illustrated in Figure 2. First, the HARDI DWI data of a subject's tongue are transformed and expressed as spherical harmonics (SH)—that is, basis functions on the surface of a sphere (Arscott, 1969). This process uses least-squares fitting (Tournier et al., 2007) to produce a set of images containing SH coefficients. The reason for using an SH framework is to ensure that the network can be applied to HARDI data with a different number of gradient directions, as determined by the scanning protocol. In the next step of this workflow, the SH coefficients are input to a fiber direction reconstruction network to produce both the number of fibers (one or two) found at each voxel and their directions. To produce smoother and more informative fiber directions, this network processes the SH coefficients in a patch of voxels—the voxel under consideration and its 26 neighbors. If multiple directions are found within a given voxel, a so-called “separation loss function” encourages those (crossing) muscle fiber directions to be orthogonal. This is an important characteristic of the interdigitated muscle fibers in the tongue that is not characteristic of the white matter fibers in the brain (the anatomical target of most existing DWI imaging and analysis methods). The reconstructed fiber directions are represented by unit vectors d ₁ and d ₂, where d ₁ is generally associated with the more diffusive direction according to the reconstructed SH coefficients. If there is only one single fiber direction in the voxel, it is assigned to d ₁, and d ₂ is set equal to the zero vector. An illustration of both single fiber and crossing fiber directions is shown in the right-hand block in Figure 2.

Figure 2. — Workflow of the fiber direction reconstruction. The SH coefficients are computed from the HARDI tongue data and input to the fiber direction reconstruction network. The output at each voxel is the number of the fiber directions (single or crossing) and the corresponding fiber directions. HARDI = high angular resolution diffusion imaging; SH = spherical harmonics.

Since it is nearly impossible to manually annotate the ground truth fiber orientations from the diffusion data, the fiber direction reconstruction network was trained on synthetic data. The synthetic DW images were created by the multitensor model. The network was quantitatively evaluated on synthetic tongue HARDI data with different noise levels. On synthetic HARDI data with signal-to-noise ratio being 20, the fiber direction reconstruction network achieved mean angular error of 6° in single-fiber regions and 13° in crossing-fiber regions (Shao et al., 2021a).

Fiber Directions Refinement

The assignment of the first and second fiber directions to d ₁ and d ₂ in a given voxel or group of voxels might disagree with the assignments in neighboring voxels (see Figure 3a). This can happen because of either noisy data or when the diffusivities of the two directions are similar. Since we want to study the strain patterns among muscle groups, it is important that muscle fiber direction assignments are smoothly and consistently defined throughout the tongue. We therefore process the initial fiber assignments using a fiber direction matching algorithm to produce more consistent fiber directions (Shao et al., 2021b). The input of the matching algorithm is a pair of vector images that represent the fiber directions generated from the fiber direction reconstruction network. In the voxels with crossing fiber directions, the matching algorithm decides whether to leave the assignment of d ₁ and d ₂ as it is or switch the assignments between d ₁ and d ₂. In the voxels with a single fiber direction, the matching algorithm decides whether to leave d ₁ as it is or to reassign it to d ₂. The refinement decisions are made by solving a quadratic unconstrained binary optimization problem that seeks a globally optimal assignment throughout the entire tongue. Figure 3b shows the result of applying the fiber direction matching algorithm on the data shown in Figure 3a. It should be noted that the computed directions are not changed; all that changes is the assignment of the directions to either d ₁ or d ₂. It is observed that tongue muscles are much more clearly delineated after this assignment. Importantly, average strains along the fiber directions d ₁ and d ₂ within different muscle groups are made meaningful only after this reassignment is made.

Figure 3. — Example of fiber directions mismatching and refinement on a control data. (a) Midsagittal view of first and second fiber direction images with some inconsistent fiber assignments (yellow contours). This pair of images is the input to the fiber direction matching algorithm. (b) The results after applying the fiber direction matching algorithm to (a). The fiber directions are conventionally color coded (red: right–left; green: anterior–posterior; and blue: inferior–superior). S = superior; P = posterior.

Tongue Motion Estimation and Alignment

To estimate tongue motion during speech, tagged MR images were processed by a phase vector incompressible registration algorithm (PVIRA; Xing et al., 2017). For each subject, the result is a dense three-dimensional incompressible motion field at each TF, and there are 26 motion fields in total for the phrase. In this work, we estimated the motion field frame by frame, which means the reference state for each TF is its previous TF. Then, the motion field from any TF to another TF can be computed by composing the multiple motion fields.

To yield a consistent analysis across subjects, we must align the subjects' motion fields in both space and time. Temporal alignment is important because subjects speak at different rates. As previously noted, there are four critical TFs (i.e., /ə/, /θ/, /i/, and /ŋ/) in the phrase “a thing.” These times are determined for each subject by manual inspection of the midsagittal slices of the cine MRI throughout the image sequence. We then follow the instructions in the study of Xing et al. (2019) to align the subjects temporally. Briefly, we first reassign the four critical TFs to a set of predefined critical time indices t_ə , t _θ, t_i , and t_ŋ , which are 1, 8, 14, and 20 in our experiments. We then interpolate the remaining TFs between these critical ones by linear interpolation using the two closest TFs in the subject's original data. This yields a set of time-aligned volumetric images (both cine and tagged MR images) of each subject saying the phrase “a thing.”

To spatially align the motion fields of our subjects, we construct a cine atlas at time index t _ə and deform the motion fields from different subjects to the atlas. Briefly, the cine atlas is constructed by creating a group average of all the cine MR images at t _ə from all the healthy controls (Avants et al., 2011). The atlas was initialized by averaging the original cine images at t _ə from all the healthy subjects. Next, the cine images of individuals are registered to this initial average atlas, deformed to the average atlas, and averaged to create an updated average atlas. This process of updating the average atlas is then repeated several times until convergence to create our groupwise average atlas. Then, we register the cine MR images at t _ə of each subject, including the patients, to the cine atlas using diffeomorphic deformable image registration (Avants et al., 2008). The deformation field between each subject and the cine atlas is referred to as ϕ. After that, each subject's PVIRA motion fields can be deformed to the cine atlas space by composing ϕ and a series of PVIRA motion fields in the original subject's space (Xing et al., 2019). After all the subjects' motion fields are temporally and spatially aligned, we can compute the strain tensor for further SLA analysis.

Tongue Muscle Labeling

Analysis of the averaged SLAs in different muscle groups enables us to study how particular muscles interact and collaborate over time during speech production. A manually labeled tongue muscle mask was generated by one expert observer, the sixth author, on a high-resolution T2-weighted (T2w) MRI atlas, which was constructed by a groupwise registration among 20 normal subjects. Complete details of the atlas generation and manual labeling can be found in the study of Woo et al. (2015). The first step is to find a deformation field between the T2w atlas and each subject's cine MR image at t _ə by rigid and deformable registration between the tongue masks in the two images. Next, the manually labeled muscle mask in the T2w atlas space is transformed by this deformation field to generate muscle masks for each subject. The rationale for using tongue masks instead of the intensity images to perform the registration is that the T2w MRI was acquired in a static state while the tagged/cine data were acquired during speech production. The air gap between the top of the tongue and palate in the tagged/cine data makes registration to the static-state intensity images inaccurate. For the patient data, individual muscle masks, including the flap regions, were manually delineated in the patient's T2w MR image and were deformed to the cine space using the same registration method as in the healthy subjects. The ratio of the flap volume to the whole tongue volume is 0.17 for Patient 1 and 0.26 for Patient 2.

We focus on one extrinsic muscle, the genioglossus (GG) muscle, and one intrinsic muscle, the transverse (T) muscle, to demonstrate the pipeline. These two muscles were further subdivided into anterior (GGa, Ta), medium (GGm, Tm), and posterior (GGp, Tp) parts. Figure 4 shows the manual labeling of the GG and T muscle groups on the T2w MRI atlas (see Figure 4a) and the two glossectomy patients (see Figures 4b and 4c). The flap region is also shown in the patient data.

Figure 4. — High-resolution T2w MR images and the manual labeling of the genioglossus (GG) and transverse (T) muscles using the T2w images. The two muscles are subdivided into anterior (a), medium (m), and posterior (p) parts. In each row, the left subfigure shows a sagittal slice of the T2w MR image. The middle and right subfigures show three-dimensional rendering of the GG and T muscles, respectively. Flap region is also shown in the patients. (a) Manual labeling of the muscles on the T2w MR image atlas. (b) and (c) Manual labeling of the muscles and flap on the two glossectomy patients P1 and P2, respectively. S = superior; I = inferior; A = anterior; P = posterior; R = right; L = left; T2w = T2-weighted; MR = magnetic resonance.

Compute SLAs

Given the previous steps, the strain tensor can be now computed from the aligned motion fields and projected onto the local fiber directions to calculate the SLA. The SLA is defined as a ratio of a local tissue line element that deforms along the muscle fiber direction compared to its length in the reference TF, and it is calculated as follows:

λ (d_{m} (i)) = \sqrt{d_{m} {(i)}^{T} C_{ij} d_{m} (i)}, m \in \{1, 2\},

(1)

where C_ij is the right Cauchy–Green deformation tensor and is defined as C_ij = F^TF. d _m (i) is the first or second muscle fiber direction at the ith TF in the cine atlas space. The F in the definition of C_ij is the deformation gradient tensor from the ith TF to the jth one and is defined as

F = \frac{d x_{j}}{d X_{i}},

(2)

where x _j is the coordinate of a tissue point in the deformed TF j and X _i is the coordinate of the same tissue point in the reference TF i. SLA values greater than 1 represent extension, and SLA values less than 1 represent contraction; for example, when SLA has the value 1.1, this can be interpreted as a 10% extension, whereas an SLA value of 0.9 can be interpreted as a 10% contraction. In this work, we apply the Lagrangian framework, and all the motion fields are measured in the reference TF, which is t _ə in the experiments. Another choice is the Eulerian framework, where all the motion fields are measured in the deformed TF. Since the HARDI data were acquired in static state, it is better to map and display the SLA quantity computed at each TF to a reference TF.

Since all the motion fields have been spatially aligned to a cine atlas space (see the Tongue Motion Estimation and Alignment section), we need to map the muscle fiber directions, d _m , m ∈ {1, 2}, at each voxel to the same cine atlas space to calculate the SLA. For each subject, we first find the transformation between the tongue masks in the HARDI b0 image and the cine atlas by applying rigid and diffeomorphic image registration. Then, the muscle fiber directions d _m , m ∈ {1, 2}, estimated on the HARDI data, are deformed into the cine atlas space by this transformation. The reason for using tongue masks instead of MR images is the same as in the description in the Tongue Muscle Labeling section.

Results

Tongue Muscle Crossing-Fiber Direction Result

The fiber reconstruction network and the fiber matching algorithm were applied to the HARDI data of all 10 subjects. The eight healthy controls are referred to as C1, C2, …, C8, and the two patients are referred to as P1 and P2. A whole tongue mask was manually drawn for each to restrict the processing region. The reconstructed fiber directions for the subjects were manually assessed by an expert observer comparing with real tongue anatomy. Figure 5 shows a visual comparison of the predicted fiber directions, overlayed on a b0 image, of a healthy control (see C8, shown in Figure 5a) and a patient (see P2, shown in Figure 5b). In each subfigure, the upper row shows the first fiber direction and the lower row shows the second fiber direction. In the midsagittal slice (see Figure 5a, top), we can observe that in the healthy control the GG fiber direction, as expected, is fan shaped and covers the whole body of the tongue (see the blue and green lines). In Figure 5a, bottom, the T muscle, represented by red dots, is arc shaped, following the upper surface shape of the tongue, and left–right in orientation. Note that the algorithm does not calculate a second fiber direction in the region of the tongue that is composed solely of GG fibers. The fiber directions of Ta (bottom) intersect with the GG fiber directions (top) at approximately right angles. In the patient data (see Figure 5b, top and bottom), the muscle fibers show similar fiber directions but are inconsistent, presumably due to the presence of the flap, in the right-anterior part of the tongue (yellow contour), and muscle position changes due to surgery. The whole flap region is shown in Figure 4c.

Figure 5. — Midsagittal slice of the reconstructed fiber directions on data from a healthy tongue (Control 8, (a)) and a glossectomy patient (Patient 2, (b)). The results are restricted by manually delineated tongue masks. In each subject, the top row shows the first fiber direction image, and the bottom row shows the second fiber direction image. The yellow circle outlines the approximate position of the flap in the patient's tongue. The fiber directions are conventionally color coded (red: right–left; green: anterior–posterior; and blue: inferior–superior). S = superior; P = posterior.

Analysis of SLAs

Comparison of Healthy Controls and Patients

To observe the patterns of the SLA for each muscle, Figure 6 shows the midsagittal slice of the tongue in four subjects, including two healthy controls (a: C1 and b: C8) and the two patients (c: P1 and d: P2). The underlying image for each subject is the cine image at time t _ə deformed to the cine atlas space for easier comparison. The lines within the tongue indicate the direction of fibers that are shortening in the first direction (top) and second direction (bottom) during the motion from /ə/ to /θ/. The line colors represent the amount of compression (not direction); maximum lengthening is red, maximum shortening is blue, and green is neutral, which means no compression or expansion. In each column, the top and bottom figures show the SLAs associated with the first and second fiber directions, respectively.

In the SLAs along the first fiber direction, the controls (see Figures 6a and 6b, top) show horizontal SLAs across the middle of the tongue and oblique ones in the upper and anterior tongue, consistent with GG fiber shortening that pulls the tongue body forward from /ə/ to /θ/. In the tip, the upper surface of the tongue shortens lengthwise (anterior–posterior) to a large extent (dark blue). The lower portion of the tip lengthens vertically (red) consistent with superior–inferior expansion resulting from the anterior–posterior shortening. The /θ/ gesture requires the tongue to flatten and approximate the lower edge of the upper teeth but not strongly contact them. The /θ/ gesture is rare in the world's languages suggesting that its construction is subtle and challenging. The gesture, especially following /ə/, requires elevation of the tongue and extension of the tip. Could the tongue's contact with the teeth result in passive shortening? Upward elevation into the teeth could produce vertical compression, but we see vertical extension instead. Forward extension into the teeth could produce horizontal compression, and this may contribute to the long thin region of shortening on the upper surface. However, the fine control needed to create this constriction discourages interpreting this region as purely passive compression. It seems that superior longitudinal, which is involved in tongue tip elevation, would also need to be a player in the fine shaping and positioning of the tongue surface against the teeth. This compression pattern is consistent with that contribution. The patients (see Figures 6c and 6d, top) show more chaotic patterns in terms of first fiber direction and, in small regions, exhibit much unconnected lengthening (red) and compression (blue) along the first fiber direction. This is true in SLA patterns along the second fiber direction of the patients as well (see Figures 6c and 6d, bottom). The SLAs in the patients reflect multiple, possibly compensatory muscle behaviors, and also deformation of the flap tissue.

To quantitatively compare the SLAs in the controls and patients, we calculated the bottom and top 10% SLA values along the first and second fiber directions (d ₁ and d ₂) from /ə/ to each of the other three phonemes (/θ/, /i/, and /ŋ/) for each subject. Figure 7 shows the histograms of the calculated SLAs at /θ/ for the eight controls grouped together (green), and the two patients each displayed alone (P1 in yellow, P2 in pink). The histograms of the bottom and top 10% of SLAs for all the three phonemes are shown in Supplemental Material S1. The number of SLA values used to generate the histograms is in the range of 1200–2000 for each control and 700–800 for each patient along the first fiber direction. This number is between 650–1200 for each control and 450–500 for each patient along the second fiber direction. The data include the flap tissue and the muscles. Figures 7a and 7b show that the bottom 10% SLA values along the first and second fibers in the two patients are lower than those in the controls. Figures 7c and 7d show that the top 10% SLA in the two patients are higher than those in the controls. That is, the patients have more extreme SLA values than the grouped controls do.

Figure 7. — (a) and (b) Histograms of the bottom 10% SLA values along the first and second fiber directions from /ə/ to /θ/. (c) and (d) Histograms of the top 10% SLA values along the first and second fiber directions from /ə/ to /θ/. The control data are grouped together. The ratio of the flap volume to the whole tongue volume is 0.17 and 0.26 for Patients 1 and 2, respectively. SLA = strain in the line of action.

SLA Patterns in Muscle Groups

As described in the Tongue Muscle Labeling section, we used a tongue muscle mask in a T2-w MRI atlas to generate muscle masks for our subjects. The generated muscle masks were used to compute the average SLAs within the anterior and posterior GG and T muscle groups from /ə/ to each aligned TF. Figures 8 and 9 present the time series of the average SLAs in the 10 subjects, including eight healthy controls (C1–C8) and two glossectomy patients (P1 and P2). All the SLA values were calculated in the Lagrangian framework with the reference TF being t _ə. SLA values greater than 1 represent extension, and SLA values less than 1 represent contraction. The lines represent the average SLAs for each subject and muscle, and the similarly colored shaded regions represent the standard deviations.

Figure 9. — Average strains in the line of action (SLAs) along the genioglossus (GG) and transverse (T) muscles calculated in the posterior part of the tongue from eight healthy controls (C1–C8) and two glossectomy patients (P1 and P2). All the SLAs were calculated in the Lagrangian framework, and the undeformed time frame is at /ə/. The solid lines are the mean SLA values, and the shades show the standard deviations. GGp = genioglossus posterior; Tp = transverse posterior.

In Figure 8, the SLAs show that the GGa muscle (blue) shortens into the /θ/ and then lengthens throughout the rest of the word for most of the controls and P1. The GGa muscle in C2 and C3 shows an initial small extension, then shortens and lengthens, but maximum shortening still occurs around the time of the /θ/. For P2, GGa maintains its resting length for part of the motion into /θ/ and then lengthens.

For the Ta muscle (orange), controls C5–C8 show an oppositional pattern to GGa in which Ta shortens at or after the /θ/, consistent with inactivity during the /θ/ followed by narrowing the tongue to facilitate its upward expansion during the /i/ and /ŋ/. C2–C4 also have a Ta length greater than 1 until /i/ or /ŋ/. C1 is the outlier whose pattern strongly resembles that of P1, in which both muscles shorten into the /θ/ and then lengthen, suggesting that the Ta is used to help GGa extend the tongue tip. For P2, the average SLA values for Ta are always at or above 1, although it does shorten into the /i/ before lengthening further, like the other subjects.

In Figure 9, the GGp muscle (coral) shows peak shortening at the /i/ or /ŋ/ for all 10 subjects; even C2, who lengthens the muscle for most of the word, shortens it a bit before /i/, which is consistent with creating an anterior tongue root to elevate the tongue body. Shortening occurs also for /θ/ in C1, C3, C4, C6, C7, C8, and P2, consistent with pulling the tongue root anteriorly to help position the tongue tip at the teeth. The Tp muscle (green) shows increased amounts of extension toward the end of the word in C1, C2, C3, C4, C5, and C8. C6 and C7 show shortening into the /i/ and /ŋ/ (just like Ta), suggesting that the muscle is behaving similarly throughout its length. In the patients, the GGp and Tp muscles show average SLA values that move similarly to each other.

Discussion

In this study, we visualized and analyzed the cooperation between certain tongue muscle groups during speech production. We performed the SLA analysis on eight healthy controls and two glossectomy patients. We compared the patterns of both muscle fiber directions and SLA patterns between controls and patients. Results showed some common patterns in the controls and differences in the patients.

The fiber direction reconstruction network can reconstruct most fiber directions in the tongue and predict orthogonal fiber directions. The fiber direction matching algorithm provides clean fiber directions in most regions of the tongue. The controls have smooth and consistent muscle fiber directions. We can see clear fan-shaped GG muscle fiber directions in the first fiber direction image and arc-shaped T muscle fiber directions in the second fiber direction image in the sagittal view, which is consistent with tongue anatomy (Miyawaki, 1974). In the patient data, we can still observe the general patterns of the GG and T muscles. The muscle fiber directions in regions far from the flap look similar to those in the controls, whereas in regions near the flap, the fiber directions look irregular (upper–left in Figure 5b). This behavior is expected because the flap region does not contain muscles.

From the visualization of the SLAs overlaid on subjects' cine images in the atlas space, we can observe and compare the way tongue muscles contract or extend at key TFs during speech production across different subjects. This visualization benefits from the fiber direction matching algorithm as we can observe and compute the average SLAs within specific muscle groups. The two controls (see Figures 6a and 6b) are shown to have similar SLA patterns from /ə/ to /θ/ with only small inconsistencies; the phoneme /θ/ is characterized by a large tongue protrusion. In order to produce /θ/, both subjects show contraction in the upper surface of the tongue tip and extension below the tip along the first fiber direction (see Figures 6a and 6b, top). In the tongue body, the tongue experiences shortening along the GG muscle, which likely activates to facilitate the tongue–tooth contact. At the same time, a small extension along the T muscle is observed throughout the tongue showing that for /θ/ tongue widening also occurs as the lateral tongue contacts the molars (see Figures 6a and 6b, bottom). The SLAs in the patients (see Figures 6c and 6d) show different patterns. Patient 2 shows similar tip behavior as the controls: shortening above and lengthening below. However, both the patients show larger extension (dark red) than the controls in some healthy muscle tissue regions, which can be seen in the superior region of the tongue.

Quantitative measurements provide insight on the different patterns of the SLAs between controls and patients. Histograms in Figure 7 show the distributions of the bottom and top 10% of SLA values for our subjects. One immediate observation is that the SLA values in the patients have wider ranges in these plots, the top 10% are higher and the bottom 10% are lower, indicating that there are more extreme SLA values in the patients' data. Patient 2 has a larger relative flap volume than Patient 1 and shows a wider range of SLA values. One possible reason is that under the condition of muscle degeneration, the remaining muscles of the patient's tongue must work harder to compensate for the muscle loss. Compared to the controls, the patients' tongues make larger deformations during speech production showing greater compression and extension.

The time series plots enable us to analyze the cooperation between the GG and T muscle groups during speech production. It is shown from Figures 8 and 9 that the behaviors of the two muscles across controls show some similarity and some differences. From Figure 8, the time series of C4–C8 visualize some negative correlation between the GGa and Ta muscles, as we can observe that one muscle tends to shorten while the other one extends. The remaining controls, C1, C2, and C3, show a positive correlation between the GGa and Ta muscles, as they shorten and extend fairly simultaneously in our speech task. Clearly, this represents two strategies by GGa and Ta to extend the tip. Our results also show that the SLAs along the GGa muscle are lowest near t _θ in our controls; even C4 pauses GGa extension at that time. This supports the idea that in healthy controls, the GGa muscle actively shortens for the /θ/ sound, possibly to ensure a local bend at the tongue blade as the superior longitudinal muscle pulls the tongue tip back. Without GGa assistance, superior longitudinal shortening would merely pull the tongue tip straight back (Kier & Smith, 1985).

In Figure 9, all the subjects except C2 and P1 shorten GGp during the motion into the /i/ and /ŋ/ gestures; the GGp muscle shows the lowest average SLA values near t_i and t _ŋ. This pattern documents our understanding that the generation of /i/ and /ŋ/, which require elevating the tongue body, is often accomplished by shortening the GGp muscle. In most subjects, we see a negative correlation between GGp and Tp, where the GGp shortens and the Tp lengthens. This is consistent with allowing the tongue to widen as it elevates to create lingual contact with the lateral teeth (inner edge of molars) for both /i/ and /ŋ/. Subject C2, who does not shorten GGp, is the only control who does not appear to use the GGp muscle during this word, although there is a brief shortening at the /θ/ sound.

Examining the patients, their SLA patterns show different characteristics from the controls. Figure 8 shows both anterior muscles either shorten or stay level at the /θ/ for both patients; they appear to engage them, like the controls, to control the tip. After that, neither patient shortens either anterior muscle. Figure 9 shows positively correlated SLA patterns for the posterior muscles, which could indicate reduction in motor control, or the flaps might inhibit tongue deformation. Both patients extend both muscles through the /θ/, showing no strong usage of the posterior muscles for /θ/, with stable length (P1), or shortening (P2) into the /i/ and /ŋ/.

In P2, GGa, Ta, and Tp show extension in general. However, the GGa shortens before the /i/, and the GGp shortens toward the end of the phrase, reflecting the usage of these muscles by P2 but less so by P1. The standard deviations of the SLAs in the two patients are larger than those of the controls. This is unsurprising as patients are often more variable than controls. The two muscle groups also tend to extend and compress more locally compared to the controls, as shown in Figure 6. It should be remembered that the patients are at all times moving an incompressible, inert flap, which affects the shapes, the durations, and the variability of their gestures.

With our strain analysis pipeline, we can observe the patterns of cooperation between muscle groups evidenced by simultaneous and antagonistic behaviors within each subject during speech production. We can also see that some patterns are consistent in control subjects, although there exist unique patterns in individuals. We also see that glossectomy patients show different SLA patterns than controls. The comparison between the controls and patients could provide some insights about the adaptive behaviors in speech production of the glossectomy patients.

In this study, the fiber directions were estimated based on DWI data acquired at a single static tongue position. However, the accuracy of the reconstructed fiber directions will be improved in the future by acquiring DWI data of the static tongue in multiple positions and using them to validate the pipeline predicted motion between them.

Conclusions

In this study, we presented a workflow to analyze SLA associated with tongue muscle fiber directions in speech sound production using multiple MRI modalities. A deep convolutional network is used to directly reconstruct the crossing tongue muscle fiber direction from diffusion MRI. A fiber direction matching algorithm is used to refine the assignments of the fiber directions. Tagged MRI is used to quantify the tongue motion in speech production. SLAs are computed within the tongue muscle masks. We performed analysis on a cohort of eight healthy controls and two glossectomy patients. Visual evidence of correlation between two muscle groups is presented in the analysis. Results show consistency of muscle behaviors among some healthy controls during speech production. Patients tend to have somewhat different strain patterns than the controls. The proposed pipeline provides a solution to quantitatively analyze the cooperation between tongue muscles during speech production.

Data Availability Statement

The data sets analyzed during this study are available from the corresponding author on reasonable request.

Supplementary Material

Supplemental Material S1. Histograms of the bottom and top 10% SLA values along the 1st and 2nd fiber directions from /ə/ to /θ/, /i/, and /ŋ/. The control data are grouped together. The ratio of the flap volume to the whole tongue volume is 0.17 and 0.26 for Patient 1 and 2, respectively.

Click here for additional data file.^{(1.4MB, jpg)}

Acknowledgments

This work was supported in part by National Institute on Deafness and Other Communication Disorders Grants R01DC014717 (PI: Jerry L. Prince) and R01DC018511 (PI: Jonghye Woo). The study was conducted at the University of Maryland School of Medicine Center for Innovative Biomedical Resources, Translational Research in Imaging @ Maryland, Baltimore.

Funding Statement

References

Arscott, F. M. (1969). Spherical harmonics: An elementary treatise on harmonic functions with applications. By T. M. MacRobert. Pp. xviii, 349. £5. 1967. (Pergamon press.). The Mathematical Gazette, 53(386), 452–453. https://doi.org/10.2307/3612534 [Google Scholar]
Avants, B. B. , Epstein, C. L. , Grossman, M. , & Gee, J. C. (2008). Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain. Medical Image Analysis, 12(1), 26–41. https://doi.org/10.1016/j.media.2007.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]
Avants, B. B. , Tustison, N. J. , Song, G. , Cook, P. A. , Klein, A. , & Gee, J. C. (2011). A reproducible evaluation of ANTs similarity metric performance in brain image registration. NeuroImage, 54(3), 2033–2044. https://doi.org/10.1016/j.neuroimage.2010.09.025 [DOI] [PMC free article] [PubMed] [Google Scholar]
Bressmann, T. , Jacobs, H. , Quintero, J. , & Irish, J. C. (2009). Speech outcomes for partial glossectomy surgery: Measures of speech articulation and listener perception. Head and Neck Cancer, 33(4), 204. [Google Scholar]
Bressmann, T. , Sader, R. , Whitehill, T. L. , & Samman, N. (2004). Consonant intelligibility and tongue motility in patients with partial glossectomy. Journal of Oral and Maxillofacial Surgery, 62(3), 298–303. https://doi.org/10.1016/j.joms.2003.04.017 [DOI] [PubMed] [Google Scholar]
Buchaillard, S. , Perrier, P. , & Payan, Y. (2009). A biomechanical model of cardinal vowel production: Muscle activations and the impact of gravity on tongue positioning. The Journal of the Acoustical Society of America, 126(4), 2033–2051. https://doi.org/10.1121/1.3204306 [DOI] [PubMed] [Google Scholar]
Chuanjun, C. , Zhiyuan, Z. , Shaopu, G. , Xinquan, J. , & Zhihong, Z. (2002). Speech after partial glossectomy: A comparison between reconstruction and nonreconstruction patients. Journal of Oral and Maxillofacial Surgery, 60(4), 404–407. https://doi.org/10.1053/joms.2002.31228 [DOI] [PubMed] [Google Scholar]
Elsaid, N. M. , Prince, J. L. , Roys, S. , Gullapalli, R. P. , & Zhuo, J. (2019). Phase image texture analysis for motion detection in diffusion MRI (PITA-MDD). Magnetic Resonance Imaging, 62, 228–241. https://doi.org/10.1016/j.mri.2019.07.009 [DOI] [PMC free article] [PubMed] [Google Scholar]
Fischer, S. E. , McKinnon, G. C. , Maier, S. E. , & Boesiger, P. (1993). Improved myocardial tagging contrast. Magnetic Resonance in Medicine, 30(2), 191–200. https://doi.org/10.1002/mrm.1910300207 [DOI] [PubMed] [Google Scholar]
Gaige, T. A. , Benner, T. , Wang, R. , Wedeen, V. J. , & Gilbert, R. J. (2007). Three dimensional myoarchitecture of the human tongue determined in vivo by diffusion tensor imaging with tractography. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, 26(3), 654–661. https://doi.org/10.1002/jmri.21022 [DOI] [PubMed] [Google Scholar]
Gomez, A. D. , Stone, M. L. , Woo, J. , Xing, F. , & Prince, J. L. (2020). Analysis of fiber strain in the human tongue during speech. Computer Methods in Biomechanics and Biomedical Engineering, 23(8), 312–322. https://doi.org/10.1080/10255842.2020.1722808 [DOI] [PMC free article] [PubMed] [Google Scholar]
Hiiemae, K. M. , & Palmer, J. B. (2003). Tongue movements in feeding and speech. Critical Reviews in Oral Biology & Medicine, 14(6), 413–429. https://doi.org/10.1177/154411130301400604 [DOI] [PubMed] [Google Scholar]
Kier, W. M. , & Smith, K. K. (1985). Tongues, tentacles and trunks: The biomechanics of movement in muscular-hydrostats. Zoological Journal of the Linnean Society, 83(4), 307–324. https://doi.org/10.1111/j.1096-3642.1985.tb01178.x [Google Scholar]
Liang, X. , Su, P. , Patil, S. G. , Elsaid, N. M. , Roys, S. , Stone, M. , Gullapalli, R. P. , Prince, J. L. , & Zhuo, J. (2021). Prospective motion detection and re-acquisition in diffusion MRI using a phase image–based method—Application to brain and tongue imaging. Magnetic Resonance in Medicine, 86(2), 725–737. https://doi.org/10.1002/mrm.28729 [DOI] [PMC free article] [PubMed] [Google Scholar]
Lin, Z. , Gong, T. , Wang, K. , Li, Z. , He, H. , Tong, Q. , Yu, F. , & Zhong, J. (2019). Fast learning of fiber orientation distribution function for MR tractography using convolutional neural network. Medical Physics, 46(7), 3101–3116. https://doi.org/10.1002/mp.13555 [DOI] [PubMed] [Google Scholar]
Liu, X. , Abd-Elmoniem, K. Z. , Stone, M. , Murano, E. Z. , Zhuo, J. , Gullapalli, R. P. , & Prince, J. L. (2011). Incompressible deformation estimation algorithm (IDEA) from tagged MR images. IEEE Transactions on Medical Imaging, 31(2), 326–340. https://doi.org/10.1109/TMI.2011.2168825 [DOI] [PMC free article] [PubMed] [Google Scholar]
Miyawaki, K. (1974). A study of the muscular of the human tongue. Annual Bulletin Research Institute of Logopedics and Phoniatrics, 8, 23–50. [Google Scholar]
Parthasarathy, V. , Prince, J. L. , Stone, M. , Murano, E. Z. , & NessAiver, M. (2007). Measuring tongue motion from tagged cine-MRI using harmonic phase (HARP) processing. The Journal of the Acoustical Society of America, 121(1), 491–504. https://doi.org/10.1121/1.2363926 [DOI] [PubMed] [Google Scholar]
Pierre, C. S. , Dassonville, O. , Chamorey, E. , Poissonnet, G. , Riss, J.-C. , Ettaiche, M. , Peyrade, F. , Benezery, K. , Chand, M.-E. , & Leyssalle, A. (2014). Long-term functional outcomes and quality of life after oncologic surgery and microvascular reconstruction in patients with oral or oropharyngeal cancer. Acta Oto-Laryngologica, 134(10), 1086–1093. https://doi.org/10.3109/00016489.2014.913809 [DOI] [PubMed] [Google Scholar]
Pittman, L. J. , & Bailey, E. F. (2009). Genioglossus and intrinsic electromyographic activities in impeded and unimpeded protrusion tasks. Journal of Neurophysiology, 101(1), 276–282. https://doi.org/10.1152/jn.91065.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]
Rentschler, G. J. , & Mann, M. B. (1980). The effects of glossectomy on intelligibility of speech and oral perceptual discrimination. Journal of Oral Surgery, 38(5), 348–354. [PubMed] [Google Scholar]
Sanguineti, V. , Laboissiere, R. , & Payan, Y. (1997). A control model of human tongue movements in speech. Biological Cybernetics, 77(1), 11–22. https://doi.org/10.1007/s004220050362 [DOI] [PubMed] [Google Scholar]
Sasaki, M. , Onishi, K. , Stefanov, D. , Kamata, K. , Nakayama, A. , Yoshikawa, M. , & Obinata, G. (2016). Tongue interface based on surface EMG signals of suprahyoid muscles. ROBOMECH Journal, 3(1), Article No. 9. https://doi.org/10.1186/s40648-016-0048-0 [Google Scholar]
Shao, M. , Carass, A. , Gomez, A. D. , Zhuo, J. , Liang, X. , Stone, M. , & Prince, J. L. (2021a). Direct reconstruction of crossing muscle fibers in the human tongue using a deep neural network. In Computational diffusion MRI (pp. 69–80). Springer. [Google Scholar]
Shao, M. , Gomez, A. D. , Zhuo, J. , Liang, X. , Stone, M. , Carass, A. , & Prince, J. L. (2021b). Reconstruction and refinement of crossing muscle fibers in the human tongue. Medical Imaging 2021: Image Processing, 11596. [Google Scholar]
Shinagawa, H. , Murano, E. Z. , Zhuo, J. , Landman, B. , Gullapalli, R. P. , Prince, J. L. , & Stone, M. (2008). Tongue muscle fiber tracking during rest and tongue protrusion with oral appliances: A preliminary study with diffusion tensor imaging. Acoustical Science and Technology, 29(4), 291–294. https://doi.org/10.1250/ast.29.291 [Google Scholar]
Stone, M. , Langguth, J. M. , Woo, J. , Chen, H. , & Prince, J. L. (2014). Tongue motion patterns in post-glossectomy and typical speakers: A principal components analysis. Journal of Speech, Language, and Hearing Research, 57(3), 707–717. https://doi.org/10.1044/1092-4388(2013/13-0085) [DOI] [PMC free article] [PubMed] [Google Scholar]
Stone, M. , Woo, J. , Lee, J. , Poole, T. , Seagraves, A. , Chung, M. , Kim, E. , Murano, E. Z. , Prince, J. L. , & Blemker, S. S. (2018). Structure and variability in human tongue muscle anatomy. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 6(5), 499–507. https://doi.org/10.1080/21681163.2016.1162752 [DOI] [PMC free article] [PubMed] [Google Scholar]
Takemoto, H. (2001). Morphological analyses of the human tongue musculature for three-dimensional modeling. Journal of Speech, Language, and Hearing Research, 44(1), 95–107. https://doi.org/10.1044/1092-4388(2001/009) [DOI] [PubMed] [Google Scholar]
Tankisi, H. , Otto, M. , Pugdahl, K. , & Fuglsang-Frederiksen, A. (2013). Spontaneous electromyographic activity of the tongue in amyotrophic lateral sclerosis. Muscle & Nerve, 48(2), 296–298. https://doi.org/10.1002/mus.23781 [DOI] [PubMed] [Google Scholar]
Tournier, J.-D. , Calamante, F. , & Connelly, A. (2007). Robust determination of the fibre orientation distribution in diffusion MRI: Non-negativity constrained super-resolved spherical deconvolution. NeuroImage, 35(4), 1459–1472. https://doi.org/10.1016/j.neuroimage.2007.02.016 [DOI] [PubMed] [Google Scholar]
Tuch, D. S. , Reese, T. G. , Wiegell, M. R. , Makris, N. , Belliveau, J. W. , & Wedeen, V. J. (2002). High angular resolution diffusion imaging reveals intravoxel white matter fiber heterogeneity. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine, 48(4), 577–582. https://doi.org/10.1002/mrm.10268 [DOI] [PubMed] [Google Scholar]
Voskuilen, L. , Mazzoli, V. , Oudeman, J. , Balm, A. J. , van der Heijden, F. , Froeling, M. , de Win, M. M. , Strijkers, G. J. , Smeele, L. E. , & Nederveen, A. J. (2019). Crossing muscle fibers of the human tongue resolved in vivo using constrained spherical deconvolution. Journal of Magnetic Resonance Imaging, 50(1), 96–105. https://doi.org/10.1002/jmri.26609 [DOI] [PMC free article] [PubMed] [Google Scholar]
Woo, J. , Lee, J. , Murano, E. Z. , Xing, F. , Al-Talib, M. , Stone, M. , & Prince, J. L. (2015). A high-resolution atlas and statistical model of the vocal tract from structural MRI. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 3(1), 47–60. https://doi.org/10.1080/21681163.2014.933679 [DOI] [PMC free article] [PubMed] [Google Scholar]
Woo, J. , Murano, E. Z. , Stone, M. , & Prince, J. L. (2012). Reconstruction of high-resolution tongue volumes from MRI. IEEE Transactions on Biomedical Engineering, 59(12), 3511–3524. https://doi.org/10.1109/TBME.2012.2218246 [DOI] [PMC free article] [PubMed] [Google Scholar]
Woo, J. , Xing, F. , Prince, J. L. , Stone, M. , Green, J. R. , Goldsmith, T. , Reese, T. G. , Wedeen, V. J. , & El Fakhri, G. (2019). Differentiating post-cancer from healthy tongue muscle coordination patterns during speech using deep learning. The Journal of the Acoustical Society of America, 145(5), EL423–EL429. https://doi.org/10.1121/1.5103191 [DOI] [PMC free article] [PubMed] [Google Scholar]
Xing, F. , Liu, X. , Reese, T. , Stone, M. , Wedeen, V. , Prince, J. L. , El Fakhri, G. , & Woo, J. (2021). Muscle strain analysis from diffusion tractography and dynamic magnetic resonance imaging of the moving tongue. The Journal of the Acoustical Society of America, 150(4), A190–A190. https://doi.org/10.1121/10.0008082 [Google Scholar]
Xing, F. , Prince, J. L. , Stone, M. , Reese, T. G. , Atassi, N. , Wedeen, V. J. , El Fakhri, G. , & Woo, J. (2018). Strain map of the tongue in normal and ALS speech patterns from tagged and diffusion MRI. Medical Imaging 2018: Image Processing, 10574, 1057411. [DOI] [PMC free article] [PubMed] [Google Scholar]
Xing, F. , Stone, M. , Goldsmith, T. , Prince, J. L. , El Fakhri, G. , & Woo, J. (2019). Atlas-based tongue muscle correlation analysis from tagged and high-resolution magnetic resonance imaging. Journal of Speech, Language, and Hearing Research, 62(7), 2258–2269. https://doi.org/10.1044/2019_JSLHR-S-18-0495 [DOI] [PMC free article] [PubMed] [Google Scholar]
Xing, F. , Woo, J. , Gomez, A. D. , Pham, D. L. , Bayly, P. V. , Stone, M. , & Prince, J. L. (2017). Phase vector incompressible registration algorithm for motion estimation from tagged magnetic resonance images. IEEE Transactions on Medical Imaging, 36(10), 2116–2128. https://doi.org/10.1109/TMI.2017.2723021 [DOI] [PMC free article] [PubMed] [Google Scholar]
Xing, F. , Woo, J. , Lee, J. , Murano, E. Z. , Stone, M. , & Prince, J. L. (2016). Analysis of 3-D tongue motion from tagged and cine magnetic resonance images. Journal of Speech, Language, and Hearing Research, 59(3), 468–479. https://doi.org/10.1044/2016_JSLHR-S-14-0155 [DOI] [PMC free article] [PubMed] [Google Scholar]
Zerhouni, E. A. , Parish, D. M. , Rogers, W. J. , Yang, A. , & Shapiro, E. P. (1988). Human heart: Tagging with MR imaging—A method for noninvasive assessment of myocardial motion. Radiology, 169(1), 59–63. https://doi.org/10.1148/radiology.169.1.3420283 [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Click here for additional data file.^{(1.4MB, jpg)}

Data Availability Statement

The data sets analyzed during this study are available from the corresponding author on reasonable request.

[bib1] Arscott, F. M. (1969). Spherical harmonics: An elementary treatise on harmonic functions with applications. By T. M. MacRobert. Pp. xviii, 349. £5. 1967. (Pergamon press.). The Mathematical Gazette, 53(386), 452–453. https://doi.org/10.2307/3612534 [Google Scholar]

[bib2] Avants, B. B. , Epstein, C. L. , Grossman, M. , & Gee, J. C. (2008). Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain. Medical Image Analysis, 12(1), 26–41. https://doi.org/10.1016/j.media.2007.06.004 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib3] Avants, B. B. , Tustison, N. J. , Song, G. , Cook, P. A. , Klein, A. , & Gee, J. C. (2011). A reproducible evaluation of ANTs similarity metric performance in brain image registration. NeuroImage, 54(3), 2033–2044. https://doi.org/10.1016/j.neuroimage.2010.09.025 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib4] Bressmann, T. , Jacobs, H. , Quintero, J. , & Irish, J. C. (2009). Speech outcomes for partial glossectomy surgery: Measures of speech articulation and listener perception. Head and Neck Cancer, 33(4), 204. [Google Scholar]

[bib5] Bressmann, T. , Sader, R. , Whitehill, T. L. , & Samman, N. (2004). Consonant intelligibility and tongue motility in patients with partial glossectomy. Journal of Oral and Maxillofacial Surgery, 62(3), 298–303. https://doi.org/10.1016/j.joms.2003.04.017 [DOI] [PubMed] [Google Scholar]

[bib6] Buchaillard, S. , Perrier, P. , & Payan, Y. (2009). A biomechanical model of cardinal vowel production: Muscle activations and the impact of gravity on tongue positioning. The Journal of the Acoustical Society of America, 126(4), 2033–2051. https://doi.org/10.1121/1.3204306 [DOI] [PubMed] [Google Scholar]

[bib7] Chuanjun, C. , Zhiyuan, Z. , Shaopu, G. , Xinquan, J. , & Zhihong, Z. (2002). Speech after partial glossectomy: A comparison between reconstruction and nonreconstruction patients. Journal of Oral and Maxillofacial Surgery, 60(4), 404–407. https://doi.org/10.1053/joms.2002.31228 [DOI] [PubMed] [Google Scholar]

[bib8] Elsaid, N. M. , Prince, J. L. , Roys, S. , Gullapalli, R. P. , & Zhuo, J. (2019). Phase image texture analysis for motion detection in diffusion MRI (PITA-MDD). Magnetic Resonance Imaging, 62, 228–241. https://doi.org/10.1016/j.mri.2019.07.009 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib9] Fischer, S. E. , McKinnon, G. C. , Maier, S. E. , & Boesiger, P. (1993). Improved myocardial tagging contrast. Magnetic Resonance in Medicine, 30(2), 191–200. https://doi.org/10.1002/mrm.1910300207 [DOI] [PubMed] [Google Scholar]

[bib10] Gaige, T. A. , Benner, T. , Wang, R. , Wedeen, V. J. , & Gilbert, R. J. (2007). Three dimensional myoarchitecture of the human tongue determined in vivo by diffusion tensor imaging with tractography. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, 26(3), 654–661. https://doi.org/10.1002/jmri.21022 [DOI] [PubMed] [Google Scholar]

[bib11] Gomez, A. D. , Stone, M. L. , Woo, J. , Xing, F. , & Prince, J. L. (2020). Analysis of fiber strain in the human tongue during speech. Computer Methods in Biomechanics and Biomedical Engineering, 23(8), 312–322. https://doi.org/10.1080/10255842.2020.1722808 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib12] Hiiemae, K. M. , & Palmer, J. B. (2003). Tongue movements in feeding and speech. Critical Reviews in Oral Biology & Medicine, 14(6), 413–429. https://doi.org/10.1177/154411130301400604 [DOI] [PubMed] [Google Scholar]

[bib13] Kier, W. M. , & Smith, K. K. (1985). Tongues, tentacles and trunks: The biomechanics of movement in muscular-hydrostats. Zoological Journal of the Linnean Society, 83(4), 307–324. https://doi.org/10.1111/j.1096-3642.1985.tb01178.x [Google Scholar]

[bib14] Liang, X. , Su, P. , Patil, S. G. , Elsaid, N. M. , Roys, S. , Stone, M. , Gullapalli, R. P. , Prince, J. L. , & Zhuo, J. (2021). Prospective motion detection and re-acquisition in diffusion MRI using a phase image–based method—Application to brain and tongue imaging. Magnetic Resonance in Medicine, 86(2), 725–737. https://doi.org/10.1002/mrm.28729 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib15] Lin, Z. , Gong, T. , Wang, K. , Li, Z. , He, H. , Tong, Q. , Yu, F. , & Zhong, J. (2019). Fast learning of fiber orientation distribution function for MR tractography using convolutional neural network. Medical Physics, 46(7), 3101–3116. https://doi.org/10.1002/mp.13555 [DOI] [PubMed] [Google Scholar]

[bib16] Liu, X. , Abd-Elmoniem, K. Z. , Stone, M. , Murano, E. Z. , Zhuo, J. , Gullapalli, R. P. , & Prince, J. L. (2011). Incompressible deformation estimation algorithm (IDEA) from tagged MR images. IEEE Transactions on Medical Imaging, 31(2), 326–340. https://doi.org/10.1109/TMI.2011.2168825 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib17] Miyawaki, K. (1974). A study of the muscular of the human tongue. Annual Bulletin Research Institute of Logopedics and Phoniatrics, 8, 23–50. [Google Scholar]

[bib18] Parthasarathy, V. , Prince, J. L. , Stone, M. , Murano, E. Z. , & NessAiver, M. (2007). Measuring tongue motion from tagged cine-MRI using harmonic phase (HARP) processing. The Journal of the Acoustical Society of America, 121(1), 491–504. https://doi.org/10.1121/1.2363926 [DOI] [PubMed] [Google Scholar]

[bib19] Pierre, C. S. , Dassonville, O. , Chamorey, E. , Poissonnet, G. , Riss, J.-C. , Ettaiche, M. , Peyrade, F. , Benezery, K. , Chand, M.-E. , & Leyssalle, A. (2014). Long-term functional outcomes and quality of life after oncologic surgery and microvascular reconstruction in patients with oral or oropharyngeal cancer. Acta Oto-Laryngologica, 134(10), 1086–1093. https://doi.org/10.3109/00016489.2014.913809 [DOI] [PubMed] [Google Scholar]

[bib20] Pittman, L. J. , & Bailey, E. F. (2009). Genioglossus and intrinsic electromyographic activities in impeded and unimpeded protrusion tasks. Journal of Neurophysiology, 101(1), 276–282. https://doi.org/10.1152/jn.91065.2008 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib21] Rentschler, G. J. , & Mann, M. B. (1980). The effects of glossectomy on intelligibility of speech and oral perceptual discrimination. Journal of Oral Surgery, 38(5), 348–354. [PubMed] [Google Scholar]

[bib22] Sanguineti, V. , Laboissiere, R. , & Payan, Y. (1997). A control model of human tongue movements in speech. Biological Cybernetics, 77(1), 11–22. https://doi.org/10.1007/s004220050362 [DOI] [PubMed] [Google Scholar]

[bib23] Sasaki, M. , Onishi, K. , Stefanov, D. , Kamata, K. , Nakayama, A. , Yoshikawa, M. , & Obinata, G. (2016). Tongue interface based on surface EMG signals of suprahyoid muscles. ROBOMECH Journal, 3(1), Article No. 9. https://doi.org/10.1186/s40648-016-0048-0 [Google Scholar]

[bib24] Shao, M. , Carass, A. , Gomez, A. D. , Zhuo, J. , Liang, X. , Stone, M. , & Prince, J. L. (2021a). Direct reconstruction of crossing muscle fibers in the human tongue using a deep neural network. In Computational diffusion MRI (pp. 69–80). Springer. [Google Scholar]

[bib25] Shao, M. , Gomez, A. D. , Zhuo, J. , Liang, X. , Stone, M. , Carass, A. , & Prince, J. L. (2021b). Reconstruction and refinement of crossing muscle fibers in the human tongue. Medical Imaging 2021: Image Processing, 11596. [Google Scholar]

[bib26] Shinagawa, H. , Murano, E. Z. , Zhuo, J. , Landman, B. , Gullapalli, R. P. , Prince, J. L. , & Stone, M. (2008). Tongue muscle fiber tracking during rest and tongue protrusion with oral appliances: A preliminary study with diffusion tensor imaging. Acoustical Science and Technology, 29(4), 291–294. https://doi.org/10.1250/ast.29.291 [Google Scholar]

[bib27] Stone, M. , Langguth, J. M. , Woo, J. , Chen, H. , & Prince, J. L. (2014). Tongue motion patterns in post-glossectomy and typical speakers: A principal components analysis. Journal of Speech, Language, and Hearing Research, 57(3), 707–717. https://doi.org/10.1044/1092-4388(2013/13-0085) [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib28] Stone, M. , Woo, J. , Lee, J. , Poole, T. , Seagraves, A. , Chung, M. , Kim, E. , Murano, E. Z. , Prince, J. L. , & Blemker, S. S. (2018). Structure and variability in human tongue muscle anatomy. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 6(5), 499–507. https://doi.org/10.1080/21681163.2016.1162752 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib29] Takemoto, H. (2001). Morphological analyses of the human tongue musculature for three-dimensional modeling. Journal of Speech, Language, and Hearing Research, 44(1), 95–107. https://doi.org/10.1044/1092-4388(2001/009) [DOI] [PubMed] [Google Scholar]

[bib30] Tankisi, H. , Otto, M. , Pugdahl, K. , & Fuglsang-Frederiksen, A. (2013). Spontaneous electromyographic activity of the tongue in amyotrophic lateral sclerosis. Muscle & Nerve, 48(2), 296–298. https://doi.org/10.1002/mus.23781 [DOI] [PubMed] [Google Scholar]

[bib31] Tournier, J.-D. , Calamante, F. , & Connelly, A. (2007). Robust determination of the fibre orientation distribution in diffusion MRI: Non-negativity constrained super-resolved spherical deconvolution. NeuroImage, 35(4), 1459–1472. https://doi.org/10.1016/j.neuroimage.2007.02.016 [DOI] [PubMed] [Google Scholar]

[bib32] Tuch, D. S. , Reese, T. G. , Wiegell, M. R. , Makris, N. , Belliveau, J. W. , & Wedeen, V. J. (2002). High angular resolution diffusion imaging reveals intravoxel white matter fiber heterogeneity. Magnetic Resonance in Medicine: An Official Journal of the International Society for Magnetic Resonance in Medicine, 48(4), 577–582. https://doi.org/10.1002/mrm.10268 [DOI] [PubMed] [Google Scholar]

[bib33] Voskuilen, L. , Mazzoli, V. , Oudeman, J. , Balm, A. J. , van der Heijden, F. , Froeling, M. , de Win, M. M. , Strijkers, G. J. , Smeele, L. E. , & Nederveen, A. J. (2019). Crossing muscle fibers of the human tongue resolved in vivo using constrained spherical deconvolution. Journal of Magnetic Resonance Imaging, 50(1), 96–105. https://doi.org/10.1002/jmri.26609 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib34] Woo, J. , Lee, J. , Murano, E. Z. , Xing, F. , Al-Talib, M. , Stone, M. , & Prince, J. L. (2015). A high-resolution atlas and statistical model of the vocal tract from structural MRI. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 3(1), 47–60. https://doi.org/10.1080/21681163.2014.933679 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib35] Woo, J. , Murano, E. Z. , Stone, M. , & Prince, J. L. (2012). Reconstruction of high-resolution tongue volumes from MRI. IEEE Transactions on Biomedical Engineering, 59(12), 3511–3524. https://doi.org/10.1109/TBME.2012.2218246 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib36] Woo, J. , Xing, F. , Prince, J. L. , Stone, M. , Green, J. R. , Goldsmith, T. , Reese, T. G. , Wedeen, V. J. , & El Fakhri, G. (2019). Differentiating post-cancer from healthy tongue muscle coordination patterns during speech using deep learning. The Journal of the Acoustical Society of America, 145(5), EL423–EL429. https://doi.org/10.1121/1.5103191 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib37] Xing, F. , Liu, X. , Reese, T. , Stone, M. , Wedeen, V. , Prince, J. L. , El Fakhri, G. , & Woo, J. (2021). Muscle strain analysis from diffusion tractography and dynamic magnetic resonance imaging of the moving tongue. The Journal of the Acoustical Society of America, 150(4), A190–A190. https://doi.org/10.1121/10.0008082 [Google Scholar]

[bib38] Xing, F. , Prince, J. L. , Stone, M. , Reese, T. G. , Atassi, N. , Wedeen, V. J. , El Fakhri, G. , & Woo, J. (2018). Strain map of the tongue in normal and ALS speech patterns from tagged and diffusion MRI. Medical Imaging 2018: Image Processing, 10574, 1057411. [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib39] Xing, F. , Stone, M. , Goldsmith, T. , Prince, J. L. , El Fakhri, G. , & Woo, J. (2019). Atlas-based tongue muscle correlation analysis from tagged and high-resolution magnetic resonance imaging. Journal of Speech, Language, and Hearing Research, 62(7), 2258–2269. https://doi.org/10.1044/2019_JSLHR-S-18-0495 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib40] Xing, F. , Woo, J. , Gomez, A. D. , Pham, D. L. , Bayly, P. V. , Stone, M. , & Prince, J. L. (2017). Phase vector incompressible registration algorithm for motion estimation from tagged magnetic resonance images. IEEE Transactions on Medical Imaging, 36(10), 2116–2128. https://doi.org/10.1109/TMI.2017.2723021 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib41] Xing, F. , Woo, J. , Lee, J. , Murano, E. Z. , Stone, M. , & Prince, J. L. (2016). Analysis of 3-D tongue motion from tagged and cine magnetic resonance images. Journal of Speech, Language, and Hearing Research, 59(3), 468–479. https://doi.org/10.1044/2016_JSLHR-S-14-0155 [DOI] [PMC free article] [PubMed] [Google Scholar]

[bib42] Zerhouni, E. A. , Parish, D. M. , Rogers, W. J. , Yang, A. , & Shapiro, E. P. (1988). Human heart: Tagging with MR imaging—A method for noninvasive assessment of myocardial motion. Radiology, 169(1), 59–63. https://doi.org/10.1148/radiology.169.1.3420283 [DOI] [PubMed] [Google Scholar]

PERMALINK

Analysis of Tongue Muscle Strain During Speech From Multimodal Magnetic Resonance Imaging

Muhan Shao

Fangxu Xing

Aaron Carass

Xiao Liang

Jiachen Zhuo

Maureen Stone

Jonghye Woo

Jerry L Prince

Abstract

Purpose:

Method:

Results:

Conclusions:

Supplemental Material:

Materials and Method

Data Collection

Strain Analysis Pipeline

Figure 1.

Muscle Fiber Direction Reconstruction

Figure 2.

Fiber Directions Refinement

Figure 3.

Tongue Motion Estimation and Alignment

Tongue Muscle Labeling

Figure 4.

Compute SLAs

Results

Tongue Muscle Crossing-Fiber Direction Result

Figure 5.

Analysis of SLAs

Comparison of Healthy Controls and Patients

Figure 6.

Figure 7.

SLA Patterns in Muscle Groups

Figure 8.

Figure 9.

Discussion

Conclusions

Data Availability Statement

Supplementary Material

Acknowledgments

Funding Statement

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases