Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2023 Feb 9.
Published in final edited form as: Proc SPIE Int Soc Opt Eng. 2022 Apr 4;12032:1203205. doi: 10.1117/12.2610989

Measuring Strain in Diffusion-Weighted Data Using Tagged Magnetic Resonance Imaging

Fangxu Xing a,*, Xiaofeng Liu a, Timothy G Reese a, Maureen Stone b, Van J Wedeen a, Jerry L Prince c, Georges El Fakhri a, Jonghye Woo a
PMCID: PMC9911263  NIHMSID: NIHMS1870863  PMID: 36777787

Abstract

Accurate strain measurement in a deforming organ has been essential in motion analysis using medical images. In recent years, internal tissue’s in vivo motion and strain computation has been mostly achieved through dynamic magnetic resonance (MR) imaging. However, such data lack information on tissue’s intrinsic fiber directions, preventing computed strain tensors from being projected onto a direction of interest. Although diffusion-weighted MR imaging excels at providing fiber tractography, it yields static images unmatched with dynamic MR data. This work reports an algorithm workflow that estimates strain values in the diffusion MR space by matching corresponding tagged dynamic MR images. We focus on processing a dataset of various human tongue deformations in speech. The geometry of tongue muscle fibers is provided by diffusion tractography, while spatiotemporal motion fields are provided by tagged MR analysis. The tongue’s deforming shapes are determined by segmenting a synthetic cine dynamic MR sequence generated from tagged data using a deep neural network. Estimated motion fields are transformed into the diffusion MR space using diffeomorphic registration, eventually leading to strain values computed in the direction of muscle fibers. The method was tested on 78 time volumes acquired during three sets of specific tongue deformations including both speech and protrusion motion. Strain in the line of action of seven internal tongue muscles was extracted and compared both intra- and inter-subject. Resulting compression and stretching patterns of individual muscles revealed the unique behavior of individual muscles and their potential activation pattern.

Keywords: Tongue function, speech, internal muscles, diffusion MRI, motion, strain, tagged MRI, deep learning

1. INTRODUCTION

To accurately quantify the status of a deforming internal organ, medical imaging-based motion analysis has been an essential tool in both clinical practice and research studies. Strain of a deforming organ describes the internal change in the shape of an infinitesimally small cube of tissue[1]. It is a major quantity to be estimated as part of the motion analysis process. Theoretically, once the motion fields of an internal organ are computed, strain tensors can be directly computed, and principal strains indicating the main orthogonal deformation directions can be found by an eigen decomposition[1]. In practice, since deforming tissues are typically composed of muscles interleaved in a complex internal muscular structure, the strain value projected along the line of muscle fibers is of more interest. This quantity describes the ratio of stretching and compression of a local muscle fiber, which is a potential indicator of muscle activation[2]. When a muscle is activated, it usually shows globally shortening indicated by a series of successive local compressions along its entire span of length. Therefore, studying strain in the line of muscle fibers is an essential step toward understanding muscle behaviors and motor control from a medical imaging point of view.

Over the past decades, dynamic magnetic resonance (MR) imaging has been developed into a very accurate and effective tool to capture motion[3]. Especially, tagged MR imaging has been widely used for internal tissue motion data acquisition[4,5]. Acquired MR slices from orthogonal directions can be combined into three-dimensional (3D) volumes, where motion extraction algorithms are applied, producing spatiotemporal dynamic motion fields. Although principal strains can be computed from these motion fields, such data lack fiber geometry information and local tissue anatomy information. On the other hand, diffusion-weighted MR imaging provides fiber directions by its tractography, but it is a static image volume and is usually acquired in a separate imaging protocol with different imaging parameters from tagged MR data[6]. This results in any computed motion data not in alignment with tractography. The incompatibility between the two datasets with different spatial resolution and physical coordinates is the main difficulty in combining the two sources of information. Strain in the line of muscles was usually computed with synthetic or artificial fiber directions[7,8].

In this work, we present an algorithm workflow that computes strain in the diffusion MR space by matching corresponding tagged MR data. Strain estimation during speech is a major step in understanding the functions of the human vocal tract in various oromotor behaviors such as speech, swallowing, and respiration. Therefore, we apply this method to tongue motion analysis. First, we synthesize a sequence of cine MR images from tagged data using a deep neural network to provide a set of tongue masks. Motion fields are computed based on these masks and transformed into the diffusion space using diffeomorphic registration. Strain is computed in the direction of local fibers provided by tractography. The method was tested on 78 image volumes acquired during three sets of specific tongue deformations. Strain in the line of action of seven internal tongue muscles was eventually computed and compared in all three deformation scenarios over time. Unique behaviors of individual tongue muscles are observed and analyzed.

2. METHODS

2.1. Data acquisition

Diffusion MR data were collected in one scan session on a human subject with a b-value of 500 s/mm2 and diffusion weightings applied in 64 directions[9], resulting in an image volume with 2.25×2.25×4.6 mm3 of a resolution. On the other hand, in each dynamic MR acquisition session, the participated subject was instructed to perform a certain speech task designed to reflect specific tongue motion patterns[10]. Tagged MR data were collected in repeated tongue motion cycles at a temporal rate of 26 frames/s. Their in-plane resolution for each slice was a higher 1.88×1.88 mm2.

2.2. Cine synthesis and segmentation

To specify the positions of the deforming tongue volume over time, segmentation on all dynamic MR volumes is needed. A sequence of segmented tongue masks is necessary in the following motion estimation step. However, tagged images serve the sole purpose of capturing internal tissue deformation, and they suffer from an intrinsic low anatomical resolution problem when segmenting anatomical structures[10]. Therefore, we apply a deep neural network built upon a dual-cycle-constrained bijective variational autoencoder generative adversarial network (VAE-GAN) to synthesize a corresponding cine MR image sequence from each tagged dataset[11]. Comparing to other VAE-based synthesis methods, the network provides relatively realistic cine MR sequences due to its VAEs with cycle reconstruction-constrained adversarial training. Using the synthesized images, we can create 3D super-resolved volumes from synthesized cine slices in all three orthogonal directions[12]. Segmentation is performed on the synthesized cine super-resolution volumes to yield a tongue mask at each time frame with a resolution of 1.88×1.88×1.88 mm3. The above process is illustrated in Figure 1 with the deep network structure, synthesized tagged-to-cine images, and a segmented tongue mask.

Figure 1.

Figure 1.

(a) The dual-cycle VAE-GAN tagged-to-cine synthesis network showing encoders, decoders, and discriminators with parallel modules: tagged MRI xt and cine MRI xc. (b) Examples of synthesized sagittal cine MR slices from tagged slices with the tongue in red boxes. (c) Segmented tongue mask based on a synthesized cine volume at one time frame.

2.3. Motion estimation and spatial transformation

Given the tagged MR sequence and their tongue mask sequence, the phase vector incompressible registration algorithm is applied to yield an incompressible dense motion field at each time frame[13]. We denote the motion field as ut(X) sampled at image coordinates X and time frame t. Note that ut(X) is only computed in the original physical space where tagged MR data are acquired and super-resolution volumes are constructed. In order to match it with tractography data computed in the diffusion-weighted MR space, a spatial transformation between the two spaces ø(∙) is needed. We choose not to warp diffusion data to the tagged MR space because tractography is susceptible to numerical errors if spatially transformed. If we use μt to denote the tissue deformation caused by motion field ut(X), we have

μt(X)=X+ut(X). (1)

To find the transformation between the two spaces, we use diffeomorphic image registration between the segmented masks of the synthesized cine volume and the b0 image of diffusion MR data[14]. Registration provides both a forward transform ø(∙) and its inverse transform ø−1(∙) in a symmetric form. A composition of related transformations and deformations yields the same tissue deformation vt but is relocated in the diffusion-weighted image space[15], i.e.,

vt()=ϕμtϕ1(). (2)

2.4. Strain computation and projection

In practice, vt(X) = X + vt(X) is saved in the form of a new motion field vt(X) defined on the grid location X in the diffusion space. X matches the location of tractography data indicating tongue fiber directions. Therefore, the new motion field vt(X) is used to compute the Lagrangian strain in the diffusion space by[1]

Et(X)=12((I+dvt(X)dX)T(I+dvt(X)dX)I). (3)

Finally, strain tensor Et(X) is projected onto the direction of internal fibers d(X) by the quadratic form[16]

et(X)=d(X)TEt(X)d(X). (4)

Note that d(X) has a normalized length of 1 mm for all locations in X. It is also directly provided by tractography so that it already exists in the diffusion space X. et(X) is a strain value along the direction of the local muscle fibers. A positive value indicates local tissue expansion along the fiber direction and a negative value indicates local compression. Figure 2 shows examples of an estimated motion field both in its original tagged MR space and after being warped into the diffusion MR space. It also shows a diffusion tractography image from the sagittal view. Note that the warped motion field appears sparser because the diffusion data space has an intrinsic lower resolution of 2.25×2.25×4.6 mm3 comparing to dynamic data of 1.88×1.88×1.88 mm3. Also, all motion fields are visually downsampled by a factor of two to make them sparser for better visibility.

Figure 2.

Figure 2.

(a) Motion field from tagged data in its original space. (b) Same field warped into the diffusion MR space. (c) Tractography of the tongue region from the sagittal view.

3. RESULTS

The proposed workflow was carried out on a dataset of the tongue performing three specific motion tasks, each lasted 26 time frames in one second. Two tasks studied speech data during the pronunciation of an utterance “asa” showing a forward tongue motion and another utterance “atha” showing a dental fricative motion. The third task studied a simple forward tongue protrusion. All 78 dynamic MR volumes were processed with their motion fields transformed into the diffusion MR space. After strain in the direction of muscle fibers was computed using Eqn. (4), it was extracted with muscle masks to reveal the specific behavior of individual muscles.

A speech expert delineated seven internal tongue muscles. All muscle labels were drawn on each image slice and combined into a three-dimensional rendering of individual muscle locations. In the diffusion space, we denote a muscle label by L and its masked region by ML(X), whose value is 1 for voxels inside L and 0 otherwise. Strain in the direction of a specific muscle L in the diffusion space at time frame t is

et,L(X)=ML(X)et(X). (5)

The values of et,L(X) in all time frames and all seven muscles are plotted in Figure 3(b)(c)(d). All mask locations are shown in a 3D rendering in Figure 3(a).

Figure 3.

Figure 3.

Strain along the internal tongue muscle fiber directions during the motion of pronouncing “asa,” “atha,” and a forward protrusion. A total of seven muscles in 26 time frames are involved in strain computation.

4. DISCUSSION

Results on three tongue deformations have shown a major difference between the forward protrusion and the two speech tasks. For speech data, constant compression is shown for the geniohyoid or genioglossus muscles and the other muscles appear to be stretching in a larger extent especially the digastric muscle. For the protrusion motion, all muscles appear to be converging to a smaller range of expansion, indicating that they work together to extend the tongue in a more cooperative pattern for its global elongation during protrusion.

Comparing the two utterances, most muscles work in more than one pattern. Expect the digastric muscle that shows a consistent expansion, the other muscles deform in their unique ways. For “asa,” the genioglossus muscle shows most compression to squeeze the tongue vertically for its forward motion, while the other muscles reach maximum expansion gradually around the last time frame. For “atha,” the geniohyoid muscle shows most compression to form a dental fricative motion, while the other muscles reach maximum expansion during the middle time interval pronouncing the utterance, which is around where the /th/ consonant was pronounced.

Given different acquisition protocols for diffusion and dynamic MR data, the proposed workflow is a feasible way to combine real diffusion tractography with real motion data in post-processing. The most ideal way is to share a consistent physical space during data acquisition for the two modalities, which is being actively studied. In the best scenario, once diffusion data and dynamic data are intrinsically aligned, numerical errors from warping image spaces can be fundamentally avoided. However, acquisition cost and welfare for the human participants need to be considered as well. For example, if multiple speech tasks are to be involved in a study, acquiring all dynamic data in a same scan session as the diffusion data may become impractical. Any resulting inconsistencies between the two datasets could potentially be addressed with the proposed workflow.

5. CONCLUSION

In this work, we proposed a method that estimates motion strain in the diffusion tractography space from dynamic MR images. Resulting strain patterns on seven internal muscles showed that some muscles stayed stretched or compressed over all time frames, while other muscles switched between stretches and compressions depending on specific tongue deformations. The proposed method enables assessment of muscle strains using real diffusion data, which helps understand internal muscle’s cooperative mechanics on top of their anatomical differences, providing further interpretation on clinical observations and speech motor control.

Acknowledgements

This work was supported by NIH R01DC014717, R01DC018511, R21DC016047, R00DC012575, R01CA133015.

REFERENCES

  • [1].Spencer AJM, [Continuum Mechanics], Courier Corporation, North Chelmsford: (2004). [Google Scholar]
  • [2].Xing F, Ye C, Woo J, Stone M, and Prince J, “Relating speech production to tongue muscle compressions using tagged and high-resolution magnetic resonance imaging,” Medical Imaging 2015: Image Processing, Vol. 9413, p. 94131L (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [3].Wang H and Amini AA, “Cardiac motion and deformation recovery from MRI: a review,” IEEE Transactions on Medical Imaging, 31(2), 487–503 (2011). [DOI] [PubMed] [Google Scholar]
  • [4].Park J, Metaxas D, Young AA, and Axel L, “Deformable models with parameter functions for cardiac motion analysis from tagged MRI data,” IEEE Transactions on Medical Imaging, 15(3), 278–289 (1996). [DOI] [PubMed] [Google Scholar]
  • [5].Parthasarathy V, Prince JL, Stone M, Murano EZ, and NessAiver M, “Measuring tongue motion from tagged cine-MRI using harmonic phase (HARP) processing,” The Journal of the Acoustical Society of America, 121(1), 491–504 (2007). [DOI] [PubMed] [Google Scholar]
  • [6].Wedeen VJ, Hagmann P, Tseng WYI, Reese TG, and Weisskoff RM, “Mapping complex tissue architecture with diffusion spectrum magnetic resonance imaging,” Magnetic Resonance in Medicine, 54(6), 1377–1386 (2005). [DOI] [PubMed] [Google Scholar]
  • [7].Gomez AD, Stone ML, Woo J, Xing F, and Prince JL, “Analysis of fiber strain in the human tongue during speech,” Computer Methods in Biomechanics and Biomedical Engineering, 23(8), 312–322 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Xing F, Ye C, Woo J, Stone M, and Prince J, “Relating speech production to tongue muscle compressions using tagged and high-resolution magnetic resonance imaging,” Medical Imaging 2015: Image Processing, Vol. 9413, p. 94131L (2015). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Lee E, Xing F, Ahn S, Reese TG, Wang R, Green JR, … and Woo J, “Magnetic resonance imaging based anatomical assessment of tongue impairment due to amyotrophic lateral sclerosis: a preliminary study,” The Journal of the Acoustical Society of America, 143(4), EL248–EL254 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Xing F, Woo J, Lee J, Murano EZ, Stone M, and Prince JL, “Analysis of 3-D tongue motion from tagged and cine magnetic resonance images,” Journal of Speech, Language, and Hearing Research, 59(3), 468–479 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Liu X, Xing F, Prince JL, Carass A, Stone M, El Fakhri G, and Woo J, “Dual-cycle constrained bijective VAE-GAN for tagged-to-cine magnetic resonance image synthesis,” IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 1448–1452 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Woo J, Murano EZ, Stone M, and Prince JL, “Reconstruction of high-resolution tongue volumes from MRI,” IEEE Transactions on Biomedical Engineering, 59(12), 3511–3524 (2012). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Xing F, Woo J, Gomez AD, Pham DL, Bayly PV, Stone M, and Prince JL, “Phase vector incompressible registration algorithm (PVIRA) for motion estimation from tagged magnetic resonance images,” IEEE Transactions on Medical Imaging, 36(10), 2116–2128 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [14].Avants BB, Tustison NJ, Song G, Cook PA, Klein A and Gee JC, “A reproducible evaluation of ANTs similarity metric performance in brain image registration,” Neuroimage, 54(3), pp.2033–2044 (2011). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [15].Ehrhardt J, Werner R, Schmidt-Richberg A and Handels H “Statistical modeling of 4D respiratory lung motion using diffeomorphic image registration,” IEEE Transactions on Medical Imaging, 30(2), 251–265 (2011). [DOI] [PubMed] [Google Scholar]
  • [16].Knutsen AK, Gomez AD, Gangolli M, Wang WT, Chan D, Lu YC, … and Pham DL, “In vivo estimates of axonal stretch and 3D brain deformation during mild head impact,” Brain Multiphysics, 1, 100015 (2020). [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES