Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2012 Jan 1.
Published in final edited form as: J Voice. 2010 Mar 29;25(1):67–75. doi: 10.1016/j.jvoice.2009.08.001

Comparison of neck tension palpation rating systems with surface electromyographic and acoustic measures in vocal hyperfunction

Cara E Stepp 1,2,§, James T Heaton 2,3, Maia N Braden 2,a, Marie E Jetté 2, Tara K Stadelman-Cohen 2, Robert E Hillman 1,2,3
PMCID: PMC2913165  NIHMSID: NIHMS193356  PMID: 20347260

Abstract

Objectives/Hypothesis

The purpose of this study was to evaluate current neck tension palpation rating systems to determine inter-rater reliability and possible correlation with neck surface electromyography (sEMG, collected from three electrode recording locations) and measures of the third formant for /a/ during various vocal behaviors.

Study Design

This prospective study examined the neck muscle tension of 16 participants before and after a single session of voice therapy.

Methods

Inter-rater reliability and relationships between palpation ratings and objective measures of sEMG (anterior neck) and the third formant for /a/ were assessed using Pearson’s correlations (r).

Results

Inter-rater reliability was relatively low as measured by Pearson’s correlations, although Wilcoxon Signed Rank Test results were similar as those in a previous study. Correlations between palpation ratings and sEMG, and between ratings of laryngeal height and the third formant for /a/ were generally low. Correlations increased between anterior neck sEMG and ratings of suprahyoid muscle tension when examined in a reduced set of individuals with higher inter-rater reliability.

Conclusions

Palpation rating scales do not reliably capture changes that may occur in neck muscle tension of typical voice therapy patients over one session. Consequently, little can be concluded from correlations between sEMG and palpation ratings.

Keywords: voice, muscle tension, laryngeal palpation, surface electromyography, acoustic measures

Introduction

When individuals demonstrate increased intrinsic laryngeal muscle contraction (vocal hyperfunction), it is thought that they often simultaneously contract the extrinsic laryngeal muscles and other superficial neck muscles in a similar hyperfunctional manner.1 Strap muscle tension can be assessed through both visual and tactile inputs. A study by Altman and colleagues reported on 150 patients who had been diagnosed with muscle tension dysphonia (MTD; a voice disorder with symptoms of vocal hyperfunction and no known structural change to the vocal fold or neurogenic disease of the larynx).2 Based on a speech pathology evaluation of these patients, 70% were found to have “obvious cervical neck tension visible”.2 Practitioners have previously reported that observation of the inferior bellies of the omohyoid muscle crossing the supraclavicular fossae may show them to be tense and prominent during speech3, while further information about the extent of muscle tension can be gained by palpation of the larynx at rest and during voicing.3 Excessive tension in disordered individuals has been noted via palpation over the major horns of the hyoid bone, over the superior cornu of the thyroid cartilage, along the anterior border of the sternocleidomastoid muscle, and throughout the suprahyoid musculature.4

While palpation of neck musculature is a routine clinical procedure in the assessment and management of vocal hyperfunction1, 58, only a few standardized rating scales have been developed. As part of a surface electromyography (sEMG) study, one speech-language-pathologist rated “laryngeal-area tonicity” on a 1 – 5 linear scale, finding a high correlation between a single clinician’s scores and mean sEMG during vowel production.9 Angsuwarangsee and Morrison (2002) developed a linear 0 – 3 grading system of neck muscle tension based on the experiences and work of Lieberman5 for research use in which each muscle group is graded based on specific text descriptors (see Table 1).10 Kooijam et al. (2005) modified the system proposed by Angsuwarangsee and Morrison to include more muscle categories, as well as documentation about body posture.11 Mathieson et al. recently proposed a new rating system in which the muscle resistance of four categories is rated on a linear scale of 1 – 5 and laryngeal position is noted as being one of the following: high held, neural, lowered, or forced lowered (see Figure 1).12

Table 1.

Neck tension palpation system. Reprinted with permission from Angsuwarangsee T, Morrison M. Extrinsic laryngeal muscular tension in patients with voice disorders. Journal of Voice. 2002; 16:333–34310.

Rating Description
Suprahyoid muscles
0 soft at rest, may slightly contract on phonation
1 soft at rest, mild low-pitch and moderate high-
pitch contraction
2 some tension at rest, tense with jaw protrusion
on phonation
3 tense all the time, maximally tight on
phonation
Thyrohyoid muscles
0 no muscular contraction at rest, mild on
phonation
1 soft thyrohyoid space at rest, some contraction
on phonation
2 tense, narrow thyrohyoid space at rest,
moderate contraction on phonation
3 very tense with closed thyrohyoid space all
the time
Cricothyroid muscles
0 normal cricothyroid space and phonatory
movement
1 narrowing of cricothyroid space at rest, some
movement on phonation
2 anterior displacement of cricoid cartilage with
narrowing of cricothyroid space at rest,
closing of the space on phonation
3 closed cricothyroid space all the time
Pharyngolaryngeal muscles
0 soft, easy to rotate the larynx for 90° and
palpate posterior cricoarytenoid (PCA) muscle
and arytenoids movement on sniffing
1 slightly tense, cannot palpate PCA muscle
movement on sniffing
2 moderately tense, difficult to rotate the larynx
but still can palpate the posterior edge of
thyroid cartilage
3 very tense, cannot rotate the larynx at all

Figure 1.

Figure 1

Palpation Rating Scale. Adapted with permission from Mathieson L, Hirani SP, Epstein R, Baken RJ, Wood G, Rubin JS. Laryngeal Manual Therapy: A Preliminary Study to Examine its Treatment Effects in the Management of Muscle Tension Dysphonia. Journal of Voice. In Press.12

Angsuwarangsee and Morrison10 assessed their rating system on 57 successive voice patients, with two independent investigators (otolaryngologists) examining each patient. Inter-rater reliability numbers based on Wilcoxon Signed-Ranks Tests were presented, with the reliabilities presented in the form of p-values. Only one category, pharyngolaryngeal, exhibited statistically significant scores (less than 0.05), which was interpreted by the authors as having low inter-rater reliability. Mathieson et al.12 used palpatory evaluations in 10 individuals with MTD pre- and post- laryngeal manual therapy. Inter-rater reliability was not noted, as more than one clinician was not used for evaluation. Further, it appears that the evaluator (a speech-language pathologist) was the same individual providing therapy.

The work of Redenbaugh and Reich9 is the only study to explore the relationship between sEMG and neck palpation ratings. In their study, laryngeal-area tonicity was evaluated by a single speech-language pathologist during tidal breathing, production of the vowel /a/ for 15 seconds, and reading aloud. Laryngeal-area tonicity was rated using a 5-point, equal-appearing-interval scale. The Pearson’s correlations between the palpation score and the sEMG during the vowel and speech tasks were found to be 0.86 and 0.9, respectively. No inter-rater reliability measures were attempted given that there was only one rater. The study examined seven individuals with MTD and seven individuals with healthy normal voice. Due to the bimodal nature of this sample, with the participants likely representing alternate ends of the spectrum of neck muscle tension related to voice disorder, correlation values may possibly be inflated. Despite the fact that the palpation procedure and scale utilized by Redenbaugh and Reich9 was published in 1989, it has not been the subject of further published research, and to our knowledge is not widely used in the clinic. Further, this previous study utilized only one electrode position overlying the thyrohyoid membrane. In order to understand the relationships between sEMG and clinical ratings of palpation, it is necessary to determine among multiple electrode recording locations and vocal behaviors, those that may correlate most accurately with more widely used clinical ratings.

Roy and Ferguson13 examined changes in formant frequencies pre-therapy versus post-therapy in 75 participants with functional dysphonia, finding significant decreases in the first, second, and third formants after therapy. The authors interpreted this finding as evidence of laryngeal lowering as a result of therapeutic intervention. The neck palpation rating system proposed by Mathieson et al.12 requires the evaluator to note laryngeal position of the participant as being high held, neural, lowered, or forced lowered using a nominal scale. Their study applied this system to 10 participants and found insignificant changes post-therapy in average laryngeal height. Acoustic analysis at both time points found a trend of increased second formant during vowel production post-therapy, which would be inconsistent with laryngeal lowering. Their study did not attempt to correlate changes in perceived larynx height with formant changes.

The third formant shows less variation across different vowel productions than the second formant, which is more likely to be affected by changes in vowel articulation. Moreover, recent work has shown that treatment for MTD can also affect articulation, leading to increased vowel space.14 The third formant should be more correlated with vocal tract length, such that changes to the third formant may offer objective confirmation of changes in laryngeal position, especially due to short-term therapy effects. Assessing the relationship between judgments of laryngeal height and corresponding changes in the third formant may offer more useful information about the utility of this clinical scale for assessing laryngeal position.

Clinically, the presence of excessive neck tension is noted as a sign of vocal hyperfunction, informing both diagnosis and treatment.48, 1012 However, current methods of assessment of neck muscle tension10, 12 depend on tactile measures, which are subjective and lack a large dynamic range of measurement. The use of sEMG and objective acoustic methods to monitor changes in neck tension and/or laryngeal position in patients with voice disorders could lead to more standardized care, as well as improved information about patient progress. It is currently still unknown whether neck sEMG recordings or formant changes correlate well with clinical palpation-based ratings. Also, in order to use sEMG optimally, it necessary to determine the electrode recording locations and vocal behaviors that correlate most accurately with clinical ratings. The purpose of this study was to evaluate the neck tension palpation tension rating systems of Angsuwarangsee and Morrison10 and Mathieson et al.12 to determine whether reproducible results could be obtained, as measured by inter-rater reliability measures (Pearson’s correlations), when administered by speech-language pathologists previously unfamiliar with these systems. A further goal of this study was to ascertain whether the systems were correlated with objective measures of neck tension (sEMG) and laryngeal height (third formant for /a/) of individuals receiving therapy for voice disorders related to vocal hyperfunction. These two scales were used as a comparison with acoustic changes in the third formant and neck sEMG collected from multiple electrode recording locations during various vocal behaviors to understand how differences in scale structure may affect correlations with objective measures.

Methods

Participants

Participants were 16 adult volunteers (13 females, 3 males) with mean age of 24.9 years (R=18–41 years) receiving voice therapy due to a voice disorder related to vocal hyperfunction (e.g., muscle tension dysphonia, vocal nodules). Table 2 lists the diagnoses of the participants, as well as age and sex. Participants were varied in their progress in voice therapy, with their research participation taking place during one of multiple visits in the course of their therapy.

Table 2.

Participant diagnosis and demographic information.

Participant Age Sex Diagnosis
P1 31 M muscle tension dysphonia
P2 32 F vocal fold nodules
P3 22 F muscle tension dysphonia
P4 22 F vocal fold nodules
P5 18 F muscle tension dysphonia
P6 26 M muscle tension dysphonia
P7 27 F muscle tension dysphonia
P8 19 F muscle tension dysphonia
P9 22 F muscle tension dysphonia
P10 24 F muscle tension dysphonia
P11 21 F vocal fold nodules
P12 41 M muscle tension dysphonia
P13 20 F muscle tension dysphonia
P14 22 M vocal fold nodules
P15 26 F vocal fold nodules
P16 25 F muscle tension dysphonia

Clinical Palpation Methodology

Two of three total independent certified speech-language pathologists assessed each participant before and after therapy using the two clinical palpation ratings of Angsuwarangsee and Morrison10 and Mathieson et al.12 The ‘primary’ rater was the same clinician providing therapy to the participant. A second rater was another speech-language pathologist who was unfamiliar with the patient.

A total of three certified speech-language pathologists who specialize in voice participated in the clinical assessment portion of this study. All three of the speech-language pathologists who participated in this study completed their clinical fellowship training in a specialized voice clinic, and all had at least one year of experience working full-time in a specialized voice clinic, with case loads consisting exclusively of patients with voice disorders. All of them had extensive experience with laryngeal palpation and manipulation as a part of clinical practice prior to the initiation of this study. Participation among the three speech-language pathologists was approximately equal, with each completing pre-therapy and post-therapy assessments for between 8 and 12 participants. The speech-language pathologists were trained internally by reading the primary literature behind the rating systems 10, 12 as well as the chapter on techniques of manual therapy5, and then each applying the two neck tension ratings systems to the same individual, comparing rating decisions, and discussing scoring issues at length. This internal training lasted approximately 1.5 hours. After official recruitment and recording of participants had been initiated, no feedback was given to participating clinicians regarding their agreement with one another.

sEMG and Acoustic Recording Methodology

The sEMG and acoustic recordings consisted of a brief vocal assessment of the participant, which included three trials of the vowel /a/, read speech (The Rainbow Passage15), and spontaneous running speech. Spontaneous speech was elicited in response to the investigator asking the participant a probing question (e.g., “Can you tell me what you do in a typical therapy session?”). After completion of these speech tasks, maximal voluntary contraction (MVC) maneuvers were performed. These consisted of asking the participants to perform neck contraction against manual resistance for the purpose of normalizing sEMG data (see following Data Analysis section). In order to ensure that systematic differences did not exist in the MVC force production produced in the pre-therapy and post-therapy recordings, a dynamometer (Chatillon DPP-50, Ametek, Inc., Paoli, PA) was used during neck muscle contraction against manual resistance for all but three participants, and the maximal force was recorded. The MVC forces ranged from 14 – 42 lbf by participant, but there was not a statistically significant difference between pre-therapy and post-therapy MVC forces (Paired Student’s t-test, df = 12, p = 0.85).

Simultaneous neck sEMG and acoustic signals from a lavalier microphone (Sennheiser MKE2-P-K, Wedemark, Germany) were recorded digitally with Delsys™ (Boston, Massachusetts) hardware (Bagnoli Desktop System) and software (EMGworks 3.3) at 20 kHz. The sEMG signals in this study were recorded and analyzed in view of current European standards.16 Participants’ necks were prepared for electrode placement by cleaning the neck surface with an alcohol pad and “peeling” with tape to reduce electrode-skin impedance, noise, DC voltages, and motion artifacts. The neck sEMG was recorded using two 2-channel Bagnoli systems (Delsys ™ Inc, Boston, Massachusetts) with three Delsys ™ 3.1 double differential surface electrodes placed parallel to the underlying muscle fibers of the 1) thyrohyoid, omohyoid, and sternohyoid muscles, 2) cricothyroid and sternohyoid muscles, and 3) sternocleidomastoid muscle (see Figure 2). The Delsys ™ 3.1 double differential surface electrodes consisted of three 10-mm silver bars with inter-electrode distances of 10-mm. Double differential electrodes were chosen to increase spatial specificity of the sEMG recordings and to eliminate the possibility of electrical crosstalk, a risk given the electrode proximity.

Figure 2.

Figure 2

Schematic of sEMG electrode recording locations.

Electrode 1 was centered about 1 cm lateral to the neck midline, as far superior as was possible without impeding jaw opening of the participant. Electrode 2 was centered on the gap between the cricoid and thyroid cartilages of the larynx, and centered at 1 cm lateral to the midline, contralateral to Electrode 1. Electrode 3 was centered one-third of the distance from the sternal notch of each participant to his or her mastoid process following the recommendations of Falla et al.17 A ground electrode was placed on the superior aspect of the participant’s left shoulder. The sEMG signals were pre-amplified and filtered using Delsys™ Bagnoli systems set to a gain of 1000 with a band-pass filter (roll-off frequencies of 20 Hz and 450 Hz).

Data Analysis

So that EMG data gathered could be compared between pre-therapy and post-therapy recordings, the variability associated with differences in neck surface electrode contact and placement was minimized by normalizing the sEMG to a reference contraction at MVC. All EMG data were computed as the root-mean-squared (RMS) and then normalized via MVC (in RMS) in windows of 1 s using custom software written in MATLAB® (Mathworks Inc., Natick, MA). While studies have shown that for simple, one-joint systems, sub-maximal contractions are more reliable for normalization18, 19, it has been shown that the MVC references is more reliable for anterior neck musculature.20 Consequently, all sEMG data were analyzed in terms of % MVC. The third formant during three trials of the vowel /a/ was estimated using the linear predictive coding analysis in Praat acoustic analysis software.21 All of the formants found were consistent with the expected ranges specified in the literature (e.g., 22).

Correlations were calculated between the normalized RMS sEMG and clinical ratings of various muscle groups to ascertain the level of association between the assorted measures. Inter-rater reliability measures were calculated with Pearson’s correlation for most elements of the two clinical rating systems using the assessment of the two speech-language pathologists. To compare these data with previous reports of inter-rater reliability, Wilcoxon Signed-Ranks Tests were also performed between raters. Inter-rater reliability of the larynx position measure of the Mathieson et al.12 palpation system was assessed using Cohen’s Kappa due to the nominal nature of the scale. A two-factor ANOVA was used to examine the effect of rater and perceived larynx height change (the larynx position measure of the Mathieson et al.12 palpation system) on the measured changes in the third formant for the /a/ vowel. Statistical analysis was performed using Minitab® Statistical Software (Minitab Inc., State College, PA).

Results

Inter-rater Reliability

Inter-rater reliability between the two raters of neck tension using all pre-therapy and post-therapy judgments was assessed using Pearson’s correlations for all categories of the Angsuwarangsee & Morrison system, and for the first four categories of the Mathieson et al. system. Pearson’s correlations were generally poor, but differed slightly as a function of muscle group. For comparison with the work of Angsuwarangsee & Morrison10, Wilcoxon Signed-Ranks tests were also performed on rater judgments. The Pearson’s correlations and p-values from the Wilcoxon Signed-Ranks Tests are shown in Figure 3.

Figure 3.

Figure 3

Inter-rater reliability for pre-therapy and post-therapy palpation. Palpation measures marked with an (A) are part of the Angsuwarangsee & Morrison system10; those marked with an (M) are part of the Mathieson et al. system.12 Asterisks note those measures for which the Pearson’s correlation was significantly (p < 0.05) greater than 0.

None of the Pearson’s correlations were greater than 0.6, with the lowest at 0.23. The Wilcoxon Signed-Ranks showed no significant difference between raters for most categories, excepting the cricothyroid and pharyngolaryngeal measures of the Angsuwarangsee & Morrison system.

When judging laryngeal position, no raters used the designations for “Lowered” or “Forced Lowered”, essentially creating a binary rating system of “High held” or “Neutral.” Of the 32 assessments of laryngeal position, a total of 22 matched perfectly (69%). Cohen’s Kappa was calculated for each response (“High held” and “Neutral”), equaling 0.38 for both.

Inter-rater reliability between the two raters of neck tension using the change between pre- and post-therapy judgments was also assessed. The Pearson’s correlations and p-values from the Wilcoxon Signed-Ranks Tests are shown in Figure 4. Several of the Pearson’s correlations were near zero or even negative, although the left SCM of the Mathieson et al. system had a Pearson’s correlation greater than 0.6. The Wilcoxon Signed-Ranks showed no significant difference between raters for any category. Of the 16 assessments of laryngeal position change, a total of 10 matched perfectly (63%). Cohen’s Kappa was calculated for each response, equaling 0.02 for both.

Figure 4.

Figure 4

Inter-rater reliability for the change between pre-therapy and post-therapy judgments. Palpation measures marked with an (A) are part of the Angsuwarangsee & Morrison system10; those marked with an (M) are part of the Mathieson et al. system.12

Correlation between Palpation Ratings and sEMG

The left panel of Figure 5 shows the Pearson’s correlations between each palpation measure and sEMG from all possibly relevant electrode locations during rest, read speech, and spontaneous speech. The pharyngolaryngeal measure was not included in the correlation analysis since there were no appropriate electrode locations. The sEMG from electrode positions 1 and 2 were compared with suprahyoid, thyrohyoid, and cricothyroid ratings from the Angsuwarangsee & Morrison system, and the supralaryngeal and lateral pressure ratings from the Mathieson et al. system. The sEMG from electrode position 3 was compared with both left and right SCM ratings, despite the fact that sEMG was collected only from the patient’s left SCM.

Figure 5.

Figure 5

Pearson’s correlations between palpation measures and sEMG. Palpation measures marked with an (A) are part of the Angsuwarangsee & Morrison system10; those marked with an (M) are part of the Mathieson et al. system.12 The left panel (A, C, E, G, I, K, M) is for the entire set of participants (N = 16); the right panel (B, D, F, H, J, L, N) is for the reduced set of “high reliability participants” (N = 8).

To reduce the effects of poor inter-rater reliability on correlations between sEMG and palpation ratings, participants whose pre- and post change differed between rater by 2 or more scale points on any dimension were excluded, resulting in a reduced set of N = 8 “high reliability” participants. The right panel of Figure 5 shows the Pearson’s correlations for the reduced set.

Relationship between Perceived Laryngeal Height and the third formant

Laryngeal height (the larynx position measure of the Mathieson et al.12 palpation system) was most frequently rated as the same in both pre-therapy and post therapy recordings. In some cases, one or both raters felt that a participant moved from “high held” to “neutral” during the course of therapy. Changes in the third formant averaged at 1 Hz, ranging from −164 Hz (indicating a lower larynx post-therapy) to 281 Hz (indicating a higher larynx post-therapy). These changes did not appear to be associated with perceived laryngeal height. A two-factor ANOVA assessing the effect of rater and perceived larynx height change on the measured changes in the third formant showed no effect of either variable (p > 0.05).

Discussion

Inter-rater Reliability

Inter-rater reliability based on single time-point assessments as measured with Pearson’s correlation was generally low across all dimensions of both scales, and did not improve with the use of pre- and post-therapy differenced data. The highest reliabilities were seen for the thyrohyoid and pharyngolaryngeal assessments of the Angsuwarangsee and Morrison system and the left SCM assessment of the Mathieson et al. system. No systematic differences in the inter-rater reliability emerged between the two systems. The difference in reliability between the right (R = 0.30) and left (R = 0.49) SCM assessments is puzzling given that clinicians tended to use both hands during both SCM assessments. One possibility is that patient asymmetries could perhaps have affected the variability of palpable muscle tension, leading to reduced inter-rater reliability, but there is no evidence here to support overall right-left asymmetry.

Angsuwarangsee and Morrison10 used Wilcoxon Signed-Ranks Tests as a measure of inter-rater reliability, finding the only significant differences (p < 0.05) in judgments for the pharyngolaryngeal assessment (interpreted by them as poor reliability). Similarly, we also found p-values greater than 0.05 for all assessments excepting the cricothyroid and pharyngolaryngeal assessments of the Angsuwarangsee and Morrison system, indicating that the rater performance was not dissimilar than those from that study. However, the relationship between Pearson’s correlation values and the p-values resulting from the Wilcoxon Signed-Ranks testing calls into question the appropriateness of using the Wilcoxon Signed-Ranks test as a measure of inter-rater reliability. The Wilcoxon Signed-Ranks test assesses the likelihood of mean differences between measures being non-zero, not reliability. Greater overall variance of two judges (lack of agreement) would therefore increase p-values, whereas they could lead to lowered Pearson’s correlations, and vice-versa. Raters who are highly unreliable but do not consistently agree in the direction of their disagreement would have a high p-value due to the large variance in their differences, but a low Pearson’s correlation. As an example, in the inter-rater reliability data shown in Figure 3, the pharyngolaryngeal measure has both the highest Pearson’s correlations (indicative of high reliability) and the smallest p-values (indicative of a non-zero difference between raters). This throws into doubt the high inter-rater reliability reported by Angsuwarangsee and Morrison10, since the interpretation was based on the use of Wilcoxon Signed-Ranks Tests, and no Pearson’s correlations were reported.

The nominal scale used to assess larynx position in the Mathieson et al. system showed moderately low values of Cohen’s kappa, with non-significant p-values to assess the likelihood of kappa > 0. Kappa values range from −1 to 1 where a kappa of 1 indicates perfect agreement between raters, and a kappa of 0 indicates agreement the same as that expected by chance. The results of kappa analysis indicate that the inter-rater agreement of laryngeal height is not significantly higher than that which would be due to chance. One possible factor in this lack of agreement is the prevalence of different internal definitions of laryngeal height: some clinicians may associate a high larynx position with merely a high hyoid, whereas others might require the entire larynx to be raised.

Regardless, the low values of kappa indicate that these scales do not provide reliable indications of laryngeal height.

Correlations between palpation ratings and objective measures

Using the full dataset, correlations between sEMG and palpation ratings were generally low, with many near zero or even negative. This is not surprising given the low inter-rater reliability of the palpation ratings. There does not appear to be an effect of task on correlations, with resting sEMG resulting in correlations similar to those for sEMG collected during running speech. Repeating correlation analyses on the high reliability participants resulted in much higher correlations over all. In particular, correlations between sEMG from electrode positions 1 and 2 and suprahyoid/supralaryngeal ratings of both systems increased. Also, correlations between sEMG from electrode position 3 and both left and right SCM ratings increased. One interpretation is that there is an underlying correlation between these sets of ratings and corresponding sEMG that was made clearer with the elimination of some of the variance in the palpation scoring. However, we cannot rule out the possibility that these increases are mere artifact produced by our manipulation of the dataset.

No association was seen between mean changes in the third formant of the /a/ vowels and the larynx position palpation rating changes pre- and post-therapy. Mathieson and colleagues also found a lack of changes in formant frequencies (first and second) pre- and post- manual therapy in 10 patients with MTD.12

Participants in this study were current therapy patients reporting for one of a number of recommended therapy sessions. Unlike so-called functional dysphonia patients for whom voice quality frequently changes drastically over the course of a single therapy session, it is more likely that these individuals displayed patterns of voice production and muscle tension that were more resistant to change. Further, therapy sessions were not necessarily directly targeting muscle tension (e.g., laryngeal massage), but varied as a function of individual patient needs. The lack of association between palpation ratings and objective measures could, therefore, also be a result of a lack of effectively-large tension changes in the pre-therapy and post-therapy conditions, given that the study was only conducted over a single session. However, these types of patients who report for multiple therapy sessions over time are those for whom a reliable palpation scale and/or objective assessment protocol would be most useful as a way of marking therapeutic progress.

Issues with respect to Clinical Adoption of Palpation Rating Scales

Although neck muscle palpation for assessment and management of vocal hyperfunction is commonplace in specialized voice clinics1, 58, formal documentation of neck tension is not widely practiced. The reliable recording of neck tension through palpation ratings or objective measures could lead to more standardized and well-informed patient care. Obstacles to the advised use of the two scales evaluated here stem from poor inter-rater reliability. The clinicians who participated as raters in this study described several major flaws that they perceived with these scales that included overly-broad distinctions, general lack of bilateral (left versus right) discriminations, neglect of essential categories, and inappropriate guiding text. None of the raters in this study felt that either system was a valuable addition to their current (qualitative) protocol for monitoring neck muscle tension across the course of therapy.

A major criticism of both systems was the lack of discrimination possible. The 4- and 5-point scales were often insensitive to within-therapy changes, even when the clinician believed that they could palpate a change in muscle tension. It is possible that a scale with more divisions, or a visual analog scale such as the one employed by the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V23) could result in more reliable within-therapy results. However, more studies should be performed on this matter given that increasing sensitivity from a 4-point scale to a visual-analog scale can, in some cases, result in decreased inter-rater reliability.24

Vocal hyperfunction often causes patients to present with imbalanced muscular patterns5. These patterns cause asymmetry that may be evident during laryngoscopy as well as through palpation. However, with the exception of the right and left SCM categories of the Mathieson et al. scale, no other categories distinguish between right and left muscle behaviors. The raters of the present study often felt significant differences bilaterally, leading to rater confusion given the limited options for ratings. Likewise, lack of discrimination between anterior and posterior stiffness for the thyrohyoid category of Angsuwarangsee and Morrison’s system also lead to rater confusion, since the accompanying text referred to both muscular contraction and differences in the thyrohyoid space. Further, in palpation of the SCM (right and left), often differences were felt between the superior and inferior ends of the SCM. Ratings based on the ‘average’ muscle tension in cases like these might mask clinically-relevant changes in muscle tension that speech-language-pathologists have the ability to palpate.

The accompanying text of the Angsuwarangsee and Morrison system often caused frustration for the raters of this study. The text descriptions of this system have multiple parts, and raters frequently identified parts of multiple text descriptors that spanned different numerical ratings within the same patient. One example of this was seen more than once for the thyrohyoid measure: agreement with “some contraction on phonation” for a rating of “1”, as well as agreement with “tense, narrow thyrohyoid space at rest” for a rating of “2” (see Table 1 for reference to this system). In some cases, raters even identified with text descriptors of non-adjacent ratings (e.g., agreeing with text for a rating of 0, as well as a rating of 2). In the particular case of the CT text descriptors, raters in this study felt that the emphasis on the size of the cricothyroid space rather than the tension felt in the cricothyroid muscle was misplaced. Likewise, raters felt that the description for pharyngolaryngeal category that asks the rater to attempt to rotate the larynx a full 90° was in most cases inappropriate. The general consensus of the raters of this study, all of whom had several years of experience working exclusively in voice, was that the text descriptions were a distraction. It is possible, however, that the text descriptors in this system may be of more use for clinicians having less experience with voice therapy, in which specific text descriptors may serve as a much-needed guide.

Conclusions

This study examined two recently published clinical neck tension palpation tension rating systems in individuals receiving a single session of voice therapy for hyperfunction-related disorders to determine whether the systems could produce reliable results when administered by speech-language pathologists previously unfamiliar with them. The study further attempted to determine whether either of the systems was correlated with objective measures of neck tension (sEMG and change in the third formant for the vowel /a/). For the 16 individuals studied, Pearson’s correlations between raters were generally low, and little correspondence was found between ratings and objective measures. However, a smaller set of subjects with greater inter-rater agreement showed a stronger relationship between palpation ratings of the supralaryngeal area and sEMG measured on the anterior neck. These scales may be helpful in providing guidance for beginning voice practitioners and may be useful to mark long-term progress from a disordered to fully rehabilitated state, but the current results indicate that they may not be sensitive enough for use as monitoring tools across individual sessions in the course of therapy and their clinical use is not recommended for this purpose.

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

  • 1.Aronson AE. Clinical Voice Disorders: An Interdisciplinary Approach. 1 ed. New York: Thieme-Stratton, Inc; 1980. [Google Scholar]
  • 2.Altman KW, Atkinson C, Lazarus C. Current and emerging concepts in muscle tension dysphonia: a 30-month review. J Voice. 2005 Jun;19(2):261–267. doi: 10.1016/j.jvoice.2004.03.007. [DOI] [PubMed] [Google Scholar]
  • 3.Morrison M. Pattern recognition in muscle misuse voice disorders: how I do it. J Voice. 1997 Mar;11(1):108–114. doi: 10.1016/s0892-1997(97)80031-8. [DOI] [PubMed] [Google Scholar]
  • 4.Roy N, Ford CN, Bless DM. Muscle tension dysphonia and spasmodic dysphonia: the role of manual laryngeal tension reduction in diagnosis and management. Annals of Otology, Rhinology, & Laryngology. 1996 Nov;105(11):851–856. doi: 10.1177/000348949610501102. [DOI] [PubMed] [Google Scholar]
  • 5.Lieberman J. Principles and Techniques of Manual Therapy: Applications in the Management of Dysphonia. In: T H, S H, J R, Howard DM, editors. The Voice Clinic Handbook. London: Whurr Publishers Ltd; 1998. pp. 91–138. [Google Scholar]
  • 6.Roy N, Bless DM. Manual circumlaryngeal techniques in the assessment and treatment of voice disorders. Current Opinion in Otolaryngology & Head and Neck Surgery. 1998;6:151–155. [Google Scholar]
  • 7.Roy N, Bless DM, Heisey D, Ford CN. Manual circumlaryngeal therapy for functional dysphonia: an evaluation of short- and long-term treatment outcomes. J Voice. 1997 Sep;11(3):321–331. doi: 10.1016/s0892-1997(97)80011-2. [DOI] [PubMed] [Google Scholar]
  • 8.Roy N, Leeper HA. Effects of the manual laryngeal musculoskeletal tension reduction technique as a treatment for functional voice disorders: perceptual and acoustic measures. J Voice. 1993 Sep;7(3):242–249. doi: 10.1016/s0892-1997(05)80333-9. [DOI] [PubMed] [Google Scholar]
  • 9.Redenbaugh MA, Reich AR. Surface EMG and related measures in normal and vocally hyperfunctional speakers. Journal of Speech and Hearing Disorders. 1989 Feb;54(1):68–73. doi: 10.1044/jshd.5401.68. [DOI] [PubMed] [Google Scholar]
  • 10.Angsuwarangsee T, Morrison M. Extrinsic laryngeal muscular tension in patients with voice disorders. J Voice. 2002 Sep;16(3):333–343. doi: 10.1016/s0892-1997(02)00105-4. [DOI] [PubMed] [Google Scholar]
  • 11.Kooijman PG, de Jong FI, Oudes MJ, Huinck W, van Acht H, Graamans K. Muscular tension and body posture in relation to voice handicap and voice quality in teachers with persistent voice complaints. Folia Phoniatr Logop. 2005 May-Jun;57(3):134–147. doi: 10.1159/000084134. [DOI] [PubMed] [Google Scholar]
  • 12.Mathieson L, Hirani SP, Epstein R, Baken RJ, Wood G, Rubin JS. Laryngeal Manual Therapy: A Preliminary Study to Examine its Treatment Effects in the Management of Muscle Tension Dysphonia. Journal of Voice. 2009;23(3):353–366. doi: 10.1016/j.jvoice.2007.10.002. [DOI] [PubMed] [Google Scholar]
  • 13.Roy N, Ferguson N. Formant frequency changes following manual cicumlaryngeal therapy for functional dysphonia: Evidence of laryngeal lowering? Journal of Medical Speech-Language Pathology. 2001;9(3):169–175. [Google Scholar]
  • 14.Roy N, Nissen SL, Dromey C, Sapir S. Articulatory changes in muscle tension dysphonia: evidence of vowel space expansion following manual circumlaryngeal therapy. J Commun Disord. 2009 Mar-Apr;42(2):124–135. doi: 10.1016/j.jcomdis.2008.10.001. [DOI] [PubMed] [Google Scholar]
  • 15.Fairbanks G. Voice and Articulation Drillbook. 2nd ed. New York: Harper and Row; 1960. [Google Scholar]
  • 16.Hermens HJ, Freriks B, Merletti R, et al. European Recommendations for Surface ElectroMyoGraphy: Results of the SENIAM project. Enschede: Roessingh Research and Development b.v.; 1999. [Google Scholar]
  • 17.Falla D, Dall'Alba P, Rainoldi A, Merletti R, Jull G. Location of innervation zones of sternocleidomastoid and scalene muscles--a basis for clinical and research electromyography applications. Clin Neurophysiol. 2002 Jan;113(1):57–63. doi: 10.1016/s1388-2457(01)00708-8. [DOI] [PubMed] [Google Scholar]
  • 18.Allison GT, Marshall RN, Singer KP. EMG Signal Amplitude Normalization Technique in Stretch-shortening Cycle Movements. J Electromyogr Kinesiol. 1993;3(4):236–244. doi: 10.1016/1050-6411(93)90013-M. [DOI] [PubMed] [Google Scholar]
  • 19.Yang JF, Winter DA. Electromyography reliability in maximal and submaximal isometric contractions. Arch Phys Med Rehabil. 1983 Sep;64(9):417–420. [PubMed] [Google Scholar]
  • 20.Netto KJ, Burnett AF. Reliability of normalisation methods for EMG analysis of neck muscles. Work. 2006;26(2):123–130. [PubMed] [Google Scholar]
  • 21.Praat: doing phonetics by computer. 2008 [computer program]. Version 5.0.20: http://www.praat.org/
  • 22.Stevens KN. Acoustic Phonetics. Cambridge, MA: MIT Press; 2000. [Google Scholar]
  • 23.Kempster GB, Gerratt BR, Verdolini Abbott K, Barkmeier-Kraemer J, Hillman RE. Consensus Auditory-Perceptual Evaluation of Voice: Development of a Standardized Clinical Protocol. Am J Speech Lang Pathol. 2009 Oct 16;18:124–132. doi: 10.1044/1058-0360(2008/08-0017). [DOI] [PubMed] [Google Scholar]
  • 24.Wuyts FL, De Bodt MS, Van de Heyning PH. Is the reliability of a visual analog scale higher than an ordinal scale? An experiment with the GRBAS scale for the perceptual evaluation of dysphonia. J Voice. 1999 Dec;13(4):508–517. doi: 10.1016/s0892-1997(99)80006-x. [DOI] [PubMed] [Google Scholar]

RESOURCES