Abstract
Objective
The purpose of this study was to examine the relation of perceptual ratings of nasality by experienced listeners, measures of nasalance, and the size of the nasal port opening for three simulated English corner vowels, /i/, /u/, and /ɑ/.
Design
Samples were generated using a computational model that allowed for exact control of nasal port size and a direct measure of nasalance. Perceptual ratings were obtained using a paired-stimulus presentation.
Participants
Five experienced listeners.
Main Outcome Measures
Measures of nasalance and perceptual nasality ratings.
Results
Differences in nasalance and perceptual ratings of nasality were noted among the three vowels, with values being greater for the high vowels /i/ and /u/ compared to the low vowel /ɑ/. Listeners detected nasality for the high and low vowels simulated with nasal port areas of 0.01 and 0.15 cm2, respectively. Correlations between ratings of nasality and nasalance were high for all three vowels.
Conclusions
Results of the present study show a high correlation between ratings of nasality and measures of nasalance for nasal port areas ranging from 0 to 0.5 cm2. The correlations were based on sustained vowel samples. The restricted speech sample limits generalization of the findings to clinical data; however, the results are a demonstration of the usefulness of modeling to understand the perceptual phenomena of nasality.
Keywords: nasal port area, nasalance, nasality
Perceptual ratings of oral-nasal resonance are part of a complete assessment for patients with a variety of speech disorders. Differences in oral-nasal resonance may result from either congenital or acquired disorders and may be noted in both children and adults. It has been well documented that the perceptual rating of oral-nasal resonance, commonly referred to as nasality, exists on a continuum such that listeners can detect different degrees of normal and abnormal nasality (e.g., Brancamp et al., 2010). The relation of perceptual ratings of nasality and measures of nasal port area (i.e., velopharyngeal orifice size) has been subject to speculation for many years. Modest correlations between ratings and measures can be found in both physical studies (Watterson and Emanuel, 1981) and studies of patient populations (e.g., Warren et al., 1994; Kummer et al., 2003). Kummer et al. (2003) have suggested that the relation between perception and gap size is not linear, based on a retrospective medical chart review; however, the relationship has not been tested directly. The goal of the present study was to use a model of the vocal tract with varying nasal port areas to produce acoustic signals that can be rated by listeners as a direct way to study the relation between the perceptual characteristics and nasal port area.
Concerns about the reliability of scaled ratings of nasality have been presented in the literature (e.g., Zraick and Liss, 2000; Whitehill et al., 2002). One concern about reliability has been related to the type of psychophysical scale used for the ratings (e.g., equal-appearing interval scaling versus direct magnitude estimation). Zraick and Liss (2000) report a large difference between ratings based on the two scales; whereas, Brancamp et al. (2010) report no differences. The authors of both studies suggest that further research of the acoustic-perceptual correlates of nasality are needed. Perceptual uncertainty or the lower than expected reliability in listener ratings of nasality has been explained as variation in the acoustical foundation of hypernasality across speech samples and subjects (Sherman, 1954; Carney and Sherman, 1971; Moore and Sommers, 1973; Watterson and Emanuel, 1981; Watterson et al., 1993). The nasal vowel is produced by acoustic coupling of the oral and nasal cavities at a point that is about halfway along the vocal tract between the glottis and the lips. The effect of this acoustic coupling is a shift in the formant frequencies of the vocal tract (i.e., the formant frequencies for the nonnasal vowel) and the addition of pole-zero pairs to the vocal-tract transfer function (House and Stevens, 1956; Fant, 1960; Fujimura, 1960, 1961; Fujimura and Lindqvist, 1971; Lonchamp, 1979). The primary consequence on the acoustic spectrum of the vowel appears to be at the low frequencies, in the vicinity of the first formant. Issues with reliability, therefore, have been reported to arise when listeners do not pay attention to cues in the acoustic signal in the same way (Fletcher, 1970).
Hawkins and Stevens (1983) have suggested that there are basic acoustic properties of nasality, independent of the vowel, to which the auditory system responds in a distinctive way, and therefore all listeners should be reliable in their ratings of nasality. They assert that when the cross-sectional area of the velopharyngeal opening is gradually increased, a shift is seen in the frequency of the first formant as well as an increase in the formant bandwidth. A pole zero pair is added in the vicinity of the first formant and the spacing between the pole and zero increases as the velopharyngeal opening increases, with the additional pole showing increased spectral prominence with larger openings (Hawkins and Stevens, 1983). Based on data from listeners across languages, they concluded that a measure of the degree of prominence of the spectral peak in the vicinity of the first formant represents the acoustic properties of nasality, independent of the vowel, to which the auditory system responds in a distinctive way regardless of language background. This is consistent with a statement put forth by McDonald and Koepp-Baker (1951), who hypothesized that there is a critical point in the degree of velar closure where the balance of oral and nasal resonance shifts from normal to abnormal based on changes in the acoustic characteristics.
Given the reported clinical concerns about perceptual reliability for ratings of nasality and the need to document nasality prior to management, objective measures of nasal versus oral airflow and pressure are used clinically to support the perceptual ratings. Similar to acoustic measures, these measures have been shown to vary by vowel and dialect as well as by language; subsequently, multiple sets of reference values have been published in the literature (e.g., Anderson, 1996; Dalston et al., 1993). One measure, which has been studied extensively in relation to the perceptual rating of nasality and is used most commonly in clinical settings, is nasalance (Stelck et al., 2011). Nasalance is the relative proportion of sound within a specified frequency band emitted from the mouth and nose during speech. Screening values of nasalance have been proposed for use in clinical settings with high sensitivity and specificity to guide management decisions. For example, using a nasalance cutoff score of 0.32, sensitivity is reported as 0.89 and specificity as 0.95 (Dalston et al., 1991). Similarly, Hardin et al. (1992) found with a cutoff score of 0.27, sensitivity is 0.76 and specificity is 0.86. Differences in the absolute values were believed to be related to speech material used in the research. Measures of nasality have also been reported for tracking changes over time and in response to management within subjects (Tachimura et al., 2004; Zemann et al., 2006; Pereira et al., 2008). A few studies have documented moderate correlations between perceptual ratings of nasality and nasalance (e.g., Hardin et al., 1992; Nellis et al., 1992; Brancamp et al., 2010). Dalston et al. (1991) reported a correlation (r = .82) between Nasometer data and listener ratings of nasality.
Two studies have attempted to relate perceptual ratings to nasal port area size. Dalston et al. (1991) correlated measures of nasalance, perceptual nasality ratings, and aerodynamic estimates of velopharyngeal orifice size and reported weak correlations (r = .32). In a retrospective medical chart review, Kummer et al. (2003) reported that based on a logit regression analysis, perceptual characteristics correctly predicted the category of gap size in 70% of the patients (121 of 173 subjects). Gap sizes were listed as small, medium, or large based on videofluoroscopy and/or nasaopharyngoscopy evaluation. No direct measures of velopharyngeal orifice size were made. The perceptual ratings included the presence or absence of nasal emission or nasal rustle and presence of hypernasality rated on a 3-point scale as mild, moderate, or severe. Conclusions from the study suggest that predicting velopharyngeal gap size based on an individual’s speech is not an exact science; however, some predictions can be made that can be used to guide management. Confidence was reported to be greatest when the opening was small, perceived as a nasal rustle, or large, perceived as severe hypernasality.
One study has attempted to investigate the effects of oral-nasal coupling on nasality using a physical appliance that permitted alteration of the size of the velopharyngeal opening (Watterson and Emanuel, 1981). The speech tasks included two English vowels, /i/ and /u/, in a whispered and voiced condition. Listeners rated the scaled degree of nasality for each test vowel using a 5-point scale, with 1 representing the least nasality and 5 the most nasality. Velopharyngeal opening was set to 1 of 5 circular areas. Areas included 12.57, 28.27, 50.26, 78.53, and 153.94 mm2. Results showed that ratings were erratic and did not increase or decrease systematically as oral-nasal coupling was increased. Watterson and Emanuel concluded that increases in oral-nasal orifice area do not necessarily result in increased vowel nasality, in contrast to reports from other authors (e.g., Stevens and House, 1956; Hawkins and Stevens, 1983). They further suggest that the acoustic-perceptual impact of coupling may be influenced markedly be source-spectrum characteristics, although there are no data reported to directly support this claim.
For clinical populations, ratings of nasality are frequently corroborated with visualization of the velopharyngeal mechanism via nasopharyngoscopy. Although this does allow the clinicians to calculate a port size and make decisions regarding management, errors related to visual distortion and the perceptual consequences are possible. In addition, any attempt to relate the perception of nasality to port size (via presentation of recorded audio to listeners) is compromised by either the presence of the endoscope during audio recording or by using audio recordings collected during a different production of sound than the one for which port size was measured. In contrast, a computational model representative of the structure and function of the speech production system can be used to produce human-like speech sounds for which all parameters are under experimental control. For instance, a set of vowel sounds could be generated for which the nasal port size ranges from small to large, but all other aspects of producing the vowel, such as vocal-fold vibratory characteristics and vocal-tract shape, are guaranteed to be unchanged from sample to sample. In this way, the effect of systematic parametric variation (i.e., a continuum of parameter values) can be related to the output signal(s) as well as to a listener’s perception of those signals.
The purpose of this study was to examine the relation of perceptual ratings of nasality by trained listeners, measures of nasalance, and the size of the nasal port opening for three English corner vowels, /i/, /u/, and /ɑ/. Samples were generated using a computational model that allowed for exact control of nasal port size and a direct measure of nasalance.
Method
The institutional review board at the University of Arizona approved all procedures.
Simulation of Audio Samples
A computational model of speech production was used to generate vowel samples with varying degrees of nasal port coupling. Although phrase- and sentence-level speech can be produced by the model, it requires a complex time-dependent variation of the model parameters and complicates the interpretation of acoustic and perceptual measures. In this preliminary study, vowels were chosen so that the temporal complexity of connected speech would not affect the results.
The voice-source component of the model used to simulate all vowel samples consisted of a kinematic representation of the medial surface of the vocal folds (Titze, 1984, 2006) for which fundamental frequency, surface bulging, and adduction are control parameters. This produces a glottal area signal that is coupled to the acoustic pressures and air flows in the trachea and vocal tract through aerodynamic and acoustic considerations as prescribed by Titze (2002). The resulting glottal flow was determined by the interaction of the glottal area with the time-varying pressures present just inferior and superior to the glottis. The vocal tract shape, which extended from glottis to lips, was specified by an area function representative of an /i/, /u/, or /ɑ/ vowel (Story, 2008); the tracheal shape was also specified by an area function that extended from the glottis to bronchi (Story, 1995). The nasal tract consisted of multiple path system of airways that included a nasal port coupling area for entry in the velopharyngeal space, which then branched into the left and right nasal passages. The sphenoid sinus and the left and right maxillary sinuses were also included as side branch resonators. Dimensions for the nasal tract were based on Story (1995). It is noted that the model represents the coupling of the vocal and nasal tracts only as a cross-sectional area and does not account for the possible changes in the vocal tract shape due to lowering of the velar structure. Acoustic wave propagation in subglottal, supraglottal, and nasal airspaces was computed with a wave-reflection model (Liljencrants, 1985; Story, 1995) that included energy losses due to yielding walls, viscosity, heat conduction, and radiation at the lips (Story, 1995). The sound production system is illustrated in Figure 1. The glottis is located at zero on the x-axis and the tracheal shape extends leftward in the negative direction; whereas, the vocal tract, configured here as an /ɑ/ vowel, extends rightward in the positive direction. The nasal coupling port, indicated by an, is located 9.1 cm from the glottis. To keep the figure simple, the nasal tract is shown only as a label rather than as a set of area functions. The acoustic pressure signal radiated at the lips and nares are summed to form the composite output signal. This is analogous to a microphone signal recorded from a real talker.
Each of the three vowels /i/, /u/, and /ɑ/ was simulated with 21 equally incremented values of the nasal port area (an in Fig. 1) that ranged from 0 to 0.05 cm2 (0.0025-cm2 increments). Each vowel sample was generated with a total duration 0.5 seconds and the fundamental frequency (F0) was varied according to the contour shown in the top panel of Figure 2. The F0 began at a frequency of 89 Hz, increased to a maximum value of 105 Hz at 0.25 seconds, and then decreased to a final value of 95 Hz. For each vowel the respiratory pressure was ramped from 1000 to 7840 dyne/cm2 in the initial 5 milliseconds with a cosine function, and then ramped down from 7840 to 1000 dyne/ cm2 over the final 50 milliseconds of the utterance. Other model parameters were set to constant values throughout the time course of each vowel. An example waveform consisting of the sum of the radiated pressure signal at the lips and nares is shown in the lower panel of Figure 2 for the vowel /ɑ/ and a nasal coupling area of 0.03 cm2.
Waveforms such as the one shown in Figure 2 were generated for each vowel condition, converted to wav format audio files, and used as stimuli for the listening experiment. Three audio files containing all stimuli for the three different vowels (Audio 1, Audio 2, and Audio 3) are available as supplemental online content associated with this article.
Measures of Nasalance
As shown in Figure 1, a radiated pressure signal is available at each nares and at the lips for any given simulated vowel. These signals were used to calculate the nasalance as a percentage according to the equation,
where Pn and Po are the root mean square (RMS) pressures at the nares and lips (oral pressure), respectively. Nasalance values were calculated twice for each vowel along the continuum of the nasal port area; first with the raw signals, which are referred to as full bandwidth, and then with the pressure signals at the nares and lips using a fourth-order Butterworth band-pass filter with cutoff frequencies of 350 and 650 Hz prior to determining the RMS values of Pn and Po. The second condition was done to simulate the filtering done by the Nasometer system (Fletcher et al., 1989).
Auditory Perceptual Scaling of Nasality
The listening panel consisted of five experienced speech-language pathologists. Experience was defined as having 3 or more years clinical experience working with children with resonance disorders. Three of the listeners had between 3 and 5 years of experience and two had 20-plus years of experience. Although experience varied within the listener group, data suggest that trained listeners, regardless of years of experience, have similar reliability for judgments of nasality (Laczi et al., 2005). All listeners were women, were native English speakers, and passed a hearing screening. Each listener performed the evaluation alone while seated in a sound-treated room in front of a PC computer that randomly presented the speech stimuli through a loudspeaker. The loudspeaker was located about 100 cm from the listener, and signals were presented at a sound pressure level of 70 to 75 dBA. Alvin (Hillenbrand and Gayvert, 2005), an open-source, Windows-based program for controlling listening experiments, was used for stimulus presentation and response recording. Alvin allows the listener to control the pace of stimulus presentation and response. Each listener was presented a series of trials. Each trial contained a pair of vowel productions. The samples were separated by a 400-millisecond pause. The pair included one sample where the nasal port opening was set to zero (i.e., not coupled to the vocal tract) and one where the nasal tract was coupled to the vocal tract with an opening between 0.0025 and 0.05 cm2. Sample order was randomized and balanced for order of presentation; this created a total of 42 pairs per vowel for presentation (i.e., 21 different nasal port opening settings × two orders). While the stimulus pair was presented, the computer screen displayed a horizontally oriented “slider scale,” with the slider button positioned at the scale’s midpoint. Each end of the slider scale was labeled Sample 1 (left side) and Sample 2 (right side). The following instructions were read to the listeners prior to commencing the experiment.
“You will be presented with a series of paired words read by a number of different speakers. For each presentation, the words will have differing degrees of nasality. Your task is (1) to indicate whether sample one or sample two is more nasal, and (2) by how much. You will use the mouse to move the marker from the middle of the scale towards the presentation that has the most nasality. You will indicate the degree of nasality difference by how close you move the marker to the speech sample, the closer to either end, the larger the difference in nasality. Try to make each choice match the nasality, as you perceive it.”
No practice trials were offered; however, the investigator did demonstrate the computer interface to each listener using three vowel pairs that were not part of the study. The three pairs included one vowel with the nasal port area set to zero and one where the nasal port area was larger than those used in the study (0.5 cm2). A total of 504 signal pairs were presented (four repetitions × three vowels × 42 pairs). Listeners were allowed to play each stimulus pair as many times as they wished. Vowel order was randomized. The listening experiment lasted 30 minutes.
The Alvin program recorded the listener response to each stimulus pair as an integer ranging from −500 to +500. For example, a score of −500 would indicate that the listener perceived the first sample to be much more nasal than the second sample. A rating of +500, on the other hand, would indicate that the listener perceived the second sample as more nasal than the first sample. A rating of zero would indicate that the listener could not distinguish the nasality of the two samples. Because one of the samples was always the reference (nasal port area = 0 cm2), the sign of the rating was flipped to be positive. There were 13 ratings where the reference signal was judged to be more nasal than the target signal rating; these ratings were removed from the data set. For each stimulus pair, a listener rating was derived on the basis of the mean of that listener’s four ratings. A panel rating was then derived on the basis of the mean rating across the five listeners. Interrater reliability was assessed using the intraclass correlation coefficient or ICC(2,k) (Shrout and Fleiss, 1979). The ICC(2,k) for the listener group was .97.
Results
Nasalance
Measures of nasalance plotted against nasal port area are shown in the two panels of Figure 3 for the full bandwidth (Fig. 3a) and limited bandwidth (Fig. 3b) for the three English vowels. The limited bandwidth is comparable to that used by the Nasometer (Fletcher et al., 1989). As expected, nasalance increased as the nasal port area grew larger. The most rapid rise in nasalance was noted for /i/ in both bandwidth conditions, and the slowest rise was for /ɑ/. Differences in nasalance were noted for all three vowels when the full bandwidth was compared with the Nasometer bandwidth. The slopes (rise in nasalance) were steeper for the Nasometer bandwidth compared with the full bandwidth; however, the relative order of the vowels remained the same.
The horizontal bold line at 32% nasalance is included as a reference based on the clinical cutoff value proposed by Dalston et al. (1991). For the full bandwidth, the vowel /i/ reached the threshold at a nasal port area of 0.03 cm2; whereas, /u/ did not reach the same nasalance level until the nasal port area was 0.0375 cm2. The nasalance value for /ɑ/ did not reach the clinical threshold level for the range of areas tested (0 to 0.05 cm2). Based on the limited (Nasometer) bandwidth, the /i/ vowel reached the cutoff threshold at a smaller nasal port area (0.0175 cm2) compared with both /u/ and /ɑ/. For /u/ and /ɑ/, nasal port areas of 0.03 cm2 and 0.0475cm2, respectively, correspond to nasalance scores of 32%.
Auditory Perceptual Scaling of Nasality
Mean ratings across the listener group are shown in Figure 4 where nasal port area is plotted against nasality rating for each of the three vowels. Visually, the overall shape of the curve for the vowels /i/ and /u/ resemble the plots of nasalance shown in Figure 3. For these two vowels, nasality ratings increased quickly once the nasal port area reached 0.01 cm2. This nasal port area size represented the point where listeners detected nasality and above which an increasing severity was noted. This nasal port area is smaller than the nasal port area matching the clinical threshold for the corresponding nasalance value (0.0175 cm2). A second jump in the nasality ratings was noted at a nasal port area of 0.02 cm2 for /i/ and /u/. The perceptual ratings appeared to plateau for the nasal port opening sizes greater than 0.045 cm2; this may represent a saturation point for the listeners. It is interesting to note, however, that no raters used values above 300 on the scale for any of the samples presented, even though a range of 0 to 500 was available.
For the vowel /ɑ/, an increase in nasality ratings was not seen until a nasal port area of 0.0175 cm2 was achieved, and in contrast to the other vowels, the ratings for /ɑ/ were relatively flat for nasal port areas greater than 0.02 cm2. This was below the nasal port area that corresponded to the clinical threshold value for nasalance scores (0.0325 cm2 for the limited bandwidth measure). Overall, ratings for /ɑ/ did not reach the magnitude seen for both /i/ and /u/.
Data for individual listeners is shown in the panels of Figure 5. These data show the variability in ratings across the listeners. Three of the five listeners limited their scale use to a range of 0 to 200. This means that they used roughly half of the visual analog scale available to them. The other two listeners used a range of 0 to 350, only slightly larger. Differences in ratings between /i/ and /u/ versus /ɑ/ noted in group data shown in Figure 4 were evident in individual data for listeners 2, 3, and 4, even though the absolute values of the ratings were different for these listeners. No differences in ratings between vowels were noted for listener 5. For listener 1, ratings for /ɑ/ and /u/ were similar, but there was a difference noted in ratings for /i/. Listeners 2 and 5 had the most clinical experience.
Nasalance and Nasality
Correlation coefficients between nasalance (full bandwidth and limited bandwidth) and mean nasality ratings were calculated for all three vowels. Correlations were high in all cases. For /i/, the correlation coefficient was .98 for the full bandwidth and .97 for the limited bandwidth. For /u/ and /ɑ/, the values were .93 and .97, respectively, for both bandwidths.
Discussion
The current study examined the relation between perceptual ratings of nasality, measures of nasalance, and nasal port area using a computer model. The samples used were simulations of three English corner vowels, /i/, /u/, and /ɑ/ based on a computational model. Modeling was desirable because it allowed nasal port area to be explicitly specified so the relation between perceptual and acoustic measures could be determined. Results of the study can be compared with previously published data speculating the relation among perception, nasalance, and nasal port area.
Vowel Differences
Differences between vowels were noted for both nasality ratings and nasalance measures. For example, /i/ and /u/ had a steeper nasalance slope as nasal port area increased compared with /ɑ/ in the present study. Nasalance increased slowly for /ɑ/, especially when the measure was taken from the full bandwidth. These findings are consistent with previous research for speech samples from speakers with and without velopharyngeal impairment, where it was reported that nasalance scores obtained from the high front vowel /i/ were markedly higher than nasalance scores obtained from low back vowel /ɑ/ in sentence-level material (MacKay and Kummer, 1994). Similar findings were reported by Lewis et al. (2000) based on a study that included both vowels in isolation as well as vowels extracted from sentence repetitions. They reported a higher nasalance value for /i/ in isolation (20%) compared with the other isolated vowels (/u/, /æ/, and /ɑ/) (10%). In the sentence sample, nasalance values were similar for /i/ and /u/, which were higher than values for /æ/ and /ɑ/. Neither of these studies investigated the relation between listener judgments of nasality and nasalance scores. Thus it is not known whether there would have been perceptual differences across stimuli weighted with high and low vowels.
In the present study, perceptual ratings of nasality across listeners were similar for /i/ and /u/, with jumps in the ratings noted at nasal port areas of 0.01 and 0.02 cm2. The jump in ratings at a nasal port area of 0.01 cm2 corresponded to the point where listeners were able to detect nasality in the vowel samples. This jump occurred at a nasal port area of 0.0175 cm2 for /ɑ/. The measured nasalance value was approximately 5% for all three vowel samples when nasality was detected by the listeners. The second jump in ratings appeared to relate to an increase in severity. At this nasal port area, the measured nasalance was 20%, well below the clinical threshold of 32% put forth by Warren et al. (1991) or 27% reported by Hardin et al. (1992). A caveat, however, is that the clinical threshold values were developed using connected speech; whereas, isolated vowels were tested in the present study. Nasality ratings continued to increase for nasal port areas greater than 0.02 cm2 for both /i/ and /u/. For /ɑ/, ratings fluctuated but were fairly flat. Generally, the literature suggests that high vowels are perceived as more nasal than low vowels for speakers who are hypernasal (Hess, 1959; Spriesters-bach and Powers, 1959; Carney and Sherman, 1971). For normal speakers, however, low vowels are perceived as more nasal than high vowels (Lintz and Sherman, 1961). In the present study, nasality ratings for the low vowel /ɑ/ were never as high as ratings for either /i/ or /u/ as would be predicted based by Lintz and Sherman. One difference is the use of connected speech versus sustained vowels. Furthermore, nasal port area was maintained at a constant area throughout the sustained vowel. Speakers may vary the size of the port within the vowel nucleus, and this would affect the acoustics. If the nasal port area had increased past 0.05 cm2, it is not known how ratings of nasality would change. It is possible that listeners were expecting samples with greater degrees of nasality because they used only a portion of the perceptual scale range available. It has been suggested, however, that ratings of quality are perhaps more of a ranking of samples rather than a measure of perceptual distance (Shrivastav and Sapienza, 2005). In this view, the absolute magnitude of the nasality ratings is not important but may simply indicate an ordering of relative severity. Further study of how ratings change with larger nasal port area is needed in order to understand how listeners use the rating scale for nasality.
Nasality Ratings and Nasalance
Understanding the relation between the perceptual phenomenon of nasality and nasalance scores is of great interest. Kent (1996) noted that “auditory-perceptual judgments are typically the final arbiter in clinical decision-making and often provide the standards against which instrumental, so-called ‘objective’ measures are evaluated” (p. 7). Correlation coefficients calculated for the measures of nasalance (full bandwidth and limited bandwidth) and ratings of nasality in the present study were high (range, .93 to .98). This finding is somewhat surprising given reports in the literature. For example, Watterson et al. (1993) reported modest correlations (range, .24 to .49) between judgments of nasality and measures of nasalance that they speculated were related to the limited bandwidth used by the Nasometer (500 ± 150 Hz; Fletcher et al., 1989). They reported that listeners had access to the full bandwidth to arrive at judgments of nasality. Correlations in the present study were similar for measures of nasalance taken from both the full bandwidth and the limited bandwidth compared with perceptual ratings. It is also likely, as put forth by Watterson and colleagues, that listeners take advantage of suprasegmental features, such as vowel intensity, vocal pitch, context, and articulation to rate nasality. None of these variables are included in measures of nasalance. Additional factors such as oral and nasal impedance and respiratory effort may also play an important role in the perception of nasality. In the present study, sustained vowel samples were used, and therefore, with the exception of a slight rise and fall in the fundamental frequency contour, no suprasegmental variables were varied. Additional factors were also held constant. Further modeling research looking at the connected speech samples and different contexts, as well as varied speaker characteristics could be used to investigate this further. In addition, spectral analysis of the stimuli may allow for a more complete exploration of the relation between nasality and the acoustic characteristics of nasalized vowels.
Nasal Port Area
The nasal port areas included in the present study, 0 to 0.05 cm2, were limited compared with reports of previous investigations of nasality and degree of velopharyngeal impairment. Our initial modeling included nasal port areas up to 0.2 cm2, but it was found that nasalance values for both bandwidths reached a plateau just above 0.05 cm2. Although it is not known whether nasality ratings would plateau at the same point, this was assumed a priori so that the number of samples used for listening experiment was not excessive. Given that listeners did not use the full range of the perceptual scale during the rating task, however, it is possible ratings could continue to increase with larger port areas even though the acoustic measure of nasalance was found to be essentially flat. Certainly of clinical relevance is to also know whether the plateau in nasalance is an artifact of static vowel production or whether it would also occur when nasalance is calculated over the duration of a phrase or sentence, as is often done during a clinical assessment. Warren (1979) reported that for speakers with velopharyngeal openings from 0.05 to 0.1 cm2, listeners do perceive speech as hypernasal; however, essentially normal aerodynamic patterns are seen in connected speech. For velopharyngeal openings greater than 0.1 cm2, pressure and airflow patterns are different from normal. Velopharyngeal opening sizes were based on the pressure flow technique proposed by Warren and DuBois (1964). Differences such as decreased intraoral pressure (Dalston et al., 1988), increased respiratory effort (Warren, 1967), and excessive nasal emission (Warren et al., 1989) have been reported. Despite these observations, Warren et al. (1994) found only moderate correlations between hypernasality ratings and velopharyngeal orifice size. They hypothesized that timing of velopharyngeal closing may have had a significant effect on perceived nasality in connected speech. This was not investigated in the present study.
Conclusions
Results of the present study show a high correlation between ratings of nasality and measures of nasalance for nasal port areas ranging from 0 to 0.5 cm2. The correlations were based on sustained vowel samples. The restricted speech sample limits generalization of the findings; however, the results are a demonstration of the usefulness of modeling to understand the perceptual phenomena of nasality. It is known that nasality can be affected by the ratio of oral and nasal cavity impedances, and this impedance relationship changes with effort, articulatory configuration, size of the vocal tract, relative degree of oral-nasal coupling, and timing of closure, as well as other suprasegmental variables. Further studies where structural or kinematic variables can be manipulated independently in a computational model are useful for investigating how changes in speech production behaviors, both in isolation and in combination, ultimately affect the perception of speech.
Supplementary Material
Acknowledgments
This work was supported by NIH/NIDCD R01 DC004789.
References
- Anderson RT. Nasometric values for normal Spanish-speaking females: a preliminary report. Cleft Palate J. 1996;33:333–336. doi: 10.1597/1545-1569_1996_033_0333_nvfnss_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Brancamp TU, Lewis KE, Watterson T. The relationship between nasalance scores and nasality ratings obtained with equal appearing interval and direct magnitude estimation scaling methods. Cleft Palate Craniofac J. 2010;47:631–637. doi: 10.1597/09-106. [DOI] [PubMed] [Google Scholar]
- Carney P, Sherman D. Severity of nasality in three selected speech tasks. J Speech Hear Res. 1971;14:396–407. doi: 10.1044/jshr.1402.396. [DOI] [PubMed] [Google Scholar]
- Dalston RM, Neiman GS, Gonzalez-Landa G. Nasometric sensitivity and specificity: a cross-dialect and cross-culture study. Cleft Palate Craniofac J. 1993;30:385–291. doi: 10.1597/1545-1569_1993_030_0285_nsasac_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Dalston RM, Warren DW, Dalston E. The use of nasometry as a diagnostic tool for identifying patients with velopharyngeal impairment. Cleft Palate Craniofac J. 1991;28:184–189. doi: 10.1597/1545-1569_1991_028_0184_uonaad_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Dalston RM, Warren DW, Morr KE, Smither LR. Intraoral pressure and its relationship to velopharyngeal inadequacy. Cleft Palate J. 1988;25:210–219. [PubMed] [Google Scholar]
- Fant G. Acoustic Theory of Speech Production. The Hague: Mouton; 1961. [Google Scholar]
- Fletcher SG. Theory and instrumentation for qualitative measurement of nasality. Cleft Palate J. 1970;7:601–609. [PubMed] [Google Scholar]
- Fletcher SG, Adams LE, McCutcheon JJ. Cleft palate speech assessment through oral nasal acoustic measures. In: Bzoch KR, editor. Communicative Disorders Related to Cleft Lip and Palate. Boston: Little Brown; 1989. pp. 246–257. [Google Scholar]
- Fujimura O. Spectra of nasalized vowels. MIT Res Lab Electronic Quarterly Progress Report. 1960;58:214–218. [Google Scholar]
- Fujimura O. Analysis of nasalized vowels. MIT Res Lab Electronic Q Prog Rep. 1961;6:191–192. [Google Scholar]
- Fujimura O, Lindqvist J. Sweep-tone measurement of vocal tract characteristics. J Acoust Soc Am. 1971;49:541–558. doi: 10.1121/1.1912385. [DOI] [PubMed] [Google Scholar]
- Hardin MA, Van Demark DR, Morris HL, Payne MM. Correspondence between nasalance sores and listener judgments of hypernasality. Cleft Palate J. 1992;29:346–351. doi: 10.1597/1545-1569_1992_029_0346_cbnsal_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Hawkins S, Stevens KN. A cross-language study of the perception of nasalized vowels. J Acoust Soc Am. 1983;73(suppl 1):S54. [Google Scholar]
- Hess DA. Pitch, intensity, and cleft palate voice quality. J Speech Hear Res. 1959;2:113–125. doi: 10.1044/jshr.0202.113. [DOI] [PubMed] [Google Scholar]
- Hillenbrand J, Gayvert RT. Open source software for experiment design and control. J Speech Hear Res. 2005;48:45–60. doi: 10.1044/1092-4388(2005/005). [DOI] [PubMed] [Google Scholar]
- House AS, Stevens KN. Analog study of the nasalization of vowels. J Speech Hear Dis. 1956;21:218–232. doi: 10.1044/jshd.2102.218. [DOI] [PubMed] [Google Scholar]
- Kent RD. Hearing and believing: some limits to the auditory-perceptual assessment of speech and voice disorders. Am J Speech Lang Pathol. 1996;5:7–23. [Google Scholar]
- Kummer AW, Briggs M, Lee L. The relationship between the characteristics of speech and velopharyngeal gap size. Cleft Palate Craniofac J. 2003;40:590–596. doi: 10.1597/1545-1569_2003_040_0590_trbtco_2.0.co_2. [DOI] [PubMed] [Google Scholar]
- Laczi E, Sussman J, Stathopoulos E, Huber J. Perceptual evaluation of hypernasality compared to HONC measures: the role of experience. Cleft Palate Craniofac J. 2005;42:203–211. doi: 10.1597/03-011.1. [DOI] [PubMed] [Google Scholar]
- Lewis KE, Watterson T, Quint T. The effect of vowels on nasalance scores. Cleft Palate Craniofac J. 2000;37:584–589. doi: 10.1597/1545-1569_2000_037_0584_teovon_2.0.co_2. [DOI] [PubMed] [Google Scholar]
- Liljencrants J. Dissertation. Stockholm: Royal Institute of Technology; 1985. Speech Synthesis With a Reflection-Type Line Analog. [Google Scholar]
- Lintz LB, Sherman D. Phonetic elements and perception of nasality. J Speech Hear Res. 1961;4:381–396. doi: 10.1044/jshr.0404.381. [DOI] [PubMed] [Google Scholar]
- Lonchamp F. Analyse acoustique des voyelles nasals francaises. Verbum. 1979;2:9–54. [Google Scholar]
- MacKay IR, Kummer AW. Simplified Nasometric Assessment Procedures. Lincoln Park, NJ: Kay Elemetrics; 1994. [Google Scholar]
- McDonald ET, Koepp-Baker H. Cleft palate speech: an integration of research and clinical observation. J Speech Hear Dis. 1951;16:9–20. doi: 10.1044/jshd.1601.09. [DOI] [PubMed] [Google Scholar]
- Moore WH, Sommers RK. Phonetic contexts: their effects on perceived nasality in cleft palate speakers. Cleft Palate J. 1973;10:72–83. [PubMed] [Google Scholar]
- Nellis JL, Neiman GS, Lehamn JA. Comparison of Nasometer and listener judgments of nasality in the assessment of velopharyngeal function after pharyngeal flap surgery. Cleft Palate J. 1992;29:157–163. doi: 10.1597/1545-1569_1992_029_0157_conalj_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Pereira V, Sell D, Ponniah A, Evans R, Dunaway D. Midface osteotomy versus distraction: the effect on speech, nasality, and velopharyngeal function in craniofacial dystosis. Cleft Palate Craniofac J. 2008;45:353–363. doi: 10.1597/07-042.1. [DOI] [PubMed] [Google Scholar]
- Sherman D. The merits of backward playing of connected speech. J Speech Hear Res. 1954;19:312–321. doi: 10.1044/jshd.1903.312. [DOI] [PubMed] [Google Scholar]
- Shrivastav R, Sapienza C. Application of psychometric theory to the measurement of voice quality using rating scales. J Speech Hear Res. 2005;48:323–335. doi: 10.1044/1092-4388(2005/022). [DOI] [PubMed] [Google Scholar]
- Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull. 1979;2:420–428. doi: 10.1037//0033-2909.86.2.420. [DOI] [PubMed] [Google Scholar]
- Spriestersbach DC, Powers GR. Nasality in isolated vowels and connected speech of cleft palate speakers. J Speech Hear Res. 1959;2:40–45. doi: 10.1044/jshr.0201.40. [DOI] [PubMed] [Google Scholar]
- Stelck EH, Boliek CA, Hagler PH, Rieger JM. Current practices for evaluation of resonance disorders in North America. Semin Speech Lang. 2011;32:58–68. doi: 10.1055/s-0031-1271975. [DOI] [PubMed] [Google Scholar]
- Story BH. Dissertation. Iowa City, IA: University of Iowa; 1995. Physiologically Based Speech Simulation Using an Enhanced Wave Reflection Model of the Vocal Tract. [Google Scholar]
- Story BH. Comparison of magnetic resonance imaging-based vocal tract area functions obtained from the same speaker in 1994 and 2002. J Acoust Soc Am. 2008;123:327–335. doi: 10.1121/1.2805683. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Tachimura T, Kotani Y, Wata T. Nasalance scores in wearers of palatal lift prosthesis in comparison with normative data for Japanese. Cleft Palate Craniofac J. 2004;41:315–319. doi: 10.1597/02-107.1. [DOI] [PubMed] [Google Scholar]
- Titze IR. Parameterization of the glottal area, glottal flow, and vocal fold contact area. J Acoust Soc Am. 1984;75:570–580. doi: 10.1121/1.390530. [DOI] [PubMed] [Google Scholar]
- Titze IR. Regulating glottal airflow in phonation: application of the maximum power transfer theorem to a low dimensional phonation model. J Acoust Soc Am. 2002;111:367–376. doi: 10.1121/1.1417526. [DOI] [PubMed] [Google Scholar]
- Titze IR. The Myoelastic Aerodynamic Theory of Phonation. Iowa City, IA: National Center for Voice and Speech; 2006. pp. 197–214. [Google Scholar]
- Warren DW. Nasal emission of air and velopharyngeal function. Cleft Palate J. 1967;4:148–155. [PubMed] [Google Scholar]
- Warren DW. Perci: a method for rating palatal efficiency. Cleft Palate J. 1979;16:279–285. [PubMed] [Google Scholar]
- Warren DW, Dalston RM, Mayo R. Hypernasality and velopharyngeal impairment. Cleft Palate Craniofac J. 1994;31:257–262. doi: 10.1597/1545-1569_1994_031_0257_havi_2.3.co_2. [DOI] [PubMed] [Google Scholar]
- Warren DW, Dalston RM, Morr KE, Hairfield WM, Smith LR. The speech regulating system: temporal and aerodynamic responses to velopharyngeal inadequacy. J Speech Hear Res. 1989;32:566–575. [PubMed] [Google Scholar]
- Warren DW, DuBois A. A pressure-flow technique for measuring velopharyngeal orifice area during continuous speech. Cleft Palate J. 1964;1:52–71. [PubMed] [Google Scholar]
- Watterson T, Emanuel F. Effects of oral-nasal coupling on whispered vowel spectra. Cleft Palate J. 1981;18:24–38. [PubMed] [Google Scholar]
- Watterson T, McFarlane SC, Wright DS. The relationship between nasalance and nasality in children with cleft palate. J Commun Disord. 1993;26:13–28. doi: 10.1016/0021-9924(93)90013-z. [DOI] [PubMed] [Google Scholar]
- Whitehill TL, Lee ASY, Chun JC. Direct magnitude estimation and interval scaling of hypernasality. J Speech Hear Res. 2002;45:80–88. doi: 10.1044/1092-4388(2002/006). [DOI] [PubMed] [Google Scholar]
- Zemann W, Geichtinger M, Santler G, Karcher H. Effects of Le Fort I osteotomy on nasalance scores [in German] Mund Kiefer Gesichtschir. 2006;10:221–228. doi: 10.1007/s10006-006-0001-0. [DOI] [PubMed] [Google Scholar]
- Zraick RI, Liss JM. A comparison of equal-appearing interval scaling and direct magnitude estimation of nasal voice quality. J Speech Hear Res. 2000;43:979–988. doi: 10.1044/jslhr.4304.979. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.