Abstract
Purpose
Video games provide a promising platform for rehabilitation of speech disorders. Although video games have been used to train speech perception in foreign language learners and have been proposed for aural rehabilitation, their use in speech therapy has been limited thus far. We present feasibility results from at-home use in a case series of children with velopharyngeal dysfunction (VPD) using an interactive video game that provided real-time biofeedback to facilitate appropriate nasalization.
Method
Five participants were recruited across a range of ages, VPD severities, and VPD etiologies. Participants completed multiple weeks of individual game play with a video game that provides feedback on nasalization measured via nasal accelerometry. Nasalization was assessed before and after training by using nasometry, aerodynamic measures, and expert perceptual judgments.
Results
Four participants used the game at home or school, with the remaining participant unwilling to have the nasal accelerometer secured to his nasal skin, perhaps due to his young age. The remaining participants showed a tendency toward decreased nasalization after training, particularly for the words explicitly trained in the video game.
Conclusion
Results suggest that video game–based systems may provide a useful rehabilitation platform for providing real-time feedback of speech nasalization in VPD.
Supplemental Material
This special issue contains selected papers from the March 2016 Conference on Motor Speech held in Newport Beach, CA.
Home rehabilitation programs have been proposed to increase therapy intensity (Novak, Cusick, & Lannin, 2009), particularly as high therapy intensity has been shown to improve treatment outcomes in a variety of speech disorders, including velopharyngeal dysfunction (VPD; Albery & Enderby, 1984). The interactive and engaging nature of video games provides a particularly effective platform for at-home rehabilitation. Thus, video games have been used to augment therapy and increase the dosage of treatment for a wide array of speech-language and hearing disorders purposes, including to train speech perception in foreign language learners (Lim & Holt, 2011), to give feedback about vowel content (Tan, Johnston, Ballard, Ferguson, & Perera-Schulz, 2013; Tan, Johnston, Bluff, Ferguson, & Ballard, 2014a, 2014b), to rehabilitate voice disorders (King, Davis, Lehman, & Ruddy, 2012; Lv, Esteve, Chirivella, & Gagliardo, 2015a, 2015b), and for aural rehabilitation (Cano, Peñeñory, Collazos, Fardoun, & Alghazzawi, 2015; Loaiza et al., 2013; Navarro-Newball et al., 2014; Whitton, Hancock, & Polley, 2014). Here, we present results from a case series in children with resonance disorders who used a video game at home to facilitate increased appropriate nasalization.
VPD is a resonance disorder that can result from lack of appropriate closure and/or opening between the nasal and oral cavities, the velopharyngeal port, during speech tasks. VPD can be structural, neurological, or functional in origin. Regardless of etiology, VPD can result in hyper- and/or hyponasality, as well as nasal air emission (NAE; occurs when air leaks from the oral cavity into the nasal cavity at times when the port should be tightly closed). VPD symptoms can negatively affect intelligibility (Hodge & Gotzke, 2007) and listener impressions (Addington, 1968; Blood, Blood, & Danhauer, 1978; Blood & Hyman, 1977; Blood, Mahan, & Hyman, 1979; Lallh & Rochet, 2000; Rieger et al., 2006).
The assessment of the consequences of VPD (hypernasality, hyponasality, and NAE) is complex, as the acoustic markers of nasalization are subtle, and while auditory-perceptual judgments of nasality are commonly used clinically, nasality ratings show low inter- and intrarater reliability (42%–62%; Brunnegård, Lohmander, & van Doorn, 2012). The perception of nasality is well correlated with the acoustic measure nasalance (e.g., Brancamp, Lewis, & Watterson, 2010; Brunnegård et al., 2012). The instrumentation used to measure nasalance, called a nasometer, was developed by Fletcher (1970) and is marketed most prominently by PENTAX Medical (Montvale, NJ). Measuring nasalance involves a large headset with a baffle placed on the philtrum, with one microphone placed on the nasal side of the baffle and a second placed on the oral side. Nasalance is calculated as the acoustic energy of the sound from the oral microphone divided by the total acoustic energy of both microphones and, as such, is recorded as a percentage from 0–100. However, total nasalance scores increase when speech samples are known to include NAE (Karnell, 1995), leading to a reduction in the correlation between nasalance and nasality (Dalston, Warren, & Dalston, 1991). Thus, separate measures of nasalance, the perception of nasality, and NAE provide complementary rather than redundant information.
When VPD is caused by structural or neurological anomalies, rehabilitation usually involves surgical or prosthetic intervention. However, when VPD is functional in origin either due to hearing impairment or otherwise, speech therapy approaches may be effective in remediating the aberrant resonance (Kummer, 2008). There is no standard protocol for speech therapy for VPD (Yorkston et al., 2001), but approaches often rely on providing the client with either enhanced auditory or visual feedback. Enhanced auditory feedback includes cues from the clinician on the basis of his or her auditory perception, as well as devices to magnify auditory-perceptual cues of the speaker. Enhanced visual feedback includes observation of fogging on a cold mirror held under the nose, real-time nasalance calculations with the nasometer (Fletcher & Higgins, 1980; Steinhauer & Grayhack, 2000), and endoscopy (Brunner, Stellzig-Eisenhauer, Proschel, Verres, & Komposch, 2005; Van Lierde, Claeys, De Bodt, & Van Cauwenberge, 2004; Witzel, Tobe, & Salyer, 1988; Yamaoka, Matsuya, Miyazaki, Nishio, & Ibuki, 1983; Ysunza, Pamplona, Femat, Mayer, & García-Velasco, 1997). Although these approaches have shown success, they often rely on auditory perception (which is particularly unreliable for nasalization), are invasive or uncomfortable, or are expensive or otherwise inappropriate for home use.
On the other hand, a nasal accelerometer can measure skin acceleration caused by sound and air traveling through the nasal passages. The Horii Oral-Nasal Coupling Index (HONC; Horii, 1980) quantifies nasalization as the ratio of nasal acceleration to total acoustic output captured by a microphone. This ratio correlates with perceived nasality in healthy adults (Laczi, Sussman, Stathopoulos, & Huber, 2005) and children with VPD (Laczi et al., 2005; Redenbaugh & Reich, 1985). Recent work has led to filtered HONC, in which the accelerometer and microphone signals are bandpass filtered, allowing for comparison of HONC values across participants and vowels (Thorp, Virnik, & Stepp, 2013; Varghese, Mendoza, Braden, & Stepp, 2014).
We have implemented an algorithmic estimation of nasalization via filtered HONC; feedback on the basis of these estimates has been shown to effect changes in nasalization in healthy speakers over two sessions (Heller Murray, Mendoza, Gill, Perkell, & Stepp, 2016). This same algorithm has been implemented into a video game format, which has been shown to be usable by children in the laboratory in one session (Cler, Voysey, & Stepp, 2015). In this study, we show feasibility of video game HONC feedback on the function of children with VPD with at-home or at-school use over many sessions.
Method
This study used a video game–based therapy format for children with VPD, which provided interactive feedback regarding speech nasalization measured via nasal accelerometry. Nasalization was assessed before and after multiple weeks of individual game play training (see Table 1). Measures included nasometry, auditory perception, and, for two participants, nasal airflow.
Table 1.
Participant characteristics, training, and assessments.
| Identification | Age and gender | Concurrent therapy a | Diagnosis | Weeks of game play | Assessments |
|---|---|---|---|---|---|
| P0 | 4M | Unknown | VPD (unknown etiology) | 0 sessions | Pretraining c |
| P1 | 9M | SLP | Craniofacial anomalies (postrepair) and VPD due to hearing impairment | Seven sessions over 3 months, one session per week | Pretraining and posttraining |
| P2 | 15M | SLP for language | VPD and concurrent language and social impairments (unknown etiology) | 20 sessions over 4 months, one to six sessions per week | Pretraining and partial posttraining c |
| P3 | 9M | Social issues | VPD (unknown etiology) | 80 sessions over 5 months, three to six sessions per week | Pretraining b and posttraining b |
| P4 | 9M | Unknown | No diagnosis | 53 sessions over 3 months, three to six sessions per week | Pretraining b and posttraining b |
Note. P = participant; M = male; VPD = velopharyngeal dysfunction; SLP = speech-language pathologist.
Self-reported; unrelated to current study.
Assessment included additional aerodynamic measures.
Discontinued participation before full posttraining assessment could be collected.
Participants
Five male pediatric participants, aged 4–15 years, were recruited for the study. Participants were referred by expert speech-language pathologists (SLPs) at Boston University's Academic Speech, Language & Hearing Center, Boston Children's Hospital, and The Learning Center for the Deaf. Participants were eligible if they had a diagnosis of VPD, with no known structural anomalies, and complied with the protocol.
Participant 1 (P1) was a 9-year-old male child who was postrepair of craniofacial anomalies and had VPD due to a hearing impairment. He completed training during his weekly speech therapy sessions. A 15-year-old male, Participant 2 (P2), exhibited VPD of unknown etiology and concurrent language and social impairments and completed video game training at home; he discontinued his participation before a full posttest assessment could be completed. Two 9-year-old male identical twins, Participants 3 and 4 (P3 and P4), also participated. P3 had VPD with no known etiology, and P4 had not completed a full assessment for VPD but was noted by parents to be hypernasal. P3 and P4 completed video game training at home. In addition, Participant 0 (P0), a 4-year-old male child diagnosed with VPD, was initially accepted for participation and completed a subset of the pretraining assessment protocol. However, his parents withdrew him from the study, citing he was unable to comply due to his age, and no further assessment or analysis was completed. For each participant, a parent completed written consent in compliance with the Boston University Institutional Review Board. Participants aged 7–17 years completed verbal or written assent, as appropriate. Parents were compensated at $10/hr, and children were provided a small toy for each assessment session.
Data Collection
Participants completed pretraining and posttraining assessments of nasalization. As nasalization is difficult to assess, several complementary measures were collected: nasalance; expert auditory perception of hypernasality, hyponasality, and NAE; and a quantitative measure of NAE consisting of the nasal airflow during the hold periods of stops. Nasalance and speech acoustics were captured during two separate repetitions of the assessment materials. Assessment stimuli included the MacKay-Kummer SNAP-R Simplified Nasometric Assessment Procedures Test (SNAP test; MacKay & Kummer, 2005), with additional nasal and nonnasal consonant–vowel–consonants (CVCs). The SNAP test includes nasal and nonnasal stimuli, ranging from sustained sounds (/ɑ/, /i/, /m/), strings of consonant–vowel tokens (/pɑpɑpɑpɑpɑ/, /nininini/), short sentences (“Pick up the book.” “Take a teddy.”), and reading passages loaded with bilabial plosives (“Bobby and Billy play ball.”) and sibilant fricatives (“Suzy eats cereal or toast for breakfast.”). The SNAP test is normed on children, ranging in age from 3 to 9 years, with no apparent speech or language problems (MacKay & Kummer, 2005).
Additional stimuli consisted of 12 trained CVCs that were used as stimuli in the video game (man, bag, mean, bead, nine, guide, nun, bug, mom, dog, noon, and dude) and 16 untrained CVCs that were not included in the video game (/mɑm/, /bɑb/, /nɑn/, /dɑd/, /mæm/, /bæb/, /næn/, /dæd/, /mim/, /bib/, /nin/, /did/, /mum/, /bub/, /nun/, /dud/). During pretraining and posttraining assessments for P3 and P4, an additional measurement of nasal airflow was conducted to detect NAE during the hold periods of the stop consonants of CVCVCV strings (e.g., /pɑpɑpɑpɑpɑpɑ/).
Speech Samples
Speech acoustics were measured by using a standard headset microphone (WH20; Shure, Niles, IL) placed approximately 6–10 cm from the mouth at a 45° angle from the midline. Samples were recorded by using Audacity software (Audacity Team, 2014) and an external soundcard (either Native Instruments Komplete Audio 6 Interface, Native Instruments GmbH, Berlin, Germany, or UltraLite-mk3 Hybrid, MOTU Audio, Cambridge, MA) at 44100 Hz.
Nasalance
Nasalance was measured by using a Nasometer II (model 6450; PENTAX Medical, Montvale, NJ). Elements of the SNAP test were produced, and nasalance was measured per utterance (e.g., /m/, “Pick up the pie”; MacKay & Kummer, 2005). Trained and untrained CVCs were evaluated with one nasalance value over the entire three repetitions of each word.
Nasal Airflow During Hold Periods
Nasal airflow was measured by using a Phonatory Aerodynamic System (model 6600; PENTAX Medical, Montvale, NJ). Recordings were obtained for utterance strings with stop consonants /p/, /t/, and /k/ and low and high vowels (e.g., /pɑpɑpɑpɑpɑpɑ/, /kikikikikiki/). When recording, the bottom of the mask was placed on the participant's philtrum, so only airflow from the nose was captured. The airflow was captured at 200 Hz and the microphone embedded in the Phonatory Aerodynamic System captured audio at 22050 Hz. The microphone signal was viewed in Praat (Boersma & Weenink, 2014) as a waveform and as a spectrogram to identify the burst of each stop consonant. The airflow for each CV was then calculated in liters per second over the hold period of the consonant (i.e., starting at the end of voicing of the previous vowel and ending at the burst of the consonant) to capture any NAE.
Auditory Perceptual Analysis
Perceptual ratings of hypernasality, hyponasality, and NAE were collected through an online survey. Two certified SLPs specializing in resonance disorders were recruited for the perceptual study. The two raters completed written consent in compliance with the Boston University Institutional Review Board. They received the perceptual survey via a confidential e-mail link and completed the study remotely by using their personal computers and headphones.
Speech samples were obtained from microphone signals recorded during the participants' pre- and posttraining assessments for the three participants who fully completed both assessments. There were a total of 12 samples, with each sample containing a sequence of nasal and nonnasal CVCs (three participants × pre-/post- × trained/untrained CVCs). The untrained CVCs (/mɑm/, /bɑb/, /nɑn/, /dɑd/, /mæm/, /bæb/, /næn/, /dæd/, /mim/, /bib/, /nin/, /did/, /mum/, /bub/, /nun/, /dud/) were presented as spoken (i.e., /mɑm mɑm mɑm/, /bɑb bɑb bɑb/, /nɑn nɑn nɑn/). The trained CVCs were presented similarly, with three repetitions in a row of alternating nasal and nonnasal CVCs (man man man, bag bag bag; followed by three repetitions each of mean, bead, nine, guide, nun, bug, mom, dog, noon, and dude). Sequences were the same for each child and always alternated a sequence of three nasal word repetitions with three nonnasal word repetitions. Full sets were presented for listeners in order for them to hear the differentiation between nasals and nonnasals that each child produced.
The speech samples were presented to raters via an online survey (SurveyGizmo 2016, Boulder, CO). The survey was made up of three modules, each consisting of the same set of 12 speech samples in randomized order, with three samples repeated to measure intrarater reliability. Raters were instructed to rate one percept per module (hyponasality, hypernasality, and NAE) and were instructed to disregard any other speech or resonance abnormalities in the samples. Raters used a Likert scale for each sample (1 = normal speech, 5 = severe deviation/occurring always or close to always). Intra- and interrater reliability were measured via Spearman correlation. Perceptual raters showed intrarater reliabilities of ρ = .60 and ρ = .66, with an interrater reliability of ρ = .37.
Video Game Training
Hardware
During game play, acoustic signals were recorded with a WH20 XLR microphone (Shure Incorporated, Niles, IL), and nasal skin vibration was recorded with a Hot Spot accelerometer (K&K Sound, Coos Bay, OR) attached to the nasal skin with double-sided tape (see top of Figure 1). Both signals were preamplified and digitized via a USB Dual Pre external sound card (ART ProAudio, Niagara Falls, NY) at a sampling rate of 44.1 kHz using custom C# code.
Figure 1.
Top: Hardware needed for the game, including nasal accelerometer, microphone, soundcard, and computer. Bottom: Example screenshot from the game. Player is prompted with a nonnasal token and given feedback to lower nasalization relative to an individualized threshold.
Game Play
Users were prompted to repeat nasal and nonnasal stimuli. Filtered HONC scores were automatically calculated over the center vowel of each word (see full details in Cler et al., 2015). In brief, the signals are filtered, the root-mean-square of the accelerometer signal is divided by the root-mean-square of the total acoustic output signal, and this score is then normalized by the same ratio calculated over an /m/. During an initial no-feedback stage, these scores were used to set nasal and nonnasal targets, so the user produced speech within the nasal and nonnasal targets 70% of the time by using their typical speech (Cler et al., 2015; Macmillan & Creelman, 1991); this percentage ensures that participants continue to be challenged while providing enough success that there was minimal frustration. Subsequent feedback stages prompted participants to increase their nasalization on nasal stimuli and decrease their nasalization on nonnasal stimuli (see bottom of Figure 1 and Supplemental Material S1). The no-feedback stage was required once per session; thus, targets were recalculated each day. Players were instructed to complete as many iterations of the feedback stage as they wished.
Users play the game as a ninja fighting evil robots by using just their speech. Game play for feedback and no-feedback stages are identical: a robot appears, and the user is prompted to repeat one of the stimuli CVCs (e.g., mom, nun, bag, bead) three times (see bottom of Figure 1 and Supplemental Material S1). During the feedback stages, smiley or frowny faces appear after each repetition to indicate whether the production was within the target nasalization range. After the three repetitions, the ninja performed one action (jumped, ducked, swung a sword, threw a fireball, or said “ouch”). During the feedback stage, the action was related to how many of the three productions were within the target nasalization region. During the no-feedback stage, one of the positive actions was chosen pseudorandomly to be displayed.
Player and Guardian Instruction
Participants played the video game under researcher supervision in the laboratory after their pretest assessment. During this time, guardians (SLP for P1 and parents of P2, P3, and P4) and children were instructed in how to set up the equipment (laboratory-provided laptop, sound card, microphone, and accelerometer) and how to properly position the microphone and accelerometer. Guardians and children were given an instructional booklet with setup instructions and common troubleshooting steps. Participants were instructed to call or e-mail the laboratory whenever needed. One participant called for setup assistance in the first few weeks related to computer settings; the rest did not need further assistance. P1 agreed to play the game once per week under his SLP's instruction. The remaining participants agreed to play the game three to five times per week and were instructed that the more they could practice, the better.
Results
Four of the five participants used the video game outside of the laboratory. Compliance, age, and VPD etiology were variable among the participants (see Table 1), and results for those four participants are thus reported individually. Results are reported in terms of overall compliance, changes in varied nasalization measures after training, and parent and child feedback.
Out-of-Laboratory Compliance
P1 used the game under the supervision of his SLP at school once per week over 3 months. He completed seven sessions, with 120 total productions with automated feedback (see Figure 2A). P2 used the game at home over 4 months, zero to six times a week, with variable compliance (see Figure 2B), with a total of 20 sessions and 1,080 productions with feedback. P3 used the game at home over 5 months, three to six times a week for a total of 80 sessions and 5,370 total productions (see Figure 2C). P4, the identical twin of P3, used the game at home over 3 months, three to six times a week, for a total of 53 sessions and 3,900 productions with feedback (see Figure 2D).
Figure 2.
Compliance for all participants in sessions per week over the length of the experiment. Dotted vertical lines indicate time of pre- and postassessment. Asterisks denote in-laboratory assessments (no data shown but may explain some of the variation in compliance leading up to and immediately following these check-ins). Note. P1 = Participant 1; P2 = Participant 2; P3 = Participant 3; P4 = Participant 4.
Changes in Nasalization Measures After Training
Due to the nature of this feasibility study, especially given the range of participants' ages, VPD etiology, severity, and sessions with the game, no statistical tests were completed, and the participants' results are reported individually.
Results for P1
Nasalance. During the pretraining assessment, P1's nasalance was above normal limits on many nonnasal tasks but within normal ranges for all nasal tasks. Figure 3, Panel P1-A shows that P1's nasalance on a subset of the nonnasal, untrained tasks decreased after training, often to within normal ranges. Figure 3, Panel P1-B shows reduced nasalance for both nasal and nonnasal trained CVCs in this participant.
Figure 3.
Pre- and postchanges in different measures of nasalization for all participants. Each row shows results for one participant. Column A shows nasalance of untrained MacKay-Kummer SNAP-R Simplified Nasometric Assessment Procedures Test components (sustained /ɑ/ in circles, average of nasalance over sequences of /pɑpɑpɑpɑpɑpɑ, tɑtɑtɑtɑtɑtɑ, kɑkɑkɑkɑkɑkɑkɑ, sɑsɑsɑsɑsɑsɑ, and ʃɑʃɑʃɑʃɑʃɑʃɑ/ in squares, and the average over 12 sentences loaded with nonnasals in triangles) before and after training; error bars are standard error of the mean (SEM). Brackets in the lower left of each plot show norm ranges for each task. Column B shows nasalance of the consonant–vowel–consonants (CVCs) trained in the game, before and after training; feedback was given to increase nasalization of nasal tokens (red circles) and decrease nasalization of nonnasal tokens (blue squares). Means are the average nasalance over the six different CVCs, and error bars are SEM. Column C shows mean nasal airflow during hold periods of stop consonants before and after training; error bars are standard deviation. Column D shows average perceptual ratings (over two raters) of trained and untrained speech samples. Hypernasality ratings are in orange, hyponasality ratings in blue, and nasal air emission ratings in grey; ratings of trained CVCs are in solid lines on the left and untrained in dashed lines on the right. P1 = Participant 1; P2 = Participant 2; P3 = Participant 3; P4 = Participant 4.
Perceptual ratings. Change in perceptual ratings before and after training (see Figure 3, Panel P1-D) were not as consistent as the nasalance results. The mean perceptual rating for hypernasality on the trained CVCs remained constant at a 2.5 (between slight deviations/single occurrences and mild deviations/some occurrences) but decreased on untrained CVCs from a mean rating of 1.5 to 1 (normal speech). Although P1's results on the nasal portions of the SNAP test remained within normal ranges, perceptual ratings of hyponasality increased for trained CVCs (1.5 to 3) and remained constant (2.5) on untrained CVCs. P1 was not perceived as having NAE before or after training.
Results for P2
Nasalance. P2 was above normal limits on all of the nonnasal tasks of the SNAP test (+5–14 SDs) and was within normal limits on all nasal tasks (i.e., not hyponasal) during the pretraining assessment. P2 showed mixed results on the nonnasal untrained tasks (see Figure 3, Panel P2-A). Figure 3, Panel P2-B shows nasalance modulated in the correct directions for both the nasal and nonnasal trained CVCs. Although this participant completed the pretraining assessment and some video game training, he chose to discontinue his participation; thus, a full posttraining assessment was not completed. Therefore, no additional measures were collected or reported.
Results for P3
Nasalance. P3 was above clinical norms on all of the nonnasal tasks of the SNAP test (+5–13 SDs) during the pretraining assessment and was not hyponasal on any nasal tasks. After training, P3 showed decreased nasalance on nonnasal, untrained tasks (see Figure 3, Panel P3-A), as well as nasalance modulated in the correct directions on the trained nasal and nonnasal CVCs (see Figure 3, Panel P3-B).
Nasal airflow. Nasal airflow was also measured before and after training (see Figure 3, Panel P3-C). Before training, P3 had a mean nasal airflow during hold periods of 0.092 L/s; individuals with moderate NAE have nasal airflow of 0.02 L/s or greater (Dotevall, Lohmander-Agerskov, Ejnell, & Bake, 2002). After training, the nasal airflow of P3 was reduced to 0.0016 L/s.
Perceptual ratings. Results shown in Figure 3, Panel P3-D for perceived hypernasality on trained CVCs reduced slightly from 2 to 1.5, while untrained CVC ratings remained at a constant level of 3. Perceived hyponasality remained low for both untrained (consistent at 1) and trained (1 to 1.5) CVCs. Perceived NAE was reduced in this participant in both trained (4 to 1) and untrained (3.5 to 1.5) CVCs.
Results for P4
Nasalance. P4, the identical twin of P3, was hypernasal on 85% of the nonnasal tasks (+1–12 SDs). After training, P4 showed reduced nasalance on some but not all of the untrained nonnasal tasks (see Figure 3, Panel P4-A) and showed appropriately reduced nasalance on the explicitly trained nonnasal CVCs (see Figure 3, Panel P4-B).
Nasal airflow. The nasal airflow of P4 decreased posttraining from 0.020 L/s to 0.0021 L/s (see Figure 3, Panel P4-C), consistent with a change from moderate NAE (airflow ≥ 0.02 L/s; Dotevall et al., 2002) to typical ranges.
Perceptual ratings. Perceptual ratings (see Figure 3, Panel P4-D) of hypernasality on trained CVCs decreased (3 to 1) but remained consistent for untrained CVCs (2). Perceived NAE decreased from 1.5 to 1 on trained CVCs.
Child and Guardian Feedback
Overall, the technology and game were well received by the participants and their guardians. Feedback was solicited during in-laboratory check-ins and via written logs from P3 and P4. The parent of the 4-year-old P0 suggested that she found the game rudimentary due to the graphics and hardware needed and requested that we recontact her when the game was in its final implementation. Many parents and others questioned the need for the external hardware and microphones and have suggested that the game be modified to use just one microphone or be accessible on mobile. Two microphones are necessary, unfortunately, as the acoustic cues of nasalization are subtle and thus measuring with one microphone is unreliable at best. Many of the children noted that the nasal accelerometer was sometimes uncomfortable or annoying, so developments in this area would be beneficial. Despite the hardware requirements, P2 said, “It was easier to set up than I thought.”
Compliance issues differed across age groups. P0 (4 years old) could not continue due to being unwilling to wear the nasal accelerometer. P2 (15 years old) discontinued his participation before expected and stated that the game was geared too much to kids and took up too much time; therefore, he did not wish to play anymore. Although noted by caregivers, SLPs, and our nasalance assessments to be extremely hypernasal, P2 did not notice issues with his speech and was uninterested in working on it further. He asked during one session whether he would always need to think so hard about making his speech sound right or if it would become automatic, suggesting that he found the task difficult.
The participants were all inquisitive children interested in learning how to push the bounds of the game. P3 reported that he discovered that he could make the game easier by turning the knobs on the soundcard to decrease the signal amplitude of the nasal accelerometer (thus automatically lowering the HONC score) and needed a reminder not to change the settings on the soundcard once the game was calibrated. P1 was inclined to shout words, which can modulate the HONC score if the signal begins clipping. P4 reported after several weeks of practice that he preferred to attach the accelerometer to his nasal skin upside down. He also noticed that the game does not catch if the player remains silent instead of attempting the word. Many of these issues can be resolved with modifications to the software or increased training for the guardian or supervisor.
The final category of feedback from the children related to the game play itself. As noted previously, the game play remains the same throughout all sessions. P3 diplomatically suggested that we “should make levels because it might be boring.” P3 and P4 both noted that they had trouble focusing for the entire session; this would also likely be improved with more engaging game play.
Discussion
Participants completed seven to 80 total sessions of game play, with feedback on 120–5,370 repetitions of nasal and nonnasal CVCs. After training, participants showed a tendency toward decreased nasalization on nonnasal trained and untrained tasks of the SNAP test, with the exception of P2 whose nasalance for untrained tasks did not decrease. The perceptual ratings obtained for P1, P3, and P4 also indicated decreased levels of hypernasality on trained items. Although participants were not explicitly trained to modulate NAE, two participants who were rated as having NAE (P3 and P4) demonstrated decreased NAE ratings, along with decreased nasal airflow during hold periods following training. These results suggest that video game rehabilitation of speech nasalization is a promising avenue for further development.
Evaluation of the efficacy of the video game rehabilitation platform presented should include individuals with a wider range of impairments. With current avenues of recruitment, our participants were primarily individuals with severe VPD with unknown etiology, although one user had VPD due to hearing impairment. Before enrolling in the study, participants were screened by expert VPD teams, but it is possible that some of them had latent structural or neurological issues, preventing them from achieving full velopharyngeal closure. Additional evaluation in a larger cohort of participants should include individuals with mild VPD across etiologies, of varying ages, with and without hearing impairments. Individuals with hearing impairment or repaired cleft palate seem likelier to respond to this type of biofeedback, which presents an alternate modality of feedback (i.e., visual instead of auditory), because the cause of their VPD is known and is due to mislearning rather than a structural or neurological issue. In addition, future evaluation of efficacy should implement a consistent schedule of training and assessment, including probes and retention assessments.
Future game development will include higher quality graphics, multiple game play levels, and user-initiated trials to provide a more immersive and age-appropriate gaming experience for older children and optimize motivation for game play. In addition, some of the effects mentioned previously (e.g., shouting, changing the gain) could be managed with changes to the software and game play. In a similar way, the game play could be modified so that instead of calibrating the nasalization targets during an initial session with random positive feedback, targets could be continuously reset. This would also account for small changes over time due to any microphone positional changes and continue to challenge the player to improve. These changes are vital to increase player engagement and motivation, as the goal of this research is to facilitate the frequent practice that is essential to motor learning (Robbins et al., 2008) and thus to improve nasalization. This type of biofeedback-based video game provides an excellent venue to promote such practice, both because video games may facilitate neural plasticity following perceptual learning (Bao, Chan, & Merzenich, 2001; Koepp et al., 1998) and because providing a form of feedback that does not require a trained listener's real-time perception gives players a higher dosage of practice and feedback than is typically available outside of speech therapy. The inexpensive instrumentation used here (<$150 vs. $3,000 for a Nasometer II) also facilitates the use of this video game to supplement traditional therapy. We do not intend this to supplant traditional therapy but to be used in concert or perhaps between semesters to keep the client's attention on their nasalization. Further research is needed to optimize the game, participant characteristics, dosage, and other therapy constraints for maximal benefit.
In addition, the video game platform presented here was designed to be modular. That is, the speech-processing algorithms, stimuli, and graphics engine are all separate. This enables the game to be modified to address a variety of rehabilitation issues. For example, the speech-processing algorithm could be replaced with a quantitative measure of NAE, as in Cler et al. (2016), to give players feedback to reduce their NAE severity. In a similar way, any speech feature that can be reliably measured with microphones and a signal-processing algorithm could be elicited and used to generate appropriate biofeedback for players.
Conclusion
This technical report presents the evaluation of a video game that provides real-time feedback of speech nasalization via nasal accelerometry. Five participants with a range of ages and VPD severity and etiology were recruited. One participant was unable to complete video game training due to failure to tolerate the nasal accelerometer, likely due to his young age. The remaining participants completed multiple weeks of game play. Participants showed a tendency toward decreased nasalization following training, particularly for words that were explicitly trained. On the basis of findings from this case series, the video game platform is promising for providing real-time feedback of speech nasalization for the rehabilitation of VPD.
Acknowledgments
Thanks to Elizabeth Heller Murray, Danielle Sturgeon, and Delsys, Inc., for their support and to Dave Anderson, Andrew Brughera, Will Cunningham, Ran Gong, and Jake Hermann for their development work. This work was supported, in part, by the Diane M. Bless Endowed Chair, Division of Otolaryngology–Head and Neck Surgery at University of Wisconsin, Madison (awarded to M. Braden) and National Institutes of Health Grant DC014872 (awarded to G. Cler).
Funding Statement
This work was supported, in part, by the Diane M. Bless Endowed Chair, Division of Otolaryngology–Head and Neck Surgery at University of Wisconsin, Madison (awarded to M. Braden) and National Institutes of Health Grant DC014872 (awarded to G. Cler).
References
- Addington D. W. (1968). The relationship of selected vocal characteristics to personality perception. Speech Monographs, 35, 492–503. [Google Scholar]
- Albery L., & Enderby P. (1984). Intensive speech therapy for cleft palate children. British Journal of Disorders of Communication, 19, 115–124. [DOI] [PubMed] [Google Scholar]
- Audacity Team. (2014). Audacity (Version 2.1.2) [Computer software]. Retrieved from http://audacity.sourceforge.net/ [Google Scholar]
- Bao S., Chan V. T., & Merzenich M. M. (2001). Cortical remodelling induced by activity of ventral tegmental dopamine neurons. Nature, 412, 79–83. [DOI] [PubMed] [Google Scholar]
- Blood G. W., Blood I. M., & Danhauer J. L. (1978). Listeners' impressions of normal-hearing and hearing-impaired children. Journal of Communication Disorders, 11, 513–518. [DOI] [PubMed] [Google Scholar]
- Blood G. W., & Hyman M. (1977). Children's perception of nasal resonance. Journal of Speech and Hearing Disorders, 42, 446–448. [DOI] [PubMed] [Google Scholar]
- Blood G. W., Mahan B. W., & Hyman M. (1979). Judging personality and appearance from voice disorders. Journal of Communication Disorders, 12, 63–67. [DOI] [PubMed] [Google Scholar]
- Boersma P., & Weenink D. (2014). Praat: Doing Phonetics By Computer (Version 5.3.80) [Computer program]. Retrieved from http://www.praat.org [Google Scholar]
- Brancamp T. U., Lewis K. E., & Watterson T. (2010). The relationship between nasalance scores and nasality ratings obtained with equal appearing interval and direct magnitude estimation scaling methods. The Cleft Palate–Craniofacial Journal, 47, 631–637. [DOI] [PubMed] [Google Scholar]
- Brunnegård K., Lohmander A., & van Doorn J. (2012). Comparison between perceptual assessments of nasality and nasalance scores. International Journal of Language & Communication Disorders, 47, 556–566. https://doi.org/10.1111/j.1460-6984.2012.00165.x [DOI] [PubMed] [Google Scholar]
- Brunner M., Stellzig-Eisenhauer A., Proschel U., Verres R., & Komposch G. (2005). The effect of nasopharyngoscopic biofeedback in patients with cleft palate and velopharyngeal dysfunction. The Cleft Palate–Craniofacial Journal, 42, 649–657. https://doi.org/10.1597/03-044.1 [DOI] [PubMed] [Google Scholar]
- Cano S., Peñeñory V., Collazos C. A., Fardoun H. M., & Alghazzawi D. M. (2015, October). Training with Phonak: Serious game as support in auditory—Verbal therapy for children with cochlear implants. Paper presented at the Third 2015 Workshop on ICTs for improving Patients Rehabilitation Research Techniques, Lisbon, Portugal. [Google Scholar]
- Cler M. J., Lien Y.-A. S., Braden M. N., Mittelman T., Downing K., & Stepp C. E. (2016). Objective measure of nasal air emission using nasal accelerometry. Journal of Speech, Language, and Hearing Research, 59, 1018–1024. https://doi.org/10.1044/2016_JSLHR-S-15-0407 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cler M. J., Voysey G. E., & Stepp C. E. (2015, April). Video game speech rehabilitation for velopharyngeal dysfunction: Feasibility and pilot testing. Paper presented at the Seventh Annual International Institute of Electrical and Electronics Engineers/Engineering in Medicine and Biology Society (IEEE/EMBS) Conference on Neural Engineering, Montpellier, France https://doi.org/10.1109/NER.2015.7146747 [Google Scholar]
- Dalston R. M., Warren D. W., & Dalston E. T. (1991). A preliminary investigation concerning the use of nasometry in identifying patients with hyponasality and/or nasal airway impairment. Journal of Speech and Hearing Research, 34, 11–18. [DOI] [PubMed] [Google Scholar]
- Dotevall H., Lohmander-Agerskov A., Ejnell H., & Bake B. (2002). Perceptual evaluation of speech and velopharyngeal function in children with and without cleft palate and the relationship to nasal airflow patterns. The Cleft Palate–Craniofacial Journal, 39, 409–424. https://doi.org/10.1597/1545-1569(2002)039<0409:PEOSAV>2.0.CO;2 [DOI] [PubMed] [Google Scholar]
- Fletcher S. G. (1970). Theory and instrumentation for quantitative measurement of nasality. The Cleft Palate Journal, 7, 601–609. [PubMed] [Google Scholar]
- Fletcher S. G., & Higgins J. M. (1980). Performance of children with severe to profound auditory impairment in instrumentally guided reduction of nasal resonance. Journal of Speech and Hearing Disorders, 45, 181–194. [DOI] [PubMed] [Google Scholar]
- Heller Murray E. S., Mendoza J. O., Gill S. V., Perkell J. S., & Stepp C. E. (2016). Effects of biofeedback on control and generalization of nasalization in typical speakers. Journal of Speech, Language, and Hearing Research, 59, 1025–1034. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hodge M., & Gotzke C. L. (2007). Preliminary results of an intelligibility measure for English-speaking children with cleft palate. The Cleft Palate–Craniofacial Journal, 44, 163–174. https://doi.org/10.1597/05-035.1 [DOI] [PubMed] [Google Scholar]
- Horii Y. (1980). An accelerometric approach to nasality measurement: A preliminary report. The Cleft Palate Journal, 17, 254–261. [PubMed] [Google Scholar]
- Karnell M. P. (1995). Nasometric discrimination of hypernasality and turbulent nasal airflow. The Cleft Palate–Craniofacial Journal, 32, 145–148. [DOI] [PubMed] [Google Scholar]
- King S. N., Davis L., Lehman J. J., & Ruddy B. H. (2012). A model for treating voice disorders in school-age children within a video gaming environment. Journal of Voice, 26, 656–663. [DOI] [PubMed] [Google Scholar]
- Koepp M. J., Gunn R. N., Lawrence A. D., Cunningham V. J., Dagher A., Jones T., … Grasby P. M. (1998). Evidence for striatal dopamine release during a video game. Nature, 393, 266–268. [DOI] [PubMed] [Google Scholar]
- Kummer A. (2008). Cleft palate and craniofacial anomalies: Effects on speech and resonance (2nd ed.). Clifton Park, NY: Thomson Delmar Learning. [Google Scholar]
- Laczi E., Sussman J. E., Stathopoulos E. T., & Huber J. (2005). Perceptual evaluation of hypernasality compared to HONC measures: The role of experience. The Cleft Palate–Craniofacial Journal, 42, 202–211. https://doi.org/10.1597/03-011.1 [DOI] [PubMed] [Google Scholar]
- Lallh A. K., & Rochet A. P. (2000). The effect of information on listeners' attitudes toward speakers with voice or resonance disorders. Journal of Speech, Language, and Hearing Research, 43, 782–795. [DOI] [PubMed] [Google Scholar]
- Lim S. J., & Holt L. L. (2011). Learning foreign sounds in an alien world: Videogame training improves non-native speech categorization. Cognitive Science, 35, 1390–1405. https://doi.org/10.1111/j.1551-6709.2011.01192.x [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loaiza D., Oviedo C., Castillo A., Portilla A., Alvarez G., Linares D., … Ivarez G. A. (2013, September). A video game prototype for speech rehabilitation. Paper presented at the Fifth International Conference on Games and Virtual Worlds for Serious Applications, Bournemouth, England. [Google Scholar]
- Lv Z., Esteve C., Chirivella J., & Gagliardo P. (2015a, May). Clinical feedback and technology selection of game based dysphonic rehabilitation tool. Paper presented at the Ninth International Conference on Pervasive Computing Technologies for Healthcare, Istanbul, Turkey. [Google Scholar]
- Lv Z., Esteve C., Chirivella J., & Gagliardo P. (2015b, March). A game based assistive tool for rehabilitation of dysphonic patients. Paper presented at the Third Institute of Electrical and Electronics Engineers Virtual Reality International Workshop on Virtual and Augmented Assistive Technology, Arles, France. [Google Scholar]
- MacKay I. R., & Kummer A. W. (2005). The MacKay-Kummer SNAP Test-R Simplified Nasometric Assessment Procedures [Revised 2005]. Kay Elemetrics Corp, Instruction manual: Nasometer Model 6450 (pp. 115–124). Lincoln Park, NJ: Kay Elemetrics Corp. [Google Scholar]
- Macmillan N. A., & Creelman C. D. (1991). Detection theory: A user's guide. Cambridge, England: Cambridge University Press. [Google Scholar]
- Navarro-Newball A., Loaiza D., Oviedo C., Castillo A., Portilla A., Linares D., & Álvarez G. (2014). Talking to Teo: Video game supported speech therapy. Entertainment Computing, 5, 401–412. [Google Scholar]
- Novak I., Cusick A., & Lannin N. (2009). Occupational therapy home programs for cerebral palsy: Double-blind, randomized, controlled trial. Pediatrics, 124, e606–e614. [DOI] [PubMed] [Google Scholar]
- Redenbaugh M. A., & Reich A. R. (1985). Correspondence between an accelerometric nasal/voice amplitude ratio and listeners' direct magnitude estimations of hypernasality. Journal of Speech and Hearing Research, 28, 273–281. [DOI] [PubMed] [Google Scholar]
- Rieger J., Dickson N., Lemire R., Bloom K., Wolfaardt J., Wolfaardt U., & Seikaly H. (2006). Social perception of speech in individuals with oropharyngeal reconstruction. Journal of Psychosocial Oncology, 24, 33–51. https://doi.org/10.1300/J077v24n04_03 [DOI] [PubMed] [Google Scholar]
- Robbins J., Butler S. G., Daniels S. K., Gross R. D., Langmore S., Lazarus C. L., … Rosenbek J. (2008). Swallowing and dysphagia rehabilitation: Translating principles of neural plasticity into clinically oriented evidence. Journal of Speech, Language, and Hearing Research, 51(Suppl.) S276–S300. [DOI] [PubMed] [Google Scholar]
- Steinhauer K., & Grayhack J. P. (2000). The role of knowledge of results in performance and learning of a voice motor task. Journal of Voice, 14, 137–145. [DOI] [PubMed] [Google Scholar]
- Tan C. T., Johnston A., Ballard K., Ferguson S., & Perera-Schulz D. (2013, September). sPeAK-MAN: Towards popular gameplay for speech therapy. Paper presented at the Proceedings of the Ninth Australasian Conference on Interactive Entertainment: Matters of Life and Death, Melbourne, Victoria, Australia. [Google Scholar]
- Tan C. T., Johnston A., Bluff A., Ferguson S., & Ballard K. J. (2014a, December). Retrogaming as visual feedback for speech therapy. Paper presented at the SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications, Shenzhen, China. [Google Scholar]
- Tan C. T., Johnston A., Bluff A., Ferguson S., & Ballard K. J. (2014b). Speech invaders & yak-man: Retrogames for speech therapy. Paper presented at the SIGGRAPH Asia 2014 Mobile Graphics and Interactive Applications. [Google Scholar]
- Thorp E. B., Virnik B. T., & Stepp C. E. (2013). Comparison of nasal acceleration and nasalance across vowels. Journal of Speech, Language, and Hearing Research, 56, 1476–1484. https://doi.org/10.1044/1092-4388(2013/12-0239) [DOI] [PubMed] [Google Scholar]
- Van Lierde K. M., Claeys S., De Bodt M., & Van Cauwenberge P. (2004). Outcome of laryngeal and velopharyngeal biofeedback treatment in children and young adults: A pilot study. Journal of Voice, 18, 97–106. https://doi.org/10.1016/j.jvoice.2002.09.001 [DOI] [PubMed] [Google Scholar]
- Varghese L. A., Mendoza J. O., Braden M. N., & Stepp C. E. (2014). Effects of spectral content on Horii Oral-Nasal Coupling scores in children. The Journal of the Acoustical Society of America, 136, 1295–1306. https://doi.org/10.1121/1.4892791 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Whitton J. P., Hancock K. E., & Polley D. B. (2014). Immersive audiomotor game play enhances neural and perceptual salience of weak signals in noise. Proceedings of the National Academy of Sciences, 111, E2606–E2615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Witzel M. A., Tobe J., & Salyer K. (1988). The use of nasopharyngoscopy biofeedback therapy in the correction of inconsistent velopharyngeal closure. International Journal of Pediatric Otorhinolaryngology, 15, 137–142. [DOI] [PubMed] [Google Scholar]
- Yamaoka M., Matsuya T., Miyazaki T., Nishio J., & Ibuki K. (1983). Visual training for velopharyngeal closure in cleft palate patients; a fihrescopic procedure (preliminary report). Journal of Maxillofacial Surgery, 11(1), 191–193. [DOI] [PubMed] [Google Scholar]
- Yorkston K. M., Spencer K., Duffy J., Beukelman D., Golper L. A., Miller R., … Sullivan M. (2001). Evidence-based practice guidelines for dysarthria: Management of velopharyngeal function. Journal of Medical Speech-Language Pathology, 9(4), 257–274. [Google Scholar]
- Ysunza A., Pamplona M., Femat T., Mayer I., & García-Velasco M. (1997). Videonasopharyngoscopy as an instrument for visual biofeedback during speech in cleft palate patients. International Journal of Pediatric Otorhinolaryngology, 41, 291–298. [DOI] [PubMed] [Google Scholar]



