Abstract
Purpose:
Somatosensory targets and feedback are instrumental in ensuring accurate speech production. Individuals differ in their ability to access and respond to somatosensory information, but there is no established standard for measuring somatosensory acuity. The primary objective of this study was to determine which of three measures of somatosensory acuity had the strongest association with change in production accuracy in a vowel learning task, while controlling for the better-studied covariate of auditory acuity.
Method:
Three somatosensory tasks were administered to 20 female college students: an oral stereognosis task, a bite block task with auditory masking, and a novel phonetic awareness task. Individual scores from the tasks were compared to their performance on a speech learning task in which participants were trained to produce novel Mandarin vowels with visual biofeedback.
Results:
Of the three tasks, only bite block adaptation with auditory masking was significantly associated with performance in the speech learning task. Participants with weaker somatosensory acuity tended to demonstrate larger increases in production accuracy over the course of training.
Conclusions:
The bite block adaptation task measures proprioceptive awareness rather than tactile acuity and assesses somatosensory knowledge implicitly, with limited metalinguistic demands. This small-scale study provides preliminary evidence that these characteristics may be desirable for the assessment of oral somatosensory acuity, at least in the context of vowel learning tasks. Well-normed somatosensory measures could be of clinical utility by informing diagnosis/prognosis and treatment planning.
Keywords: somatosensory, speech production, adults, stereognosis, phonetic awareness, bite block
INTRODUCTION
During speech production, the brain receives somatosensory feedback from the vocal tract, in parallel with auditory feedback. Somatosensory feedback can be divided into tactile and proprioceptive types of information. Tactile somatosensory feedback involves the activation of mechanoreceptors in the skin when the surface of a structure is touched, as when the tongue makes contact with the alveolar ridge. Proprioceptive somatosensory feedback involves the body’s awareness of the position and movement trajectories of its own structures. This information arises from multiple sources, including skin mechanoreceptors and muscle spindles that provide information about vocal tract muscle movements (Guenther, 2016). Research on hand and limb function suggests that tactile and proprioceptive inputs make unique contributions to the experience and use of somatosensory feedback (e.g., Berryman, Yau, & Hsiao, 2006). Somatosensory and auditory feedback are central to current theoretical frameworks describing speech production (e.g., Guenther, 2016; Hickok, 2012; Houde & Nagarajan, 2011; Parrell, Ramanarayanan, Nagarajan, & Houde, 2019). In such models, production begins with a stored target representation, including somatosensory knowledge about articulator placement, that will be translated into motor commands to produce a given sound. This somatosensory target information is compared to incoming tactile and proprioceptive somatosensory feedback. If there is a discrepancy, corrective commands are generated to update the motor plan during production, creating a somatosensory feedback loop.
There is evidence that somatosensory feedback is instrumental to accurate speech production. When speakers undergo topical or local anesthesia on the articulators, which has the effect of blocking somatosensory feedback, their speech tends to be marked by articulation errors and reduced movement precision (Gammon, Smith, Daniloff, & Kim, 1971; Putnam & Ringel, 1976; Ringel & Steer, 1963). In a contrasting case, many postlingually deafened individuals who therefore lack auditory feedback can maintain intelligible speech, a phenomenon that may be partly attributable to the continued availability of somatosensory feedback (Nasir & Ostry, 2008). Taken together, this evidence suggests that somatosensory feedback plays a crucial role in specifying articulatory targets of speech and adjusting articulator movements under varying circumstances. The ability to use this feedback depends on one’s threshold to detect fine-grained differences in this domain, or one’s somatosensory acuity (Attanasio, 1987). It should be noted that “somatosensory acuity” is used throughout the present study as a convenient but imperfect term. This paper describes three tasks that each tap somatosensory function, but also rely on abilities in other domains to different extents.
It is plausible that clinical management of various populations could be strengthened through a deeper understanding of somatosensory feedback and somatosensory acuity. Developmental misarticulations of certain sounds, such as /ɹ/ and /s/, persist past 8–9 years of age in a subset of children, at which point they may be labeled residual speech errors; such errors continue through adulthood in an estimated 1–2% of individuals (Flipsen, 2015). Some studies have suggested that, on average, individuals with residual speech errors perform worse than typical speakers on somatosensory tasks assessing oral stereognosis (Fucci & Robertson, 1971), oral vibrotactile thresholds (Fucci, 1972), and oral two-point discrimination (McNutt, 1977). However, other research has found no reliable relationship between articulation errors and performance on oral stereognosis tasks (Arndt, Elbert, & Shelton, 1970). Previous research has also shown that people who stutter exhibit reduced somatosensory acuity, suggesting that stuttering may be interpreted in terms of an oral proprioception deficiency, in addition to a motor deficit (Fucci, Petrosino, Gorman, & Harris, 1985; Loucks & De Nil, 2006; Stewart, Evans, & Fitch, 1985). Among typical speakers, there is evidence that individuals with high somatosensory acuity exhibit more precise production of phonetic contrasts (e.g., greater acoustic contrast distance between the sibilants /s/ and /ʃ/) than those with low somatosensory acuity (Ghosh et al., 2010). Taken together, these findings suggest that research aimed at measuring and even altering somatosensory acuity could contribute to the diagnosis and treatment of speech disorders. However, a deeper understanding of typical oral somatosensory ability (henceforth, “somatosensory acuity”) is necessary before the clinical potential of somatosensory measurement can be fully realized.
Somatosensory acuity has been measured in various ways, including oral stereognosis tasks and oral form discrimination and identification tasks. Oral stereognosis measures an individual’s ability to recognize the shape of an object through the sense of touch in the oral cavity (Fucci & Robertson, 1971). For example, Steele, Stokely, & Peladeau-Pigeon (2014) presented participants with plastic strips bearing raised letters of different sizes and asked them to use the tongue tip to identify each letter. Alternatively, oral form discrimination can be measured by placing two shapes on the tongue in succession, and then asking the individual whether the shapes were the same or different (Attanasio, 1987; McNutt, 1977). Another task involves manipulating the orientation of grooved domes that are pressed on a speech structure while participants are asked to identify groove orientations as horizontal, vertical, left diagonal, or right diagonal (Ghosh et al., 2010). Still other tasks involve oral two-point discrimination (Etter, Miller, & Ballard, 2017; McNutt, 1977), which measures the minimum distance between two pointed or disc-shaped stimuli that an individual can perceive as two separate points, and measurement of oral vibrotactile thresholds (Fucci, 1972), or the vibration frequency needed for individuals to sense vibratory stimuli on an oral structure. More recently, Etter and colleagues (2017) investigated the use of Von Frey hairs, a set of elastic columns with nylon filaments that bend at a specific force when pressed onto a surface. The monofilaments were pressed against participants’ tongue and lower lip to measure tactile detection thresholds, or the target force level at which participants were able to detect pressure from the monofilaments, as well as tactile discrimination thresholds, or the target force levels at which participants were able to indicate which of two trials presented a “stronger” pressure on their lip and tongue. All of these tasks measure the tactile aspect of somatosensation.
Studies evaluating the proprioceptive aspect of somatosensation have typically introduced perturbations to the vocal tract, either in the form of a bite block or palatal prosthesis (e.g., Baum & McFarland, 1997; Zandipour et al., 2006) or through robotic manipulation of articulator trajectories (e.g., Feng, Gracco, & Max, 2011; Lametti, Nasir, & Ostry, 2012; Nasir & Ostry, 2006; Tremblay, Shiller, & Ostry, 2003). While many such studies have examined response to mechanical perturbation when both somatosensory and auditory feedback are available, others have isolated somatosensory contributions by introducing perturbations that do not influence speech acoustics (Nasir & Ostry, 2006; Tremblay, Shiller, & Ostry, 2003) or blocking auditory feedback with masking noise (Zandipour et al., 2006). These studies have shown that talkers adapt their oral speech movements to such perturbations, indicating that proprioception has an important role in the planning and control of speech movements.
Finally, a novel approach using phonetic awareness to evaluate oral proprioception was developed as part of an ongoing project investigating sensory profiles in children with and without residual speech errors (McAllister, Preston, Hitchcock, & Hill, 2020). This task was inspired by proprioceptive awareness tasks developed to evaluate limb control (e.g., Fuentes & Bastian, 2010; Proske, Tsay, & Allen, 2014), as well as by research on lingual control in non-speech contexts (Ouni, 2014). In the limb control literature, participants undergo passive movements of one limb outside of their field of view and are then asked to indicate its position by pointing or matching its placement with the other limb (Proske et al., 2014). In the context of lingual control, Ouni (2014) examined individual abilities to perform various volitional non-speech movements of the tongue, such as moving the tongue body back and up. Success in performing the task was evaluated through ultrasound imaging, but participants were not provided with visual feedback during the assessment and instead had to rely on proprioceptive feedback. McAllister et al. (2020) developed a phonetic awareness task that adapts the notion of conscious reflection on the relative position of body structures to the speech context. In this task, as described in more detail below, speakers are asked to produce various sound pairs and make a judgment regarding the relative position of their tongues (e.g., higher, lower, more front).
In summary, a range of measures have been established that assess different aspects of somatosensory acuity. Across multiple studies, there is evidence that differences in somatosensory acuity correlate with differences in speech production, both in typical speakers (e.g., Ghosh et al., 2010) and across diagnostic groups (e.g., Attanasio, 1987; Fucci, 1972; Fucci & Robertson, 1971; McNutt, 1977). This suggests that measures of somatosensory acuity could have clinical utility to inform prognostic judgments and guide treatment planning for individuals with speech disorders. Despite this potential, somatosensory acuity remains under-studied relative to auditory acuity and other aspects of speech-motor control. In particular, there is a lack of research comparing the various somatosensory tasks developed in previous literature in order to evaluate which task best represents different aspects of somatosensory acuity. The current study took a step toward addressing this limitation by comparing individual performance across three somatosensory measures: an oral stereognosis task (Steele et al., 2014), a bite block task with auditory masking (Zandipour et al., 2006), and the above-described phonetic awareness task (McAllister et al., 2020). Twenty participants from a previous speech-motor learning study (Li, Ayala, Harel, Shiller, & McAllister, 2019) completed all three measures, and their performance was examined in relation to the outcome measure from the speech task in the original study.
Li et al. (2019) examined several individual-level factors, including somatosensory acuity, as potential predictors of performance in a task in which English speakers were trained to produce the Mandarin vowels /u/ and /y/. The participants imitated native Mandarin speakers producing the vowels in isolation in a probe that was administrated at baseline and again after a period of training incorporating visual biofeedback. Biofeedback intervention for speech involves using technology to generate a real-time visual display of some aspect of speech for the learner to manipulate in an effort to match a visual target representing correct production. It has been associated with positive learning outcomes in children and adolescents with residual speech errors (e.g., McAllister Byun & Hitchcock, 2012) or Childhood Apraxia of Speech (e.g., Preston, Brick, & Landi, 2013), as well as typical adult second-language learners (e.g., Kartushina, Hervais-Adelman, Frauenfelder, & Golestani, 2015). In Li et al. (2019), participants were randomly assigned to receive one of two different types of biofeedback. In the visual-acoustic biofeedback condition (e.g., McAllister Byun & Hitchcock, 2012), participants viewed a real-time display of the acoustic spectrum of the speech signal, while in the ultrasound biofeedback condition (e.g., Sugden, Lloyd, Lam, & Cleland, 2019), participants saw a real-time image of the surface of the tongue within the oral cavity. Li and colleagues found that participants’ Mandarin vowel production accuracy improved in both biofeedback training conditions. However, neither somatosensory acuity nor auditory acuity was a significant predictor of response to training. One of the stated limitations of the original study was that somatosensory acuity was only measured using an oral stereognosis task. This task primarily evaluates the tactile acuity of the tongue tip,1 which may be more applicable to consonants than the vowel targets studied by Li et al. (2019), since vowel production likely provides less tactile feedback than consonant production.
The present pilot study represents an extension of the work by Li et al. (2019). Participants from the original study were invited to complete two additional somatosensory acuity tasks: a bite block adaptation task with auditory masking and the above-described phonetic awareness task. As in the original study, the dependent variable was the magnitude of learning over the course of biofeedback-enhanced vowel production training. Our primary research objective was to determine which measure of somatosensory acuity was most strongly correlated with this measure of speech-motor learning. As noted by Li et al., there is reason to believe that a measure of proprioceptive awareness might be more closely related to outcomes in their vowel learning task than the stereognosis measure administered originally. Most of the oral somatosensory tasks investigated to date have focused on assessing tactile sensation, so there is a need for further study of oral proprioception. The two tasks that we added were selected because they are hypothesized to measure the construct of proprioceptive awareness and because they do not require specialized equipment such as robotic force-generators.
Regarding the expected direction of the relationship, previous literature has pointed out that either a positive or a negative association between sensory acuity and response to biofeedback training is theoretically defensible (Cialdella et al., 2020). Li et al. (2019) hypothesized that individuals with poor sensory acuity would derive greater benefit from the sensory enhancement provided by biofeedback and would thus show a greater magnitude of learning. (They also predicted an interaction between sensory acuity and biofeedback type, such that individuals with poor auditory acuity would derive relatively greater benefit from visual-acoustic biofeedback and individuals with poor somatosensory acuity would derive greater benefit from ultrasound biofeedback; we return to this point in the Discussion.) On the other hand, given that higher somatosensory acuity is associated with greater precision in speech production (Ghosh et al., 2010), it is possible that individuals with higher somatosensory acuity could show stronger performance in a speech-motor learning task. Finally, the present study proposed to examine pairwise correlations between the three measures administered. We predicted that the bite block and phonetic awareness tasks would show the strongest pairwise correlation, since they are both intended to measure proprioceptive awareness, whereas the stereognosis task is primarily a measure of tactile somatosensory acuity.
METHODS
Participants
For the present study, all 65 participants who completed the original study described in Li et al. (2019) were invited back for 1–2 follow-up sessions. From the original group, 20 females between the ages of 19 and 23 completed this follow-up study (mean age = 21.13 years, SD = 0.84 years). Eight of them were originally assigned to the visual-acoustic biofeedback training condition and 12 to the ultrasound condition. Two of the participants who completed the present study were excluded from the data reported by Li et al. (2019) because they fell more than two standard deviations away from the group mean performance on the vowel learning task. However, they were retained for the purpose of the present pilot study due to the difficulty of recruiting a sufficiently large sample of individuals who completed the original study. All participants were native speakers of English with no self-reported history of speech or language disorders. They also reported healthy dental status, as dental status may affect performance on an oral stereognosis task (Jacobs, Serhal, & van Steenberghe, 1998). Lastly, all participants passed a pure tone hearing screening (20dB HL at 500Hz, 1000Hz, 2000Hz, 4000Hz) using a portable audiometer. This study was approved by the Institutional Review Board at New York University, and each participant provided informed consent to participate.
Procedure: Original Study
The original study by Li et al. (2019) included 60 female native English speakers between the ages of 18 and 30 who attended two sessions. In the first session, the researchers administered the oral stereognosis task from Steele et al. (2014), which will be described in detail in the next section, to measure somatosensory acuity; they also administered an AXB discrimination task to measure auditory acuity. For the AXB discrimination task, the researchers generated a continuum from /y/ to /u/ by manipulating a native Mandarin speaker’s productions of these vowels in isolation using the speech algorithm STRAIGHT (Speech Transformation and Representation by Adaptive Interpolation of weiGHTed spectrogram; Kawahara, Morise, Banno, & Skuk, 2013). This procedure created 240 steps equidistant in acoustic space between the endpoints of the continuum. Participants listened to sets of three stimuli drawn from this synthesized continuum through over-the-ear headphones (Sennheiser HD 429) and were asked to identify whether the first (A) or last (B) stimulus was identical to the middle stimulus (X). The stimuli started from the /y/ end of the continuum, and the distance between the stimuli in each trial was adjusted following an adaptive staircase model to arrive at an estimate of the just noticeable distance (JND) between stimuli that each participant could detect consistently. Further detail on computing the JND for each participant is provided below under Measurement.
A speech learning task took place during the second session. Before and after training, participants produced 20 repetitions each for the Mandarin vowels /y/ and /u/ in isolation, imitating a native speaker’s model for each trial. The audio stimuli used for this imitation task, as well as for the biofeedback training, included six productions of each vowel by three female Mandarin speakers, presented in a random order through headphones (Sennheiser HD 429). After the baseline production task, participants completed 30 minutes of production training for each of the two target vowels. The order in which vowels were trained was counter-balanced across participants, and participants were randomly assigned to one of the two biofeedback training conditions. Before training began, participants were matched on the basis of formant frequencies to a native Mandarin speaker whose productions would be used for visual targets in biofeedback training. In the visual-acoustic biofeedback condition, participants were cued to make a real-time LPC spectrum line up with a spectral template of each vowel as produced by the target speaker. In the ultrasound biofeedback condition, participants were cued to match a trace of a tongue shape obtained from the target talker during production of each vowel. See Li et al. (2019) for the complete protocol, including the selection of visual targets as well as training procedures.
Procedure: Somatosensory measures
Three somatosensory measures were obtained from all participants in the current study. As noted above, the oral stereognosis task was administered as part of the original study, while the other two somatosensory tasks were collected in one or two follow-up sessions. In the oral stereognosis task, which measures tactile acuity, participants used their tongue tip to identify the form of a letter embossed on a plastic strip following a protocol established in Steele et al. (2014). After verbal and visual explanations that the top of the letter would be oriented toward the back of the mouth, participants were instructed to place each letter strip in their mouth with the embossed side facing down and to use only their tongue tip to feel the letter. The letters were sans serif capitals (A, I, J, L, T, U, and W) and although the embossment height was constant across letters, the letters themselves ranged in size from 2.5–8.0 mm (2.5, 3, 4, 5, 6, 7 and 8 mm) from top to bottom (See Figure 1). Stimulus presentation started by randomly choosing a strip with a medium-sized letter (5 mm). Following an adaptive staircase model, a participant’s correct identification of the embossed letter resulted in a one-step decrease in letter size, whereas an incorrect response resulted in a one-step increase in letter size. Participants wore dark sunglasses to reduce the visibility of the letters as the investigator handed each strip to them face-down. The administrator then recorded the participant’s verbal responses on a scoring sheet.
Figure 1.

Letter strips from oral stereognosis task by Steele et al. (2014). Image copyright Steele et al. (2014).
The phonetic awareness task measures articulator proprioception by asking participants to judge the relative position of the tongue while producing different sounds. The task was administered through PsychoPy3 (version 13; Peirce et al., 2019) on a laptop (MacBook Air). The participants wore over-the-ear headphones (Audio-Technica ATH-M20x) and listened to prerecorded instructions and prompts while viewing the same text on the screen. The task consisted of four blocks with nine stimuli in each block; see supplemental materials for a full list of stimuli. In each block, the participant was provided with both orthographic and verbal models for a target sound, both in the context of a real word and in isolation (e.g., “oo like in hoot”). Participants were prompted to repeat each target sound in isolation three times in order to form a mental image of the sensation accompanying the production of those targets. In the first part of the task, participants classified consonants as being produced with the front or the back of the tongue (e.g., “Say /θ/ like in “think.” Do you use the front or the back of your tongue?”). For the second, third, and fourth parts of this task, two vowels were modeled and the participants had to produce them in an alternating manner three times (e.g., “ah-ee-ah-ee-ah-ee”), then answer questions about the relative position of their tongue for each vowel (e.g., “which vowel is higher/lower/further back”). (See Figure 2 for a visual representation of the dimensions probed in this task). If the participant produced a sound incorrectly (e.g., /u/ instead of /ʊ/ for the vowel in “put”), another verbal model was provided and another attempt was allowed, after which the task proceeded regardless of the accuracy of the participant’s production. The participant provided verbal responses to the questions, and the administrator entered the responses on a keyboard.
Figure 2.

Depiction of questions asked about tongue placement in the oral cavity during phonetic awareness task.
The bite block task with auditory masking measures an individual’s ability to compensate for articulator perturbation in a context where auditory feedback is not available. It consists of three parts. In the baseline condition, the participants read the words “heed,” “had,” “who’d,” and “hod” eight times each (32 total) with masking noise but no bite block perturbation. These words were displayed in pseudorandomized order on a laptop (MacBook Air) using a custom-made program. In the bite block condition, the participants held a standard tongue depressor between their front incisors in two different ways (See Figure 3). For the low vowels /æ/ and /ɑ/ (“had” and “hod”), the tongue depressor was placed horizontally, creating a relatively closed jaw position, while for the high vowels /u/ and /i/ (“who’d” and “heed”), the tongue depressor was placed vertically, creating 1.75 cm of jaw aperture. Tongue depressor placement prevented speakers from using their habitual motor plans for speech production, allowing us to observe how they adjusted their motor plans using only somatosensory feedback. Each of the bite block conditions (horizontal and vertical) elicited 16 repetitions of each word in randomized order, for a total of 32 words in each bite block condition. The order of elicitation of the horizontal versus vertical condition was counterbalanced across participants.
Figure 3.

Bite block setup with tongue depressor between front incisors in vertical orientation and auditory masking through air and bone conduction.
To prevent participants from using auditory feedback to adjust their speech to compensate for the presence of the bite block, participants listened to loud masking noise through two sets of headphones. Auditory masking included pink noise played via Praat (Boersma & Weenink, 2019) through Bluetooth-enabled bone conduction headphones (Z8), and multi-talker babble played via Audacity (Audacity Team, 2019) through in-the-ear noise-isolating earphones (Etymotic Research HF5). Masking noise was presented at 75 dB in the insert headphones and at maximum device volume in the bone conduction headphones; all participants confirmed that the combined noise volume was loud but comfortable. To further ensure participants could not hear themselves while speaking, they were cued to speak at a low vocal volume through visual feedback in the form of a digital sound level meter (Kay Pentax, Computerized Speech Lab, Model 4500). If a participant’s volume exceeded a set threshold on the visual display, the experimenter prompted them to speak more quietly, but not in a whisper. All productions were recorded to the Computerized Speech Lab system through a table-mounted microphone (Shure SM48) placed five inches away from the participant’s mouth. Samples were digitized with a 44kHz sampling rate and 16-bit encoding. The first 16 participants were recorded in a sound-attenuated booth, but due to changes in lab space availability, the remaining participants were recorded in a quiet room with minimal background noise.
Measurement
Original Study: Production Accuracy.
In Li et al. (2019), the first and second formant frequencies (F1 and F2) from the midpoint of each vowel at baseline and post-training time points were measured in Praat (Boersma & Weenink, 2019). Formant frequencies were automatically extracted from a 50 ms Gaussian window using a Praat script (Lennes, 2003) and were transformed into the psychoacoustic Bark scale using the “vowels” package (Kendall, Thomas, & Kendall, 2018) in the R software environment (R Core Team, 2019). To measure production accuracy, the researchers computed the Euclidean distance (ED) in F1-F2 space from each production to the center of the distribution of productions by the participant’s assigned target talker (see above). A smaller ED value indicated higher accuracy. Median ED values across twenty repetitions were calculated for each participant and vowel at baseline and post-training. Change in median ED from baseline to post-training was also calculated, with a negative value for change indicating improvement over the course of training.
Because two vowels were trained, each participant had separate scores for change in median ED for /y/ versus /u/. For the purpose of the present study, only scores for the /y/ vowel were used. This decision was made because the auditory acuity measure started from the /y/ end of the continuum, as described below, and may therefore be more closely related to performance for /y/ than /u/. In addition, performance on Mandarin /y/ is of interest because this vowel does not have a counterpart in the English vowel space and therefore may encourage learners to explore new articulatory-acoustic mappings. The Mandarin /u/ is phonetically more back than English /u/ (Chen, Robb, Gilbert, & Lerman, 2001) but the two categories are perceptually similar, which could lead participants to reuse their existing motor plan for English /u/ during the experimental task. See discussion from Li et al. (2019) for more detail.
Original Study: Auditory Acuity.
As noted above, the AXB discrimination task yielded a just noticeable distance score (JND) representing the threshold at which a participant could detect a difference in vowel formants between sets of two stimuli. Based on Villacorta, Perkell, & Guenther (2007), an adaptive staircase procedure was used to enable precise calculation of the JND. In the first trial, the first stimulus was the /y/ end of the continuum and the second stimulus was 50 steps away. In subsequent trials, the distance between stimuli decreased by eight steps after a correct response and increased by four steps after an incorrect response. After every four reversals of the direction of the change in stimulus, the size of the stimulus adjustment was reduced by half. The task ended after 12 reversals or 80 trials, whichever came first. The JND score was determined by calculating the mean distance between the stimuli at the final four reversals, such that a smaller JND corresponds with higher auditory acuity.
Somatosensory measures.
Following Steele et al. (2014), the outcome measure for the oral stereognosis task was determined based on the mean letter size (MLS) across all correct responses. Task administration was completed after 28 trials or eight reversals in the adaptive staircase described above, whichever came first. As smaller letter sizes are harder to identify, a lower MLS is interpreted as representing a higher degree of somatosensory acuity.
The outcome measure for the phonetic awareness task was the percentage of correct responses across all trials, such that higher scores indicate higher proprioceptive awareness.
For the bite block adaptation task, the first and second formants (F1 and F2) were measured within the vowel from each production during both baseline and bite block trials. Trained research assistants examined recordings of each participant in Praat and selected formant settings (e.g., 5 formants in 5500 Hz) that were judged to yield the best alignment between automated formant tracking and the visible areas of energy concentration in the spectrogram for each target vowel. A representative point within the steady-state region of each vowel was marked and formants were extracted and Bark-transformed as described above for the original study. Mean F1 and F2 frequencies in the baseline condition were calculated for each vowel target for each participant. Then we calculated the Euclidean distance in F1-F2 space between each token produced in the bite block condition and the baseline mean for that vowel target and participant. This value was averaged across tokens and across the four vowels elicited for each individual. Thus, the primary outcome measure for the bite block task was mean ED in F1-F2 space, where a smaller ED indicates a greater degree of compensation for the bite block perturbation. Reliability was evaluated by having a second trained student re-measure 20% of the tokens and then comparing the measurements. The overall correlation between measures across raters was 0.86, suggesting adequate inter-rater agreement.
Analysis
To test the hypothesis of an association between somatosensory acuity and performance on the vowel learning task, three linear regression models were fit in R (R Core Team, 2019), one for each somatosensory measure (stereognosis, phonetic awareness, or bite block). In each model, the dependent variable was the change in production accuracy (median ED between the participant’s production and the native speaker target) from pre to post training for the vowel /y/, as measured in the original study by Li et al. (2019). The independent variables were somatosensory acuity (as measured by the oral stereognosis, phonetic awareness, or bite block tasks) and auditory acuity (as measured by the discrimination task from the original study). Although our primary research question pertains to somatosensory acuity, we included auditory acuity as a controlled covariate because of its central role in models of sensorimotor control of speech (e.g., Guenther, 2016). In addition, previous empirical research has established the importance of auditory perception in producing the sounds of a second language (Bradlow, Pisoni, Akahane-Yamada, & Tohkura, 1997; Flege, 1995). No interactions of theoretical interest were predicted, and therefore, none were included. No correction for multiple comparisons was undertaken due to the small sample size of this pilot investigation, which represents a limitation of the present study.
The Akaike and Bayesian Information Criteria (AIC/BIC) were used to select the best-fitting of the three regression models. AIC and BIC are goodness-of-fit measures that take into account the number of predictors in each model (Cohen, Cohen, West, & Aiken, 2013). Both AIC and BIC penalize the log-likelihood of the data by accounting for the cost of estimating the parameters included in the model, with lower values interpreted as indicating a better fit.
Finally, to test the hypothesis that the bite block and phonetic awareness tasks would show the strongest pairwise correlation, we explored pairwise correlations among the three somatosensory measures. Complete data and code to reproduce all figures and analyses can be retrieved at https://osf.io/2h9jr/; complete model output can be found in the online supplement to this paper.
RESULTS
Descriptive Statistics
Table 1 provides descriptive statistics for all measures obtained in the present study, including the three somatosensory tasks, auditory acuity, and change in production accuracy. The scatterplots in Figure 4 show the relationship between each somatosensory measure and change in production accuracy.
Table 1.
Mean and SD Values for All Measured Variables from Respective Tasks
| Task | Measure | Mean | SD |
|---|---|---|---|
| Stereognosis | Mean letter size | 4.31 | 1.08 |
| Phonetic Awareness | % correct | 76.80 | 12.30 |
| Bite Block | Euclidean Distance | 0.87 | 0.25 |
| Auditory Acuity | Just Noticeable Difference | 18.30 | 15.70 |
| Production Accuracy | Change in Euclidean Distance | −0.36 | 0.67 |
Figure 4.

Relationships between somatosensory scores and change in production accuracy.
Inferential Statistics
In the model examining stereognosis as the measure of somatosensory acuity, there was no significant association between mean letter size and change in production accuracy (β = −0.167, SE = 0.142, p = 0.256). The controlled covariate of auditory acuity was also nonsignificant (β = 0.016, SE = 0.010, p = 0.114). Likewise, in the model examining the phonetic awareness task as the measure of somatosensory acuity, there was no effect of percentage accuracy (β = 0.011, SE = 0.013, p = 0.395) and no effect of auditory acuity (β = 0.015, SE = 0.010, p = 0.138). However, in the model examining performance on the bite block task, the association between the somatosensory measure (mean ED in the bite block condition) and change in production accuracy was significant (β = −1.582, SE = 0.605, p = 0.018). The negative direction of the coefficient indicates that speakers who showed a relatively large mean ED in bite block condition (suggestive of weaker somatosensory acuity) tended to show a greater magnitude of change in the speech learning task. (See Figure 4). There was no effect of auditory acuity in this model (β = 0.000, SE = 0.010, p = 0.984).
Table 2 displays the AIC and BIC values for the three regression models. AIC and BIC values were comparable across the models examining performance on the stereognosis and phonetic awareness tasks, but both AIC and BIC were lower for the model examining performance on the bite block task (see Table 2). Although the observed differences in AIC and BIC are small, they align with the results of the hypothesis-tests reported above in suggesting that performance on the bite block task was more successful than the other two measures in accounting for variance in performance on the speech learning task.
Table 2.
AIC and BIC Values for the Three Models
| Somatosensory Acuity Measures | AIC | BIC |
|---|---|---|
| Stereognosis | 44.01 | 47.99 |
| Phonetic Awareness | 44.69 | 48.68 |
| Bite Block | 38.81 | 42.79 |
A final set of analyses examined pairwise correlations between scores on the various somatosensory measures. There was no significant association between performance on the stereognosis and phonetic awareness tasks (ρ(18) = −0.431, p = 0.058), between performance on the stereognosis and bite block tasks (ρ(18) = 0.188, p = 0.427), or between performance on the phonetic awareness and bite block tasks (ρ(18) = 0.092, p = 0.699). The scatterplots in Figure 5 depict the pairwise relationships among the three somatosensory measures. Please see Supplemental Materials for full model output.
Figure 5.

Pairwise relationships between somatosensory scores across tasks.
DISCUSSION
The primary purpose of this study was to determine which of three measures of somatosensory acuity (stereognosis task, phonetic awareness task, or bite block task) was most strongly associated with a measure of speech learning ability, while controlling for auditory acuity. To do this, we examined participants’ performance on these three somatosensory tasks in relation to performance on a lab-based Mandarin vowel learning task incorporating visual biofeedback. Of the three measures administered, only the bite block task showed a significant association with speech learning performance. The direction of the association was negative, suggesting that individuals whose performance was consistent with lower somatosensory acuity exhibited a greater magnitude of change over the course of the vowel learning task. There were no significant paired correlations between the three somatosensory measures, suggesting that each task tapped into a distinct somatosensory skill. More research with a larger sample size is needed to determine whether the results observed in the present pilot study are robust.
Comparison among Somatosensory Measures
Among the three somatosensory tasks administered to the 20 adult learners in the present study, only the bite block task exhibited a significant association with magnitude of change in the vowel learning task from Li et al. (2019); it was also favored by AIC/BIC model comparison criteria. Participants were observed to vary widely in their performance on this task, with some exhibiting large amounts of perturbation in vowel formant frequencies with the bite block in place and others showing relatively little perturbation. This suggests that some participants were able to compensate for the presence of the bite block even in the absence of auditory feedback, consistent with previous research (Zandipour et al., 2006). Interpreting this result within a speech-motor control framework, it appears that these participants were successful in using somatosensory feedback to adjust their feedforward motor plan for each vowel target.
Although larger-scale data collection is needed to substantiate these pilot results, the finding that the bite block task was the best predictor of speech learning performance is not entirely surprising. As noted above, we reasoned that tasks evaluating oral proprioceptive awareness would be more likely than tactile tasks to show a significant association with change in production accuracy for the vowel targets from Li et al. (2019). Both the bite block task and the phonetic awareness task were intended to assess proprioceptive awareness of the tongue, whereas stereognosis is a tactile measure. Comparing the bite block task and the phonetic awareness task against one another, it is noteworthy that the phonetic awareness task requires explicit metalinguistic reflection on the position of the tongue for different sounds, whereas the bite block task taps participants’ ability to make implicit adjustments for the presence of a mechanical perturbation. It is reasonable to think that the latter process is more closely related to the ability to acquire new sensorimotor mappings as part of natural language acquisition. As a final possibility, we note that among the three tasks studied, only the bite block task used a measure of speech production as the outcome variable. This could also have contributed to the finding that this task was the best predictor of speech motor learning.
Finally, we examined pairwise correlations between tasks and hypothesized that the two proprioceptive measures, the bite block task and the phonetic awareness task, would show the strongest correlation. In fact, no between-task correlations were significant in our small sample, and the strongest nonsignificant correlation was between the stereognosis task and the phonetic awareness task. Again, the implicit versus explicit nature of the tasks may be relevant to this finding. Both the stereognosis and phonetic awareness tasks required participants to make an explicit selection from a pool of responses, which is inherently quite different from the implicit articulatory adjustments required by the bite block task. Thus, it is possible that the shared variance between the phonetic awareness and stereognosis tasks reflected the degree of attention and strategy that participants brought to the performance of experimental tasks.
Relationship between Sensory Acuity and Speech Learning
In the introduction, we noted that either of two possible associations between somatosensory acuity and speech-motor learning performance could be theoretically defended. One possibility was that speakers with higher acuity would show better performance in the speech learning task, since previous research has indicated that higher somatosensory acuity tends to support greater precision in speech production (e.g., Ghosh et al., 2010). The alternative possibility was that individuals with weaker somatosensory acuity would derive greater relative benefit from the sensory enhancement provided by biofeedback training, and they would thus exhibit a greater magnitude of change in the production training task in the original study by Li et al. (2019). The significant negative association between somatosensory acuity and change in production accuracy that was observed for the bite block task was consistent with the latter hypothesis. While the other two somatosensory measures did not show a statistically significant relationship to change in production accuracy, it is worth noting that the nonsignificant trend for each was in a direction theoretically consistent with the result from the bite block task (see Figure 4). For the stereognosis task, speakers with relatively high mean letter size, suggestive of lower somatosensory acuity, tended to show a relatively large change in production accuracy. Likewise, speakers with relatively poor performance on the phonetic awareness task tended to show a greater magnitude of change in production accuracy than speakers with higher scores. Although these nonsignificant trends do not constitute strong evidence, all of the results of the present study point toward an interpretation where participants with relatively low somatosensory acuity tended to derive greater benefit from the training provided in Li et al. (2019) than participants with higher somatosensory acuity.
It is worthwhile at this juncture to reflect further on the nature of the training provided to participants in Li et al. (2019). Recall that participants in the original study were assigned to one of two biofeedback training conditions: visual-acoustic or ultrasound. As described previously, visual-acoustic biofeedback provides a real-time display of the spectrum of speech, which they attempted to match to a visual target representing a native speaker’s F1 and F2 for the vowel in question. Ultrasound biofeedback provides a real-time image of the tongue that the participant attempted to align with a visual trace of a native speaker’s tongue for each vowel. Because these two biofeedback types differ in the type of information provided, Li et al. (2019) hypothesized an interaction between sensory acuity and biofeedback type. Specifically, they predicted that response to visual-acoustic biofeedback, which augments feedback in the auditory-acoustic domain, would show a significant association with auditory acuity but not somatosensory acuity. Likewise, they reasoned that ultrasound biofeedback provides visual information about articulator placement that serves to augment feedback that is ordinarily received through somatosensory channels; they thus predicted that response to ultrasound biofeedback treatment would show a significant association with somatosensory acuity but not auditory acuity. However, Li et al. (2019) did not find evidence in support of the hypothesized interaction in their original study. They also found no significant difference in training outcomes between the two biofeedback conditions. Because of these null results, and because statistical power was limited by the small sample size of this follow-up study, we did not examine biofeedback type as a predictor in the present research. Nevertheless, we acknowledge the theoretical rationale behind the claim that different forms of biofeedback target different sensory domains. Pooling across the two biofeedback conditions may have thus diminished our capacity to detect an association between somatosensory acuity and treatment response. Such an association might be more robustly observed in a study where all participants received ultrasound biofeedback training.
Limitations and Future Directions
There are important limitations that should be considered when interpreting the results from this study. First, the three tasks studied were selected partly on the basis of ease of administration. While this has the potential to increase the clinical relevance of the research, it also means that some compromises were made with regard to construct validity. For example, as noted above, the phonetic awareness task recruits a significant degree of metalinguistic skill, meaning that participants who have had greater exposure to the concept of articulator placement (e.g., through second-language pronunciation instruction) may have heightened awareness of their articulation and therefore perform better on this task.
To measure oral tactile acuity, we used the stereognosis task from Steele et al. (2014) because the stimulus materials in that study were described in a level of detail that allowed us to reproduce them.2 However, the use of letters rather than other shapes represents a limitation: the plastic strips with embossed letters are presented upside down (i.e., with the top of the letter pointing to the back of the mouth), which occasionally leads to confusion regarding the orientation of the letter. This means that individuals with strong spatial awareness skill, facilitating mental rotation of letters, may be more successful at this task. In addition, the task from Steele et al. (2014) uses the same seven letters for each step in the adaptive staircase, whereas previous studies have used as many as 20 distinct forms (e.g., Fucci & Robertson, 1971). A wider response set is desirable to prevent participants from developing guessing strategies based on the limited set of response options. Future research should also compare different measures of oral tactile acuity, such as stereognosis, vibrotactile sensation thresholds, and two-point discrimination, both against one another and in relation to speech learning outcomes.
For the bite block task, auditory masking was used to minimize participants’ access to auditory feedback, but it was not possible to guarantee complete masking in all cases. In particular, while participants were prompted to maintain a quiet vocal volume, they were not all equally successful in meeting this requirement. Individuals who spoke louder would potentially have a greater degree of self-hearing than those who were more successful in maintaining a quiet vocal volume. Thus, individual differences in performance on this task may reflect differences in the availability of auditory feedback in addition to differences in somatosensory acuity.
A second limitation pertains to the small size of the sample (n = 20) in this pilot study. A minimum sample size of 30 is often recommended for statistical inference (e.g., Hogg, Tanis, & Zimmerman, 2010). However, it was not possible to achieve this standard in the present study because recruitment was restricted to the sample of participants who completed the original study by Li et al. (2019). In addition, due to the difficulty of achieving an adequate sample size, two participants were retained who were excluded from the data reported by Li et al. because they represented outliers relative to group mean performance. A larger sample size would also allow for a wider range of analyses beyond the present investigation of change in accuracy from pre- to post-training; for instance, we could examine post-training accuracy as the outcome variable, with independent variables of pre-training accuracy and sensory acuity. Finally, the analyses reported here were not corrected for multiple comparisons due to the low statistical power of this pilot study, which raises the risk of a false positive finding; future studies should introduce such corrections.
The present sample is also limited in that all participants were female, which prevents generalization of findings to the larger population. Subsequent research should use a larger sample including both male and female participants, especially in light of previous findings suggesting that patterns of sensory response may differ by sex (Cialdella et al., 2020; Hammer & Krueger, 2014). A follow-up study including child participants is also desirable as a means to understand whether differences in somatosensory acuity are present across different age groups. In addition, future research should administer these somatosensory tasks to matched groups of individuals with and without speech disorder. This would allow us to replicate existing findings of group differences in oral tactile acuity (e.g., Attanasio, 1987; Fucci, 1972; Fucci & Robertson, 1971; McNutt, 1977) and ask whether such differences can also be found in tasks assessing the proprioceptive domain. Finally, it will be important to conduct research examining how differences in somatosensory acuity are associated with response to different forms of treatment. In the present small-scale study, participants with poor performance on the bite block task tended to respond well to the biofeedback-enhanced training provided in Li et al. (2019). Clinically, we would like to know if treatment with enhanced sensory feedback, such as ultrasound biofeedback, can be recommended for individuals with low somatosensory acuity. However, before making any judgments on this subject, it will be necessary to evaluate how individuals with different somatosensory profiles respond to treatment both with and without biofeedback.
CONCLUSION
This pilot study offers preliminary insights into the measurement of oral somatosensory acuity in relation to speech learning ability. Out of the three tasks evaluated, only one—compensation for a bite block in the presence of auditory masking—was significantly associated with performance on a task of learning to produce non-native vowel sounds. If this result proves robust in larger-scale studies, it would suggest that tasks measuring proprioceptive somatosensory feedback may be preferable to tactile acuity measures, at least when vowels are the targets of interest. In addition, tasks that assess somatosensory knowledge implicitly, such as the bite block task, may be more suitable than tasks with a substantial metalinguistic component, such as the phonetic awareness task used here. Our findings also suggest that participants with lower somatosensory acuity tended to show a greater magnitude of change over the course of a brief period of biofeedback-enhanced training than individuals with higher somatosensory acuity. Further research is needed to understand how individuals with different profiles respond to different forms of speech production training. Although a considerable amount of research is still needed on this topic, this study represents one step toward a deeper understanding of somatosensory acuity that could ultimately inform the clinical management of speech disorders.
Supplementary Material
Supplemental material 1: Full model output.
Supplemental material 2: Full list of stimuli in the phonetic awareness task.
Acknowledgements
The authors gratefully acknowledge members of the Biofeedback Intervention Technology for Speech Lab (BITS Lab) at NYU for assistance with data collection and analysis, and all participants for their time. Special thanks to Laine Cialdella and Lauren Bergman for formant measurements and to Twylah Campbell for recruiting and scheduling all participants.
Funding:
This research was supported by the National Institute on Deafness and Other Communication Disorders of the National Institutes of Health under Grant R01DC017476 (T. McAllister, PI) and Grant F31DC018197 (H. Kabakoff, PI).
Footnotes
Conflict of Interest Statement: Authors have no relevant conflicts of interest to report.
Oral stereognosis tasks are not perfect measures of tactile acuity because they recruit other skills such as spatial awareness, as discussed in more detail below. However, we include this task because it has been widely adopted as a measure of tactile function in previous literature (e.g., Trudeau-Fisetto, Ito, & Ménard, 2019).
The commercially available materials described in Etter et al. (2017) also meet this criterion and should be considered in future research along the lines of the present study.
REFERENCES
- Adler-Bock M, Bernhardt BM, Gick B, & Bacsfalvi P (2007). The use of ultrasound in remediation of North American English /r/ in 2 adolescents. American Journal of Speech-Language Pathology, 16(2), 128–139. 10.1044/1058-0360(2007/017) [DOI] [PubMed] [Google Scholar]
- Arndt WB, Elbert M, & Shelton RL (1970). Standardization of a test of oral stereognosis. In Second Symposium on Oral Sensation and Perception (pp. 379–383). Springfield, IL: Charles C. Thomas. [Google Scholar]
- Attanasio JS (1987). Relationships between oral sensory feedback skills and adaptation to delayed auditory feedback. Journal of Communication Disorders, 20(5), 391–402. [DOI] [PubMed] [Google Scholar]
- Team Audacity. (2019). Audacity(R): Free audio editor and recorder (Version 2.3.3) [Computer application] Retrieved from https://audacityteam.org/.
- Baum SR, & McFarland DH (1997). The development of speech adaptation to an artificial palate. The Journal of the Acoustical Society of America, 102(4), 2353–2359. [DOI] [PubMed] [Google Scholar]
- Berryman LJ, Yau JM, & Hsiao SS (2006). Representation of object size in the somatosensory system. Journal of Neurophysiology, 96(1), 27–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Boersma P, & Weenink D (2019). Praat: Doing phonetics by computer (Version 6.1.14) [Computer program] Retrieved from http://www.praat.org/. [Google Scholar]
- Bradlow AR, Pisoni DB, Akahane-Yamada R, & Tohkura Y (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. The Journal of the Acoustical Society of America, 101(4), 2299–2310. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chen Y, Robb M, Gilbert H, & Lerman J (2001). Vowel production by Mandarin speakers of English. Clinical Linguistics & Phonetics, 15(6), 427–440. [Google Scholar]
- Cialdella L, Kabakoff H, Preston J, Dugan S, Spencer C, Boyce S, Tiede M, Whalen D, & McAllister T (2020). Auditory-perceptual acuity in rhotic misarticulation: Baseline characteristics and treatment response. Clinical Linguistics & Phonetics, 1–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Cohen J, Cohen P, West SG, & Aiken LS (2013). Applied multiple regression/correlation analysis for the behavioral sciences. New York, NY: Routledge. [Google Scholar]
- Etter NM, Miller OM, & Ballard KJ (2017). Clinically available assessment measures for lingual and labial somatosensation in healthy adults: Normative data and test reliability. American Journal of Speech-Language Pathology, 26(3), 982–990. [DOI] [PubMed] [Google Scholar]
- Feng Y, Gracco VL, & Max L (2011). Integration of auditory and somatosensory error signals in the neural control of speech movements. Journal of Neurophysiology, 106(2), 667–679. 10.1152/jn.00638.2010 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Flege JE (1995). Second language speech learning: Theory, findings, and problems. Speech Perception and Linguistic Experience: Issues in Cross-Language Research, 92, 233–277. [Google Scholar]
- Flipsen P (2015). Emergence and prevalence of persistent and residual speech errors. Seminars in Speech and Language, 36(4), 217–223. 10.1055/s-0035-1562905 [DOI] [PubMed] [Google Scholar]
- Fucci D (1972). Oral vibrotactile sensation: An evaluation of normal and defective speakers. Journal of Speech and Hearing Research, 15(1), 179–184. [DOI] [PubMed] [Google Scholar]
- Fucci DJ, & Robertson JH (1971). “Functional” defective articulation: An oral sensory disturbance. Perceptual and Motor Skills, 33(3), 711–714. [DOI] [PubMed] [Google Scholar]
- Fucci D, Petrosino L, Gorman P, & Harris D (1985). Vibrotactile magnitude production scaling: A method for studying sensory-perceptual responses of stutterers and fluent speakers. Journal of Fluency Disorders, 10(1), 69–75. [Google Scholar]
- Fuentes CT, & Bastian AJ (2010). Where is your arm? Variations in proprioception across space and tasks. Journal of Neurophysiology, 103(1), 164–171. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gammon SA, Smith PJ, Daniloff RG, & Kim CW (1971). Articulation and stress/juncture production under oral anesthetization and masking. Journal of Speech and Hearing Research, 14(2), 271–282. 10.1044/jshr.1402.271 [DOI] [PubMed] [Google Scholar]
- Ghosh SS, Matthies ML, Maas E, Hanson A, Tiede M, Ménard L, Guenther FH, Lane H, & Perkell JS (2010). An investigation of the relation between sibilant production and somatosensory and auditory acuity. The Journal of the Acoustical Society of America, 128(5), 3079–3087. 10.1121/1.3493430 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Guenther FH (2016). Neural Control of Speech. MIT Press. [Google Scholar]
- Hammer MJ, & Krueger MA (2014). Voice-related modulation of mechanosensory detection thresholds in the human larynx. Experimental Brain Research, 232(1), 13–20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hickok G (2012). Computational neuroanatomy of speech production. Nature Reviews Neuroscience, 13(2), 135–145. 10.1038/nrn3158 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hogg RV, Tanis EA, & Zimmerman DL (2010). Probability and statistical inference. Pearson/Prentice Hall. [Google Scholar]
- Houde JF, & Nagarajan SS (2011). Speech production as state feedback control. Frontiers in Human Neuroscience, 5(82). 10.3389/fnhum.2011.00082 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hutchinson JM, & Ringel RL (1975). The effect of oral sensory deprivation on stuttering behavior. 10.1016/0021-9924(75)90017-9 [DOI] [PubMed] [Google Scholar]
- Jacobs R, Serhal CB, & van Steenberghe D (1998). Oral stereognosis: A review of the literature. Clinical Oral Investigations, 2(1), 3–10. [DOI] [PubMed] [Google Scholar]
- Kartushina N, Hervais-Adelman A, Frauenfelder UH, & Golestani N (2015). The effect of phonetic production training with visual feedback on the perception and production of foreign speech sounds. The Journal of the Acoustical Society of America, 138(2), 817–832. 10.1121/1.4926561 [DOI] [PubMed] [Google Scholar]
- Kawahara H, Morise M, Banno H, & Skuk VG (2013). Temporally variable multi-aspect N-way morphing based on interference-free speech representation. Paper presented at the Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific. [Google Scholar]
- Kendall T, Thomas ER, & Kendall MT (2018). Package “vowels” in R (Version 1.2.2) Retrieved from https://cran.rr-project.org/web/packages/vowels/vowels.pdf. [Google Scholar]
- Lametti DR, Nasir SM, & Ostry DJ (2012). Sensory preference in speech production revealed by simultaneous alteration of auditory and somatosensory feedback. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 32(27), 9351–9358. 10.1523/JNEUROSCI.0404-12.2012 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lennes M (2003). Collect_formant_data_from_files.praat [Praat script], from http://www.helsinki.fi/~lennes/praat-scripts/public. [Google Scholar]
- Li JJ, Ayala S, Harel D, Shiller DM, & McAllister T (2019). Individual predictors of response to biofeedback training for second-language production. The Journal of the Acoustical Society of America, 146(6), 4625–4643. 10.1121/1.5139423 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Loucks TMJ, & Nil LFD (2006). Oral kinesthetic deficit in adults who stutter: A target-accuracy study. Journal of Motor Behavior, 38(3), 238–247. [DOI] [PubMed] [Google Scholar]
- McAllister Byun T, & Hitchcock ER (2012). Investigating the use of traditional and spectral biofeedback approaches to intervention for /r/ misarticulation. American Journal of Speech-Language Pathology, 21(3), 207–221. [DOI] [PubMed] [Google Scholar]
- McAllister T, Preston JL, Hitchcock ER, & Hill J (2020). Protocol for Correcting Residual Errors with Spectral, ULtrasound, Traditional Speech therapy Randomized Controlled Trial (C-RESULTS RCT). BMC Pediatrics, 20(1), 66. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McNutt JC (1977). Oral sensory and motor behaviors of children with /s/ or /r/ misarticulations. Journal of Speech and Hearing Research, 20(4), 694–703. [DOI] [PubMed] [Google Scholar]
- Nasir SM, & Ostry DJ (2006). Somatosensory precision in speech production. Current Biology, 16(19), 1918–1923. 10.1016/j.cub.2006.07.069 [DOI] [PubMed] [Google Scholar]
- Nasir SM, & Ostry DJ (2008). Speech motor learning in profoundly deaf adults. Nature Neuroscience, 11, 1217–1222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ouni S (2014). Tongue control and its implication in pronunciation training. Computer Assisted Language Learning, 27(5), 439–453. 10.1080/09588221.2012.761637 [DOI] [Google Scholar]
- Parrell B, Ramanarayanan V, Nagarajan S, & Houde J (2019). The FACTS model of speech motor control: Fusing state estimation and task-based control. PLOS Computational Biology, 15(9), e1007321. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Peirce JW, Gray JR, Simpson S, MacAskill MR, Höchenberger R, Sogo H, Kastman E, & Lindeløv JK (2019). PsychoPy2: Experiments in behavior made easy. Behavior Research Methods, 51(1), 195–203. 10.3758/s13428-018-01193-y [DOI] [PMC free article] [PubMed] [Google Scholar]
- Preston JL, Brick N, & Landi N (2013). Ultrasound biofeedback treatment for persisting childhood apraxia of speech. American Journal of Speech-Language Pathology, 22(4), 627–643. 10.1044/1058-0360(2013/12-0139) [DOI] [PubMed] [Google Scholar]
- Proske U, Tsay A, & Allen T (2014). Muscle thixotropy as a tool in the study of proprioception. Experimental Brain Research, 232(11), 3397–3412. [DOI] [PubMed] [Google Scholar]
- Putnam AHB, & Ringel RL (1976). A cineradiographic study of articulation in two talkers with temporarily induced oral sensory deprivation. Journal of Speech and Hearing Research, 19(2), 247–266. 10.1044/jshr.1902.247 [DOI] [PubMed] [Google Scholar]
- R Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. [Google Scholar]
- Ringel RL, & Steer MD (1963). Some effects of tactile and auditory alterations on speech output. Journal of Speech and Hearing Research, 6(4), 369–378. [DOI] [PubMed] [Google Scholar]
- Steele CM, Hill L, Stokely S, & Peladeau-Pigeon M (2014). Age and strength influences on lingual tactile acuity. Journal of Texture Studies, 45(4), 317–323. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Stewart C, Evans WB, & Fitch JL (1985). Oral form perception skills of stuttering and nonstuttering children measured by stereognosis. Journal of Fluency Disorders, 10(4), 311–316. 10.1016/0094-730X(85)90029-4 [DOI] [Google Scholar]
- Sugden E, Lloyd S, Lam J, & Cleland J (2019). Systematic review of ultrasound visual biofeedback in intervention for speech sound disorders. International Journal of Language & Communication Disorders, 54(5), 705–728. [DOI] [PubMed] [Google Scholar]
- Tremblay S, Shiller DM, & Ostry DJ (2003). Somatosensory basis of speech production. Nature, 423(6942), 866–869. 10.1038/nature01710 [DOI] [PubMed] [Google Scholar]
- Trudeau-Fisette P, Ito T, & Ménard L (2019). Auditory and somatosensory interaction in speech perception in children and adults. Frontiers in Human Neuroscience, 13. 10.3389/fnhum.2019.00344 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Villacorta VM, Perkell JS, & Guenther FH (2007). Sensorimotor adaptation to feedback perturbations of vowel acoustics and its relation to perception. The Journal of the Acoustical Society of America, 122(4), 2306–2319. 10.1121/1.2773966 [DOI] [PubMed] [Google Scholar]
- Zandipour M, Perkell J, Guenther F, Tiede M, Honda K, & Murano E (2006). Speaking with a bite-block: Data and modeling. Proceedings of the 7th International Seminar on Speech Production, 361–368. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Supplemental material 1: Full model output.
Supplemental material 2: Full list of stimuli in the phonetic awareness task.
