Skip to main content
The Journal of the Acoustical Society of America logoLink to The Journal of the Acoustical Society of America
. 2023 Nov 14;154(5):3089–3100. doi: 10.1121/10.0022387

Spectral analysis of strident fricatives in cisgender and transfeminine speakersa)

Nichole Houle 1,b),, Mackenzie P Lerario 2,c), Susannah V Levi 3
PMCID: PMC10651311  PMID: 37962405

Abstract

The spectral features of /s/ and /ʃ/ carry important sociophonetic information regarding a speaker's gender. Often, gender is misclassified as a binary of male or female, but this excludes people who may identify as transgender or nonbinary. In this study, we use a more expansive definition of gender to investigate the acoustics (duration and spectral moments) of /s/ and /ʃ/ across cisgender men, cisgender women, and transfeminine speakers in voiced and whispered speech and the relationship between spectral measures and transfeminine gender expression. We examined /s/ and /ʃ/ productions in words from 35 speakers (11 cisgender men, 17 cisgender women, 7 transfeminine speakers) and 34 speakers (11 cisgender men, 15 cisgender women, 8 transfeminine speakers), respectively. In general, /s/ and /ʃ/ center of gravity was highest in productions by cisgender women, followed by transfeminine speakers, and then cisgender men speakers. There were no other gender-related differences. Within transfeminine speakers, /s/ and /ʃ/ center of gravity and skewness were not related to the time proportion expressing their feminine spectrum gender or their Trans Women Voice Questionnaire scores. Taken together, the acoustics of /s/ and /ʃ/ may signal gender group identification but may not account for within-gender variation in transfeminine gender expression.

I. INTRODUCTION

Gender-affirming voice training involves behaviorally adjusting speech and voice quality to alter the perceived femininity, masculinity, or androgyny of speakers. These adjustments may involve, but are not limited to, pitch, resonance, and intonation (Adler et al., 2018; Davies and Goldberg, 2006), which correspond to the acoustic measures of fundamental frequency (fo), formant frequencies, and fo contours. Particular goals vary across speakers, as gender-affirming care is client-specific and highly individualized.

Vowel articulation has received considerable attention within gender-affirming voice care for transfeminine speakers. Transfeminine speakers are encouraged to use a forward tongue carriage (Carew et al., 2007; Kawitzky and McAllister, 2020) and increase lip spreading (Carew et al., 2007; Cosyns et al., 2014; Kawitzky and McAllister, 2020) to increase their formant frequencies and auditory-perceptions of vocal femininity (Carew et al., 2007; Kawitzky and McAllister, 2020). Additionally, transfeminine speakers are encouraged to speak with greater precision. Focusing on only vowel articulation, this may include using easy onset of phonation (Adler et al., 2018; Davies et al., 2015) and longer vowels (Adler et al., 2018). The rationale for altering articulation and vocal gender expression is based on cisgender speech production and not studies that include transfeminine speakers. This is problematic as it centers anatomy as the primary factor affecting articulation. It excludes the influence of sociolinguistic factors (e.g., speech patterns allowing for group identification within some same-sex speakers) on speech production. Recently, Leyns et al. (2021) reported a bite block may also encourage speakers to raise formant frequencies and result in increased auditory-perceptual ratings of femininity. Taken together, these studies illustrate that vowel articulation is an important aspect of gender-affirming voice feminization, but also highlight a gap in the evidence for addressing consonant articulation in gender-affirming voice care. Within the current study, we report on the acoustics of two fricatives, /s/ and /ʃ/, produced by cisgender men, cisgender women, and transfeminine speakers in both voiced and whispered speech. Within transfeminine speakers, we also investigate the relationship between measures of gender expression and acoustic measures of /s/ and /ʃ/.

A. Vocal tract feature factors and acoustics of /s/ and /ʃ/ in voiced and whispered speech

The strident, voiceless, fricative /s/ has been identified as a sociophonetic cue used by speakers to index several aspects of identity. Differences in fricative production have been reported based on a binary definition of speaker sex [i.e., female versus male, excluding intersex persons (Fox and Nissen, 2005; Fuchs and Toda, 2010; Heffernan, 2004; Romeo et al., 2013)], gender identity [e.g., cisgender, transgender, nonbinary, agender (Hazenberg, 2016; Zimman, 2013)], sexual orientation (Mack and Munson, 2012; Munson et al., 2006; Zimman, 2013), language background (Fuchs and Toda, 2010; Gordon et al., 2002), socioeconomic status (Stuart-Smith, 2007, 2020), region [e.g., urban, rural (Calder and King, 2020; Podesva and Van Hofwegen, 2014)], and race (Calder and King, 2020). In other words, the acoustics of /s/ are not only influenced by anatomical differences but are controlled by speakers as one potential way to express their in-group versus out-group status regarding a particular identity they hold. The phoneme /ʃ/ has not been as closely linked with social identity as /s/, despite binary sex differences being reported (Fox and Nissen, 2005; Jongman et al., 2000; Romeo et al., 2013). The exact reason for this is unclear, but we may speculate that /s/ has a greater contribution to the active construction of gender expression than does /ʃ/. If true, the acoustics of /ʃ/ will provide insight into vocal tract dimensions involved in speech production whereas /s/ will provide insight into the active construction of gender expression.

When producing the fricatives /s/ and /ʃ/, a constriction between the tongue and palate is made either at or behind the alveolar ridge. The length of the resonating cavity in front of the constriction determines the resonance, with the more anterior /s/ having more energy at higher frequencies and the more posterior /ʃ/ having more energy at lower frequencies (Perkell et al., 2004; Yoshinaga et al., 2017). Speakers can also manipulate the length of the front resonating cavity by altering the degree of lip rounding [e.g., /ɑ/ or /i/ (Nittrouer, 1995; Shadle and Scully, 1995)].

Whispered speech is acoustically distinct from voiced speech (Eklund and Traunmüller, 1997; Ito et al., 2005; Jovičić and Šarić, 2008; Sharifzadeh et al., 2012), but speakers alter the vowel space areas similarly regardless of gender (Houle and Levi, 2020). One of key differences between voiced and whispered vowels is that the second formant frequency (F2) is raised in whispered speech (Eklund and Traunmüller, 1997; Houle and Levi, 2020; Ito et al., 2005; Sharifzadeh et al., 2012). One study investigating Polish speech reported that voiceless fricatives were produced with higher frequencies in whispered speech than in voiced speech (Żygis et al., 2017). Taken together, there is converging evidence that suggests speakers may use a more forward tongue carriage and a smaller anterior oral cavity when producing whispered speech compared to voiced speech. To further study this association between vocal tract factors and the gendered qualities of speech production, we examined the acoustic properties (duration, center of gravity, variance, skew, kurtosis) of these two strident, voiceless fricatives in voiced and whispered speech. Our prespecified hypothesis is that /s/ may be more strongly related to the sociophonetic representation of gender expression and /ʃ/ more strongly related to properties of the vocal tract. Therefore, we expected that the acoustics of /s/ would be affected by the interaction of gender expression and speech condition, but the acoustics of /ʃ/ would not. Further, this study examined the relationship between two of these spectral characteristics (center of gravity and skew) of /s/ and /ʃ/ with feminine gender expression (proportion of time spent expressing a feminine spectrum gender, satisfaction with feminine vocal gender expression) within transfeminine speakers.

The following describes studies that vary in their terminology around the constructs of sex and gender. Unless otherwise specified, we will assume that the studies used a binary definition for sex and gender and presented participants with only two options for sex: female or male. To avoid imposing a gender identity, we will refer to participants in these studies by their sex assigned at birth (assigned female, assigned male). We will only use gendered terms (e.g., cisgender woman, transgender man) when referring to studies where gender identity and expression are described using gender-inclusive terminology.

B. Sociocultural variables in sex differences in /s/ and /ʃ/ fricatives

The spectral frequencies of /s/ and /ʃ/ may vary based on speaker sex. In English speaking adults, speakers assigned female tend to produce these phonemes with higher frequencies than those assigned male (Fox and Nissen, 2005; Fuchs and Toda, 2010; Heffernan, 2004; Jongman et al., 2000; Romeo et al., 2013). This has been attributed to the role of sexual dimorphism on vocal tract length (Fox and Nissen, 2005; Kahane, 1982), but the degree of the difference may be moderated by social or cultural factors (Avery and Liss, 1996; Calder and King, 2020; Li et al., 2016; Podesva and Van Hofwegen, 2014). Fuchs and Toda (2010) used electropalatography to investigate articulatory and acoustic differences in production of /s/ based on sex within English and German speakers. Although there were no sex differences in palate length, adults assigned female produced /s/ more anteriorly and with higher frequencies than those assigned male. This suggests that beyond anatomical differences between speakers assigned female and assigned male, social or cultural factors may influence /s/ and /ʃ/ production.

Sex differences in voiceless fricatives may differ cross-linguistically (Fuchs and Toda, 2010; Gordon et al., 2002; Heffernan, 2004). For example, Fuchs and Toda (2010) reported that English speakers had a larger sex difference in /s/ productions than German speakers. Similarly, Heffernan (2004) reported that English speakers had larger sex differences in /s/ productions than Japanese speakers. In contrast, Gordon et al. (2002) examined the voiceless fricatives (including /s/ and /ʃ/) of seven languages and reported sex differences in only one language. Given that definitions of gender may vary across cultures (Lang and Kuhnle, 2008), cross-linguistic differences may reflect cultural differences in the understanding and expression of gender.

For English speakers, sex differences in fricative production emerge early in life, even before the development of sexually dimorphic changes to the vocal tract (Munson et al., 2022). Munson et al. (2022) examined the production of speech in preschool children as young as 2.5 years. Children assigned female produced /s/ with higher frequencies than children assigned male. This is consistent with previous studies that identified sex differences in fricatives in speakers as young as 4 (Li et al., 2016), consistent with the onset of gender identity development (Bem, 1993). The magnitude of the frequency differences in /s/ increases as speakers age (Fox and Nissen, 2005; Li et al., 2016; Nittrouer, 1995; Romeo et al., 2013).

Fricative production also varies within sex. Avery and Liss (1996) investigated auditory-perceptual ratings of masculinity and their acoustic correlates within speakers assigned male. Speakers were recorded reading a standardized passage and listeners rated the masculinity of the recordings. Based on the masculinity ratings, speakers were divided into two groups: (a) less masculine and (b) more masculine. The results indicated that the more masculine group produced /s/ with lower frequencies than the less masculine speakers. The two groups did not differ in their /ʃ/ acoustics.

That said, sex differences in the production of /s/ in English may be moderated by racial identity. A recent study by Calder and King (2022) examined the acoustics of /s/ in African American and White speakers from Bakersfield, CA. The results indicated that White speakers assigned female produced significantly /s/ with significantly higher frequencies than White speakers assigned male. In contrast, African American speakers assigned male and assigned female did not differ from each other. Further, African American speakers assigned male produced /s/ with similar frequencies to African American and White speakers assigned female, suggesting that White speakers assigned male were using a speech pattern that diverged from other members of society. Given that many of the studies cited in this paper do not report racial identity, we cannot comment on whether this is a common pattern found across speaker race. A similar trend was noted by Hazenberg (2016) when comparing /s/ productions by heterosexual cisgender men to other speaker groups. He hypothesized that by using lower /s/ frequencies, heterosexual cisgender men were separating themselves from non-heterosexual cisgender men speakers to preserve the power and social prestige inherent to their identities (Hazenberg, 2016). Overall, sex differences in the production of /s/ in English are likely the result of the multi-dimensional construction of identity rather than pre-determined by vocal tract features.

C. Sociocultural variables of gender expression and /s/ and /ʃ/ acoustics

The acoustics of /s/ and /ʃ/ signal more than sex assigned at birth and may be linked with other sociocultural variables, such as gender expression. Gender refers to the biological and social constructs used to describe identities and expression based on genetic, neuroanatomical, hormonal, biological, social, behavioral, cultural, and psychological traits (Butler, 1988, 2004; Muehlenhard and Peterson, 2011; Munson and Babel, 2019; Polderman et al., 2018; Ristori et al., 2020). Gender identity refers to a person's deeply-held, core concept of self, and gender expression is the manner in which a person signals their gender to the world around them (Butler, 2004). Interestingly, /s/ and /ʃ/ may vary based on a speaker's gender expression and signal identification within a social group.

Zimman (2013) investigated differences in /s/ production across speakers with masculine spectrum gender identities who varied in their sexual orientations. Three speaker groups were investigated: heterosexual cisgender men, homosexual cisgender men, and heterosexual transgender men. Speakers were recorded reading a standardized passage and only /s/ acoustics were measured. The results indicated that heterosexual cisgender men and transgender men produced /s/ with lower frequencies than homosexual cisgender men. Heterosexual cisgender men and transgender men /s/ productions did not differ from each other. Considering that transgender men were assigned female, this result indicates that /s/ production varied by gender identity and sexual orientation rather than sex assigned at birth. One caveat was that the use of gender-affirming testosterone was not reported within the sample of transgender men. Gender-affirming testosterone has been associated with acoustic differences (Hodges-Simeon et al., 2021; Ziegler et al., 2018) indicative of anatomical changes to the larynx and vocal tract consistent in speakers who were assigned female. It is unlikely that gender-affirming testosterone affected /s/ production though. More recently, Zimman (2017) identified only a weak, positive correlation between /s/ production and gender-affirming hormones. The correlation was primarily attributed to speaker-specific changes. After at least 54 week on testosterone, only three out of 15 speakers produced /s/ with significantly lower frequencies. Taken together, the results indicated that sexual orientation rather than sex assigned at birth or the effects of gender-affirming testosterone accounted for differences in /s/ production within people who have masculine spectrum gender identities.

Expanding on this, Hazenberg (2016) examined the intersection of gender expression and sexual orientation on /s/ production. Speakers identified as heterosexual cisgender women, queer women, transgender women, heterosexual cisgender men, queer men, or transgender men. Although this study was inclusive of diverse sexual orientations and gender identities, it is unclear if the participants who identified as queer women and queer men identified as queer cisgender or queer transgender individuals, or both. Recordings of spontaneous speech were collected and productions of /s/ from the word “so” were analyzed. The results indicated that speakers with feminine gender identities produced /s/ with higher frequencies than speakers with masculine gender identities. The three groups with feminine spectrum gender identities produced higher frequencies than the three groups of masculine spectrum gender identities, indicating that speakers were signaling within-group identification for gender identity and sexual orientation within their /s/ productions. Further, heterosexual cisgender speakers produced the most extreme values, potentially differentiating themselves from the other sexual orientation and gender groups to reinforce the social prestige of hetero- and cis- normativity.

In summary, fricative production is affected by anatomical factors, such as vocal tract length and oral cavity size, and cultural and social factors, such as language background, gender identity, and sexual orientation. Speakers with feminine spectrum gender identities tend to use higher frequencies to produce /s/ and /ʃ/ than speakers with masculine spectrum gender identities. For the current study, we are interested in the duration and spectral characteristics of /s/ and /ʃ/ production across gender identity and whether gender identity is also signaled in whispered speech. We are interested in how two of those spectral characteristics related to gender expression (proportion of time spent expressing a feminine spectrum gender and satisfaction with vocal gender) within transfeminine speakers.

D. Current study

The current study investigated gender expression on duration and spectral features of phonologically voiceless, strident fricatives /s/ and /ʃ/ between voiced and whispered speech. Spectral analyses of fricatives typically examine four features: center of gravity (COG), variance, skewness, and kurtosis. As described in Jongman et al. (2000), COG reflects the frequency at which there is a primary concentration of energy and the variance is the standard deviation or the distribution of acoustic energy around mean. Skewness is an indicator of the distribution's symmetry or the spectral tilt. Positive skewness will indicate that there is a concentration of energy in lower frequencies and a negative spectral tilt whereas negative skewness indicates a concentration of energy in the higher frequencies and a positive spectral tilt. Kurtosis is an indicator of the peakedness of the distribution. Greater or positive kurtosis indicates a clearly defined distribution. Smaller values or negative kurtosis reflects a relatively flat distribution without clearly defined peaks (Forrest et al., 1988; Jongman et al., 2000). COG and skewness are the spectral measures most commonly linked to sex and gender differences where a higher COG and more skewed distribution are associated with feminine gender identities (Flipsen et al., 1999; Fox and Nissen, 2005; Fuchs and Toda, 2010; Heffernan, 2004). Gender expression was examined across three groups: cisgender men, cisgender women, and transfeminine speakers. Consistent with previous findings, we expect cisgender men speakers will produce /s/ and /ʃ/ with the lowest COG and least skew (closer to 0) whereas cisgender women speakers will produce /s/ and /ʃ/ with the highest COG and greatest skew. Transfeminine speakers will produce /s/ and /ʃ/ with greater COG and more skew than cisgender men, but less than cisgender women speakers, although we are unsure these differences will be significant. We do not expect to find gender differences in the duration, variance, or kurtosis of /s/ and /ʃ/ productions. Additionally, we expect /s/ and /ʃ/ will be longer in whispered speech and characterized by higher COGs with greater skew. We do not expect variance or kurtosis to differ within whispered speech. Further, we expect that the acoustics of /s/ will be affected by the interaction of gender identity and speech condition, but the acoustics of /ʃ/ will not, based on our previous speculation that /s/ was more strongly related to the sociophonetic representation of gender identity and /ʃ/ was more strongly related to properties of the vocal tract.

The secondary aim of this study was to examine the effect of gender expression within transfeminine speakers on the COG and skewness of /s/ and /ʃ/. Gender expression was measured in two ways. The first measure was the self-reported amount of time per week the speaker spent expressing their feminine spectrum gender identity. We hypothesized that speakers who spent more time expressing their feminine gender identity were more likely to produce /s/ and /ʃ/ with higher COG and greater negative skew than transfeminine speakers who spent less time expressing their feminine gender identity. The second measure was a standardized questionnaire about satisfaction with transfeminine gender expression and voice, the Trans Women Voice Questionnaire (TWVQ), formerly named Transsexual Voice Questionnaire Male to Female (Dacakis et al., 2017; Dacakis et al., 2016). Lower scores indicate fewer transfeminine voice concerns. Transfeminine speakers with lower TWVQ scores may report fewer voice concerns and therefore, may experience less gender dysphoria regarding their voice and are more satisfied with their gender expression (Novais Valente Junior and Mesquita de Medeiros, 2020). Assuming that transfeminine speakers who are more satisfied with their vocal gender expression use more feminine speech patterns than transfeminine speakers who are less satisfied, we hypothesize that transfeminine speakers with lower TWVQ scores will produce /s/ and /ʃ/ with higher COG and greater negative skew than transfeminine speakers with higher TWVQ scores.

II. METHODS

A. Sample

Speakers were chosen from a corpus of normally phonated and whispered speech collected from 2015 to 2017. Inclusion criteria are as follows: (a) native speakers of American English, (b) never lived outside the USA for more than 6 months, and (c) 18–40 years of age. Based on these criteria, there were a total of 36 speakers that met these criteria. All speakers except for one transfeminine speaker denied a history of speech, language, and hearing services. Race and ethnicity are reported in Table I. The number of speakers differed across both word lists because not all speakers had a set of whispered sVd or shVd words. As a result, the /s/ analysis included 35 speakers (11 cisgender men, 17 cisgender women, 7 transfeminine speakers) and the /ʃ/ analysis included 34 speakers (11 cisgender men, 15 cisgender women, 8 transfeminine speakers). Speaker age and height are provided for each by analysis in Table II.

TABLE I.

Race and ethnicity demographic information for all speakers.

Categories Cisgender men (n = 11) Cisgender women (n = 17) Transfeminine speakers (n = 8)
Race Asian 1 2 1
Black or African American 2 0 0
White 6 11 4
More than one racea 1 3 3
Do not wish to respond 1 1 0
Ethnicity Hispanic or Latino 3 2 2
Not Hispanic or Latino 8 13 4
Do not wish to respond 0 2 2
a

Two participants identified with two racial categories, one as Black or African American and White, and the second as Asian and Native Hawaiian or other Pacific Islander. These speakers were recategorized by the researchers into “more than one race.”

TABLE II.

Speaker age and height by gender identity and fricative. Means and standard deviations are provided in parentheses.

Group /s/ /ʃ/
Age (years) Cisgender men 25 (5.4) 25 (5.4)
Cisgender women 24 (6.3) 24 (6.5)
Transfeminine speakers 29 (7.8) 28 (8.0)
Height (inches) Cisgender men 70 (3.4) 70 (3.4)
Cisgender women 64 (2.9) 64 (2.9)
Transfeminine speakers 70 (1.3) 69 (3.0)

Transfeminine speakers answered additional questions about gender expression and completed the TWVQ, formerly named Transsexual Voice Questionnaire Male to Female (Dacakis et al., 2017; Dacakis et al., 2016). Additional questions about gender expression included: time spent presenting feminine gender, history of gender-affirming hormone therapy, and history of participation in behavioral gender-affirming voice services. Although transfeminine speakers varied in their history with gender-affirming hormone therapy (typically estrogen and progesterone), feminizing hormones have little to no effect on the structures involved in voice and articulation (T'Sjoen et al., 2019). As only four of the transfeminine speakers included in this analysis received gender-affirming voice services, we did not examine the relationship between gender-affirming voice services on /s/ and /ʃ/ production. The TWVQ is a validated questionnaire composed of 30 questions related to transfeminine voice use to gather self-perception of voice. Lower TWVQ scores indicated fewer transfeminine voice concerns. One transfeminine speaker also reported a history of a lisp with speech therapy. There were no perceptual errors at the time of recording. Transfeminine speaker specific demographic information is presented in Table III.

TABLE III.

Percentage of time spent expressing their feminine spectrum gender identity and TWVQ. N/A, not applicable.

Transfeminine speaker /s/ analysis /ʃ/ analysis Time expressing feminine gender identity Gender-affirming voice services TWVQ score
1 Yes Yes 20% N/A 97
2 Yes Yes 90% 7 months 54
3 Yes Yes 0% N/A 67
4 Yes Yes 90% N/A 80
5 No Yes 5% N/A 47
6 Yes Yes 100% 3 months 86
7 Yes Yes 75% 2 months 74
8 Yes Yes 0% 2 months 57

Transmasculine speakers were excluded from the current study for several reasons. First, there were only four speakers who identified as transmasculine within our corpus of speech. This was partially due to recruitment methods designed to recruit transfeminine speakers. Transfeminine speakers are more likely to seek gender-affirming voice care from a speech-language pathologist than transmasculine speakers, in part because gender-affirming testosterone has a masculinizing effect on the voice (Bultynck et al., 2017; Hodges-Simeon et al., 2021). Additionally, transfeminine speakers may experience greater pressure to express their feminine gender identity in a manner similar to cisgender women due to the intersection of transgender identity and misogyny, known as transmisogyny (Serano, 2007).

B. Stimuli

Speakers completed two days of recordings. Words were presented on a screen and speakers were asked to read the words using voiced speech (first day of recording) and then whispered (second day of recording). Two wordlists (see Table IV) were used to gather three repetitions of initial /s/ and /ʃ/ in twelve vowel contexts when followed by a voiced or whispered vowel resulting in 72 tokens (12 words × 3 repetitions × 2 speech conditions) per word list per speaker. Speakers were instructed to read the words at a comfortable pitch and loudness when producing voiced speech. In whispered speech, speakers were provided a model and were asked to demonstrate a quiet whisper to ensure that participants did not produce a stage whisper. Stimuli from each wordlist were presented in random order. Productions were judged to be perceptually accurate by consensus of two listeners trained in phonetics. While syllable structure and manner of articulation of subsequent phonemes are mostly consistent across lists, we prioritized using real words in the sVd and ʃVd wordlists when possible. Although there are differences in the phonetic contexts between wordlists, analyses are made within each wordlist, comparing voiced and whispered productions of the same vowel.

TABLE IV.

Stimuli presented by wordlist.

sVd ʃVd
seed she'd
sid ship
say shade
said shed
sad shab
sud shut
sowed showed
sued shoed
sod shod
sawed shaw
sir shirt
soot should

C. Acoustic analysis

Acoustic analyses were conducted using praat (Boersma and Weenink, 2017) for duration and a series of r scripts (Reidy, 2013) to conduct a multitaper spectral analysis to derive the spectral moments (COG, variance, skewness, and kurtosis). A multitaper spectral analysis was chosen as a more robust alternative to a single DFT given the limitations identified by Shadle (2023). Multitaper spectra were created using the middle 40 ms of the fricative. Each fricative was down sampled to 22 050 Hz and a high-pass filter at 1000 Hz was applied prior to measuring the spectral moments. Fricatives boundaries, defined by the onset and offset of frication, were marked within praat by a research assistant. Tokens were removed from the analysis due to clicks (visualized as high energy across frequencies) or inability to distinguish the onset of frication from preceding silence or ambient noise (specifically for whispered speech). All boundaries were reviewed by the primary investigator. The fricatives were extracted from the sound file and saved into individual sound files prior to running the multitaper script.

D. Statistical analysis

Separate linear mixed-effects models were conducted to examine differences in fricative duration, spectral moments (COG, variance, skewness, kurtosis) for each wordlist (sVd, ʃVd) using the lme4 package (Bates et al., 2015) in r (RStudio, 2012). Each model included fixed effects for speaker group (cisgender men, cisgender women, transfeminine speaker), speech condition (voiced, whispered), and their two-way interaction. The models included by-speaker slopes and intercepts for speech condition and by-vowel slopes and intercepts. All fixed effects were sum-coded. Significant effects (p ≤ 0.05) were investigated using the emmeans function (Lenth, 2018) and adjusted p-values are denoted as p*. Omega squared effect sizes (ω2) with bias correction for small sample sizes were calculated for significant effects using the omega_squared() function in the effectsize package (Ben-Shachar et al., 2020) and effect sizes of 0.02, 0.13, 0.16 were interpreted as small, medium, and large effects, respectively (Cohen, 1992).

To investigate the relationships between two fricative acoustic measures (COG and skewness) and vocal gender expression, linear mixed-effects models were conducted. Each model included either a fixed effect for percentage of time spent expressing feminine gender or TWVQ score, speech condition (voiced, whispered), and their two-way interaction. The models included by-speaker slopes and intercepts for speech condition and by-vowel slopes and intercepts. All categorical fixed effects were sum-coded. As above, Omega square effect sizes were calculated for significant fixed effects. Full model outputs for all models and data visualizations are provided in supplementary material.1

III. RESULTS

A. Acoustics of /s/ and /ʃ/

Descriptive statistics for each of the acoustic measures for /s/ and /ʃ/ speech are presented in Tables V and VI, respectively.

TABLE V.

Mean (standard deviation) for duration and spectral moments for /s/. Standard deviations are presented in parentheses.

Speech condition Speaker group Duration (ms) COG (Hz) Variance (Hz) Skew Kurtosis
Voiced Cisgender man 204 (45) 6649 (1234) 1689 (380) 0.57 (1.19) 3.87 (6.69)
Cisgender woman 189 (36) 8079 (1120) 1677 (474) −0.26 (1.10) 4.92 (9.56)
Transfeminine speakers 185 (38) 6902 (1126) 1791 (438) 0.13 (1.09) 3.25 (6.72)
Whispered Cisgender man 184 (40) 5798 (1052) 1587 (525) 2.06 (2.32) 21.00 (38.22)
Cisgender woman 170 (33) 7150 (1127) 1784 (588) 0.53 (1.83) 9.60 (30.73)
Transfeminine speakers 172 (42) 6219 (1048) 1954 (554) 0.94 (1.68) 9.54 (20.37)

TABLE VI.

Mean (standard deviation) for duration and spectral moments for /ʃ/. Standard deviations are presented in parentheses.

Speech condition Group Duration (ms) COG (Hz) Variance (Hz) Skew Kurtosis
Voiced Cisgender man 190 (50) 3327 (405) 1066 (322) 3.27 (1.18) 24.04 (22.27)
Cisgender woman 187 (32) 3582 (406) 968 (236) 3.15 (1.37) 24.48 (28.03)
Transfeminine speakers 180 (35) 3402 (312) 852 (192) 3.70 (1.46) 35.99 (30.31)
Whispered Cisgender man 168 (34) 3100 (274) 923 (244) 5.81 (2.70) 79.05 (66.02)
Cisgender woman 165 (35) 3369 (353) 857 (216) 5.63 (2.53) 81.23 (73.91)
Transfeminine speakers 166 (34) 3133 (308) 813 (312) 7.65 (3.98) 140.34 (141.24)

For /s/ and /ʃ/, duration was significantly shorter in the whispered condition than the voiced condition (/s/: p < 0.001, ω2 = 0.29; /ʃ/: p = 0.001, ω2 = 0.32). There were no significant effects of speaker group or two-way interactions between speaker group and speech condition (all p > 0.282).

For the COG of /s/ and /ʃ/, the main effects of speaker group (/s/: p < 0.001, ω2 = 0.43; /ʃ/: p = 0.037, ω2 = 0.13) and speech condition (/s/: p < 0.001, ω2 = 0.67; /ʃ/: p < 0.001, ω2 = 0.67) were significant, but not their interaction (both p > 0.495). As expected, cisgender women speakers produced /s/ with significantly higher COGs than cisgender men and transfeminine speakers (both p* < 0.007) who did not differ from each other (p* = 0.577). Cisgender women speakers also produced /ʃ/ with significantly higher COGs than cisgender men speakers (p* = 0.047). Transfeminine speakers did not differ from either cisgender men or cisgender women speakers (both p* > 0.240). When producing /s/ and /ʃ/, speakers had a higher COG in voiced speech than in whispered speech by an average of 843 and 252 Hz, respectively.

For the variance of /s/, there was a significant two-way interaction between speaker group and speech condition (p = 0.041, ω2 = 0.12), but the variance did not differ by either speaker group or speech condition alone (both p > 0.202). Post hoc pairwise comparisons of the interaction indicated that within speaker group, the /s/ variance did not change significantly from voiced to whispered speech (all p* > 0.086), but the direction of the change differed across speaker groups. When comparing voiced and whispered productions, the variance decreased for cisgender men and increased for cisgender women and transfeminine speakers. In other words, cisgender men speakers had had more energy whereas cisgender women and transfeminine speakers had less energy centered around the COG in their /s/ productions in voiced speech than in whispered speech. The variance of /ʃ/ differed by speech condition (p = 0.003, ω2 = 0.22), but not speaker group or their two-way interaction (both p > 0.099). Productions of /ʃ/ had greater variance, or a greater spread of energy around the COG, in voiced speech than in whispered speech (p* = 0.003).

Skewness of energy within /s/ and /ʃ/ productions were not consistently affected by speaker group or speech condition. Productions of /s/ differed by speaker group (p = 0.003, ω2 = 0.24), speech condition (p < 0.001, ω2 = 0.60), but not the interaction between speaker group and speech condition (p = 0.052). Cisgender women speakers produced /s/ with more negative skewness or greater spectral tilt than cisgender men (p* = 0.003), indicating that cisgender women speakers produce /s/ with greater concentration of energy in higher frequencies than cisgender men, similar to the COG results. Transfeminine speakers produced /s/ with similar degrees of skewness as cisgender men (p* = 0.681) and cisgender women (p* = 0.554). Productions of /s/ in voiced speech had greater negative skew than in whispered speech (p* < 0.001). The skewness of the energy in /ʃ/ was only affected by speech condition (p < 0.001, ω2 = 0.69). Productions of /ʃ/ in voiced speech were less positively skewed than in whispered speech (p* < 0.001). Neither speaker group nor the interaction between speech condition and speaker group were significant (both p > 0.076).

There was a significant main effect of speech condition (p < 0.001, ω2 = 0.38) and two-way interaction between speaker group and speech condition (p = 0.013, ω2 = 0.17) for the kurtosis of /s/, but no significant main effect of speaker group (p > 0.152). Post hoc analysis indicated that kurtosis was significantly greater in whispered speech than in voiced speech (p* < 0.001). The interaction between speaker group and speech condition was driven by cisgender men who had the largest difference in the kurtosis or peakedness of their productions between speech conditions (p* < 0.001). Neither cisgender women nor transfeminine speakers significantly differed in the kurtosis of /s/ productions between speech conditions (both p* > 0.092).

Kurtosis of /ʃ/ differed by speaker group (p = 0.023, ω2 = 0.15) and speech condition (p < 0.001, ω2 = 0.69), but not their interaction (p = 0.060). Transfeminine speakers produced /ʃ/ with the greatest peakedness that differed significantly from cisgender women productions (p* = 0.035), but not cisgender men productions (p* = 0.061). Cisgender men and cisgender women did not produce /ʃ/ with significantly different peakedness (p* = 0.997). Productions of /ʃ/ were significantly more peaked in whispered speech than in voiced speech (p* < 0.001).

B. Relationship between fricative acoustics and vocal gender expression

A secondary aim of the analysis was to examine two spectral moments and their relationship with vocal gender expression. Vocal gender expression in transfeminine speakers was measured as either the self-reported percent of time spent expressing feminine spectrum gender or self-reported satisfaction with voice as measured by the TWVQ. As /s/ and /ʃ/ COG and skew differed by speech condition, speech condition and the two-way interaction between vocal gender expression and speech condition were included in all models as control variables.

There were no significant relationships between the two spectral measures of /s/ and /ʃ/, COG and skew, and vocal gender expression in transfeminine speakers, nor was there an interaction with speech condition (all p > 0.288). Given the results of the acoustic analysis, it is not surprising that speech condition was a significant predictor of COG or skew in several models (see Table VII). As speech condition was only included to control for acoustic differences in COG and skew, we will not discuss this further.

TABLE VII.

P-values (effect sizes) from models investigating the relationship between gender expression and two spectral measures of /s/ and /ʃ/.

Phoneme Spectral measure Gender expression measure Gender expression Speech condition Interaction
/s/ COG Percentage of timea 0.929 0.010 (0.55) b 0.503
TWVQ score 0.705 0.226 0.677
Skew Percentage of timea 0.614 0.047 (0.35) b 0.786
TWVQ score 0.619 0.104 0.288
/ʃ/ COG Percentage of timea 0.818 0.005 (0.59) b 0.554
TWVQ score 0.378 0.566 0.585
Skew Percentage of timea 0.522 0.006 (0.55) b 0.497
TWVQ score 0.314 0.058 0.288
a

Percentage of time spent expressing feminine gender identity.

b

P-values significant at p < 0.05. Omega effect sizes for significant predictors are provided in parentheses.

IV. DISCUSSION

Historically in the peer-reviewed literature, sex and gender erroneously have been conflated and constrained to binary definitions where research participants are only permitted to self-identify as male or female based on their sex assigned at birth (Muehlenhard and Peterson, 2011). Such research excludes the experiences of trans, nonbinary, and intersex people (Butler, 1988, 2004). To help understand the unique client experiences of and needs for speech-language pathology services within transfeminine people, our study included multiple transgender participants who represent a diverse range of gender expressions across the feminine spectrum.

The current study examined gender differences across the duration and spectral features of /s/ and /ʃ/ within voiced and whispered speech. Gender differences were found in the COG of /s/ and /ʃ/ with medium to large effect sizes. As expected, the skewness of /s/ differed by gender. Together, these results support the presence of gender differences in the acoustics of /s/ and /ʃ/. Despite the phonologically voiceless production of /s/ and /ʃ/, both phonemes differed by speech condition with large effect sizes. Specifically, voiced productions had longer durations, higher COGs, were more negative or less positively skewed, and lower kurtosis than whispered productions. Consistent with our expectations, only /s/ acoustics had significant interactions between speaker group and speech condition. Specifically, transfeminine speakers productions of /s/ had the largest change in variance and cisgender men had the largest change in kurtosis between speech conditions, suggesting that gender differences in voiced speech were not consistently produced in whispered speech. Further, our results provide further insight into how voice may be used to express gender and which speech characteristics contribute to vocal satisfaction within transfeminine speakers. Exploring the sociophonetic markers of gender and their relationship with transfeminine gender identities provides critical insight into gender expression and may inform the delivery of specialized speech-language pathology services, such as gender-affirming voice care.

A. Acoustics of /s/ and /ʃ/

Duration of /s/ and /ʃ/ differed by speech condition, but not speaker group. This is unsurprising as previous research rarely reports sex differences (assigned male versus female) for duration of /s/ and /ʃ/ (Fox and Nissen, 2005; Gordon et al., 2002). Also, whispered speech is acoustically distinct from voiced speech. Phonologically voiceless phonemes tend to be altered to a lesser degree than phonologically voiced phonemes, but whispered speech tends to be longer than voiced speech (Jovičić and Šarić, 2008). Our results contradict these previous findings, although the average difference for /s/ and /ʃ/ were 17and 15 ms, respectively. Of note, reducing the duration of a fricative affects the spectral frequencies produced by a speaker, typically resulting in lower COGs and reduced negative skew (Calder, 2019).

The COG reflects the spectral frequencies with the greatest intensity within a fricative. Our results were consistent for /s/ and /ʃ/, indicating that both voiceless, strident fricatives differed across gender groups. As expected, cisgender women speakers produced the highest COG. Interestingly, cisgender men and transfeminine speakers produced similar COGs, although transfeminine speaker /ʃ/ COG did not differ from either cisgender women or cisgender men. The cisgender men speakers produced /s/ with higher COGs than those produced by heterosexual cisgender men reported in Hazenberg (2016), but was consistent with the COG produced by heterosexual cisgender men in Zimman (2013). The similarity between the /s/ COG produced by cisgender men and transfeminine speakers may partially relate to the intersection of sexual orientation and gender identity. The cisgender men speaker group included not only heterosexual speakers, but also, speakers who may identify as asexual, bisexual, homosexual, pansexual, or another orientation not listed here. Unfortunately, we cannot speculate further because at the time of recordings, sexual orientation was not collected.

In addition to group differences, COG was lower in whispered speech than in voiced speech. Across gender groups, COG for /s/ and /ʃ/ were reduced by approximately 20% and 10% within whispered speech, respectively. As COG is typically related to oral cavity size (Perkell et al., 2004; Shadle, 1990), a lower COG suggests that speakers have larger anterior oral cavity sizes in whispered speech than in voiced speech. This is surprising as vowel formant frequencies, particularly the second formant frequency, tend to raise in whispered speech (Eklund and Traunmüller, 1997; Houle and Levi, 2020; Ito et al., 2005; Sharifzadeh et al., 2012). Producing a higher second formant frequency indicates the speaker is using a more anterior tongue placement (Carew et al., 2007; Kawitzky and McAllister, 2020) and thus, shorter anterior oral cavity length. Most likely, the lower COG was associated with the shorter fricative durations in whispered speech (Calder, 2019).

Variance reflects the distribution of energy around the COG. In contrast to previous research (Jongman et al., 2000), our data did not reveal a difference in variance by speaker group. On average, variance was greater for /ʃ/ in voiced speech than in whispered speech indicating a greater spread of energy in voiced speech. In general, we can attribute this change to the features of whispered speech. Given the lack of vocal fold vibration and reliance on only supraglottic constrictions as a sound source, whispered speech is lower in intensity than voiced speech (Ito et al., 2005; Jovičić and Šarić, 2008). To maintain the phonemic target in a speech signal that is generally less intense, speakers may need to concentrate the energy in the spectrum to maintain the intended acoustic target for listener perception. In other words, the changes in intensity may contribute to changes in the concentration of energy.

Consistent with previous research (Fox and Nissen, 2005; Jongman et al., 2000), cisgender women speakers produced /s/ with the greater negative or less positive skew and cisgender men speakers produced /s/ with the least negative or more positive skew. In contrast to our expectations, there was no interaction between speaker group and speech condition, indicating that this gender difference was maintained in both voiced and whispered speech. The reduction in skew reflects a more shallow spectral tilt indicating less concentration of energy within higher frequencies. It is difficult to suggest a single reason for why skewness may decrease between speech conditions. The decrease may be related to the decrease in COG which can account for less energy in the higher frequencies of /s/ in whispered speech. Additionally, the variance of /s/ decreased indicating that speakers produced /s/ with less concentration of energy in whispered speech than in voiced speech. Taken together, speakers used /s/ in the construction of gender identity and were able to maintain these differences between voiced and whispered speech.

As predicted, the skewness of /ʃ/ did not differ by gender but was affected by speech condition. These results are not surprising given our previous hypothesis that /ʃ/ is more closely tied to vocal tract features rather than the sociophonetic construction of gender identity. Also, the skewness of /ʃ/ has not been associated with sex differences (Fox and Nissen, 2005). Given that /ʃ/ tends to be positively skewed, indicative of more energy in lower frequencies than in higher frequencies, it is interesting that the positive skew was increased within whispered speech. Similar to /s/, there are most likely several contributing factors for the increase in skewness, such as changes to COG, intensity, and sound source for the production of whispered speech.

As both /s/ and /ʃ/ are strident fricatives, we expected relatively high levels of kurtosis. High values of kurtosis indicate that peak of the distribution is well defined. Kurtosis is not always reported in spectral analyses of /s/ and /ʃ/ [e.g., Li et al. (2016)], but speakers assigned female tend to produce similar (Fox and Nissen, 2005) or higher values for kurtosis (Jongman et al., 2000) than speakers assigned male. Kurtosis was significantly greater for /s/ and /ʃ/ in whispered speech, indicating a more defined distribution of energy in whispered speech. While there were no differences in the kurtosis of /s/ productions across speaker group, cisgender men had the largest change in the peakedness of their productions between speech conditions. In contrast, cisgender women had significantly less peaked distributions than transfeminine speakers, but gender identity was marked similarly between speech condition. It is unclear how to interpret this finding, given that previous studies tend to report that the kurtosis of /s/ is higher than that of /ʃ/ (Avery and Liss, 1996; Jongman et al., 2000; Nittrouer, 1995), although similar values were reported for adult speakers (no information on sex provided) by Fox and Nissen (2005).

B. Relationship between fricative acoustics and vocal gender expression

The secondary aim of this study was to examine the relationships between transfeminine gender expression and two spectral measures of /s/ and /ʃ/. Contrary to our hypothesis, neither time spent expressing a feminine spectrum gender nor TWVQ scores were related to the COG or skewness of /s/ and /ʃ/ productions. This is unsurprising given the highly individualized nature of gender identity and expression and how these relate to speech production. We used a small sample size of transfeminine speakers, and thus, our findings are not generalizable to a larger population. Further research could elucidate the contribution of individual variation in transfeminine vocal gender expression to our results by using validated measures of gender identity or expression and by classifying transgender women and nonbinary transfeminine speakers as separate samples [e.g., The GenderQueer Identity Scale (McGuire et al., 2019)]. Additionally, many speech parameters can be altered to express gender (Calder, 2019; Cartei et al., 2019). It is possible transgender women and transfeminine nonbinary speakers will focus on different aspects of their voice (Hosbach-Cannon et al., 2022). Emerging research including nonbinary speakers suggests that nonbinary speakers may use fine-grained phonetic detail to express gender identity, signaling that they exist outside of the gender binary, as would be expected (Merritt, 2023).

C. Limitations

The current study has several limitations given the small sample size of transfeminine speakers (/s/: n = 7; /ʃ/: n = 8) and the lack of inclusion of transmasculine speakers. A small sample size of transfeminine speakers limited the generalizability of the current study. This may be because we did not have a sufficient number of speakers to identify a group trend or the transfeminine population may be composed of heterogeneous speakers. Our study was not powered to analyze the speech patterns of transgender women and transfeminine nonbinary speakers separately. Nonbinary people comprise approximately a third of the gender minority community and report unique health needs (Scandurra et al., 2019) and unique experiences with gender dysphoria (Galupo et al., 2021) than do transgender men and women who identify within the gender binary. Given that not all transgender persons experience gender dysphoria and that transgender women and transfeminine nonbinary speakers may differ in their desired voice quality (Kennedy and Thibeault, 2020), the generalizability of our results are limited. Further research should be conducted to explore the relationship between spectral characteristics of /s/ and /ʃ/ and voice related concerns within transgender speakers with larger sample sizes. Despite our small sample, it is actually similar to other studies (Hazenberg, 2016; Zimman, 2013).

The exclusion of transmasculine speakers limits the generalization of these findings. As gender expression and acoustic measures do not have a one-to-one relationship, we cannot extrapolate whether or not transmasculine speakers differentiate their /s/ and /ʃ/ production from the speakers or that /s/ and /ʃ/ production would be related to masculine gender expression and related voice concerns. Future research should examine the perception of a speaker's intended gender expression based on /s/ and /ʃ/ production and how speech-language pathologists can best incorporate the use of gendered consonant sounds within their transgender and nonbinary clients' treatment goals. Additionally, further efforts should be taken by researchers in general to include members of the transgender and gender nonconforming community to participate in all aspects of research [i.e., community-based participatory action research (Adams et al., 2017)]. Inclusion in research will be beneficial for improving researcher-community relationships on several levels, including trust building (Christopher et al., 2008; Jagosh et al., 2015), improving quality of research (Jagosh et al., 2015), and implementation of research findings into our systems of care.

Along these lines, this study used relatively standard inclusion and exclusion criteria for research in speech-language pathology. The primary purpose of this was to restrict our analysis to a subset of adult, American English speakers, most of whom had no history of a speech, language, or hearing disorder. Historically, this has centered normative data on middle class, White speakers and defined any deviation from that as a disability (Holt, 2022; St. Pierre and St. Pierre, 2018). Further, White (Calder and King, 2020) and cisgender (Hazenberg, 2016) speakers tend to have the largest differences between the COG for /s/ productions. Given that we have unbalanced samples with mostly cisgender speakers who are White (61%), we may have biased our findings to a specific sociophonetic construction of gender identity. This may be particularly apparent as only four of the eight (50%) transfeminine speakers identified as White. Additionally, the intersectionality of constructing racial and gender identities may affect the results of this study. Similarly, we cannot comment on the associations between fricative production and sexual orientation, socioeconomic status, or geographic location. In response, a larger, prospective study is warranted to obtain a more nuanced understanding of the role of /s/ and /ʃ/ in the phonetic construction of identity, accounting for the multi-faceted nature of social identity.

D. Clinical implications

The relationship between gender expression and speech articulation has been of increasing interest within gender-affirming voice care. Researchers typically focus on vowel articulation (Carew et al., 2007; Davies et al., 2015; Kawitzky and McAllister, 2020; Leyns et al., 2021) and promote a more anterior tongue position during vowel production to raise F2 (Carew et al., 2007; Kawitzky and McAllister, 2020). Higher F2 values are associated with higher listener ratings of femininity (Carew et al., 2007; Kawitzky and McAllister, 2020), with femininity most likely referring to a singular, Western stereotype of feminine gender expression [e.g., high pitched voice, light or soft tone, demure, etc. (Houle, 2021)]. The spectral characteristics of /s/ and /ʃ/ are also associated with the size of the anterior oral cavity (Perkell et al., 2004; Shadle, 1990) and a more anterior tongue position corresponds to higher COG (Perkell et al., 2004). Similar to F2, a higher COG is associated with increased listener perceptions of femininity (Munson et al., 2022) and decreased listener perceptions of masculinity (Avery and Liss, 1996). Given the parallels between vowel F2 and fricative COG, transfeminine speakers may benefit from using a more forward tongue position when producing /s/ and /ʃ/ to promote a more feminine speech pattern [e.g., intersections of race, location, gender identity (Calder and King, 2020)].

That said, not all transfeminine speakers wish to conform to binary, cisgender norms. Some transgender or nonbinary speakers may wish for a more neutral gender expression (Hosbach-Cannon et al., 2022; Kennedy and Thibeault, 2020; Merritt, 2023), one that does not conform to presumably White (Calder and King, 2022), middle socioeconomic status (Stuart-Smith, 2007), heterosexual (Hazenberg, 2016; Zimman, 2013), and cisgender (Calder and King, 2020; Hazenberg, 2016) norms. In these clients, promoting a less extreme (i.e., less backed and less fronted) tongue position for /s/ and /ʃ/ may be preferred. Consistent with current best practices, we encourage a client-centered approach for gender affirming voice care, where the client is educated on potential targets, such as /s/ and /ʃ/, and empowered to set their own individualized goals.

V. CONCLUSION

The current study examined gender differences in the acoustics of /s/ and /ʃ/ within voiced and whispered speech. It also investigated the relationship between /s/ and /ʃ/ and two measures of gender expression within transfeminine speakers. The results supported gender differences in the spectral features of /s/ and /ʃ/, although the relative effect was larger for /s/ than it was for /ʃ/. Although /s/ and /ʃ/ are phonologically voiceless, gender differences did not manifest similarly between voiced and whispered speech. Within transfeminine speakers, we did not find significant relationships between /s/ and /ʃ/ COG and skew and time expressing feminine spectrum gender or TWVQ scores. Overall, these findings suggest that /s/, and to a lesser extent /ʃ/, are important sociophonetic markers of gender and may signal identification with gender groups but vary within-gender.

ACKNOWLEDGMENTS

This work was partially supported by a National Institute of Health (NIH) funded postdoctoral fellowship to N.H. (Grant No. F32DC020627), NIH funded postdoctoral fellowship position at Boston University to N.H. (Grant No. T32DC013017), and a New York University Steinhardt IDEA grant to S.V.L. Its contents are solely the responsibility of the authors and do not represent the official views of the NIH, New York University, Boston University, Fordham University, or Greenburgh Pride.

a)

Portions of this work were submitted to the 184th Meeting of the Acoustical Society of America as a poster presentation entitled, “Gender expression in productions of /s/ and /ʃ/.”

Footnotes

1

See supplementary material at https://doi.org/10.1121/10.0022387 for the full model output for the effect of speaker group and speech condition on /s/ and /ʃ/ acoustics and graphical representation of the acoustic measures /s/ and /ʃ/ by speaker group, speech condition, and their relationship with feminine gender expression.

References

  • 1. Adams, N. , Pearce, R. , Veale, J. , Radix, A. , Castro, D. , Sarkar, A. , and Thom, K. C. (2017). “ Guidance and ethical considerations for undertaking transgender health research and institutional review boards adjudicating this research,” Transgender Health 2(1), 165–175. 10.1089/trgh.2017.0012 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2. Adler, R. K. , Hirsch, S. , and Pickering, J. (2018). Voice and Communication Therapy for the Transgender/Gender Diverse Client: A Comprehensive Clinical Guide ( Plural Publishing, San Diego, CA: ). [Google Scholar]
  • 3. Avery, J. D. , and Liss, J. M. (1996). “ Acoustic characteristics of less‐masculine‐sounding male speech,” J. Acoust. Soc. Am. 99(6), 3738–3748. 10.1121/1.414970 [DOI] [PubMed] [Google Scholar]
  • 4. Bates, D. , Maechler, M. , Bolker, B. , and Walker, S. (2015). “ Fitting linear mixed-effects models using lme4,” J. Stat. Softw. 67(1), 1–48. 10.18637/jss.v067.i01 [DOI] [Google Scholar]
  • 5. Bem, S. L. (1993). The Lenses of Gender: Transforming the Debate on Sexual Inequality ( Yale University Press, New Haven, CT: ). [Google Scholar]
  • 6. Ben-Shachar, M. S. , Makowski, D. , and Ludecke, D. (2020). “ Compute and interpret indices of effect size,” CRAN, https://github.com/easystats/effectsize (Last viewed 11/7/2023).
  • 7. Boersma, P. , and Weenink, D. (2017). “ Praat: Doing phonetics by computer” [computer program], http://www.praat.org/ (Last viewed 12/13/2017).
  • 8. Bultynck, C. , Pas, C. , Defreyne, J. , Cosyns, M. , den Heijer, M. , and T'Sjoen, G. (2017). “ Self‐perception of voice in transgender persons during cross‐sex hormone therapy,” Laryngoscope 127(12), 2796–2804. 10.1002/lary.26716 [DOI] [PubMed] [Google Scholar]
  • 9. Butler, J. (1988). “ Performative acts and gender constitution: An essay in phenomenology and feminist theory,” Theatre J. 40(4), 519–531. 10.2307/3207893 [DOI] [Google Scholar]
  • 10. Butler, J. (2004). Undoing Gender ( Psychology Press, East Sussex, UK: ). [Google Scholar]
  • 11. Calder, J. (2019). “ The fierceness of fronted /s/: Linguistic rhematization through visual transformation,” Lang. Soc. 48(1), 31–64. 10.1017/S004740451800115X [DOI] [Google Scholar]
  • 12. Calder, J. , and King, S. (2020). “ Intersections between race, place, and gender in the production of/s,” Univ. Penn. Work. Papers Ling. 26(2), 31–38. [Google Scholar]
  • 13. Calder, J. , and King, S. (2022). “ Whose gendered voices matter?: Race and gender in the articulation of /s/ in Bakersfield, California,” J. Sociolinguistics 26(5), 604–623. 10.1111/josl.12584 [DOI] [Google Scholar]
  • 14. Carew, L. , Dacakis, G. , and Oates, J. M. (2007). “ The effectiveness of oral resonance therapy on the perception of femininity of voice in male-to-female transsexuals,” J. Voice 21, 591–603. 10.1016/j.jvoice.2006.05.005 [DOI] [PubMed] [Google Scholar]
  • 15. Cartei, V. , Garnham, A. , Oakhill, J. , Banerjee, R. , Roberts, L. , and Reby, D. (2019). “ Children can control the expression of masculinity and femininity through the voice,” R. Soc. Open Sci. 6(7), 190656. 10.1098/rsos.190656 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16. Christopher, S. , Watts, V. , McCormick, A. K. , and Young, S. (2008). “ Building and maintaining trust in a community-based participatory research partnership,” Am. J. Public Health 98(8), 1398–1406. 10.2105/AJPH.2007.125757 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 17. Cohen, J. (1992). “ Quantitative methods in psychology: A power primer,” Psychol. Bull. 112(1), 155–159. 10.1037/0033-2909.112.1.155 [DOI] [PubMed] [Google Scholar]
  • 18. Cosyns, M. , Van Borsel, J. , Wierckx, K. , Dedecker, D. , Van de Peer, F. , Daelman, T. , Laenen, S. , and T'Sjoen, G. (2014). “ Voice in female-to-male transsexual persons after long-term androgen therapy,” Laryngoscope 124(6), 1409–1414. 10.1002/lary.24480 [DOI] [PubMed] [Google Scholar]
  • 19. Dacakis, G. , Oates, J. M. , and Douglas, J. M. (2016). “ Further evidence of the construct validity of the transsexual voice questionnaire (TVQMtF) using principal components analysis,” J. Voice 31, 142–148. 10.1016/j.jvoice.2016.07.001 [DOI] [PubMed] [Google Scholar]
  • 20. Dacakis, G. , Oates, J. M. , and Douglas, J. (2017). “ Associations between the Transsexual Voice Questionnaire (TVQMtF) and self-report of voice femininity and acoustic voice measures,” Intl. J. Lang. Comm. Disor. 52(6), 831–838. 10.1111/1460-6984.12319 [DOI] [PubMed] [Google Scholar]
  • 21. Davies, S. , and Goldberg, J. M. (2006). “ Clinical aspects of transgender speech feminization and masculinization,” Int. J. Transgenderism 9(3/4), 167–196. 10.1300/J485v09n03_08 [DOI] [Google Scholar]
  • 22. Davies, S. , Papp, V. G. , and Antoni, C. (2015). “ Voice and communication change for gender nonconforming individuals: Giving voice to the person inside,” Int. J. Transgenderism 16(3), 117–159. 10.1080/15532739.2015.1075931 [DOI] [Google Scholar]
  • 23. Eklund, I. , and Traunmüller, H. (1997). “ Comparative study of male and female whispered and phonated versions of the long vowels of Swedish,” Phonetica 54(1), 1–21. 10.1159/000262207 [DOI] [Google Scholar]
  • 24. Flipsen, P. , Shriberg, L. , Weismer, G. , Karlsson, H. , and McSweeny, J. (1999). “ Acoustic characteristics of /s/ in adolescents,” J. Speech. Lang. Hear. Res. 42(3), 663–677. 10.1044/jslhr.4203.663 [DOI] [PubMed] [Google Scholar]
  • 25. Forrest, K. , Weismer, G. , Milenkovic, P. , and Dougall, R. N. (1988). “ Statistical analysis of word‐initial voiceless obstruents: Preliminary data,” J. Acoust. Soc. Am. 84(1), 115–123. 10.1121/1.396977 [DOI] [PubMed] [Google Scholar]
  • 26. Fox, R. A. , and Nissen, S. L. (2005). “ Sex-related acoustic changes in voiceless English fricatives,” J. Speech. Lang. Hear. Res. 48(4), 753–765. 10.1044/1092-4388(2005/052) [DOI] [PubMed] [Google Scholar]
  • 27. Fuchs, S. , Toda, M. , and Zygis, M. (2010). “ Do differences in male versus female/s/reflect biological or sociophonetic factors,” in Turbulent Sounds: An Interdisciplinary Guide, edited by Zygis M. ( De Gruyter Mouton, Berlin, Germany: ), Vol. 21, pp. 281–302. 10.1515/9783110226584 [DOI] [Google Scholar]
  • 28. Galupo, M. P. , Pulice-Farrow, L. , and Pehl, E. (2021). “ ‘ There is nothing to do about it’: Nonbinary individuals  experience of gender dysphoria,” Transgender Health 6(2), 101–110. 10.1089/trgh.2020.0041 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29. Gordon, M. , Barthmaier, P. , and Sands, K. (2002). “ A cross-linguistic acoustic study of voiceless fricatives,” J. Int. Phon. Assoc. 32(2), 141–174. 10.1017/S0025100302001020 [DOI] [Google Scholar]
  • 30. Hazenberg, E. (2016). “ Walking the straight and narrow: Linguistic choice and gendered presentation,” Gender Lang. 10(2), 270–294. 10.1558/genl.v10i2.19812 [DOI] [Google Scholar]
  • 31. Heffernan, K. (2004). “ Evidence from HNR that /s/ is a social marker of gender,” Toronto Working Papers in Linguistics, Vol. 23, https://twpl.library.utoronto.ca/index.php/twpl/article/view/6208 (Last viewed 11/7/2023). [Google Scholar]
  • 32. Hodges-Simeon, C. R. , Grail, G. P. O. , Albert, G. , Groll, M. D. , Stepp, C. E. , Carré, J. M. , and Arnocky, S. A. (2021). “ Testosterone therapy masculinizes speech and gender presentation in transgender men,” Sci. Rep. 11(1), 3494. 10.1038/s41598-021-82134-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33. Holt, Y. (2022). “ Reflecting on the role of gender and race in speech-language pathology,” Perspect. ASHA Spec. Int. Groups 7(6), 2158–2168. 10.1044/2022_PERSP-22-00019 [DOI] [Google Scholar]
  • 34. Hosbach-Cannon, C. J. , Miholics, K. , and Zendano, A. (2022). “ Self-report of voice model usage within the nonbinary and gender-nonconforming populations,” Perspect. ASHA Spec. Int. Groups 7, 1750–1756. 10.1044/2022_PERSP-22-00084 [DOI] [Google Scholar]
  • 35. Houle, N. (2021). “ Influence of stimuli and task on auditory-perception of vocal gender and femininity/masculinity,” Ph.D. thesis, New York University, New York, NY. [Google Scholar]
  • 36. Houle, N. , and Levi, S. V. (2020). “ Acoustic differences between voiced and whispered speech in gender diverse speakers,” J. Acoust. Soc. Am. 148(6), 4002–4013. 10.1121/10.0002952 [DOI] [PubMed] [Google Scholar]
  • 37. Ito, T. , Takeda, K. , and Itakura, F. (2005). “ Analysis and recognition of whispered speech,” Speech Commun. 45, 139–152. 10.1016/j.specom.2003.10.005 [DOI] [Google Scholar]
  • 38. Jagosh, J. , Bush, P. L. , Salsberg, J. , Macaulay, A. C. , Greenhalgh, T. , Wong, G. , Cargo, M. , Green, L. W. , Herbert, C. P. , and Pluye, P. (2015). “ A realist evaluation of community-based participatory research: Partnership synergy, trust building and related ripple effects,” BMC Public Health 15(1), 725. 10.1186/s12889-015-1949-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 39. Jongman, A. , Wayland, R. , and Wong, S. (2000). “ Acoustic characteristics of English fricatives,” J. Acoust. Soc. Am. 108(3), 1252–1263. 10.1121/1.1288413 [DOI] [PubMed] [Google Scholar]
  • 40. Jovičić, S. T. , and Šarić, Z. (2008). “ Acoustic analysis of consonants in whispered speech,” J. Voice 22, 263–274. 10.1016/j.jvoice.2006.08.012 [DOI] [PubMed] [Google Scholar]
  • 41. Kahane, J. C. (1982). “ Growth of the human prepubertal and pubertal larynx,” J. Speech. Lang. Hear. Res. 25(3), 446–455. 10.1044/jshr.2503.446 [DOI] [PubMed] [Google Scholar]
  • 42. Kawitzky, D. , and McAllister, T. (2020). “ The effect of formant biofeedback on the feminization of voice in transgender women,” J. Voice 34(1), 53–67. 10.1016/j.jvoice.2018.07.017 [DOI] [PubMed] [Google Scholar]
  • 43. Kennedy, E. , and Thibeault, S. L. (2020). “ Voice-gender incongruence and voice health information-seeking behaviors in the transgender community,” Am. J. Speech. Lang. Pathol. 29(3), 1563–1573. 10.1044/2020_AJSLP-19-00188 [DOI] [PubMed] [Google Scholar]
  • 44. Lang, C. , and Kuhnle, U. (2008). “ Intersexuality and alternative gender categories in non-Western cultures,” Horm. Res. Paediatr. 69(4), 240–250. 10.1159/000113025 [DOI] [PubMed] [Google Scholar]
  • 45. Lenth, R. V. (2018). “ Estimated marginal means, aka Least-squares Means,” R Package Version 1.2.
  • 46. Leyns, C. , Corthals, P. , Cosyns, M. , Papeleu, T. , Van Borsel, J. , Morsomme, D. , T'Sjoen G., and D'Haeseleer, E. (2021). “ Acoustic and perceptual effects of articulation exercises in transgender women,” J. Voice. (published online). 10.1016/j.jvoice.2021.06.033 [DOI] [PubMed] [Google Scholar]
  • 47. Li, F. , Rendall, D. , Vasey, P. L. , Kinsman, M. , Ward-Sutherland, A. , and Diano, G. (2016). “ The development of sex/gender-specific /s/ and its relationship to gender identity in children and adolescents,” J. Phon. 57, 59–70. 10.1016/j.wocn.2016.05.004 [DOI] [Google Scholar]
  • 48. Mack, S. , and Munson, B. (2012). “ The influence of /s/ quality on ratings of men's sexual orientation: Explicit and implicit measures of the ‘gay lisp’ stereotype,” J. Phon. 40(1), 198–212. 10.1016/j.wocn.2011.10.002 [DOI] [Google Scholar]
  • 49. McGuire, J. K. , Beek, T. F. , Catalpa, J. M. , and Steensma, T. D. (2019). “ The genderqueer identity (GQI) scale: Measurement and validation of four distinct subscales with trans and LGBQ clinical and community samples in two countries,” Int. J. Transgender 20(2-3), 289–304. 10.1080/15532739.2018.1460735 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50. Merritt, B. (2023). “ Speech beyond the binary: Some acoustic-phonetic and auditory-perceptual characteristics of non-binary speakers,” JASA Express Lett. 3(3), 035206. 10.1121/10.0017642 [DOI] [PubMed] [Google Scholar]
  • 51. Muehlenhard, C. , and Peterson, Z. (2011). “ Distinguishing between sex and gender: History, current conceptualizations, and implications,” Sex Roles 64(11-12), 791–803. 10.1007/s11199-011-9932-5 [DOI] [Google Scholar]
  • 52. Munson, B. , and Babel, M. (2019). “ The phonetics of sex and gender,” in The Routledge Handbook of Phonetics ( Taylor and Francis, New York: ), pp. 499–525. [Google Scholar]
  • 53. Munson, B. , Lackas, N. , and Koeppe, K. (2022). “ Individual differences in the development of gendered speech in preschool children: Evidence from a longitudinal study,” J. Speech. Lang. Hear. Res. 65(4), 1311–1330. 10.1044/2021_JSLHR-21-00465 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 54. Munson, B. , McDonald, E. C. , DeBoe, N. L. , and White, A. R. (2006). “ The acoustic and perceptual bases of judgments of women and men's sexual orientation from read speech,” J. Phon. 34, 202–240. 10.1016/j.wocn.2005.05.003 [DOI] [Google Scholar]
  • 55. Nittrouer, S. (1995). “ Children learn separate aspects of speech production at different rates: Evidence from spectral moments,” J. Acoust. Soc. Am. 97(1), 520–530. 10.1121/1.412278 [DOI] [PubMed] [Google Scholar]
  • 56. Novais Valente Junior, C. , and Mesquita de Medeiros, A. (2020). “ Voice and gender incongruence: Relationship between vocal self-perception and mental health of trans women,” J. Voice 36, 808–813. 10.1016/j.jvoice.2020.10.002 [DOI] [PubMed] [Google Scholar]
  • 57. Perkell, J. S. , Matthies, M. L. , Tiede, M. , Lane, H. , Zandipour, M. , Marrone, N. , Stockmann, E. , and Guenther, F. H. (2004). “ The distinctness of speakers’ /s/—/∫/ contrast is related to their auditory discrimination and use of an articulatory saturation effect,” J. Speech. Lang. Hear. Res. 47(6), 1259–1269. 10.1044/1092-4388(2004/095) [DOI] [PubMed] [Google Scholar]
  • 58. Podesva, R. J. , and Van Hofwegen, J. (2014). “ How conservatism and normative gender constrain variation in inland California: The case of/s,” Univ. Penn. Work. Papers Ling. 20(2), 129–137. [Google Scholar]
  • 59. Polderman, T. J. , Kreukels, B. P. , Irwig, M. S. , Beach, L. , Chan, Y.-M. , Derks, E. M. , Esteva, I. , Ehrenfeld, J. , Den Heijer, M. , Posthuma, D. , Raynor, L. , Tishelman, A. , and Davis, L. K. , on behalf of the International Gender Diversity Genomics Consortium (2018). “ The biological contributions to gender identity and gender diversity: Bringing data to the table,” Behav. Genet. 48(2), 95–108. 10.1007/s10519-018-9889-z [DOI] [PubMed] [Google Scholar]
  • 60. Reidy, P. (2013). “ An introduction to random processes for the spectral analysis of speech data,” Work. Papers Ling. 60, 67–116. [Google Scholar]
  • 61. Ristori, J. , Cocchetti, C. , Romani, A. , Mazzoli, F. , Vignozzi, L. , Maggi, M. , and Fisher, A. D. (2020). “ Brain sex differences related to gender identity development: Genes or hormones?,” Int. J. Mol. Sci. 21(6), 2123. 10.3390/ijms21062123 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 62. Romeo, R. , Hazan, V. , and Pettinato, M. (2013). “ Developmental and gender-related trends of intra-talker variability in consonant production,” J. Acoust. Soc. Am. 134(5), 3781–3792. 10.1121/1.4824160 [DOI] [PubMed] [Google Scholar]
  • 63.RStudio (2012). “RStudio,” version 3.4, http://www.rstudio.org/ (Last viewed 11/7/2023).
  • 64. Scandurra, C. , Mezza, F. , Maldonato, N. M. , Bottone, M. , Bochicchio, V. , Valerio, P. , and Vitelli, R. (2019). “ Health of non-binary and genderqueer people: A systematic review,” Front. Psychol. 10, 1453. 10.3389/fpsyg.2019.01453 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 65. Serano, J. (2007). Whipping Girl: A Transsexual Woman on Sexism and the Scapegoating of Femininity ( Seal Press, New York: ). [Google Scholar]
  • 66. Shadle, C. H. (1990). “ Articulatory-acoustic relationships in fricative consonants,” in Speech Production and Speech Modelling ( Springer, Berlin), pp. 187–209. [Google Scholar]
  • 67. Shadle, C. H. (2023). “ Alternatives to moments for characterizing fricatives: Reconsidering Forrest et al. (1988),” J. Acoust. Soc. Am. 153(2), 1412–1426. 10.1121/10.0017231 [DOI] [PubMed] [Google Scholar]
  • 68. Shadle, C. H. , and Scully, C. (1995). “ An articulatory-acoustic-aerodynamic analysis of [s] in VCV sequences,” J. Phon. 23(1), 53–66. 10.1016/S0095-4470(95)80032-8 [DOI] [Google Scholar]
  • 69. Sharifzadeh, H. R. , McLoughlin, I. V. , and Russell, M. J. (2012). “ A comprehensive vowel space for whispered speech,” J. Voice 26(2), e49–e56. 10.1016/j.jvoice.2010.12.002 [DOI] [PubMed] [Google Scholar]
  • 70. St. Pierre, J. , and St. Pierre, C. (2018). “ Governing the voice: A critical history of speech-language pathology,” Foucault Stud. 24, 151–184. 10.22439/fs.v0i24.5530 [DOI] [Google Scholar]
  • 71. Stuart-Smith, J. (2007). “Empirical evidence for gendered speech production: /s/ in Glaswegian.”
  • 72. Stuart-Smith, J. (2020). “ Changing perspectives on /s/ and gender over time in Glasgow,” Ling. Vanguard 6(s1), 20180064. 10.1515/lingvan-2018-0064 [DOI] [Google Scholar]
  • 73. T'Sjoen, G. , Arcelus, J. , Gooren, L. , Klink, D. T. , and Tangpricha, V. (2019). “ Endocrinology of transgender medicine,” Endocrine Rev. 40(1), 97–117. 10.1210/er.2018-00011 [DOI] [PubMed] [Google Scholar]
  • 74. Yoshinaga, T. , Nozaki, K. , and Wada, S. (2017). “ Effects of tongue position in the simplified vocal tract model of Japanese sibilant fricatives /s/ and /ʃ/,” J. Acoust. Soc. Am. 141(3), EL314–EL318. 10.1121/1.4978754 [DOI] [PubMed] [Google Scholar]
  • 75. Ziegler, A. , Henke, T. , Wiedrick, J. , and Helou, L. B. (2018). “ Effectiveness of testosterone therapy for masculinizing voice in transgender patients: A meta-analytic review,” Int. J. Transgenderism 19(1), 25–45. 10.1080/15532739.2017.1411857 [DOI] [Google Scholar]
  • 76. Zimman, L. (2013). “ Hegemonic masculinity and the variability of gay-sounding speech: The perceived sexuality of transgender men,” J. Lang. Sexuality 2(1), 1–39. 10.1075/jls.2.1.01zim [DOI] [Google Scholar]
  • 77. Zimman, L. (2017). “ Variability in /s/ among transgender speakers: Evidence for a socially grounded account of gender and sibilants,” Linguistics 55(5), 993–1019. 10.1515/ling-2017-0018 [DOI] [Google Scholar]
  • 78. Żygis, M. , Pape, D. , Koenig, L. L. , Jaskuła, M. , and Jesus, L. M. T. (2017). “ Segmental cues to intonation of statements and polar questions in whispered, semi-whispered and normal speech modes,” J. Phon. 63, 53–74. 10.1016/j.wocn.2017.04.001 [DOI] [Google Scholar]

Articles from The Journal of the Acoustical Society of America are provided here courtesy of Acoustical Society of America

RESOURCES