Abstract
Humans and many non-human primates exhibit large sexual dimorphisms in vocalizations and vocal anatomy. In humans, same-sex competitors and potential mates attend to acoustic features of male vocalizations, but vocal masculinity especially increases perceptions of physical prowess. Yet, the information content of male vocalizations remains obscure. We therefore examined relationships between sexually dimorphic acoustic properties and men's threat potential. We first introduce a new measure of the structure of vocal formant frequencies, ‘formant position’ (Pf), which we show is more sexually dimorphic and more strongly related to height than is the most widely used measure of formant structure, ‘formant dispersion’, in both a US sample and a sample of Hadza foragers from Tanzania. We also show large sexual dimorphisms in the mean fundamental frequency (F0) and the within-utterance standard deviation in F0 (F0 − s.d.) in both samples. We then explore relationships between these acoustic parameters and men's body size, strength, testosterone and physical aggressiveness. Each acoustic parameter was related to at least one measure of male threat potential. The most dimorphic parameters, F0 and Pf, were most strongly related to body size in both samples. In the US sample, F0 predicted testosterone levels, Pf predicted upper body strength and F0 − s.d. predicted physical aggressiveness.
Keywords: formant frequency, fundamental frequency, sexual selection
1. Introduction
The acoustic properties of adult vocalizations are sexually dimorphic in many primates [1–3], including humans [4]. Males vocalize at a lower fundamental frequency (F0), the rate of vocal fold vibration and acoustic parameter closest to pitch, and with lower, more closely spaced formant frequencies [2,4], frequencies of high energy that affect vocal timbre. In humans, F0 may also vary less across an utterance in men than in women, resulting in a more monotone voice [5].
Except for F0 variation, the anatomical and physiological substrates for these vocal sex differences are well understood. At puberty, elevated testosterone levels [6] acting through androgen receptors in the vocal folds [7] cause the vocal folds to grow longer and thicker in boys than in girls [8]. Men's larger vocal folds consequently vibrate at approximately half the F0 of women's during phonation. Similarly, boys' larynges descend at puberty [9], producing a longer vocal tract and lower, more closely spaced formant frequencies [10].
2. Sexual selection and human voices
Recent research has focused on the evolutionary reasons for the pubertal development of vocal sex differences in humans [11] and non-human primates [12]. Of particular interest has been sexual selection, which operates through both mate choice, favouring ornaments and sexual displays for attracting mates, and contest competition, favouring size, strength, aggression, weapons and threat displays [13].
Male mate choice of females may have contributed to higher fundamental and formant frequencies in women than in men [14–16]. However, sexual selection was probably stronger in shaping men's traits [17], and sexually selected traits tend to emerge at puberty [13], when precipitous changes in male, but not female, vocal anatomy occur. Moreover, low speaking F0 and a monotone voice predicted men's mating success in a US sample [18,19], and low speaking F0 predicted men's reproductive success among Hadza foragers of Tanzania [20], suggesting selection on men's voices.
Although some evidence implicates female mate choice in favouring deeper, more monotone voices in men [18,21–23], theoretical predictions and phylogenetic and functional data suggest that male contests played a larger role than female choice in shaping men's phenotypes [24]. Masculine fundamental and formant frequencies in particular increase perceptions of dominance and fighting ability among men more effectively than they increase attractiveness to women [11,25–27]. Likewise, a monotone voice appears more strongly related to perceptions of dominance than attractiveness [23]. Moreover, modulation of F0 affects and reflects dominance and submissiveness across animal species [28], including humans [11].
3. Information content of male vocal signals
Although masculine voices evoke perceptions of dominance and fighting ability in men, it is less clear why they should do so. Does vocal masculinity advertise physical prowess? Attempts to answer this question have obtained mixed results.
Bruckert et al. [29] found that testosterone levels negatively predicted men's formant frequencies, and some have found that testosterone negatively predicted men's F0 [30,31], but others have not [29]. Because androgen is related to physical aggressiveness and competitiveness [32–34], a masculine voice may signal threat potential.
However, Sell et al. [35] found no relationships between men's fundamental or formant frequencies and upper body strength. Although some studies have found relationships between F0 and men's height [36] and weight [37], most have not [2,22,29,35,38–40]. Similarly, some studies have found relationships between formant structure and men's height [2,35,37,41], but others have not [22,29,42], and some have found relationships between formant structure and weight [37,42], but most have not [2,22,29,35]. To our knowledge, relationships between voice monotonicity and measures of physical threat potential are unexplored.
Rendall et al. [43] suggested that inferences of body size from F0 are misattributions resulting from (i) over-generalizing age or sex differences, (ii) applying broader environmental sound-size associations to speech, or (iii) associations between behavioural traits (e.g. dominance) and both size and F0. Similar reasoning could be applied to the inconsistent relationships between formant structure and men's physical characteristics. Alternatively, sexually dimorphic acoustic parameters may predict physical aggressive potential, including size, but varied results across studies could reflect differences in sample characteristics, small samples and varied methodologies.
Some failures to replicate associations with formant frequencies may result from the method of measuring formant structure. The most commonly used measure is ‘formant dispersion’ (Df), the average distance between successive formants [44]. Formant dispersion is given by
where Fi is the ith formant frequency, and n is the number of formant frequencies measured. Although Df is highly sexually dimorphic in humans and has been ‘the most reliable known acoustic index of body size in humans, as well as other primates’ [35], it is not without problems. Perhaps most importantly, Df surprisingly contains no information about the middle formants used in its calculation. Consider the case where the first four formants are measured. Formant dispersion, then, is
This reduces to (F4 − F1)/3, so Df actually measures only the distance between the highest and lowest formants measured. In addition, because the variance in a higher formant (e.g. F4) greatly exceeds the variance in F1 [2], and because the formant values are not standardized, Df is mainly driven by the value of the highest formant measured. Finally, given that longer vocal tracts produce lower formants, the sum of the first n formants should be inversely related to vocal tract length. Adding one formant and subtracting another, as in Df calculation, captures information about formant spacing but partly cancels information about formant positions.
The present study therefore seeks to clarify relationships between sexually dimorphic acoustic properties (F0, monotonicity, formant structure) and threat potential in men. Such evidence would elucidate the processes of sexual selection in humans and non-human primates and contribute to a key objective of acoustic communication research: clarify the information content of vocal signals [45]. We first introduce a new measure of formant structure, ‘formant position’ (Pf), which we show is more sexually dimorphic and more strongly related to body size than is Df in both a US sample and in a sample of Hadza foragers from Tanzania. We then explore relationships between vocal masculinity and body size, strength, testosterone and physical aggressiveness in these samples.
4. Study 1: US sample
(a). Methods
(i). Participants
One hundred and seventy-six male (mean age = 20.1 ± 1.7, range = 18–26) and 268 female (139 taking hormonal contraception: mean age = 20.4 ± 1.6, range = 18–26; 129 normally cycling: mean age = 19.6 ± 1.5, range = 18–24) students from a large, northeastern US university participated in this IRB-approved study. Participants identified as 91.4 per cent White, 3.4 per cent Asian, 1.8 per cent Hispanic or Latino, 1.4 per cent Black or African American, 0.5 per cent American Indian or Alaska Native, 0.2 per cent Native Hawaiian or Other Pacific Islander and 1.4 per cent ‘other’.
(ii). Procedures
Male and hormonally contracepting female participants were scheduled for 1 h morning sessions (beginning between 08.20 and 10.00 h) and evening sessions (beginning between 17.20 and 19.00 h) one week apart. Normally cycling women attended sessions between 13.00 and 16.00 h and were scheduled based on self-reported menstrual cycle data to attend one session during the late follicular phase and one session during the mid-luteal phase. Session order (morning or evening first, or follicular or mid-luteal first) was randomized. Voice, anthropometric, hormonal (in this study, men's testosterone) and psychometric data were collected at both sessions.
(iii). Voice recording and measurement
Participants were recorded reading an excerpt from a standard voice passage, the Rainbow Passage [46], in an anechoic, soundproof booth using a Shure SM58 vocal cardioid microphone. A curved wire projection from the microphone stand kept the participant's mouth approximately 9.5 cm from the microphone. Voices were recorded into a computer using Goldwave software in mono at a sampling rate of 44 100 Hz and 16 bit quantization, and saved as uncompressed ‘.wav’ files.
Each recording (mean duration = 30.6 ± 3.8 s) was analysed using Praat software (v. 4.4.11). For mean F0 and F0 variation across the utterance (standard deviation in F0, F0 − s.d.), pitch floors were 75 and 100 Hz, and pitch ceilings were 300 and 500 Hz, for men and women, respectively, in accordance with the programmers' recommendations [47]. Otherwise, default settings were used.
We measured F1 through F4 at each glottal pulse (automated detection by Praat, mean glottal pulses per recording = 2972 ± 889) and averaged across measurements. This facilitated formant measurement along the entire utterance, thus sampling a greater range of vocal tract configurations than measuring individual vowels. In addition, this method sampled only voiced speech and avoided fricatives, which artificially decrease apparent vocal tract length because the sound source of fricatives is turbulence in the mouth rather than vocal fold vibration [48]. Because Praat occasionally shifts formants (e.g. misattributing F2 as F1), we omitted all formant measurements from glottal pulses for which any value exceeded a predetermined threshold (less than 2% of pulses). Thresholds were based on published data [2] for male and female formants and were selected to eliminate only clearly erroneous measurements. For men, thresholds were 1000, 2850, 3750 and 4500 Hz for F1 through F4, and for women, thresholds were 1250, 3350, 4150 and 5100 Hz. Formant measurements obtained by this method correlated highly with measurements obtained by measuring and averaging across 11 individual vowels each from nine randomly selected men and nine randomly selected women: 0.93 ≤ r ≤ 0.98 (both sexes), 0.86 ≤ r ≤ 0.94 (men only).
We then computed formant dispersion (Df) and formant position (Pf). We define Pf as the average standardized formant value for the first n formants, where formants are standardized using between-sex means and standard deviations. Thus,
where is the standardized ith formant, and n is the number of formants measured. This method effectively assigns each standardized formant a unit weight rather than, for example, a beta weight obtained via regressing formants on sex or height. We chose this approach following Cohen [49], who suggested that unit weights have better predictive power than beta weights derived from moderate-sized samples. So as not to bias the mean or standard deviation towards either sex during standardization, we obtained means and standard deviations using bootstrapping methods, randomly selecting (with replacement) 10 000 samples of 129 men and all 129 normally cycling women who provided voice recordings for both sessions. With both sexes included, mean F1 = 472.3 ± 49.7 Hz, mean F2 = 1594.4 ± 146.7 Hz, mean F3 = 2604.5 ± 184.4 Hz and mean F4 = 3612.8 ± 267.9 Hz. These values were then used to standardize formants for all participants.
(iv). Anthropometry
Flexed biceps circumference was measured at its widest point for left and right arms using a tape measure. Left- and right-hand strengths were obtained using a JAMAR hydraulic hand dynamometer. Biceps size and hand strength are good predictors of overall upper body strength [35]. Height was measured from a metre stick affixed to a wall, and weight was obtained using an electronic scale.
(v). Testosterone assays
Saliva was collected for testosterone (T) assays during morning and evening sessions. Contamination of saliva samples was minimized by having participants not eat, drink (except plain water), smoke, chew gum or brush their teeth for 1 h before their session. Participants rinsed their mouths with water before chewing a piece of sugar-free Trident gum (inert in salivary hormone assays) to stimulate saliva flow. Approximately 9 ml of saliva was collected in a sodium azide-coated polystyrene tube. The tube was capped and left upright at room temperature for 18–24 h to allow mucins to settle. Tubes were then frozen at −20°C until hormone analysis.
We obtained salivary unbound (‘free’) T concentrations, which correlate strongly with serum T concentrations [50]. The Salivary Radioimmunoassay Laboratory at the University of Western Ontario performed T radioimmunoassay on 330 male saliva samples, 175 from session 1 and 155 from session 2. All samples went through double ether extraction, followed by radioimmunoassay in duplicate using a Coat-A-Count kit for total T (Diagnostic Products, Los Angeles, CA, USA), modified for use with saliva (for details, see [51]). The average intra-assay coefficient of variation was 6.3 per cent, and sensitivity was 5–10 pg ml–1.
(vi). Psychometric data collection
Following anthropometry and saliva collection, each participant completed the Buss & Perry [52] Aggression Questionnaire, and instruments not used here, at a private computer station. The Aggression Questionnaire comprises 29 items, including nine targeting physical aggression (e.g. ‘Once in a while I can't control the urge to strike another person’). Items are assessed on a 5-point scale, anchored at ‘extremely uncharacteristic of me’ and ‘extremely characteristic of me’. Scores were summed for a composite measure of physical aggression.
5. Results
(a). Correlations across sessions and sides of the body
Mean F0 and F0 − s.d. correlated across sessions r391 = 0.98 and 0.92, respectively (controlling for sex, r388 = 0.83 and 0.78, respectively). Formants correlated across sessions 0.92 ≤ r391 ≤ 0.98 (controlling for sex, 0.82 ≤ r388 ≤ 0.89). Men's anthropometric measurements (women's not reported here) correlated 0.68 ≤ r155 ≤ 0.99 across sessions and 0.74 ≤ r155 ≤ 0.96 across sides of the body. Men's physical aggression scores (Cronbach's α = 0.85) correlated across sessions r154 = 0.90.
Men's testosterone concentrations correlated more modestly across morning and evening sessions (r154 = 0.50), as expected given temporal variability in T secretion. Testosterone concentrations also decreased significantly from morning to evening sessions (paired t153 = 10.4, p < 0.0001), indicating that we were able to capture significant diurnal variation.
Because all measurements highly correlated across sessions and sides of the body, we averaged across sessions and sides of the body (tables 1 and 2). Only values from session 1 were used for participants who did not return for session 2 (9% of cases).
Table 1.
women (n = 268) | men (n = 176) | t | d | ||
---|---|---|---|---|---|
study 1 | F0 (Hz) | 201.7 ± 18.0 | 111.4 ± 14.2 | 58.8*** | 5.7 |
F0 − s.d. (Hz) | 35.5 ± 8.5 | 15.0 ± 3.6 | 35.1*** | 3.6 | |
F1 (Hz) | 511.8 ± 34.3 | 434.3 ± 26.7 | 26.7*** | 2.6 | |
F2 (Hz) | 1724.9 ± 64.3 | 1456.0 ± 55.7 | 45.4*** | 4.3 | |
F3 (Hz) | 2781.3 ± 91.9 | 2442.9 ± 78.5 | 41.5*** | 4.1 | |
F4 (Hz) | 3874.0 ± 125.6 | 3390.9 ± 108.9 | 43.0*** | 4.3 | |
Df (Hz) | 1120.7 ± 41.6 | 985.5 ± 39.0 | 34.3*** | 3.3 | |
Pf | 0.90 ± 0.38 | −0.85 ± 0.27 | 56.9*** | 5.4 | |
women (n = 43) | men (n = 32) | t | d | ||
study 2 | F0 (Hz) | 208.5 ± 29.4 | 114.9 ± 15.3 | 17.9*** | 4.4 |
F0 − s.d. (Hz) | 28.8 ± 11.3 | 14.7 ± 5.7 | 7.3*** | 1.8 | |
F1 (Hz) | 454.0 ± 62.4 | 418.1 ± 62.5 | 2.5* | 0.6 | |
F2 (Hz) | 1431.9 ± 123.2 | 1286.1 ± 120.8 | 5.1*** | 1.2 | |
F3 (Hz) | 2970.9 ± 153.2 | 2611.4 ± 149.4 | 10.2*** | 2.4 | |
F4 (Hz) | 4000.1 ± 121.8 | 3591.1 ± 149.9 | 13.3*** | 3.1 | |
Df (Hz) | 1174.6 ± 47.5 | 1058.9 ± 51.8 | 10.0*** | 2.3 | |
Pf | 0.68 ± 0.43 | −0.80 ± 0.40 | 15.3*** | 3.6 |
*p < 0.05.
***p < 0.0001.
Table 2.
variable | US sample | Hadza sample |
---|---|---|
height (cm) | 178.8 (6.9, 158.5–197.3) | 162.5 (7.2, 145.8–179.3) |
weight (kg) | 78.7 (13.4, 53.1–131.9) | 51.4 (6.3, 38.8–67.2) |
biceps (cm) | 33.4 (3.2, 26.1–42.0) | 25.1 (1.6, 22.0–28.5) |
hand strength (kg) | 49.5 (8.1, 31.3–69.3) | 35.0 (5.6, 21.5–45.0) |
triceps skinfold (cm) | — | 5.3 (1.8, 3.0–9.7) |
physical aggression | 21.8 (6.6, 9–39) | — |
T levels (pg ml–1) | 92.9 (34.2, 36.3–254.5) | — |
(b). Sexual dimorphism in the voice
All acoustic parameters were highly sexually dimorphic (2.7 ≤ d ≤ 5.7, table 1). Importantly, Pf (d = 5.4) was more sexually dimorphic than Df (d = 3.3) (test for difference between dependent correlation coefficients (correlations between sex and Pf and Df): t441 = 8.97, p < 0.0001). When sex and height were entered simultaneously to predict Pf and Df in separate general linear models, sex more strongly predicted Pf controlling for height (partial η2 = 0.73 compared with 0.54 for Df, both p < 0.0001), and height more strongly predicted Pf controlling for sex (partial η2 = 0.11, p < 0.0001 compared with 0.01 for Df, p = 0.021, electronic supplementary material, table S1). All other acoustic parameters remained highly sexually dimorphic after controlling for height (0.36 ≤ partial η2 ≤ 0.77).
(c). Correlates of acoustic parameters
In subsequent analyses, F0 − s.d., weight, physical aggression and T levels were log-transformed to correct for positive skew. Hand strength and biceps circumference were standardized and then averaged to produce the composite measure ‘arm strength’.
Measures of threat potential (height, weight, arm strength, physical aggression and testosterone), along with acoustic parameters as control variables, were entered simultaneously into separate multiple regressions (table 3) to explore their independent relationships to each sexually dimorphic acoustic parameter (F0, F0 − s.d. and Pf). Age was not significantly correlated with any acoustic parameter (all p > 0.3), so it was not included in these models. All variance inflation factors (VIFs) were less than 1.9. In addition, we explored zero-order correlations between measures of threat potential and acoustic parameters (table 4).
Table 3.
study 1 |
study 2 |
||||
---|---|---|---|---|---|
t | beta | t | beta | ||
F0 | height | −2.46* | −0.16 | −2.39* | −0.66 |
weight | 0.05 | <0.01 | 2.53* | 0.94 | |
arm strength | 0.71 | 0.06 | −2.16* | −0.47 | |
physical aggression | 1.19 | 0.08 | — | — | |
testosterone | −2.87** | −0.17 | — | — | |
F0 − s.d. | 9.51*** | 0.60 | 5.09*** | 0.64 | |
Pf | 0.15 | 0.01 | 1.15 | 0.17 | |
F0 − s.d. | height | 0.82 | 0.05 | 1.13 | 0.37 |
weight | 1.64 | 0.13 | −1.15 | −0.53 | |
arm strength | −1.78+ | −0.14 | 1.01 | 0.26 | |
physical aggression | −2.53* | −0.16 | — | — | |
testosterone | 1.56+ | 0.09 | — | — | |
F0 | 9.51*** | 0.58 | 5.09*** | 0.80 | |
Pf | 1.52 | 0.10 | −0.33 | −0.05 | |
Pf | height | −2.11* | −0.17 | 0.49 | 0.21 |
weight | −0.79 | −0.08 | −1.45 | −0.84 | |
arm strength | −1.43 | −0.13 | 1.05 | 0.35 | |
physical aggression | −1.31 | −0.10 | — | — | |
testosterone | 1.06 | 0.08 | — | — | |
F0 | 0.15 | 0.01 | 1.15 | 0.32 | |
F0 − s.d. | 1.52 | 0.14 | −0.33 | −0.08 |
+p < 0.1.
*p < 0.05.
**p < 0.01.
***p < 0.001.
Table 4.
F0 | F0 − s.d. | Pf | Df | ||
---|---|---|---|---|---|
study 1 | height (n = 175) | −0.17* | −0.06 | −0.24** | −0.02 |
weight (n = 175) | 0.01 | 0.02 | −0.23** | −0.01 | |
arm strength (n = 176) | −0.04 | −0.15+ | −0.26*** | −0.02 | |
physical aggression (n = 176) | −0.04 | −0.23** | −0.19** | −0.02 | |
testosterone (n = 176) | −0.15* | 0.01 | 0.11 | 0.05 | |
study 2 | height (n = 32) | −0.20 | −0.09 | −0.38* | −0.11 |
weight (n = 32) | −0.08 | −0.04 | −0.39* | −0.04 | |
arm strength (n = 31) | −0.17 | −0.07 | −0.23 | 0.10 |
+p < 0.1.
*p < 0.05.
**p < 0.01.
***p < 0.001.
In the regression model for mean F0 (F7,167 = 15.99, p < 0.0001), testosterone (t = −2.87, beta = −0.17, p = 0.005), height (t = −2.46, beta = −0.16, p = 0.015) and F0 − s.d. (t = 9.51, beta = 0.60, p < 0.0001) were significant predictors. In zero-order correlations, mean F0 was significantly negatively related to height (r175 = −0.17, p = 0.028) and testosterone (r175 = −0.15, p = 0.044), although these correlations do not survive Bonferroni correction for multiple tests (five predictors).
Physical aggression (t = −3.40, beta = −0.20, p < 0.001) and mean F0 (t = 9.67, beta = 0.58, p < 0.0001) significantly predicted F0 − s.d. in the regression model for this parameter (F2,173 = 53.90, p < 0.0001). In zero-order correlations, F0 − s.d. was significantly negatively correlated with physical aggression (r175 = −0.23, p = 0.003) and marginally significantly correlated with arm strength (r175 = −0.15, p = 0.053).
Height (t = −2.11, beta = −0.17, p = 0.036) significantly predicted Pf in the final model for Pf (F7,167 = 4.22, p < 0.001). In zero-order correlations, Pf was significantly negatively correlated with height (r175 = −0.24, p = 0.001), weight (r175 = −0.23, p = 0.002), physical aggressiveness (r175 = −0.19, p < 0.01) and arm strength (r175 = −0.26, p < 0.001).
No variable significantly or marginally significantly predicted Df (regression model: F7,167 = 0.16, p = 0.992, zero-order correlations: all p > 0.4).
6. Study 2: Hadza sample
(a). Methods
(i). Participants
The Hadza occupy remote savannah–woodland areas in northern Tanzania around Lake Eyasi and subsist on game killed with bow and arrow and foraged foods such as berries, baobab fruit and tubers. The Hadza number approximately 1000 and live in camps of approximately 30 individuals. Data were gathered from participants in eight Hadza camps. Thirty-two Hadza men (mean age = 29.6 ± 6.2, range = 19–40) and 43 Hadza women (mean age = 28.9 ± 6.3, range = 18–39) participated in this IRB-approved study.
(ii). Voice recording and measurement
Participants were recorded speaking the word ‘hujambo’, which loosely translates to ‘hello’ in English, inside a Land Rover with a Seinnheiser MKH-60 microphone. Recordings were encoded in mono directly onto computer hard disk, using Sonic Foundry's SOUND FORGE at 44 100 Hz sampling rate and 16 bit quantization, and saved as uncompressed ‘.wav’ files.
Voice recordings were measured using methods identical to those in study 1 (table 1). Mean number of glottal pulses per recording was 128 ± 73. Less than 3 per cent of glottal pulses were excluded owing to formant misattributions in Praat. Formant means and standard deviations used to compute Pf were obtained using bootstrapping methods from 10 000 random samples with replacement of 32 women and all 32 men. With both sexes included, mean F1 = 425.3 ± 49.8 Hz, mean F2 = 1372.8 ± 138.1 Hz, mean F3 = 2794.4 ± 221.1 Hz and mean F4 = 3779.2 ± 240.3 Hz.
(iii). Anthropometry
Height was measured using a portable stadiometer, and weight was obtained using an electronic scale (Tanita Ultimate Scale 2000). Triceps skinfold measurements for the left arm were made in triplicate (Cronbach's α = 0.96) using Mitutoyo Absolute Digimatic CD 6 calipers and then averaged. Mid-upper arm circumference for the left arm was measured using a flexible tape measure. A dynamometer was used to measure left- and right-hand strengths, which correlated r31 = 0.83 and were thus averaged. (See table 2 for descriptive statistics.) Upper arm muscle mass was computed from the formula
where a is the mid-upper arm circumference (cm), and the s is the average of three triceps skinfold measurements (cm). Hand strength and upper arm muscle mass were then standardized and averaged to produce the composite measure arm strength.
7. Results
(a). Sexual dimorphism in the voice
All acoustic parameters were highly sexually dimorphic (1.5 ≤ d ≤ 4.4, table 1). Importantly, Pf (d = 3.6) was more sexually dimorphic than Df (d = 2.4) (test for difference between dependent correlation coefficients (correlations between sex and Pf and Df): t72 = 2.81, p = 0.003). When both sex and height were entered to predict Pf and Df in separate general linear models, sex more strongly predicted Pf controlling for height (partial η2 = 0.59 compared with 0.41 for Df, both p < 0.0001), and height more strongly predicted Pf controlling for sex (partial η2 = 0.11, p = 0.004 compared with partial η2 < 0.01, p > 0.5 for Df, electronic supplementary material, table S1). All other acoustic parameters remained highly sexually dimorphic after controlling for height (0.20 ≤ partial η2 ≤ 0.66). In subsequent analyses, F0 − s.d. was log-transformed to correct positive skew.
(b). Correlates of acoustic parameters
Measures of threat potential (height, weight and arm strength), along with acoustic parameters as control variables, were entered simultaneously into separate multiple regressions (table 3) to explore independent relationships to each sexually dimorphic acoustic parameter (F0, F0 − s.d. and Pf). Age was not significantly correlated with any acoustic parameter (all p > 0.1), so it was not included in these models. The VIF for weight was between 9.4 and 11.1, indicating that some caution should be used in interpreting results for this variable [53]. In addition, we explored zero-order correlations between measures of threat potential and acoustic parameters (table 4).
In the regression model for mean F0 (F5,25 = 8.38, p < 0.0001), height (t = −2.39, beta = −0.66, p = 0.025), weight (t = 2.53, beta = 0.94, p = 0.018), arm strength (t = −2.16, beta = −0.47, p = 0.041) and F0 − s.d. (t = 5.09, beta = 0.64, p < 0.0001) were significant predictors. In zero-order correlations, a lower, more masculine F0 was associated with greater threat potential for all measures, but no correlation attained statistical significance. Although no variable significantly predicted Pf in the model for Pf (F5,25 = 1.51, p = 0.222), a more masculine Pf was associated with a significantly greater height (r32 = −0.38, p = 0.030) and weight (r32 = −0.39, p = 0.026) in zero-order correlations. However, these relationships are only marginally statistically significant after Bonferroni correction for multiple tests (three predictors). Only mean F0 significantly predicted F0 − s.d. (t = 5.09, beta = 0.80, p < 0.0001) in the model for this variable (F5,25 = 5.65, p = 0.001), and no zero-order correlation between F0 − s.d. and measures of threat potential attained statistical significance. No variable significantly or marginally significantly predicted Df in the regression model (F5,25 = 0.34, p = 0.886) or in zero-order correlations (all p > 0.5).
8. Discussion
We proposed a novel method of measuring formant structure, formant position (Pf), and showed in two diverse human samples that Pf was more sexually dimorphic than the most widely used measure of formant structure, formant dispersion (Df). In both samples, Pf was more sexually dimorphic than Df when height was held constant, and more strongly related to height when sex was held constant. Mean fundamental frequency (F0) and the variation in F0 across an utterance (F0 − s.d.) were also highly sexually dimorphic in both populations. Especially remarkable was the finding that F0 − s.d. differed by sex when measured from a single three-syllable word (hujambo). This variable was highly sexually dimorphic whether measured in Hertz or in equivalent rectangular bandwidths (ERBs), a unit that scales more nearly linearly with perception [54]. Thus, it is unlikely that women's F0 varies more dramatically simply because, at the higher frequencies at which women speak, larger changes are required for the same perceptual effects.
Sex differences in mean F0 and formant frequencies depend largely on the anatomy of the vocal apparatus [2]. By contrast, it seems likely that the sex difference in F0 − s.d. mainly reflects neuropsychological differences affecting control of the vocal apparatus. In this regard, a highly varied F0 may be analogous to smiling—both signal affiliation and deference [23], and both are expressed to a greater degree among women [55]. Perhaps reliably emerging neurodevelopmental processes, possibly influenced by sex hormones, underlie the apparent cross-cultural ubiquity of both sex differences, but these processes are also modifiable by social learning [55,56].
Each sexually dimorphic acoustic parameter in each sample was predicted by one or more measures of men's physical threat potential, with the exception of F0 − s.d. in the Hadza sample. This last result is unsurprising, given the modest sample size and paucity of vocal information (one three-syllable word) available for assessing individual differences in F0 − s.d. In the larger US sample with more vocal information, a low (masculine) F0 − s.d. was related to increased physical aggressiveness and marginally significantly greater arm strength.
Taller men with higher testosterone levels spoke in a lower (more masculine) F0 in the US sample. Hadza men who spoke in a lower F0 were also taller and were stronger. Interestingly, they also tended to be lighter, with height and arm strength statistically controlled, suggesting less body fat. However, entering skinfold thickness into the model did not change these relationships.
Across men and women in both US and Hadza samples, controlling for sex, taller individuals spoke with a more masculine (lower) Pf, indicating longer vocal tracts. In the US sample, men who spoke with a lower Pf were taller, heavier, stronger and reported more physical aggression. Only height predicted a lower Pf when other predictors were held statistically constant via multiple regression, although the non-significance of other predictors probably reflects collinearity. For example, when height and arm strength were entered alone into a multiple regression model, both were significant predictors (p < 0.01). In the Hadza sample, taller and heavier men also spoke with a more masculine Pf, although these associations were only marginally statistically significant after Bonferroni correction for multiple tests. No measure of physical formidability significantly predicted Pf in the multiple regression model, again probably reflecting collinearity among predictor variables (0.43 < r < 0.85, for correlations among height, weight and strength).
Height significantly predicted Df only in the US dataset with both sexes included, perhaps owing to the increased sample size. Df was unrelated to any other measure of physical formidability in either US or Hadza samples.
9. Conclusion
Lower fundamental and formant frequencies in men reflect the hypertrophic growth of the male larynx [8] and its descent in the trachea a full vertebra lower than in women [57]. These pubertal changes may have evolved in order to exaggerate body size [44], but same-sex competitors and potential mates should evolve to attend to signals in proportion to the signals' reliability at communicating useful information. Our results support the notion that certain acoustic features of men's voices are honest signals of threat potential. It is possible that adult physiology and/or developmental influences such as pubertal testosterone and growth hormone affect both vocal anatomy and physical prowess. Certain psychological states (e.g. self-confidence) may also modulate the position of the larynx and vocal fold tension, and these psychological states may be related to physical formidability. These results thus help clarify widespread attributions of physical prowess to men with masculine voices. The two most sexually dimorphic acoustic features, F0 and Pf, in particular showed the strongest relationships with size, strength and testosterone levels across our samples.
Each of our samples has advantages and disadvantages, for example, greater sample size (the largest of which we are aware to explore such relationships) and tighter methodological control in the US sample versus greater ecological validity in the Hadza sample. Our results agree with some previous results and contrast with others, highlighting the need for future work to address these discrepancies. One result contrasting with previous research was the significant negative relationship that we observed in both samples between men's height and F0. We note that the zero-order correlation was not significant in the Hadza sample (though it was of greater magnitude than in the US sample: r = −0.20 versus −0.17) and suggest that the discrepancy is due to sample size. Similarly, the weighted mean correlation between men's height and F0 measured from continuous speech was approximately −0.17 across four samples reported by Sell et al. [35], yet individual sample sizes were modest (20–63), and no correlation attained statistical significance.
Future research should explore additional correlates of physical competitive ability (e.g. past success in fights and peer rankings of physical formidability), particularly in traditional societies. Future work should especially address F0 − s.d., as little evidence is available on this promising acoustic parameter, and should explore the utility of Pf as an inverse measure of apparent vocal tract length and vocal masculinity. In unisex samples, within-sex means and standard deviations may be used in calculating Pf values; doing so had a minimal effect on the present results. Finally, researchers should continue to investigate the information content of these acoustic parameters across non-human primates.
References
- 1.Green S. M. 1981. Sex differences and age gradations in vocalizations of Japanese and lion-tailed monkey. Am. Zool. 21, 165–184 [Google Scholar]
- 2.Rendall D., Kollias S., Ney C., Lloyd P. 2005. Pitch (F0) and formant profiles of human vowels and vowel-like baboon grunts: the role of vocalizer body size and voice-acoustic allometry. J. Acoust. Soc. Am. 117, 944–955 10.1121/1.1848011 (doi:10.1121/1.1848011) [DOI] [PubMed] [Google Scholar]
- 3.Mitani J. C., Gros-Louis J. 1995. Species and sex differences in the screams of chimpanzees and bonobos. Int. J. Primatol. 16, 393–411 10.1007/BF02735794 (doi:10.1007/BF02735794) [DOI] [Google Scholar]
- 4.Childers D. G., Wu K. 1991. Gender recognition from speech. II. fine analysis. J. Acoust. Soc. Am. 90, 1841–1856 10.1121/1.401664 (doi:10.1121/1.401664) [DOI] [PubMed] [Google Scholar]
- 5.Daly N., Warren P. 2001. Pitching it differently in New Zealand English: speaker sex and intonation patterns. J. Sociolinguist. 5, 85–96 10.1111/1467-9481.00139 (doi:10.1111/1467-9481.00139) [DOI] [Google Scholar]
- 6.Tossi O., Postan D., Bianculli C. 1976. Longitudinal study of children's voice at puberty. XVIth Int. Cong. Logop. Phoniatr. 1976, 486–490 [Google Scholar]
- 7.Newman S. R., Butler J., Hammond E. H., Gray S. D. 2000. Preliminary report on hormone receptors in the human vocal fold. J. Voice. 14, 72–81 10.1016/S0892-1997(00)80096-X (doi:10.1016/S0892-1997(00)80096-X) [DOI] [PubMed] [Google Scholar]
- 8.Harries M., Hawkins S., Hacking J., Hughes I. 1998. Changes in the male voice at puberty: vocal fold length and its relationship to the fundamental frequency of the voice. J. Laryngol. Otol. 112, 451–454 10.1017/S0022215100140757 (doi:10.1017/S0022215100140757) [DOI] [PubMed] [Google Scholar]
- 9.Fitch W. T., Giedd J. 1999. Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J. Acoust. Soc. Am. 106, 1511–1522 10.1121/1.427148 (doi:10.1121/1.427148) [DOI] [PubMed] [Google Scholar]
- 10.Fant G. 1960. Acoustic theory of speech production. The Hague, The Netherlands: Mouton [Google Scholar]
- 11.Puts D. A., Gaulin S. J. C., Verdolini K. 2006. Dominance and the evolution of sexual dimorphism in human voice pitch. Evol. Hum. Behav. 27, 283–296 10.1016/j.evolhumbehav.2005.11.003 (doi:10.1016/j.evolhumbehav.2005.11.003) [DOI] [Google Scholar]
- 12.Delgado R. A. 2006. Sexual selection in the loud calls of male primates: signal content and function. Int. J. Primatol. 27, 5–25 10.1007/s10764-005-9001-4 (doi:10.1007/s10764-005-9001-4) [DOI] [Google Scholar]
- 13.Andersson M. 1994. Sexual selection. Princeton, NJ: Princeton University Press [Google Scholar]
- 14.Feinberg D. R., DeBruine L. M., Jones B. C., Perrett D. I. 2008. The role of femininity and averageness of voice pitch in aesthetic judgments of women's voices. Perception 37, 615–623 10.1068/p5514 (doi:10.1068/p5514) [DOI] [PubMed] [Google Scholar]
- 15.Puts D. A., Barndt J. L., Welling L. L. M., Dawood K., Burriss R. P. 2011. Intrasexual competition among women: vocal femininity affects perceptions of attractiveness and flirtatiousness. Pers. Indiv. Differ. 50, 111–115 10.1016/j.paid.2010.09.011 (doi:10.1016/j.paid.2010.09.011) [DOI] [Google Scholar]
- 16.Apicella C. L., Feinberg D. R. 2009. Voice pitch alters mate-choice-relevant perception in hunter-gatherers. Proc. R. Soc. B 276, 1077–1082 10.1098/rspb.2008.1542 (doi:10.1098/rspb.2008.1542) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Trivers R. L. 1972. Parental investment and sexual selection. In Sexual selection and the descent of man, 1871–1971 (ed. Cambell B.), pp. 136–179 London, UK: Heinemann [Google Scholar]
- 18.Puts D. A. 2005. Mating context and menstrual phase affect women's preferences for male voice pitch. Evol. Hum. Behav. 26, 388–397 10.1016/j.evolhumbehav.2005.03.001 (doi:10.1016/j.evolhumbehav.2005.03.001) [DOI] [Google Scholar]
- 19.Hodges-Simeon C. R., Gaulin S. J., Puts D. A. 2011. Voice correlates of mating success in men: examining ‘contests’ versus ‘mate choice’ modes of sexual selection. Arch. Sex Behav. 40, 551–557 10.1007/S10508-010-9625-0 (doi:10.1007/S10508-010-9625-0) [DOI] [PubMed] [Google Scholar]
- 20.Apicella C. L., Feinberg D. R., Marlowe F. W. 2007. Voice pitch predicts reproductive success in male hunter–gatherers. Biol. Lett. 3, 682–684 10.1098/rsbl.2007.0410 (doi:10.1098/rsbl.2007.0410) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Feinberg D. R., Jones B. C., Little A. C., Burt D. M., Perrett D. I. 2005. Manipulations of fundamental and formant frequencies affect the attractiveness of human male voices. Anim. Behav. 69, 561–568 10.1016/j.anbehav.2004.06.012 (doi:10.1016/j.anbehav.2004.06.012) [DOI] [Google Scholar]
- 22.Collins S. A. 2000. Men's voices and women's choices. Anim. Behav. 60, 773–780 10.1006/anbe.2000.1523 (doi:10.1006/anbe.2000.1523) [DOI] [PubMed] [Google Scholar]
- 23.Hodges-Simeon C. R., Gaulin S. J., Puts D. A. 2010. Different vocal parameters predict perceptions of dominance and attractiveness. Hum. Nat. 21, 406–427 10.1007/s12110-010-9101-5 (doi:10.1007/s12110-010-9101-5) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.Puts D. A. 2010. Beauty and the beast: mechanisms of sexual selection in humans. Evol. Hum. Behav. 31, 157–175 10.1016/j.evolhumbehav.2010.02.005 (doi:10.1016/j.evolhumbehav.2010.02.005) [DOI] [Google Scholar]
- 25.Puts D. A., Hodges C., Cardenas R. A., Gaulin S. J. C. 2007. Men's voices as dominance signals: vocal fundamental and formant frequencies influence dominance attributions among men. Evol. Hum. Behav. 28, 340–344 10.1016/j.evolhumbehav.2007.05.002 (doi:10.1016/j.evolhumbehav.2007.05.002) [DOI] [Google Scholar]
- 26.Jones B. C., Feinberg D. R., DeBruine L. M., Little A. C., Vukovic J. 2010. A domain-specific opposite-sex bias in human preferences for manipulated voice pitch. Anim. Behav. 79, 57–62 10.1016/j.anbehav.2009.10.003 (doi:10.1016/j.anbehav.2009.10.003) [DOI] [Google Scholar]
- 27.Wolff S. E., Puts D. A. 2010. Vocal masculinity is a robust dominance signal in men. Behav. Ecol. Sociobiol. 64, 1673–1683 10.1007/s00265-010-0981-5 (doi:10.1007/s00265-010-0981-5) [DOI] [Google Scholar]
- 28.Morton E. S. 1977. On the occurrence and significance of motivation—structural rules in some bird and mammal species. Am. Nat. 111, 855–869 10.1086/283219 (doi:10.1086/283219) [DOI] [Google Scholar]
- 29.Bruckert L., Lienard J. S., Lacroix A., Kreutzer M., Leboucher G. 2006. Women use voice parameters to assess men's characteristics. Proc. R. Soc. B 273, 83–89 10.1098/rspb.2005.3265 (doi:10.1098/rspb.2005.3265) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Dabbs J. M., Mallinger A. 1999. High testosterone levels predict low voice pitch among men. Pers. Indiv. Differ. 27, 801–804 10.1016/S0191-8869(98)00272-4 (doi:10.1016/S0191-8869(98)00272-4) [DOI] [Google Scholar]
- 31.Evans S., Neave N., Wakelin D., Hamilton C. 2008. The relationship between testosterone and vocal frequencies in human males. Physiol. Behav. 93, 783–788 10.1016/j.physbeh.2007.11.033 (doi:10.1016/j.physbeh.2007.11.033) [DOI] [PubMed] [Google Scholar]
- 32.Archer J. 1991. The influence of testosterone on human aggression. Br. J. Psychol. 82, 1–28 10.1111/j.2044-8295.1991.tb02379.x (doi:10.1111/j.2044-8295.1991.tb02379.x) [DOI] [PubMed] [Google Scholar]
- 33.Clark A. S., Henderson L. P. 2003. Behavioral and physiological responses to anabolic-androgenic steroids. Neurosci. Biobehav. Rev. 27, 413–436 10.1016/S0149-7634(03)00064-2 (doi:10.1016/S0149-7634(03)00064-2) [DOI] [PubMed] [Google Scholar]
- 34.Archer J. 2006. Testosterone and human aggression: an evaluation of the challenge hypothesis. Neurosci. Biobehav. Rev. 30, 319–345 10.1016/j.neubiorev.2004.12.007 (doi:10.1016/j.neubiorev.2004.12.007) [DOI] [PubMed] [Google Scholar]
- 35.Sell A., Bryant G. A., Cosmides L., Tooby J., Sznycer D., von Rueden C., Krauss A., Gurven M. 2010. Adaptations in humans for assessing physical strength from the voice. Proc. R. Soc. B 277, 3509–3518 10.1098/rspb.2010.0769 (doi:10.1098/rspb.2010.0769) [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Graddol D., Swann J. 1983. Speaking fundamental frequency: some physical and social correlates. Lang. Speech 26, 351–366 [DOI] [PubMed] [Google Scholar]
- 37.Evans S., Neave N., Wakelin D. 2006. Relationships between vocal characteristics and body size and shape in human males: an evolutionary explanation for a deep male voice. Biol. Psychol. 72, 160–163 10.1016/j.biopsycho.2005.09.003 (doi:10.1016/j.biopsycho.2005.09.003) [DOI] [PubMed] [Google Scholar]
- 38.Kunzel H. J. 1989. How well does average fundamental frequency correlate with speaker height and weight? Phonetica 46, 117–125 10.1159/000261832 (doi:10.1159/000261832) [DOI] [PubMed] [Google Scholar]
- 39.Lass N. J., Brown W. S. 1978. Correlational study of speakers' heights, weights, body surface areas, and speaking fundamental frequencies. J. Acoust. Soc. Am. 63, 1218–1220 10.1121/1.381808 (doi:10.1121/1.381808) [DOI] [PubMed] [Google Scholar]
- 40.van Dommelen W. A., Moxness B. H. 1995. Acoustic parameters in speaker height and weight identification: sex-specific behaviour. Lang. Speech 38, 267–287 [DOI] [PubMed] [Google Scholar]
- 41.Greisbach R. 1999. Estimation of speaker height from formant frequencies. Forensic Linguist. 6, 265–277 10.1558/sll.1999.6.2.265 (doi:10.1558/sll.1999.6.2.265) [DOI] [Google Scholar]
- 42.Gonzalez J. 2004. Formant frequencies and body size of speaker: a weak relationship in adult humans. J. Voice 32, 277–287 [Google Scholar]
- 43.Rendall D., Vokey J. R., Nemeth C. 2007. Lifting the curtain on the Wizard of Oz: biased voice-based impressions of speaker size. J. Exp. Psychol. Hum. Percept. Perform. 33, 1208–1219 10.1037/0096-1523.33.5.1208 (doi:10.1037/0096-1523.33.5.1208) [DOI] [PubMed] [Google Scholar]
- 44.Fitch W. T. 1997. Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J. Acoust. Soc. Am. 102, 1213–1222 10.1121/1.421048 (doi:10.1121/1.421048) [DOI] [PubMed] [Google Scholar]
- 45.Ey E., Pfefferle D., Fischer J. 2007. Do age- and sex-related variations reliably reflect body size in non-human primate vocalizations? Rev. Primates 48, 253–267 10.1007/s10329-006-0033-y (doi:10.1007/s10329-006-0033-y) [DOI] [PubMed] [Google Scholar]
- 46.Fairbanks G. 1960. Voice and articulation drillbook, 2nd edn. New York, NY: Harper & Row [Google Scholar]
- 47.Boersma P., Weenik D. 2009. Praat: doing phonetics by computer, v. 5.1.03 (cited 21 March 2009). See http://www.praat.org/
- 48.Baken R. J. 1987. Clinical measurement of speech and voice. Boston, MA: Little, Brown, and Co [Google Scholar]
- 49.Cohen J. 1990. Things I have learned (so far). Am. Psychol. 45, 1304–1312 10.1037/0003-066X.45.12.1304 (doi:10.1037/0003-066X.45.12.1304) [DOI] [Google Scholar]
- 50.Wang C., Plymate S., Nieschlag E., Paulsen C. A. 1981. Salivary testosterone in men: further evidence of a direct correlation with free serum testosterone. J. Clin. Endocrinol. Metab. 53, 1021–1024 10.1210/jcem-53-5-1021 (doi:10.1210/jcem-53-5-1021) [DOI] [PubMed] [Google Scholar]
- 51.Moffat S. D., Hampson E. 1996. Salivary testosterone levels in left- and right-handed adults. Neuropsychologia 34, 225–233 10.1016/0028-3932(95)00090-9 (doi:10.1016/0028-3932(95)00090-9) [DOI] [PubMed] [Google Scholar]
- 52.Buss A. H., Perry M. 1992. The aggression questionnaire. J. Pers. Soc. Psychol. 63, 452–459 10.1037/0022-3514.63.3.452 (doi:10.1037/0022-3514.63.3.452) [DOI] [PubMed] [Google Scholar]
- 53.Kutner M., Nachtsheim C., Neter J. 2004. Applied linear regression models, 4th edn. New York, NY: McGraw-Hill/Irwin [Google Scholar]
- 54.Moore B. C., Glasberg B. R. 1983. Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J. Acoust. Soc. Am. 74, 750–753 10.1121/1.389861 (doi:10.1121/1.389861) [DOI] [PubMed] [Google Scholar]
- 55.Ellis L., Das S. 2011. Sex differences in smiling and other photographed traits: a theoretical assessment. J. Biosoc. Sci. 43, 345–351 10.1017/S0021932010000659 (doi:10.1017/S0021932010000659) [DOI] [PubMed] [Google Scholar]
- 56.Henton C. 1995. Pitch dynamism in female and male speech. Lang. Commun. 15, 43–61 10.1016/0271-5309(94)00011-Z (doi:10.1016/0271-5309(94)00011-Z) [DOI] [Google Scholar]
- 57.Harrison D. F. N. 1995. The anatomy and physiology of the mammalian larynx. Cambridge, NY: Cambridge University Press [Google Scholar]