A validated Mandarin Chinese Auditory Emotion Database of Subject-Personal-Pronoun Sentences (MCAE-SPPS)

Mengyuan Li; Anqi Zhou; Huiru Yan; Qiuhong Li; Chifen Ma; Chao Wu

doi:10.1038/s41597-026-06976-z

. 2026 Mar 5;13:602. doi: 10.1038/s41597-026-06976-z

A validated Mandarin Chinese Auditory Emotion Database of Subject-Personal-Pronoun Sentences (MCAE-SPPS)

Mengyuan Li ^1,², Anqi Zhou ², Huiru Yan ², Qiuhong Li ², Chifen Ma ², Chao Wu ^2,^✉

PMCID: PMC13079751 PMID: 41786802

Abstract

Emotional expression in speech varies with grammatical subjects, including personal pronouns. This study reports the development and validation of a novel Mandarin Chinese auditory emotional speech dataset comprising sentences with first-, second-, and third-person pronouns. Six professionally trained actors recorded 200 semantically meaningful sentences in a neutral tone and six basic emotions: happiness, sadness, anger, fear, disgust, and surprise. Emotional labels and intensity ratings were provided by 720 native Chinese-speaking college students. The final dataset includes 6,675 validated recordings, including Neutral (1,169), Sadness (1,187), Anger (671), Surprise (969), Disgust (738), Happiness (785), and Fear (671) utterances. Of these, 2,729 recordings contain first-person pronouns, 2,608 contain second-person pronouns, and 1,338 contain third-person pronouns. The dataset demonstrated acceptable inter-rater reliability and robust associations between acoustic features and emotion recognition performance. Each recording includes the raw waveform file, emotion recognition rates, perceived intensity ratings, and a comprehensive set of extracted acoustic features. This validated emotional speech corpus offers a unique and valuable resource for research in linguistics, psychological science, neuroscience, and clinical rehabilitation.

Subject terms: Psychology, Research data

Background & Summary

Subject-pronoun sentences are frequently used, either consciously or unconsciously, in everyday interpersonal communication. Subject pronouns such as “I,” “we,” and “you” often carry substantial emotional salience, as they directly reference the speaker or the listener, thereby occupying a central role in both emotion expression and perception in spoken language^1,2. Investigating how emotions are conveyed through pronoun-based sentences provides insight into subtle variations in prosodic features, including pitch, intonation, and temporal dynamics that characterize different emotional states, particularly in populations with mental health disorders^3–5. Such knowledge is essential for the development of effective strategies for diagnosis, monitoring, and intervention of emotional and communicative impairments^6,7. Moreover, the performance of automatic emotion recognition systems relies on a comprehensive understanding of emotional phonetics, including the modulatory role of subject pronouns in emotional expression⁸. Integrating pronoun-specific emotional cues into these systems may substantially enhance their sensitivity to human emotions and altered mental states, thereby improving their applicability in clinical assessment and human–machine interaction contexts^9,10.

Emotional speech databases (ESDs) constitute essential resources for advancing research in emotion recognition^11–13, linguistic analysis^14,15, emotion-language processing^16,17, emotionally intelligent systems^18,19, and mental health monitoring and cognitive training^20,21. To date, numerous countries have developed emotional speech databases in their native languages, as comprehensively summarized in previous reviews^22,23. Recently established large-scale ESDs include the Italian Database of Elicited Mood in Speech (DEMoS)²⁴, an Urdu emotional speech corpus comprising 2,500 utterances based on emotionally neutral coherent sentences²⁵, the SUST Bangla Emotional Speech Corpus (SUBESCO)²⁶, and a Quechua Collao corpus containing 12,420 stimuli²⁷. Language-specific resources for Mandarin and Cantonese have also been developed, including Mandarin Chinese Auditory Emotions Nonsense Sentences (MCAE-NS) database¹⁴, Mandarin Chinese Auditory Emotions monosyllables (MCAE-MS)²⁸, and the Cantonese Audio-Visual Emotional Speech (CAVES) dataset²⁹.

Although several existing databases include some sentences containing subject personal pronouns, none are specifically designed to examine pronoun-related emotional expression in a systematic manner. For instance, the widely used Berlin Emotional Speech Database (EmoDB; https://www.tu.berlin/en/kw/research/projects/emotional-speech) comprises a set of short German sentences, some of which begin with subject pronouns, produced by 10 professional speakers (five male and five female, aged 25–35) across seven emotional categories: happiness, boredom, sadness, fear, disgust, surprise, and neutrality. However, the inclusion of pronoun-based sentences in EmoDB is incidental rather than theoretically motivated. Similarly, the Interactive Emotional Dyadic Motion Capture database (IEMOCAP)³⁰ contains naturalistic conversational speech that includes subject pronouns (e.g., “I”, “We”, “You”, “He”) embedded within dialogue exchanges performed by five actors (two female, three male) and annotated across ten emotional categories, including: happy, sad, angry, neutral, frustrated, excited, surprised, fearful, disgusted, and others. Although IEMOCAP offers rich contextual and multimodal information, pronoun usage is not explicitly controlled or annotated as a variable of interest. The Crowd-Sourced Emotional Multimodal Actors Dataset (CREMA-D; https://github.com/CheyneyComputerScience/CREMA-D) contains five English sentences beginning with the first-person singular “I”, spoken by 91 actors (48 male and 43 female) in six emotional states: happiness, sadness, anger, fear, disgust, and neutrality. Despite the consistent use of a subject pronoun, the dataset was not designed to systematically investigate the emotional modulation associated with different pronoun types or grammatical perspectives. In Mandarin Chinese, existing emotional speech resources such as the CASIA Emotional Speech Dataset consist of recordings from four professional actors (two male and two female) who produced 50 sentences across six emotional categories: neutrality, anger, fear, sadness, surprise, and happiness³¹. However, the dataset is broad in scope and does not specifically target sentences containing subject personal pronouns. Likewise, the Chinese Expressive Audio-Visual Database (CHEAVD)³² includes audio-visual recordings of multiple speakers expressing emotions such as neutrality, happiness, sadness, anger, fear, and surprise, and contains some utterances with subject pronouns. Nonetheless, these pronoun-containing sentences are neither systematically selected nor explicitly annotated for analyses focusing on pronoun-related emotional expression. Overall, while subject pronouns appear sporadically across several widely used emotional speech databases, there remains a clear lack of dedicated resources that explicitly control, annotate, and analyze the emotional prosody associated with different subject pronoun categories. This gap highlights the need for a specialized emotional speech corpus designed to investigate pronoun-specific emotional encoding and perception.

Pronouns are subtle yet powerful linguistic elements that substantially shape the perception and interpretation of emotion in spoken language. Their influence stems from their capacity to direct attentional focus, evoke social dynamics, and modulate prosodic features. Consequently, pronouns constitute a critical dimension of emotional speech analysis, with important implications for computational emotion recognition systems as well as clinical and therapeutic applications. Different classes of pronouns are associated with distinct emotional and interpersonal functions. First-person plural pronouns (e.g., “we”) tend to emphasize shared experience and social affiliation, fostering a sense of intimacy and connectedness between speakers and listeners. In contrast, first-person singular pronouns (e.g., “I,” “me”) have been linked to increased self-focus, interpersonal distress, and more intrusive communication styles, particularly in clinical populations^33–35. Second-person pronouns (e.g., “you”) directly address the listener and often heighten emotional salience and perceived intensity by increasing personal relevance and engagement^36–38. Third-person pronouns (e.g., “he,” “she,” “they”), by contrast, introduce psychological distance between the speaker and the referent, potentially attenuating emotional immediacy while promoting a more detached or reflective tone^39–41. Neuroimaging evidence further underscores the relevance of pronouns in emotional speech processing. A meta-analysis identified functional convergence in the left posterior middle and superior temporal gyri during pronoun processing⁴². Additionally, Schirmer (2018) highlighted the involvement of bilateral primary and secondary temporal cortices in the processing of auditory emotional speech, suggesting that these regions may play integrative roles in decoding and expressing emotional prosody, particularly when subject pronouns are present.

Existing emotional speech databases have provided valuable insights into emotion recognition and speech processing. However, they rarely account for the role of subject personal pronouns—linguistic elements that are central to both emotional and social communication. Pronouns such as “I,” “you,” and “we” convey distinct self-referential and interpersonal meanings, which can substantially influence emotional salience and cognitive processing during speech perception. To address this gap, we developed a specialized corpus featuring six types of subject personal pronoun sentences, each annotated with detailed emotional labels. The corpus includes parallel utterances produced by six professional Mandarin actors, facilitating both speaker voice conversion (i.e., the same sentence expressed by different speakers) and emotional voice conversion (i.e., the same speaker expressing different emotions)²³. This design enhances the corpus’s utility for human–machine interaction research, including emotionally responsive AI and personalized therapeutic applications^43–45. Moreover, the resource enables interdisciplinary investigations across linguistics, psychology, neuroscience, and computational science by allowing exploration of pronoun-specific emotional cues, testing theoretical models in social cognition and appraisal theory, and informing emotion-based interventions aimed at improving empathy, self-awareness, and social communication.

This manuscript represents the first formal, peer-reviewed report detailing the design, validation, and application of the MCAE-SPPS database. To increase dataset visibility and encourage academic sharing, we previously created a descriptive, non-peer-reviewed metadata page on IEEE DataPort (https://ieee-dataport.org/documents/mandarin-chinese-auditory-emotions-stimulus-database-validated-set-sentences-subjective). This page does not constitute a publication, and no version of this manuscript has been published by IEEE. The IEEE DataPort entry simply redirects users to the official dataset repository hosted on OSF: 10.17605/OSF.IO/9JYZC. (View-only link: https://osf.io/9jyzc/overview?view_only=088ce4e15a914b939c8bb6bd119c7226)

Methods

Script creation

The recording script comprises 200 emotionally neutral Chinese sentences, all declarative and following a consistent subject–predicate–object structure (e.g., “I have a plan”). Each sentence contained 4–7 high-frequency Chinese characters to minimize unintended affective connotations.

The script construction included two phases. In the first phase, 40 base sentences were created using the first-person singular pronoun “I” (see the sentence corpus file available in the OSF repository: “sentence corpus information.xlsx”⁴⁶). Twenty independent raters evaluated the semantic valence of each sentence on a 9-point Likert scale (1 = very sad, 5 = neutral, 9 = very happy). The sentence order was randomized for each participant, and informed consent was obtained electronically. The average semantic valence across sentences was 5.32 ± 0.95, and no sentence exceeded the neutral midpoint of 5 plus two standard deviations; therefore, all 40 sentences were retained. In the second phase, these base sentences were adapted by replacing the first-person singular pronoun with five additional subject personal pronouns: “We,” “You” (singular), “You” (plural), “He (他)”, and “They.” Only the subject pronoun was modified; verb and object components remained identical to preserve semantic equivalence. This process resulted in a total of 200 semantically neutral sentences: 160 sentences with pronouns “I,” “We,” “You” (singular), and “You” (plural) (40 sentences per pronoun) and 40 sentences with third-person pronouns (“He”(他) and “They”, 20 sentences each). Note that in Mandarin, the spoken forms of “He” (他) and “She” (她) are phonetically identical; thus, only the orthographic form “他” was used for the third-person singular condition.

Recording sessions

Recording actors

Six professional actors (Actor 1 to Actor 6; three males and three females; mean age = 32.83 ± 5.98 years), all native Mandarin speakers and graduates of the Central Academy of Drama, participated in this study. Basic information of the actors is shown in Table 1. Before the recording sessions, all actors provided written informed consent and received financial compensation for their participation.

Table 1.

Voice Actors’ Information.

Actor	Age	Gender	Voice Acting/Performance experience
Actor 1	39	Male	21 years of voice acting and performance
Actor 2	36	Male	18 years of voice acting
Actor 3	24	Male	4 years of voice acting and performance
Actor 4	34	Female	26 years of voice acting and performance
Actor 5	37	Female	16 years of voice acting and performance
Actor 6	27	Female	4 years of voice acting

Open in a new tab

Recording procedures

Prior to the recording sessions, each actor was provided with the full set of 200 sentences to familiarize themselves with the text. The researcher briefly explained the general procedures and allowed the actors to practice expressing the target emotions using these sentences. During the recording sessions, actors were instructed to convey the six target emotions (happiness, sadness, anger, fear, disgust, and surprise) and neutral state as authentically as possible. Each actor recorded all 200 sentences for each of the seven emotional categories, resulting in a total of 1,400 recordings per actor. The coding system comprised actor codes (1–6), emotion category codes (1–7; 1 = neutral, 2 = happiness, 3 = anger, 4 = sadness, 5 = fear, 6 = disgust, and 7 = surprise), and sentence codes (001–200). After excluding 21 recordings due to mispronunciation or technical errors, 8379 recordings remained for validation (actor 1: 1,398; actor 2: 1,398; actor 3: 1,394; actor 4:1,395; actor 5: 1,398; actor 6: 1,396). The distribution of recordings across emotional categories was as follows: neutral, 1,197; sadness:1,197; fear:1,196; anger: 1,199; disgust: 1,198; surprise: 1,195; and happiness: 1,197.

Recording environment and equipment

All recordings were conducted in a professional soundproof studio. Speech signals were captured using a DJI Mic-AST01 wireless microphone and digitized at a sampling rate of 48 kHz with 64-bit resolution across two channels.

Segmentation and preprocessing

The recordings were manually segmented and coded using Adobe Audition (version 13.0.5.36). All 8,379 speech samples were peak-normalized and saved individually in WAV format using MATLAB R2023a.

Annotation procedure

Participants

A total of 818 Chinese college students were recruited through online advertisements. Ninety-seven individuals were excluded based on screening scores ≥ 10 on either the Generalized Anxiety Disorder Scale (GAD-7) or the 9-item Patient Health Questionnaire Depression Scale (PHQ-9), consistent with prior evidence indicating that anxiety and depressive symptoms may compromise emotion recognition accuracy^47,48. One additional participant was excluded due to a self-reported history of a neurological disorder. The final sample included 720 participants (412 females and 308 males; mean age = 21.60 ± 2.96 years), who were included in the validation analysis.

Procedure

The validation procedure was conducted via a custom-designed website developed in Java (version JDK 1.8) using the IntelliJ IDEA integrated development environment (version 2019.3.5). Participants received remote instructions from the experimenter through WeChat or a digital instruction manual. The experimenter remained available online throughout the initial phase of the experiment to guide participants, ensure task comprehension, and resolve any questions or technical issues.

Participants completed the experiment on personal computers in a quiet environment, using a self-selected comfortable listening level. Access to the experimental platform was granted via a unique username and password. Upon clicking the “Begin” button, each vocal utterance was presented automatically. After listening to the audio stimuli, participants were instructed to identify the perceived emotion category (neutral, happiness, anger, sadness, fear, surprise, or disgust) and to rate its emotional intensity on a 9-point scale. Participants were allowed to replay each utterance as needed by clicking the play button. Once an evaluation was submitted, the subsequent utterance was presented automatically, and participants were not allowed to revise previous responses.

Given the substantial time required to evaluate all 8,379 utterances—estimated at approximately 10 hours, based on an average of 1.3 seconds for listening and 3 seconds for rating per utterance— we implemented a strategy to minimize participant fatigue and maintain attentional engagement¹⁴. The 720 participants were divided into 18 groups, each consisting of 40 participants. Each group was assigned to evaluate one-third of the recordings from a single speaker, encompassing all seven emotional categories. Consequently, each participant evaluated a representative subset of emotional utterances from one actor rather than being restricted to a single emotion category. On average, each participant rated approximately 465 utterances, which were presented in a randomized order within each group. On average, participants spent 35–40 minutes evaluating approximately one-third of the utterances from one of the six speakers. In addition to the sentence-based evaluations, participants also completed assessments of approximately 1.5 hours of monosyllabic emotional speech, as reported in our previous work²⁸. Consequently, the total evaluation time per participant ranged from 2 to 2.5 hours. To reduce cognitive load and maintain data quality, participants were encouraged to take two to three breaks as needed throughout the experiment.

Ethics statement

All study procedures were reviewed and approved by the Peking University Biomedical Ethics Committee (IRB00001052_23144). Participants were recruited via Chinese social media platforms, including WeChat, Rednote, and Weibo. Electronic informed consent was obtained prior to participation through the SoJump platform. The consent form detailed the study objectives, data collection procedures, and assessment tasks. Raters were informed that the collected data would be shared in anonymized form for non-commercial use. All personally identifiable information was removed to ensure anonymity. Participants were also informed of the potential risks and benefits of participation and received appropriate compensation. Voice actors were specifically informed that their recordings would be anonymous and made publicly available for non-commercial use.

Data Records

The dataset is publicly available on OSF⁴⁶. It contains 6,675 audio segments. An overview of the corpus is provided in Table 2. Audio files are organized into separate folders by emotion category to facilitate preview, and compressed versions of these folders are available in the directory entitled audio_zips_by_different_emotion. Each audio file is named according to a standardized convention that includes the actor code (1–6), emotion category code (1–7), and a five-digit sentence code (001–200). Detailed naming rules are described in Table 2. Comprehensive metadata for each audio file are provided in the file sentence_corpus_information.xlsx, which includes information on emotion category, perceived emotional intensity, actor code, validation results, sentence content, and acoustic features. Additional documentation is available in the Files_description.txt.

Table 2.

Summary of the Database Information.

Utterance number in each step
Number of sentences	200 sentences in total: 160 sentences using first- and second-person pronouns (“I”, “We”, “You [singular]”, and “You [plural]”; 40 sentences per pronoun) and 40 sentences using third-person pronouns (20 with “He” and 20 with “They”).
Number of utterances recorded	8,400 utterances: 200 sentences × 7 emotional categories (neutrality, happiness, anger, fear, sadness, disgust, surprise) × 6 actors (3 males, 3 females).
Number of utterances assessed	8,379 utterances (actor 1: 1,398; actor 2: 1,398; actor 3: 1,394; actor 4:1,395; actor 5: 1,398; actor 6: 1,396). Twenty-one utterances were excluded due to mispronunciation or technical errors.
Number of utterances included in the final corpus	6,675 utterances (1704 were excluded because their recognition rate did not meet the inclusion criteria). Emotion distribution: neutral (1,169), sadness (1,187), anger (671), surprise (969), disgust (738), happiness (785), and fear (671). Pronoun distribution: first-person: 2,729; second-person: 2,608, and third-person: 1,338.
Audio annotation
Number of annotators per utterance	40
Number of utterances assessed per participant	Approximately 465 utterances, randomly distributed across seven emotion categories and six subject personal pronoun types.
Participant grouping	18 groups (6 actors × 3 randomly assigned subsets)
Total number of participants	720 (18 groups × 40 participants).
Average audio duration	1.24 second.
Annotation tasks	Emotion category identification (single-choice among seven emotions); emotion intensity rating (9-point scale)
Audio files coding rule (five digits)
1st digit	Actor coding (1–6), 1–3 are males, 4–6 are females
2st digit	Emotion type coding (Original recording emotion type, 1–7), 1: neutral, 2: happiness, 3: anger, 4: fear, 5: sadness, 6: disgust, 7: surprise
3–5th digit	sentence coding: 001 - 200. Sentence pronouns with “I”: 001–040; “We”: 041–080; “You”: 081–120; “You” (plural): 121–160; “He”: 161–180; “They”: 181–200
Examples	13171.wav —actor 1’s (male) anger voice, speech content is “他有个计划”(He has a plan). 45021.wav —actor 4’s (female) sadness voice, speech content is “我拿着杯子”(I hold the cup).

Open in a new tab

Technical Validation

Accuracy without speech inclusion criteria

Figure 1 presents a confusion matrix illustrating the correspondence between the emotions intended by the actors and the emotions perceived by the raters. Each cell in the matrix indicates the counts (i.e., the number of times) a specific emotion was selected. Diagonal cells represent correct classifications, whereas off-diagonal cells reflect misclassifications. The x-axis denotes the intended emotion, and the y-axis denotes the perceived emotion. Recognition accuracy varied across emotion categories: neutral utterances showed the highest recognition accuracy (88.6%), followed by sadness (82.1%), surprise (66.0%), anger (65.8%), happiness (55.1%), fear (51.2%), and disgust (50.7%). Confusion matrices stratified by actor are presented in Table 3.

Fig. 1 — Confusion matrix for emotion recognition. Numbers represent the count of trials in which each emotional stimulus was assigned to a given emotion category.

Table 3.

Confusion matrix (counts) of target emotion categories and listener-based emotion recognition across six speakers.

Speakers (gender)	Target emotion					Responses
Speakers (gender)	Categories	N	Neutral	Happiness	Anger	Fear	Sadness	Disgust	Surprise	Total
1 (male)	Neutral	200	6914	267	123	109	241	177	169	8000
	Happiness	200	2967	3146	234	254	285	124	990	8000
	Anger	200	511	124	6825	77	160	198	105	8000
	Fear	200	1564	173	101	3059	2251	120	732	8000
	Sadness	199	419	110	81	564	6628	95	63	7960
	Disgust	199	1901	105	269	93	231	5214	147	7960
	Surprise	200	829	360	186	96	142	115	6272	8000
	Total	1398	15105	4285	7819	4252	9938	6043	8478	55920
2 (male)	Neutral	200	6914	267	123	109	241	177	169	8000
	Happiness	200	2967	3146	234	254	285	124	990	8000
	Anger	200	511	124	6825	77	160	198	105	8000
	Fear	200	1564	173	101	3059	2251	120	732	8000
	Sadness	199	419	110	81	564	6628	95	63	7960
	Disgust	199	1901	105	269	93	231	5214	147	7960
	Surprise	200	829	360	186	96	142	115	6272	8000
	Total	1398	15105	4285	7819	4252	9938	6043	8478	55920
3 (male)	Neutral	198	6945	115	146	132	173	209	200	7920
	Happiness	200	2608	2605	585	145	152	188	1717	8000
	Anger	200	756	268	5807	119	94	374	582	8000
	Fear	197	1446	117	95	3340	2440	111	331	7880
	Sadness	200	650	33	73	430	6658	81	75	8000
	Disgust	199	3623	112	778	124	325	2582	416	7960
	Surprise	200	2022	1008	332	129	117	176	4216	8000
	Total	1394	18050	4258	7816	4419	9959	3721	7537	55760
4 (female)	Neutral	199	6477	102	145	182	576	270	208	7960
	Happiness	200	2029	4613	140	105	133	90	890	8000
	Anger	200	1548	553	4838	118	113	407	423	8000
	Fear	200	1632	93	100	3716	1925	122	412	8000
	Sadness	198	553	47	80	428	6684	62	66	7920
	Disgust	200	2968	64	1267	133	250	3116	202	8000
	Surprise	198	1275	842	336	321	118	144	4884	7920
	Total	1395	16482	6314	6906	5003	9799	4211	7085	55800
5 (female)	Neutral	200	7324	131	49	90	160	149	97	8000
	Happiness	199	1842	5558	51	84	96	88	241	7960
	Anger	200	453	93	6362	95	104	548	345	8000
	Fear	199	512	25	53	5222	1888	101	159	7960
	Sadness	200	379	41	26	1089	6335	89	41	8000
	Disgust	200	1890	65	884	92	133	4814	122	8000
	Surprise	200	754	341	1296	84	111	475	4939	8000
	Total	1398	13154	6254	8721	6756	8827	6264	5944	55920
6 (female)	Neutral	200	7245	90	90	79	145	240	111	8000
	Happiness	200	1661	5739	50	67	105	52	326	8000
	Anger	199	793	144	6398	66	95	286	178	7960
	Fear	200	885	48	90	5797	690	72	418	8000
	Sadness	200	491	44	64	513	6783	58	47	8000
	Disgust	200	2518	46	374	71	105	4683	203	8000
	Surprise	197	1021	895	358	148	109	154	5195	7880
	Total	1396	14614	7006	7424	6741	8032	5545	6478	55840

Open in a new tab

Values represent the number of listener responses. Correct classifications are shown in bold. Underlined values indicate instances in which an emotion category was selected by more listeners than the intended (target) emotion for the corresponding vocalization.

Accuracy with speech inclusion criteria

Based on evaluations from 40 participants, we calculated the mean percentage of correct identifications for each target emotion. Recordings were considered valid if they met the following criteria: (1) a recognition accuracy of at least 43% for the target emotion, corresponding to three times the chance level in a seven-alternative forced-choice task, and (2) fewer than 43% of responses attributed to any non-target emotion category^14,49.

According to these inclusion criteria, Table 4 summarizes the number of perceptually valid items for each emotion category. Following item selection and reclassification, 80% of the original utterances (6,675 out of 8,379) were retained across the seven target emotion categories: neutrality (1,169), sadness (1,187), anger (671), surprise (969), disgust (738), happiness (785), and fear (671). The proportion of valid recordings across the six actors was high, with validation rates of 78%, 80%, 62%, 73%, 92%, and 92%, respectively. Recognition accuracy and intensity ratings for each emotion are presented in Table 5.

Table 4.

Number of perceptually valid utterances for each emotion category.

Speaker		Neutrality	Happiness	Anger	Fear	Sadness	Disgust	Surprise	Total	Valid rate
1 (male)	Original	200	200	200	200	199	199	200	1398
	Removed	1	135	0	154	0	16	2	308
	Valid (N)	200	65	200	46	199	183	198	1091	0.78
2 (male)	Original	200	198	200	200	200	200	200	1398
	Removed	5	45	6	128	1	89	10	284
	Valid (N)	200	153	195	72	199	111	190	1120	0.80
3 (male)	Original	198	200	200	197	200	199	200	1394
	Removed	3	159	14	120	2	157	73	528
	Valid (N)	197	41	186	77	198	42	127	868	0.62
4(female)	Original	199	200	200	200	198	200	198	1395
	Removed	1	51	45	98	1	134	51	381
	Valid (N)	199	149	155	102	197	66	147	1015	0.73
5 (female)	Original	200	199	200	199	200	200	200	1398
	Removed	16	8	7	17	6	23	48	125
	Valid (N)	200	191	194	182	196	177	152	1292	0.92
6 (female)	Original	200	200	199	200	200	200	197	1396
	Removed	6	5	41	8	3	42	14	119
	Valid (N)	200	186	199	192	198	159	155	1289	0.92
Valid (total)		1196	785	1129	671	1187	738	969	6675	0.80

Open in a new tab

Table 5.

Mean Recognition Accuracy and Emotionally Intensity Ratings for Perceptually Valid Utterances.

Emotion	Accuracy, Mean (SD)	Intensity, Mean (SD)
Neutral	0.89 (0.08)	4.52 (0.33)
Happiness	0.69 (0.12)	5.38 (0.50)
Anger	0.79 (0.11)	6.27 (0.66)
Fear	0.65 (0.10)	5.56 (0.52)
Sadness	0.82 (0.09)	6.39 (0.45)
Disgust	0.63 (0.10)	4.85 (0.37)
Surprise	0.74 (0.12)	5.48 (0.46)
Average	0.76 (0.14)	5.53 (0.84)

Open in a new tab

Effect of subject pronoun type on emotion recognition

To examine the effect of subject pronoun type on recognition accuracy, we performed a 6 (Subject Pronoun Type: I, We, singular You, plural You, third-person singular, They) × 7 (Emotion Category: neutrality, happiness, sadness, fear, anger, disgust, and surprise) two-way non-parametric mixed-effects ANOVA on recognition rates, which exhibited a non-normal distribution. The Analysis was conducted using the Aligned Rank Transform (ART) method implemented in the R package ARTool (https://cran.r-project.org/web/packages/ARTool/readme/README.html). Actor and rater groups were included as random effects. The results revealed significant main effects of emotion category (F_6,6620 = 814.99, p < 0.001, η²_p = 0.425) and subject pronoun type (F_5,3522 = 3.33, p = 0.005, η²_p = 0.005), and a significant interaction effect between emotion category and subject pronoun type (F_30,6622 = 2.04, p < 0.001, η²_p = 0.02) were all significant (upper panel of Fig. 2). Although a significant interaction effect were observed, post hoc simple-effects analyses did not yield statistically significant contrasts after correction for multiple testing. This pattern suggests that the interaction reflects subtle, distributed differences across pronoun categories rather than large effects driven by specific contrasts.

Fig. 2 — Interaction effect of sentence subject pronoun and emotion category on recognition accuracy (upper panel) and perceived emotional intensity (lower panel).

To further probe the significant interaction between emotion and subject pronouns, we conducted theory-driven planned contrasts^50,51 using estimated marginal means derived from the ART-based mixed-effects model. Three contrasts were specified: first-person versus second-person, first-person versus third-person, and second-person versus third-person. Joint Wald-type F tests were performed separately within each emotion category. The planned contrasts revealed a selective and robust effect of person reference. Across all seven emotion categories, the contrast between second-person and third-person utterances was consistently significant (all F values > 118, all p values < 0.0001), suggesting that person reference exerts a stable, emotion-independent influence on auditory speech emotion processing. In contrast, no significant differences were observed between first-person and second-person utterances, nor between first-person and third-person utterances, across any emotion category (all p values > 0.39). These findings suggest that the observed emotion-by-pronoun interaction was primarily driven by a robust contrast between second-person and third-person references rather than by differences involving first-person expressions. This pattern indicates that the influence of person reference on emotion recognition reflects a qualitative distinction between listener-directed and non-listener-directed speech. Second-person utterances directly address the listener, establishing immediate interpersonal engagement that may enhance social relevance and emotional salience, thereby facilitating distinct perceptual or cognitive processing. In contrast, third-person utterances describe external agents and are more narrative, potentially eliciting weaker interpersonal involvement.

To assess the impact of subject pronoun type on emotional intensity ratings, we conducted a 6 (Subject Type: I, We, singular You, plural You, third-person singular, They) × 7 (Emotion Category: neutrality, sadness, fear, anger, disgust, surprise, happiness) mixed-effects ANOVA on the perceived intensity ratings, which exhibited a normal distribution. The rater group was included as a random factor. The main effect of the emotion category (F_6,6622 = 2301.54, p < 0.001, η²_p = 0.676) and the interaction effect (F_30,6617 = 3.27, p < 0.001, η²_p = 0.01) were significant; the main effect of subject pronoun type was not significant (F_5,6619 = 1.40, p = 0.10, η²_p = 0.001) (lower panel of Fig. 2). Planned comparisons between first-person, second-person, and third-person utterances (1st vs. 2nd, 1st vs. 3rd, and 2nd vs. 3rd) were non-significant across neutral, sadness, and surprise emotion categories (all p values > 0.19). In contrast, comparisons between first-person and second-person utterances (1st vs. 2nd) showed significant differences for the other four emotions (anger: estimate = −0.163, SE = 0.031, t = −5.18, p < 0.001; disgust: estimate = 0.11, SE = 0.04, t = 2.90, p = 0.004; fear: estimate = −0.09, SE = 0.04, t = 2.20, p = 0.028; happiness: estimate = 0.16, SE = 0.04, t = 4.123, p = 0.0001), and the comparisons between first-person and third-person utterances (1st vs. 3rd) showed significant differences for fear (estimate = −0.16, SE = 0.04, t = −3.18 p = 0.001) and happiness (estimate = 0.15, SE = 0.04, t = 3.48, p = 0.0005). While differences in the remaining emotion categories were non-significant (all p values > 0.08). These findings suggest that pronoun effects were primarily driven by contrasts involving first-person utterances. Second-person expressions were perceived as more intense for anger and fear, whereas first-person expressions were rated as more intense for disgust and happiness. This pattern may reflect differences between listener-directed and self-referential framing, indicating that pronoun type modulates perceived emotional intensity in an emotion-specific manner.

Effect of gender on emotion recognition

We conducted a 2 (speaker gender: female vs. male) × 2 (rater gender: female vs. male) two-way non-parametric mixed-effects ANOVA to investigate whether speaker and rater gender affected recognition accuracy. Figure 3 presents the interaction effects for the overall corpus, neutral utterances, and each of the six basic emotions. The main effect of rater gender (F₁ = 356.48, p < 0.001; η²_p = 0.027) and the interaction effect (F₁ = 5.66, p = 0.017; η²_p = 0.0004) were significant; the main effect of actor gender was not significant (F₁ = 1.37, p = 0.307). Overall, female listeners outperformed male listeners in speech emotion identification. These findings are consistent with previous research, showing that women recognize^52,53 and express⁵⁴ emotion more accurately than men.

Fig. 3 — Interaction effect of the gender of speaker and rater on the recognition rate.

Although the interaction between actor gender and rater gender reached statistical significance, the associated effect size was extremely small, and follow-up simple effects analyses did not reveal significant differences within individual levels of either factor. This pattern suggests that the interaction reflects a subtle shift in the relative pattern of effects — specifically, compared with speech produced by male speakers, female listeners tended to show slightly higher recognition accuracy than male listeners for speech produced by female speakers—rather than robust differences in any single comparison. To further clarify the source of this interaction, separate non-parametric mixed-effects ANOVAs were conducted for each emotion category. These analyses revealed that the overall interaction was primarily driven by happiness (F_1,1552 = 15.02, p = 0.0001, η²_p = 0.001), anger (F_1,2238 = 6.04, p = 0.014, η²_p = 0.003), and disgust (F_1,1472 = 11.25, p = 0.0008, η²_p = 0.008). Notably, although statistically significant, the corresponding effect sizes were small, suggesting that the interaction reflects subtle modulation rather than substantial differences in recognition performance.

Annotation consensus

Inter-rater reliability was assessed using Fleiss’ kappa (κ), which assesses agreement on categorical emotion labels beyond chance. As shown in Table 6, the Fleiss’ κ values computed across independent listener groups showed moderate to substantial agreement in categorical emotion judgments (κ = 0.44–0.66), indicating overall satisfactory reliability of the labelling process⁵⁵.

Table 6.

Summary of Inter-Rater Reliability Indices for Emotion Recognition Performance.

Actor	Fleiss Kappa for Rating Groups
Actor	Group 1	Group 2	Group 3
Actor 1, Male	0.66	0.58	0.48
Actor 2, Male	0.58	0.60	0.54
Actor 3, Male	0.56	0.51	0.50
Actor 4, Female	0.50	0.47	0.47
Actor 5, Female	0.50	0.44	0.62
Actor 6, Female	0.57	0.63	0.58

Open in a new tab

Acoustic validation

Referring to the previous emotional corpus study¹⁴, we employed the Parselmouth (Praat in Python) package⁵⁶ to extract twelve acoustic features: duration (seconds); F0 mean (Hz), F0 standard deviation (SD), F0 minimum, and F0 maximum; harmonics-to-noise ratio (HNR); local jitter; local shimmer; sound intensity (dB); root-mean-square (RMS) amplitude; spectral center of gravity (COG, Hz); and spectral spread (Hz). These features were used to assess the predictive power of acoustic parameters for emotion category classification. The mean acoustic values for each emotion category, averaged across speakers, are presented in Table 7.

Table 7.

Acoustic Features of Valid Emotional Speech Across Emotion Categories.

Emotion	Duration (second)	Pitch F0 (Hz)				HNR (dB)	Jitter (local)	Shimmer (local)	Intensity (dB)	RMS Energy (Amplitude)	Spectral_COG (Hz)	Spectral_Spread (Hz)
Emotion	Duration (second)	Mean	SD	Max	Min	HNR (dB)	Jitter (local)	Shimmer (local)	Intensity (dB)	RMS Energy (Amplitude)	Spectral_COG (Hz)	Spectral_Spread (Hz)
Neutrality	1.17	166.79	31.04	261.71	110.82	10.02	0.03	0.10	72.94	0.15	3601.55	4258.58
Happiness	1.13	272.06	61.09	420.45	161.23	11.45	0.02	0.10	73.84	0.16	3299.57	3773.32
Anger	1.07	269.11	63.38	441.58	157.89	9.91	0.02	0.12	72.16	0.14	3251.42	3615.67
Fear	1.23	215.97	28.71	297.34	160.23	9.73	0.03	0.11	74.15	0.17	4277.70	4635.36
Sadness	1.65	233.99	45.42	404.06	142.43	12.12	0.02	0.11	73.08	0.16	3433.03	4035.92
Disgust	1.20	157.87	28.79	252.81	105.19	10.37	0.02	0.11	73.15	0.14	3067.38	3930.48
Surprise	1.14	244.92	63.88	421.68	148.53	11.18	0.02	0.11	74.48	0.17	3397.82	4002.25

Open in a new tab

COG: center of gravity; HNR: harmonics-to-noise ratio; RMS: root mean square.

We conducted simultaneous multiple regression analyses for each emotion category (Gong et al., 2023; Lima et al., 2013) to examine the direct associations between acoustic features and listeners’ recognition accuracy of emotional expressions. The dependent variable was the average recognition rate for each vocalization, and the independent variables were the extracted acoustic features. Table 8 summarizes the main findings, including standardized regression coefficients (β) and adjusted variance explained. All regression models were significant, indicating that recognition of each emotion category was influenced by multiple acoustic attributes. Specifically, recognition of neutral utterances was predicted by longer duration, lower spectral COG, and greater spectral spread. Sadness recognition was associated with shorter duration, higher F0 variability, higher HNR, greater local shimmer, and lower F0 maximum and spectral spread. Fear recognition was predicted by longer duration, higher F0 mean, greater spectral spread, and lower HNR and spectral COG. Anger recognition was linked to longer duration, higher F0 mean, and increased local jitter. Disgust recognition was predicted by lower local shimmer. Surprise recognition was associated with longer duration, lower F0 mean, higher HNR and RMS amplitude, and wider spectral spread. Happiness recognition was predicted by higher F0 maximum and lower local jitter.

Table 8.

Multiple Regression Results Predicting Speech Emotion Recognition Rates from Acoustic Features.

Emotion	Acoustic feature
Emotion	Duration	F0 Mean	F0 SD	F0 Min	F0 Max	HNR (dB)	Jitter (local)	Shimmer (local)	RMS Energy	Spectral_COG (Hz)	Spectral_Spread (Hz)
Neutral	0.17^***	−0.01	−0.04	−0.06	0.03	−0.10	0.04	0.03	0.07	−0.19^***	−0.58^***
Happiness	0.06	0.06	0.05	0.05	0.17^***	−0.12	−0.15^***	0.07	−0.02	−0.02	−0.13
Anger	0.14^***	0.16^**	−0.02	−0.03	0.01	−0.04	0.15^***	−0.05	−0.04	0.06	0.04
Fear	0.23^***	0.65^***	−0.12	0.04	0.07	−0.27^***	−0.03	−0.01	−0.03	−0.24^***	0.23^***
Sadness	−0.14^***	0.01	0.13^**	−0.17^**	0.07	0.33^***	−0.02	0.19^***	0.02	0.27^***	−0.32^***
Disgust	0.12	−0.06	−0.12	−0.05	−0.02	0.13	0.02	−0.13^*	−0.01	−0.08	−0.23
Surprise	0.12^*	−0.12	0.03	0.01	0.01	0.30^***	0.01	0.08	0.13^*	−0.01	0.15^*

Open in a new tab

Values represent beta weights (Standardized Coefficients); COG: center of gravity, HNR: harmonics-to-noise ratio, RMS: root mean square.

^*p < 0.05; ^**p < 0.01; ^***p < 0.001 (FDR corrected); Bold typeface indicates a significant association after the false discovery rate (FDR) correction.

Usage Notes

The MCAE-SPPS dataset provides a comprehensive resource for investigating the interplay between language, emotion, and cognition in auditory speech. It enables researchers to examine how subject personal pronouns (e.g., “I,” “You,” “He/They”) modulate the perception and recognition of emotional expressions. Potential applications include:

Theoretical research: Studying pronoun-driven effects on emotional salience, self- vs. other-relevance, and listener perspective in emotion perception.
Social cognition and neuroscience: Exploring how pronouns influence empathy, perspective-taking, social bonding, or neural activation patterns during emotional speech processing.
Computational modeling: Improving emotion recognition algorithms by integrating pronoun-specific acoustic cues.
Clinical and practical applications: Designing interventions to enhance emotional awareness or empathy, and informing AI systems such as virtual assistants or chatbots.

Users should note that the dataset reflects recordings from a limited number of speakers and emotions; pronoun and emotion distributions may influence recognition patterns. Care should be taken when generalizing findings to broader populations or different languages.

Uncertainties and limitations

The corpus focuses on six basic emotions (sadness, anger, fear, disgust, surprise, and happiness) together with neutral expressions, and therefore does not cover more complex or socially nuanced emotions such as pride, shyness, or shame. As with many acted auditory emotional databases, the recordings may contain exaggerated or less naturalistic expressions, which could limit ecological validity. In addition, the forced-choice recognition paradigm requires listeners to select a single emotion category even when they are uncertain, potentially inflating recognition accuracy and distorting confusion patterns. Future work may address this issue by incorporating alternative response options (e.g., “Cannot recognize the emotion”) to better capture perceptual uncertainty. Moreover, each listener evaluated only a subset of utterances from a single speaker, which may introduce inter-speaker variability. To address these sources of heterogeneity, we employed nonparametric mixed-effects ANOVA models that treated both listener and actor as random effects.

Acknowledgements

We would like to thank the actors and raters for their participation in the study. This work was supported by The National Natural Science Foundation of China General Project (grant number: 32271138), the General Program of the National Social Science Foundation (grant number: 21BGL229), and the Beijing Natural Science Foundation (grant number: 7202086).

Author contributions

M. Li conducted data collection, preprocessing, data analysis, visualization, results interpretation, and first-draft writing; A. Zhou conducted data collection and preprocessing; H. Yan, Q. Li, and C. Ma performed data preprocessing. C. Wu: formulated research ideas, project design and supervision, data analysis and visualization, results interpretation, and first draft writing and editing. All authors approved the final version of the manuscript.

Data availability

The corpus audio files, metadata, and corresponding rating data are publicly available on the Open Science Framework (OSF) at 10.17605/OSF.IO/9JYZC (View-only link: https://osf.io/9jyzc/overview?view_only=088ce4e15a914b939c8bb6bd119c7226). The repository includes the full set of validated audio recordings, rater demographic information, and comprehensive metadata files describing emotion labels, recognition accuracy, perceived intensity ratings, sentence content, and extracted acoustic features.

Code availability

The codes are publicly available on OSF at 10.17605/OSF.IO/9JYZC⁴⁶. Please refer to the “Codes_and_RawData” folder for the data organization and analysis scripts.

Competing interests

The authors declare no competing interests.

Footnotes

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

1.Landman, L. L. & van Steenbergen, H. Emotion and conflict adaptation: the role of phasic arousal and self-relevance. Cogn Emot34, 1083–1096, 10.1080/02699931.2020.1722615 (2020). [DOI] [PubMed] [Google Scholar]
2.Hert, R., Järvikivi, J. & Arnhold, A. The Importance of Linguistic Factors: He Likes Subject Referents. Cogn Sci48, e13436, 10.1111/cogs.13436 (2024). [DOI] [PubMed] [Google Scholar]
3.Liebenthal, E., Silbersweig, D. A. & Stern, E. The Language, Tone and Prosody of Emotions: Neural Substrates and Dynamics of Spoken-Word Emotion Perception. Front Neurosci10, 506, 10.3389/fnins.2016.00506 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Yang, C. et al. Emotion-dependent language featuring depression. J Behav Ther Exp Psychiatry81, 101883, 10.1016/j.jbtep.2023.101883 (2023). [DOI] [PubMed] [Google Scholar]
5.Homan, S. et al. Linguistic features of suicidal thoughts and behaviors: A systematic review. Clin Psychol Rev95, 102161, 10.1016/j.cpr.2022.102161 (2022). [DOI] [PubMed] [Google Scholar]
6.Iyer, R., Nedeljkovic, M. & Meyer, D. Using Voice Biomarkers to Classify Suicide Risk in Adult Telehealth Callers: Retrospective Observational Study. JMIR Ment Health9, e39807, 10.2196/39807 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Kappen, M., Vanderhasselt, M. A. & Slavich, G. M. Speech as a promising biosignal in precision psychiatry. Neurosci Biobehav Rev148, 105121, 10.1016/j.neubiorev.2023.105121 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Abedin, E. et al. Exploring intellectual humility through the lens of artificial intelligence: Top terms, features and a predictive model. Acta Psychol (Amst)238, 103979, 10.1016/j.actpsy.2023.103979 (2023). [DOI] [PubMed] [Google Scholar]
9.Bae, Y. J., Shim, M. & Lee, W. H. Schizophrenia Detection Using Machine Learning Approach from Social Media Content. Sensors (Basel)21. 10.3390/s21175924 (2021). [DOI] [PMC free article] [PubMed]
10.Ryu, J. et al. A natural language processing approach reveals first-person pronoun usage and non-fluency as markers of therapeutic alliance in psychotherapy. iScience26, 106860, 10.1016/j.isci.2023.106860 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
11.Wang, Q., Wang, M., Yang, Y. & Zhang, X. Multi-modal emotion recognition using EEG and speech signals. Comput Biol Med149, 105907, 10.1016/j.compbiomed.2022.105907 (2022). [DOI] [PubMed] [Google Scholar]
12.Keshtiari, N., Kuhlmann, M., Eslami, M. & Klann-Delius, G. Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD). Behav Res Methods47, 275–294, 10.3758/s13428-014-0467-x (2015). [DOI] [PubMed] [Google Scholar]
13.Bustamin, A., Rizky, A. M., Warni, E., Areni, I. S. & Indrabayu IndoWaveSentiment: Indonesian audio dataset for emotion classification. Data in Brief57, 111138, 10.1016/j.dib.2024.111138 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
14.Gong, B. et al. The Mandarin Chinese auditory emotions stimulus database: A validated set of Chinese pseudo-sentences. Behav Res Methods55, 1441–1459, 10.3758/s13428-022-01868-7 (2023). [DOI] [PubMed] [Google Scholar]
15.Costantini, G., Parada-Cabaleiro, E., Casali, D. & Cesarini, V. The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning. Sensors (Basel)22. 10.3390/s22072461 (2022). [DOI] [PMC free article] [PubMed]
16.Movaghar, A., Page, D., Saha, K., Rynn, M. & Greenberg, J. Machine learning approach to measurement of criticism: The core dimension of expressed emotion. J Fam Psychol35, 1007–1015, 10.1037/fam0000906 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
17.Habets, B., Ye, Z., Jansma, B. M., Heldmann, M. & Münte, T. F. Brain imaging and electrophysiological markers of anaphoric reference during speech production. Neuroscience Research213, 110–120, 10.1016/j.neures.2025.01.001 (2025). [DOI] [PubMed] [Google Scholar]
18.Lei, J., Zhu, X. & Wang, Y. BAT: Block and token self-attention for speech emotion recognition. Neural Netw156, 67–80, 10.1016/j.neunet.2022.09.022 (2022). [DOI] [PubMed] [Google Scholar]
19.Kingeski, R., Henning, E. & Paterno, A. S. Fusion of PCA and ICA in Statistical Subset Analysis for Speech Emotion Recognition. Sensors (Basel)24. 10.3390/s24175704 (2024). [DOI] [PMC free article] [PubMed]
20.Riad, R. et al. Automated Speech Analysis for Risk Detection of Depression, Anxiety, Insomnia, and Fatigue: Algorithm Development and Validation Study. J Med Internet Res26, e58572, 10.2196/58572 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
21.Mobram, S. & Vali, M. Depression detection based on linear and nonlinear speech features in I-vector/SVDA framework. Comput Biol Med149, 105926, 10.1016/j.compbiomed.2022.105926 (2022). [DOI] [PubMed] [Google Scholar]
22.Darcy, I. & Fontaine, N. M. G. The Hoosier Vocal Emotions Corpus: A validated set of North American English pseudo-words for evaluating emotion processing. Behavior Research Methods52, 901–917, 10.3758/s13428-019-01288-0 (2020). [DOI] [PubMed] [Google Scholar]
23.Zhou, K., Sisman, B., Liu, R. & Li, H. Emotional voice conversion: Theory, databases and ESD. Speech Communication137, 1–18, 10.1016/j.specom.2021.11.006 (2022). [Google Scholar]
24.Parada-Cabaleiro, E., Costantini, G., Batliner, A., Schmitt, M. & Schuller, B. W. DEMoS: an Italian emotional speech corpus. Language Resources and Evaluation54, 341–383, 10.1007/s10579-019-09450-y (2020). [Google Scholar]
25.Asghar, A., Sohaib, S., Iftikhar, S., Shafi, M. & Fatima, K. An Urdu speech corpus for emotion recognition. PeerJ Comput Sci8, e954, 10.7717/peerj-cs.954 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Sultana, S., Rahman, M. S., Selim, M. R. & Iqbal, M. Z. SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla. PLoS One16, e0250173, 10.1371/journal.pone.0250173 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Paccotacya-Yanque, R. Y. G., Huanca-Anquise, C. A., Escalante-Calcina, J., Ramos-Lovón, W. R. & Cuno-Parari, Á. E. A speech corpus of Quechua Collao for automatic dimensional emotion recognition. Scientific Data9, 778, 10.1038/s41597-022-01855-9 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Li, M. et al. The Mandarin Chinese auditory emotions stimulus database: A validated corpus of monosyllabic Chinese characters. Behav Res Methods57, 89, 10.3758/s13428-025-02607-4 (2025). [DOI] [PubMed] [Google Scholar]
29.Chong, C. S., Davis, C. & Kim, J. A Cantonese Audio-Visual Emotional Speech (CAVES) dataset. Behavior Research Methods56, 5264–5278, 10.3758/s13428-023-02270-7 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
30.Busso, C. et al. IEMOCAP: interactive emotional dyadic motion capture database. Language Resources and Evaluation42, 335–359, 10.1007/s10579-008-9076-6 (2008). [Google Scholar]
31.Zhang J., T F., Liu M., Jia H. Design of speech corpus for mandarin text to speech. The Blizzard Challenge 2008 Workshop (2008).
32.Li, Y., Tao, J., Chao, L., Bao, W. & Liu, Y. CHEAVD: a Chinese natural emotional audio–visual database. Journal of Ambient Intelligence and Humanized Computing8, 913–924, 10.1007/s12652-016-0406-z (2017). [Google Scholar]
33.Zimmermann, J., Wolf, M., Bock, A., Peham, D. & Benecke, C. The way we refer to ourselves reflects how we relate to others: Associations between first-person pronoun use and interpersonal problems. Journal of Research in Personality47, 218–225, 10.1016/j.jrp.2013.01.008 (2013). [Google Scholar]
34.Wang, F. & Karimi, S. This product works well (for me): The impact of first-person singular pronouns on online review helpfulness. Journal of Business Research104, 283–294, 10.1016/j.jbusres.2019.07.028 (2019). [Google Scholar]
35.Stade, E. C., Ungar, L., Eichstaedt, J. C., Sherman, G. & Ruscio, A. M. Depression and anxiety have distinct and overlapping language patterns: Results from a clinical interview. J Psychopathol Clin Sci132, 972–983, 10.1037/abn0000850 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
36.Sun, Z., Cao, C. C., Liu, S., Li, Y. & Ma, C. Behavioral consequences of second-person pronouns in written communications between authors and reviewers of scientific papers. Nature Communications15, 152, 10.1038/s41467-023-44515-1 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
37.Cruz, R. E., Leonhardt, J. M. & Pezzuti, T. Second Person Pronouns Enhance Consumer Involvement and Brand Attitude. Journal of Interactive Marketing39, 104–116, 10.1016/j.intmar.2017.05.001 (2017). [Google Scholar]
38.Qu, J., Zhou, R., Zou, L., Sun, Y. & Zhao, M. in Human-Computer Interaction. Multimodal and Natural Interaction. (ed Kurosu, M.) 234–243 (Springer International Publishing).
39.Moser, J. S. et al. Third-person self-talk facilitates emotion regulation without engaging cognitive control: Converging evidence from ERP and fMRI. Scientific Reports7, 4519, 10.1038/s41598-017-04047-3 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
40.Wallace-Hadrill, S. M. & Kamboj, S. K. The Impact of Perspective Change As a Cognitive Reappraisal Strategy on Affect: A Systematic Review. Front Psychol7, 1715, 10.3389/fpsyg.2016.01715 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
41.Orvell, A. et al. Does Distanced Self-Talk Facilitate Emotion Regulation Across a Range of Emotionally Intense Experiences? Clinical Psychological Science9, 68–78, 10.1177/2167702620951539 (2020). [Google Scholar]
42.El Ouardi, L., Yeou, M. & Faroqi-Shah, Y. Neural correlates of pronoun processing: An activation likelihood estimation meta-analysis. Brain Lang246, 105347, 10.1016/j.bandl.2023.105347 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]
43.Massaeli, F., Bagheri, M. & Power, S. D. EEG-based detection of modality-specific visual and auditory sensory processing. J Neural Eng20. 10.1088/1741-2552/acb9be (2023). [DOI] [PubMed]
44.Devillers, L., Vidrascu, L. & Lamel, L. Challenges in real-life emotion annotation and machine learning based detection. Neural Netw18, 407–422, 10.1016/j.neunet.2005.03.007 (2005). [DOI] [PubMed] [Google Scholar]
45.Lee, M.-H. et al. EAV: EEG-Audio-Video Dataset for Emotion Recognition in Conversational Contexts. Scientific Data11, 1026, 10.1038/s41597-024-03838-4 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]
46.Li, M. et al. The Mandarin Chinese Auditory Emotions Stimulus Database: A Validated Set of Sentences with A Personal Pronoun as the Subject. OSF.10.17605/OSF.IO/9JYZC (2026).
47.De Prisco, M. et al. Differences in facial emotion recognition between bipolar disorder and other clinical populations: A systematic review and meta-analysis. Prog Neuropsychopharmacol Biol Psychiatry127, 110847, 10.1016/j.pnpbp.2023.110847 (2023). [DOI] [PubMed] [Google Scholar]
48.Tseng, H. H. et al. Facial and prosodic emotion recognition in social anxiety disorder. Cogn Neuropsychiatry22, 331–345, 10.1080/13546805.2017.1330190 (2017). [DOI] [PubMed] [Google Scholar]
49.Liu, P. & Pell, M. D. Recognizing vocal emotions in Mandarin Chinese: a validated database of Chinese vocal emotional stimuli. Behav Res Methods44, 1042–1051, 10.3758/s13428-012-0203-3 (2012). [DOI] [PubMed] [Google Scholar]
50.Rosenthal, R. & Rosnow, R. L. Contrast analysis: Focused comparisons in the analysis of variance. (1985).
51.Kuehne, C. C. The Advantages of Using Planned Comparisons over Post Hoc Tests., (1993).
52.Lambrecht, L., Kreifelts, B. & Wildgruber, D. Gender differences in emotion recognition: Impact of sensory modality and emotional category. Cogn Emot28, 452–469, 10.1080/02699931.2013.837378 (2014). [DOI] [PubMed] [Google Scholar]
53.Collignon, O. et al. Women process multisensory emotion expressions more efficiently than men. Neuropsychologia48, 220–225, 10.1016/j.neuropsychologia.2009.09.007 (2010). [DOI] [PubMed] [Google Scholar]
54.Filipe, M. G., Branco, P., Frota, S., Castro, S. L. & Vicente, S. G. Affective prosody in European Portuguese: Perceptual and acoustic characterization of one-word utterances. Speech Communication67, 58–64, 10.1016/j.specom.2014.09.007 (2015). [Google Scholar]
55.Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics33, 159–174 (1977). [PubMed] [Google Scholar]
56.Jadoul, Y., Thompson, B. & de Boer, B. Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics71, 1–15, 10.1016/j.wocn.2018.07.001 (2018). [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Data Citations

Li, M. et al. The Mandarin Chinese Auditory Emotions Stimulus Database: A Validated Set of Sentences with A Personal Pronoun as the Subject. OSF.10.17605/OSF.IO/9JYZC (2026).

Data Availability Statement

The codes are publicly available on OSF at 10.17605/OSF.IO/9JYZC⁴⁶. Please refer to the “Codes_and_RawData” folder for the data organization and analysis scripts.

[CR1] 1.Landman, L. L. & van Steenbergen, H. Emotion and conflict adaptation: the role of phasic arousal and self-relevance. Cogn Emot34, 1083–1096, 10.1080/02699931.2020.1722615 (2020). [DOI] [PubMed] [Google Scholar]

[CR2] 2.Hert, R., Järvikivi, J. & Arnhold, A. The Importance of Linguistic Factors: He Likes Subject Referents. Cogn Sci48, e13436, 10.1111/cogs.13436 (2024). [DOI] [PubMed] [Google Scholar]

[CR3] 3.Liebenthal, E., Silbersweig, D. A. & Stern, E. The Language, Tone and Prosody of Emotions: Neural Substrates and Dynamics of Spoken-Word Emotion Perception. Front Neurosci10, 506, 10.3389/fnins.2016.00506 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR4] 4.Yang, C. et al. Emotion-dependent language featuring depression. J Behav Ther Exp Psychiatry81, 101883, 10.1016/j.jbtep.2023.101883 (2023). [DOI] [PubMed] [Google Scholar]

[CR5] 5.Homan, S. et al. Linguistic features of suicidal thoughts and behaviors: A systematic review. Clin Psychol Rev95, 102161, 10.1016/j.cpr.2022.102161 (2022). [DOI] [PubMed] [Google Scholar]

[CR6] 6.Iyer, R., Nedeljkovic, M. & Meyer, D. Using Voice Biomarkers to Classify Suicide Risk in Adult Telehealth Callers: Retrospective Observational Study. JMIR Ment Health9, e39807, 10.2196/39807 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR7] 7.Kappen, M., Vanderhasselt, M. A. & Slavich, G. M. Speech as a promising biosignal in precision psychiatry. Neurosci Biobehav Rev148, 105121, 10.1016/j.neubiorev.2023.105121 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR8] 8.Abedin, E. et al. Exploring intellectual humility through the lens of artificial intelligence: Top terms, features and a predictive model. Acta Psychol (Amst)238, 103979, 10.1016/j.actpsy.2023.103979 (2023). [DOI] [PubMed] [Google Scholar]

[CR9] 9.Bae, Y. J., Shim, M. & Lee, W. H. Schizophrenia Detection Using Machine Learning Approach from Social Media Content. Sensors (Basel)21. 10.3390/s21175924 (2021). [DOI] [PMC free article] [PubMed]

[CR10] 10.Ryu, J. et al. A natural language processing approach reveals first-person pronoun usage and non-fluency as markers of therapeutic alliance in psychotherapy. iScience26, 106860, 10.1016/j.isci.2023.106860 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR11] 11.Wang, Q., Wang, M., Yang, Y. & Zhang, X. Multi-modal emotion recognition using EEG and speech signals. Comput Biol Med149, 105907, 10.1016/j.compbiomed.2022.105907 (2022). [DOI] [PubMed] [Google Scholar]

[CR12] 12.Keshtiari, N., Kuhlmann, M., Eslami, M. & Klann-Delius, G. Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD). Behav Res Methods47, 275–294, 10.3758/s13428-014-0467-x (2015). [DOI] [PubMed] [Google Scholar]

[CR13] 13.Bustamin, A., Rizky, A. M., Warni, E., Areni, I. S. & Indrabayu IndoWaveSentiment: Indonesian audio dataset for emotion classification. Data in Brief57, 111138, 10.1016/j.dib.2024.111138 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR14] 14.Gong, B. et al. The Mandarin Chinese auditory emotions stimulus database: A validated set of Chinese pseudo-sentences. Behav Res Methods55, 1441–1459, 10.3758/s13428-022-01868-7 (2023). [DOI] [PubMed] [Google Scholar]

[CR15] 15.Costantini, G., Parada-Cabaleiro, E., Casali, D. & Cesarini, V. The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning. Sensors (Basel)22. 10.3390/s22072461 (2022). [DOI] [PMC free article] [PubMed]

[CR16] 16.Movaghar, A., Page, D., Saha, K., Rynn, M. & Greenberg, J. Machine learning approach to measurement of criticism: The core dimension of expressed emotion. J Fam Psychol35, 1007–1015, 10.1037/fam0000906 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR17] 17.Habets, B., Ye, Z., Jansma, B. M., Heldmann, M. & Münte, T. F. Brain imaging and electrophysiological markers of anaphoric reference during speech production. Neuroscience Research213, 110–120, 10.1016/j.neures.2025.01.001 (2025). [DOI] [PubMed] [Google Scholar]

[CR18] 18.Lei, J., Zhu, X. & Wang, Y. BAT: Block and token self-attention for speech emotion recognition. Neural Netw156, 67–80, 10.1016/j.neunet.2022.09.022 (2022). [DOI] [PubMed] [Google Scholar]

[CR19] 19.Kingeski, R., Henning, E. & Paterno, A. S. Fusion of PCA and ICA in Statistical Subset Analysis for Speech Emotion Recognition. Sensors (Basel)24. 10.3390/s24175704 (2024). [DOI] [PMC free article] [PubMed]

[CR20] 20.Riad, R. et al. Automated Speech Analysis for Risk Detection of Depression, Anxiety, Insomnia, and Fatigue: Algorithm Development and Validation Study. J Med Internet Res26, e58572, 10.2196/58572 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR21] 21.Mobram, S. & Vali, M. Depression detection based on linear and nonlinear speech features in I-vector/SVDA framework. Comput Biol Med149, 105926, 10.1016/j.compbiomed.2022.105926 (2022). [DOI] [PubMed] [Google Scholar]

[CR22] 22.Darcy, I. & Fontaine, N. M. G. The Hoosier Vocal Emotions Corpus: A validated set of North American English pseudo-words for evaluating emotion processing. Behavior Research Methods52, 901–917, 10.3758/s13428-019-01288-0 (2020). [DOI] [PubMed] [Google Scholar]

[CR23] 23.Zhou, K., Sisman, B., Liu, R. & Li, H. Emotional voice conversion: Theory, databases and ESD. Speech Communication137, 1–18, 10.1016/j.specom.2021.11.006 (2022). [Google Scholar]

[CR24] 24.Parada-Cabaleiro, E., Costantini, G., Batliner, A., Schmitt, M. & Schuller, B. W. DEMoS: an Italian emotional speech corpus. Language Resources and Evaluation54, 341–383, 10.1007/s10579-019-09450-y (2020). [Google Scholar]

[CR25] 25.Asghar, A., Sohaib, S., Iftikhar, S., Shafi, M. & Fatima, K. An Urdu speech corpus for emotion recognition. PeerJ Comput Sci8, e954, 10.7717/peerj-cs.954 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR26] 26.Sultana, S., Rahman, M. S., Selim, M. R. & Iqbal, M. Z. SUST Bangla Emotional Speech Corpus (SUBESCO): An audio-only emotional speech corpus for Bangla. PLoS One16, e0250173, 10.1371/journal.pone.0250173 (2021). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR27] 27.Paccotacya-Yanque, R. Y. G., Huanca-Anquise, C. A., Escalante-Calcina, J., Ramos-Lovón, W. R. & Cuno-Parari, Á. E. A speech corpus of Quechua Collao for automatic dimensional emotion recognition. Scientific Data9, 778, 10.1038/s41597-022-01855-9 (2022). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR28] 28.Li, M. et al. The Mandarin Chinese auditory emotions stimulus database: A validated corpus of monosyllabic Chinese characters. Behav Res Methods57, 89, 10.3758/s13428-025-02607-4 (2025). [DOI] [PubMed] [Google Scholar]

[CR29] 29.Chong, C. S., Davis, C. & Kim, J. A Cantonese Audio-Visual Emotional Speech (CAVES) dataset. Behavior Research Methods56, 5264–5278, 10.3758/s13428-023-02270-7 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR30] 30.Busso, C. et al. IEMOCAP: interactive emotional dyadic motion capture database. Language Resources and Evaluation42, 335–359, 10.1007/s10579-008-9076-6 (2008). [Google Scholar]

[CR31] 31.Zhang J., T F., Liu M., Jia H. Design of speech corpus for mandarin text to speech. The Blizzard Challenge 2008 Workshop (2008).

[CR32] 32.Li, Y., Tao, J., Chao, L., Bao, W. & Liu, Y. CHEAVD: a Chinese natural emotional audio–visual database. Journal of Ambient Intelligence and Humanized Computing8, 913–924, 10.1007/s12652-016-0406-z (2017). [Google Scholar]

[CR33] 33.Zimmermann, J., Wolf, M., Bock, A., Peham, D. & Benecke, C. The way we refer to ourselves reflects how we relate to others: Associations between first-person pronoun use and interpersonal problems. Journal of Research in Personality47, 218–225, 10.1016/j.jrp.2013.01.008 (2013). [Google Scholar]

[CR34] 34.Wang, F. & Karimi, S. This product works well (for me): The impact of first-person singular pronouns on online review helpfulness. Journal of Business Research104, 283–294, 10.1016/j.jbusres.2019.07.028 (2019). [Google Scholar]

[CR35] 35.Stade, E. C., Ungar, L., Eichstaedt, J. C., Sherman, G. & Ruscio, A. M. Depression and anxiety have distinct and overlapping language patterns: Results from a clinical interview. J Psychopathol Clin Sci132, 972–983, 10.1037/abn0000850 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR36] 36.Sun, Z., Cao, C. C., Liu, S., Li, Y. & Ma, C. Behavioral consequences of second-person pronouns in written communications between authors and reviewers of scientific papers. Nature Communications15, 152, 10.1038/s41467-023-44515-1 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR37] 37.Cruz, R. E., Leonhardt, J. M. & Pezzuti, T. Second Person Pronouns Enhance Consumer Involvement and Brand Attitude. Journal of Interactive Marketing39, 104–116, 10.1016/j.intmar.2017.05.001 (2017). [Google Scholar]

[CR38] 38.Qu, J., Zhou, R., Zou, L., Sun, Y. & Zhao, M. in Human-Computer Interaction. Multimodal and Natural Interaction. (ed Kurosu, M.) 234–243 (Springer International Publishing).

[CR39] 39.Moser, J. S. et al. Third-person self-talk facilitates emotion regulation without engaging cognitive control: Converging evidence from ERP and fMRI. Scientific Reports7, 4519, 10.1038/s41598-017-04047-3 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR40] 40.Wallace-Hadrill, S. M. & Kamboj, S. K. The Impact of Perspective Change As a Cognitive Reappraisal Strategy on Affect: A Systematic Review. Front Psychol7, 1715, 10.3389/fpsyg.2016.01715 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR41] 41.Orvell, A. et al. Does Distanced Self-Talk Facilitate Emotion Regulation Across a Range of Emotionally Intense Experiences? Clinical Psychological Science9, 68–78, 10.1177/2167702620951539 (2020). [Google Scholar]

[CR42] 42.El Ouardi, L., Yeou, M. & Faroqi-Shah, Y. Neural correlates of pronoun processing: An activation likelihood estimation meta-analysis. Brain Lang246, 105347, 10.1016/j.bandl.2023.105347 (2023). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR43] 43.Massaeli, F., Bagheri, M. & Power, S. D. EEG-based detection of modality-specific visual and auditory sensory processing. J Neural Eng20. 10.1088/1741-2552/acb9be (2023). [DOI] [PubMed]

[CR44] 44.Devillers, L., Vidrascu, L. & Lamel, L. Challenges in real-life emotion annotation and machine learning based detection. Neural Netw18, 407–422, 10.1016/j.neunet.2005.03.007 (2005). [DOI] [PubMed] [Google Scholar]

[CR45] 45.Lee, M.-H. et al. EAV: EEG-Audio-Video Dataset for Emotion Recognition in Conversational Contexts. Scientific Data11, 1026, 10.1038/s41597-024-03838-4 (2024). [DOI] [PMC free article] [PubMed] [Google Scholar]

[CR46] 46.Li, M. et al. The Mandarin Chinese Auditory Emotions Stimulus Database: A Validated Set of Sentences with A Personal Pronoun as the Subject. OSF.10.17605/OSF.IO/9JYZC (2026).

[CR47] 47.De Prisco, M. et al. Differences in facial emotion recognition between bipolar disorder and other clinical populations: A systematic review and meta-analysis. Prog Neuropsychopharmacol Biol Psychiatry127, 110847, 10.1016/j.pnpbp.2023.110847 (2023). [DOI] [PubMed] [Google Scholar]

[CR48] 48.Tseng, H. H. et al. Facial and prosodic emotion recognition in social anxiety disorder. Cogn Neuropsychiatry22, 331–345, 10.1080/13546805.2017.1330190 (2017). [DOI] [PubMed] [Google Scholar]

[CR49] 49.Liu, P. & Pell, M. D. Recognizing vocal emotions in Mandarin Chinese: a validated database of Chinese vocal emotional stimuli. Behav Res Methods44, 1042–1051, 10.3758/s13428-012-0203-3 (2012). [DOI] [PubMed] [Google Scholar]

[CR50] 50.Rosenthal, R. & Rosnow, R. L. Contrast analysis: Focused comparisons in the analysis of variance. (1985).

[CR51] 51.Kuehne, C. C. The Advantages of Using Planned Comparisons over Post Hoc Tests., (1993).

[CR52] 52.Lambrecht, L., Kreifelts, B. & Wildgruber, D. Gender differences in emotion recognition: Impact of sensory modality and emotional category. Cogn Emot28, 452–469, 10.1080/02699931.2013.837378 (2014). [DOI] [PubMed] [Google Scholar]

[CR53] 53.Collignon, O. et al. Women process multisensory emotion expressions more efficiently than men. Neuropsychologia48, 220–225, 10.1016/j.neuropsychologia.2009.09.007 (2010). [DOI] [PubMed] [Google Scholar]

[CR54] 54.Filipe, M. G., Branco, P., Frota, S., Castro, S. L. & Vicente, S. G. Affective prosody in European Portuguese: Perceptual and acoustic characterization of one-word utterances. Speech Communication67, 58–64, 10.1016/j.specom.2014.09.007 (2015). [Google Scholar]

[CR55] 55.Landis, J. R. & Koch, G. G. The measurement of observer agreement for categorical data. Biometrics33, 159–174 (1977). [PubMed] [Google Scholar]

[CR56] 56.Jadoul, Y., Thompson, B. & de Boer, B. Introducing Parselmouth: A Python interface to Praat. Journal of Phonetics71, 1–15, 10.1016/j.wocn.2018.07.001 (2018). [Google Scholar]

PERMALINK

A validated Mandarin Chinese Auditory Emotion Database of Subject-Personal-Pronoun Sentences (MCAE-SPPS)

Mengyuan Li

Anqi Zhou

Huiru Yan

Qiuhong Li

Chifen Ma

Chao Wu

Abstract

Background & Summary

Methods

Script creation

Recording sessions

Recording actors

Table 1.

Recording procedures

Recording environment and equipment

Segmentation and preprocessing

Annotation procedure

Participants

Procedure

Ethics statement

Data Records

Table 2.

Technical Validation

Accuracy without speech inclusion criteria

Fig. 1.

Table 3.

Accuracy with speech inclusion criteria

Table 4.

Table 5.

Effect of subject pronoun type on emotion recognition

Fig. 2.

Effect of gender on emotion recognition

Fig. 3.

Annotation consensus

Table 6.

Acoustic validation

Table 7.

Table 8.

Usage Notes

Uncertainties and limitations

Acknowledgements

Author contributions

Data availability

Code availability

Competing interests

Footnotes

References

Associated Data

Data Citations

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases