Skip to main content
PLOS One logoLink to PLOS One
. 2024 Mar 11;19(3):e0297440. doi: 10.1371/journal.pone.0297440

Sound symbolism in Japanese names: Machine learning approaches to gender classification

Chun Hau Ngai 1,*, Alexander J Kilpatrick 2, Aleksandra Ćwiek 3
Editor: Søren Wichmann4
PMCID: PMC10927153  PMID: 38466741

Abstract

This study investigates the sound symbolic expressions of gender in Japanese names with machine learning algorithms. The main goal of this study is to explore how gender is expressed in the phonemes that make up Japanese names and whether systematic sound-meaning mappings, observed in Indo-European languages, extend to Japanese. In addition to this, this study compares the performance of machine learning algorithms. Random Forest and XGBoost algorithms are trained using the sounds of names and the typical gender of the referents as the dependent variable. Each algorithm is cross-validated using k-fold cross-validation (28 folds) and tested on samples not included in the training cycle. Both algorithms are shown to be reasonably accurate at classifying names into gender categories; however, the XGBoost model performs significantly better than the Random Forest algorithm. Feature importance scores reveal that certain sounds carry gender information. Namely, the voiced bilabial nasal /m/ and voiceless velar consonant /k/ were associated with femininity, and the high front vowel /i/ were associated with masculinity. The association observed for /i/ and /k/ stand contrary to typical patterns found in other languages, suggesting that Japanese is unique in the sound symbolic expression of gender. This study highlights the importance of considering cultural and linguistic nuances in sound symbolism research and underscores the advantage of XGBoost in capturing complex relationships within the data for improved classification accuracy. These findings contribute to the understanding of sound symbolism and gender associations in language.

Introduction

One of the more astonishing abilities of humans is that we can make use of specific combinations of hums, pops, clicks, and hisses to express imagery with vivid resolution. Arbitrary form-meaning mappings enable the coinage of complex and abstract terms, which in turn remove the limit as to what can be communicated [1]. However, human language also consists of non-arbitrary form-meaning mappings. The non-arbitrary relationship between sounds and meaning, which is often referred to as sound symbolism, has been a topic of scientific inquiry since the early 20th century. It is important to note that the current paper makes no distinction between sound symbolism and (vocal) iconicity. However, we acknowledge in certain contexts, vocal iconicity is reserved for direct imitation of environmental sounds [2]. One of the most widely studied examples of sound symbolism is the association between certain sounds and the perception of size. For instance, many language speakers associate high front vowels such as [i] and [ɪ] to denote smallness, while low back vowels, such as [a] and [ɔ] are associated with the imagery of largeness [3, 4]. The recent surge in interest in sound symbolism is reflected by the recent number of review articles on the topic [47]. In this study, we examine whether non-arbitrary relationships between sound-and-gender exist in Japanese names by making use of machine learning algorithms to classify Japanese first names into binary gender categories. Here, we specifically examine two machine learning algorithms: the Random Forest algorithm and an extreme gradient boosted algorithm (XGBoost). The classification accuracy of each algorithm is tested to determine whether gender is expressed sound symbolically in Japanese names. Following this, feature importance is examined to determine whether associations previously reported in English are found in Japanese since if they are, this would suggest that the systematic sound symbolic expressions of gender are universal.

Sound symbolism refers to the non-arbitrary associations between sounds, or sequences of sounds, and meaning in speech. Although, it should be noted that the term sound symbolism can be misleading given that symbolism, by definition, implies an arbitrary relationship between form and meaning. Early experiments in sound symbolism made use of pseudowords to examine the psychology of iconic mapping. In a seminal study, Sapir [4] found that 96% of English speakers judged /mal/ to be associated with a larger object and /mil/ with a smaller object. More recently, studies have expanded beyond English speaking populations, and found that the same association was found in Thai, Mandarin, Korean, and Japanese speakers [810]. Although it remains disputed as to whether sound-to-size mappings corresponds to vowel height or vowel backness, Shinohara and Kawahara [9] found an effect of vowel backness in size ratings in Japanese speakers. Back vowels and voiced obstruents invoked a larger imagery than those of front vowels and voiceless obstruents. This is also supported by typological survey reported that [i] is often used to express diminutive meanings and smallness [11, 12]. Other than the case of the Bahnar language–for which the opposite is true [13],–low vowels are typically associated with large entities and high vowels are associated with small entities. In the present study, we train machine learning algorithms using the sounds of given names to classify gender into binary gender categories. Aligning with gender stereotypes, we predict that sound that typically reflect smallness will be important in the algorithms for female classification while those that typically reflect largeness will contribute to male classification. Additionally, we also predict that sounds typically associated with femineity (e.g., /m/) contribute to female classification.

The systematic association between sounds and size has been proposed to have a biological basis [14, 15]. Species often convey their size by manipulating the fundamental frequencies (F0) of their vocalization, as their advertisement of size could potentially deceive adversaries [1618]. Various animals, including birds [19], frogs [20], and mammals [21] have been observed to adjust their vocal pitch to convey submission or aggression. In the context of human vocalization, Ohala [14, 15] argues that fundamental frequencies are influenced by body size, thus establishing an association between sounds and the dimensional aspects of size. Specifically, high-frequency sounds are associated with smallness, while low-frequency sounds are associated with largeness. However, in the context of human speech, formant frequencies, not fundamental frequencies, correlate with body size [2224]. While vowels exhibit variations across multiple dimensions (e.g., third formant and fundamental frequencies), back vowels generally have lower formant frequencies compared to front vowels [12, 24, 25]. In line with the frequency code hypothesis, vowels with higher second formant, such as [i], are often associated with smaller concepts, while vowels with low second formants, like [a], tend to convey larger concepts[11, 26, 27].This pattern is observed across languages worldwide, where words expressing smallness are disproportionately represented by high vowels, such as [i]. Whereas words conveying the idea of largeness tend to feature low back vowels such as [a] and [o] [11, 26, 27].

The application of the frequency code to consonants remains a puzzle, given that formant frequencies fail to provide informative cues regarding the manner and place of articulation of consonants. Moreover, disparities emerge in the consonant-to-size mappings observed in fictional creations [9, 2831] and those examining existing given names [3235]. While all these studies reference the frequency code as the underlying factor driving observed patterns, patterns reported in these studies vary across these investigations. Notably, Shinohara and Kawahara [9] observed that the invoked imagery depends on the voicing feature of obstruents, where voiced obstruents evoke a larger imagery compared to voiceless ones. This phenomenon has been attributed to the lower fundamental frequencies in vowels adjacent to voiced obstruents [36, 38, 39]. This is evident by the disproportionate number of voice obstruents in heavier, more evolved fictional Pokémon creatures [29, 37]. Although findings from Pokémon names did provide support for the frequency code [14, 15] names of Pokémon are not entirely equivalent to natural language words or names, as they are often created to highlight certain characteristics of the creatures and do not undergo changes. Conversely, studies investigating English first names have indicated a higher prevalence of sonorants in female names, while obstruents (both voiced and voiceless) tend to be more frequent in male names [35, 38, 39]. It has been hypothesized that the sound-to-gender association arises indirectly through size-sound associations, as female names tend to favor consonants associated with smallness, while male names exhibit a preference for consonants associated with largeness. Nevertheless, to date, no studies have endeavored to reconcile the discrepancies observed in these studies. As a result, it remains an open question how the frequency code [14, 15] is applicable to consonants.

Nevertheless, given names in English have been posited to follow the frequency code [33, 40]. Although people do not generally pick their own name, it has been theorized that parents are drawn to names with desirable stereotype for a given gender [41]. Of specific interest is the physical characteristics of body size. It was widely documented that on average, taller statured men have greater success in reproduction [42, 43]. Conversely, shorter and slimmer women are perceived as more fecund and attractive [4447]. As a result, parents pick phonemes associated with the desirable stereotype of each gender for their offspring. For example, Pitcher and collaborators [33] systematically examined thousands of popular American and Australian English names. In accordance to the frequency code [14, 15], vowels with high resonating frequencies (e.g., /i/ and /e/) were commonly attested in female names, and while low frequency vowels (e.g., /u/ and /o/) were found in male names. Studies have reported systematic sound-meaning mappings in English personal names [32, 33, 35, 38, 40, 41, 48, 49]. These studies have found systematic prosodic-phonological patterns in names of different genders which includes ‘consonant sonority’, ‘quality of stressed vowels’, ‘number of syllables’, ‘number of phonemes’, and ‘stress location’ (see Table 1 for details). These studies have reported that females names are more likely to contain more phonemes and syllables, less likely to stress the initial syllable [34, 47, 50]. For example, the name ‘Cecila’ /sɪsiljəˈ / fits the criteria of a typical female name. It has four syllables, non-initial stressed syllable, and a high front vowel. The opposite is true for the male name ‘Tom’, /tɑm/. This name has one closed syllable and a low back vowel.

Table 1. Systematic prosodic-phonological patterns previously reported in English names [34, 35, 47].

Male names Female names
Consonant sonority More obstruents More sonorants
Quality of stressed vowels More palatal vowels More velar vowels
Number of phonemes Less phonemes More phonemes
Number of syllables Fewer syllables More syllables
Stress location More initial stress More non-initial stress

These patterns are generally true of non-English languages. For example, Suire and collaborators [40] investigated vowel and consonant patterns in the most popular female and male multisyllabic French names. They found that the number of voiced plosives in the first syllable, place of articulation of vowels, nasality, and number of voiceless fricatives were significant predictors of gender. Male names were more likely to contain back vowels (e.g., /o/ and /ɔ/), nasal vowels (e.g., /ã/ and /ɔ̃/), and voiceless fricatives (e.g., /s/ and /ʃ/). Recently, studies have expanded to cross language comparison. Ackermann and Zimmer [32] examined cross-linguistics systematic sound-gender patterns in first names from 13 countries. Critically, this included languages that are geographically and culturally distant as well as typologically unrelated, such as Mandarin, Turkish, Japanese, Romanian, German, Hebrew,among others. They found that the number of non-palatal vowels did not interact with the factor country, suggesting that only patterns pertaining to vowels are sound symbolic. Taken together, these studies illustrated that sound-gender patterns for vowels are robustly attested across languages. In comparison, patterns observed for consonants have limited generalizability.

Our study examines given names in the context of Japanese. Japanese names are multi-syllabic, composed of Kanji, Kana, or a mix of both. Latin alphabets and numerals are not used in given names. For the two writing systems, Kanji characters are logographic characters adopted from early Chinese writing; Kana, on the other hand, is a collective term referred to the two alphabets, Hiragana and Katakana [51]. Although Hiragana and Katakana are relatively phonemic, Kanji characters are not bound to a single pronunciation [52]. Instead, Kanji characters have at least two readings: Kun-reading (the original Japanese reading), and On-reading (reading derived from importation of the Chinese characters into the written Japanese language [53]). For example, the pronunciation of the female name 紀子 is completely ambiguous, it could be pronounced as Kiko (/kʲiko̞/), Toshiko (/to̞ɕiko̞/), Michiko (/mʲi kʲiko̞/), Motoko (/mo̞to̞ko̞/). Although gender is not morphologically marked in Japanese, certain suffixes are commonly associated with male or female names. The full list of commonly found elements includes -o, -ro, -to, -hiko, -ta, -shi for male names, and -ko, -e, -yo, -ka, and -mi for female names [51]. Given the rich inventory of mimetics [54] and iconic mappings in fictional characters in Japanese [54], we expect systematic sound-gender mapping in Japanese names.

Japanese is a member of the Japonic language family, together with Ryukyuan and Hachijō. Although not limited to Japanese, one of the distinct features of Japanese are its use of mora while phonemically contrasting duration. Mora (symbolized μ) is a sub-syllabic timing unit, composed of an optional consonant and a vowel [55]. The assignment of mora is sensitive to the phonological notion of weights. Other than onset consonant, short vowels and coda consonants receive one mora, with long vowels receive two morae. For example, the loanword London /ro.n.do.n/ would consist of four mora. The full consonant and vowel inventories can be found in Table 2 and Fig 1. These inventories make up the features in the datasets except for the velar nasal. This was done to align with literature on Japanese where all instances of coda nasals were counted as /ɴ/.

Table 2. Consonant inventory of Japanese [56].

Bilabial Alveolar Alveo-palatal Palatal Velar Uvular Glottal
Plosive b p t d k g
Nasals m n ɴ
Tap ɾ
Fricatives s z ʃ h
Affricates t͡s d͡z t͡ʃ
Glides w j

Fig 1. Vowel inventory of Japanese [56].

Fig 1

Beyond its moraic structure, Japanese is also known for its rich inventory of mimetics, unlike Indo-European languages. Mimetics, also known as ideophones, are vividly depictions of sensations, actions, and various subjective experiences through iconic sound-meaning correspondences [57]. Since these words are often poorly integrated grammatically, they are often considered a special part of the lexicon [58]. Examples that have been noted include mimetics like goro-goro for sounds of rolling and pika-pika for shiny or flashing object [59]. Given mimetics’ integral role in Japanese, it is likely that sound-gender association is encoded in names, another core domain of the lexicon.

Other than unraveling the sound-gender association in Japanese, the current study also aims to test Random Forest algorithms against XGBoost algorithms in classifying names to the typical gender of referents according to phonemes. Although past studies have compared accuracy between these algorithms [60, 61], it remains challenging to ascertain which model performs better because only a single iteration of each model was constructed. To address this issue, this study constructed multiple iterations of each algorithm through k-fold cross-validation that allows for traditional statistical hypothesis testing in the form of linear regression. The Random Forest algorithm [62] and the XGBoost algorithm [63] are both ensemble machine learning algorithms that construct many decision trees to generate a model which is tested on a holdout subset of the data. Decision trees are non-parametric machine learning algorithms that resemble flow charts that map the possible outcomes given a series of choices. The choices occur at nodes in the decision tree and the outcome is determined at a terminal node by majority vote in classification models. Random Forest and XGBoost algorithms differ in a few keyways. Although both algorithms construct decision trees, XGBoost models construct sequential trees that take the results of earlier trees into consideration, while Random Forest construct parallel trees independent of each other. In XGBoost, weaker decision trees are trained on the residuals or errors of stronger decision trees. This process involves emphasizing areas where proficient decision trees exhibit deficiencies, with the goal of rectifying those specific errors. This collaborative optimization contributes to the overall model by refining its ability to address diverse scenarios and minimizing prediction errors.

Method and materials

All data and codes are available at the following online repository: https://osf.io/yrx4u/?view_only=d7fc5ef8b4ab449d8aaba591b18fdfc9.

Data

The data consists of the 1000 most common given names in Japanese from the Forebears website (REF: https://forebears.io/japan/forenames). Japanese names listed on Forebears are only listed in roman characters and are reportedly converted from their native script using JTALK (REF: https://forebears.io/japan/forenames). The moderators of the website revealed that the names were taken from a 2014 telephone directory consisting of 18 million names. Gender is listed as a distribution between male and female. The data was inspected by a native Japanese linguist. For this study, names were assigned to gender categories through a majority split of their distribution. Four names that were not gender tagged were omitted from the final dataset, resulting in 996 samples (444 female). Since Hiragana and Katakana are reasonably phonemic, an algorithm was constructed to convert Japanese names into phone counts, the output of the algorithm was checked by the same native Japanese linguist. The linguist checked to see if the transcription and gender classification was accurate. They reported that the transcription was accurate and that, overall, gender classifications were accurate, though they did note a couple of cases of unisex names whose majority classification did not meet their expectations. Given that this was a very small number of edge cases, we made no adjustments to the gender distribution listed on the website. Each sample therefore consisted of a gender classification followed by 34 features. The name Hiromi /hiɾomi/ for instance is assigned to the female category because of having a 92% distribution to that gender. It consists of value of 2 for /i/, and a value of 1 each for /h/, /ɾ/, /o/, and /m/. A value of zero is applied to all other features. This method results in a dataset primarily consisting of null values (84.32%).

Machine learning algorithms

The machine learning algorithms outlined in the following were constructed in R. The Random Forest algorithms were constructed with the Ranger package [61, 64] and tuned using the tuneRanger package [65]. The XGBoost algorithms were constructed using the XGBoost package [63] and tuned by inputting different operators to the XGBoost tuning grid. The only hyperparameter that was not tuned in this manner is the number of trees (or rounds in the parlance of XGBoost) for each algorithm. Given that we are comparing between algorithms, other hyperparameters and algorithm features were identical. A series of test models for both algorithms suggested that the stability and accuracy of the models did not improve after 3000 trees. This value was applied to all iterations.

Because decision tree-based algorithms are prone to overfitting when dealing with datasets that have many null values [56], k-fold cross-validation was used. In k-fold cross validation, the data is split into folds which are then recombined to multiple testing and training subsets. The following algorithms consist of multiple k-fold models that use different combinations of samples which are balanced so that each sample occurs in both the testing and training subsets an even number of times. In the present study, the data was split into 8 folds (A-H). These were then recombined to create subset splits consisting of a 3:1 split to training and testing subsets, whereby each iteration is trained using 75% of the data and tested on the remaining 25%. For example, the first iteration of each model is trained on subsets A, B, C, D, E, and F, and tested on subsets G and H. A Latin square combining all subsets revealed 28 possible combinations, and each combination was used resulting in 28 iterations for each algorithm. 28 folds was selected in order to ensure an adequate sample size for the statistical tests that explore accuracy differences between the Random Forest and XGBoost algorithms. A 3:1 split was determined suitable as the documentation for the Random Forest algorithm suggests a 2:1 split, while the documentation for the XGBoost algorithm suggests a 4:1 split.

Aside from cross-validation, k-folds allow for more in-depth statistical analyses. For example, the present study constructs 28 iterations of both Random Forest and XGBoost algorithms, and we are interested in testing which algorithm is more suited to the task. Single iterations of each model will give a general idea as to how well each algorithm performs specific to the subset splits, but multiple iterations provide mean and standard deviation, and show how the algorithms perform when dealing with the entirety of the data. This also means that statistical hypothesis testing could be applied using the accuracy of the two algorithms, such as the regression analyses presented in the following section. Both algorithms already have in-built significance tests. For Random Forest models, statistical significance tests are applied to feature importance while for XGBoost models, significant tests are applied to the accuracy of the model. We use Fisher’s method [66] for combining p values to provide an overall significance test for the model.

Results

Accuracy

Overall, the Random Forest algorithm (M = 76.36%, SD = 1.82%) was slightly less accurate and less stable than the XGBoost algorithm (M = 77.16%, SD = 1.58%). The accuracy for all iterations of the XGBoost algorithm were statistically significant (p < 0.001 in all cases) as was the Fisher’s method combined p value (p < 0.001). Because the ranger package does not provide a significance test for Random Forest accuracy, an intercept only linear regression model was constructed using the accuracy of each iteration against 55.42% because that is the accuracy that a naïve model would achieve if it assigned samples to the majority category (55.42% of the samples are male). The intercept only model showed the accuracy of the Random Forest algorithms to be significant; t(27) = 59.92, p < 0.001. Both models achieved an average classification accuracy of greater than 75%. Table 3 presents the confusion matrix for the Random Forest algorithm and Table 4 presents the confusion matrix for the XGBoost algorithm.

Table 3. Confusion matrix for all iterations of the Random Forest algorithm.

. Prediction
Female Male
Sample Female 2212 896
Male 752 3112

Table 4. Confusion matrix for all iterations of the XGBoost algorithm.

Classification
Female Male
Sample Female 2303 805
Male 787 3077

To examine whether the algorithms differ significantly in classifying names, a simple linear regression model was constructed based on the accuracy of individual iterations. It was constructed with models (Random Forest or XGBoost) as the predictor and accuracy as the outcome variable. The simple linear model revealed that the XGBoost algorithm was significantly more accurate than the Random Forest algorithm in classifying names according to gender; t(54) = 2.159, p = 0.035.

Feature importance

Feature importance is a measure of how important each sound was in the decision-making processes of the machine learning algorithms. For the Random Forest models, the present study uses the Altmann method [67] method of permutation which involves conducing many iterations for each permutation. Permutation is the randomisation of each individual feature and the observation of change in predicting unseen sample due to randomisation. The aggregated feature importance scores for the 10 most important features in the Random Forest model are presented in Table 5. Aggregated permutation importance of each feature and Fisher’s combined p-values are also listed as Importance and Sig in Table 5. Table 5 also lists the adjusted distribution to the female category. Because of the uneven distribution of male and female names in the sample (Female = 444; Male = 552), percentage of each phoneme in female distribution is adjusted below. Phoneme with adjusted distribution higher than 50% suggest the particular phoneme predicted female names, and opposite for male names. Since the majority of features that were found to be predictive of gender classification overlapped between Random Forest and XGBoost, features predictive of gender are discussed collectively in the following section.

Table 5. Feature importance for the Random Forest model.

The adjusted distribution signifies that if the value is above 50%, it is predicted to be particularly important for female names.

Phoneme Importance Sig Adjusted Distribution (Female)
/i/ 5.94% < 0.001 44.97%
/m/ 3.14% < 0.001 78.67%
/a/ 2.16% < 0.001 50.83%
/o/ 1.99% < 0.001 50.29%
/k/ 1.72% < 0.001 61.69%
/d͡z/ 1.51% < 0.001 22.16%
/u/ 1.19% < 0.001 52.13%
/ʃ/ 1.03% < 0.001 30.60%
/t/ 1.00% < 0.001 38.85%
/e/ 0.70% < 0.001 69.48%
/h/ 0.53% < 0.001 33.41%
/b/ 0.45% < 0.001 43.51%

The XGBoost, feature importance was calculated using the default settings included in the XGBoost package. Feature importance measures are reported in relation to the most important feature in each model. In other words, the most important feature is assigned a score of 100 and each other feature is assigned a score relative to it. Importance is calculated using purity (Gini index) applied to the amount that each attribute split improves model performance, weighted by the number of observations at the node. Table 6 reports the ten most important features in the XGBoost model. Importance and Sig in Table 6 are the aggregated permutation importance of each feature and the combined p value respectively.

Table 6. Feature importance for the XGBoost model.

The adjusted distribution signifies that if the value is above 50%, it is predicted to be particularly important for female names.

Phoneme Importance Adjusted Distribution (Female)
/i/ 99.78 44.97%
/a/ 73.25 50.83%
/m/ 52.39 78.67%
/o/ 46.33 50.29%
/k/ 46.17 61.69%
/u/ 42.43 52.13%
/d͡z/ 33.65 22.16%
/ʃ/ 26.20 30.60%
/s/ 26.20 51.04%
/e/ 25.25 69.48%
/t/ 24.04 38.85%
/h/ 21.03 33.41%

Features that were found to be important in the Random Forest model were also found to be important in the XGBoost model. Although the degree to which phonemes contribute to gender classification differ between algorithms, the same set of vowels and consonants could still be found; five phonemes were especially important: /i/, /m/, /a/, /o/ and /k/.

The directionality of phoneme was also examined, and did not entirely align with predictions put forth by the frequency code. Based on previous studies on English names, one would predict that sonorants and high vowels are more commonly found in female names, while obstruents and low vowels are more likely to be found in male names [37, 41]. Within the five most important phonemes, only the voiced bilabial nasal /m/ (a sonorant) aligns with associations reported in English, and was found to be predictive of female names. Opposite to associations observed in English, the high front vowel /i/ and voiceless velar stop /k/ were found to be predictive of male and female names respectively. Potential explanations behind these anomalies are further explored in a post-hoc poisson regression. For phonemes /a/ and /o/, since there is a roughly equal distribution of these phonemes across male and female names, the current study abstains from drawing any conclusion about the directionality of these phonemes.

Post-hoc analysis

Since associations observed for /i/ and /k/ deviate from patterns previously reported in English, a post-hoc poisson regression analysis was conducted to explore tentative explanations. Here, we proposed that the association observed was masked by suffixations in names. In the case of Japanese, certain phonemic combinations are systematically found in names of a particular gender [53]. In Japanese, the triphone /iʧi/ and the diphone /ko/ could often be found in male and female names correspondingly. The triphone /iʧi/ corresponds to the kanji , which translate to the number one [52]. Whereas for the diphone /ko/, it corresponds to the child character子 (or ‘young’) and is frequently found in word-final position of female names [68]. A post-hoc logistical Poisson regression confirms our hypothesis. The triphone /iʧi/ was found to significantly predict male names (B = -2.13, SE = 0.423, p < 0.001). The diphone /ko/ was found to significantly predict female names (B = 1.32, SE = 0.202, p < 0.001). Both observations were supplemented by many names consisted of iʧi/ (e.g., Ichini /iʧini/, Koichi /koiʧi/, and Shinichi /ʃiniʧi/) in the current dataset and /ko/ (e.g., Akiko /akiko/, Yukiko /yukiko/, and Yoko /yoko/). Given these results, it was therefore likely that associations for /k/ and /i/ were in fact masked by a phonestheme. Here phonestheme is defined as systematic sound-meaning mappings due to shared genealogy [68]. A commonly cited example is the phoneme sequence gl- in English words related to light or vision (e.g., glitter, gleam, and glare, etc.) [5, 69].

In addition to examining phonesthemic explanation, the potential effect of position was also explored here, since Ackermann and Zimmer [34] reported a bifuraction in the sound-gender association for non-palatal vowels (e.g., /a/) in word-final position, as opposed to word initial or word-medial position. Contrary to Ackermann and Zimmer [34], a position effect was not attested. Upon excluding the combination of /iʧi/, predictors /i/ in word-final position (B = -0.293, SE = 0.174, z = -1.684, p = 0.092) and non-word final position (B = -0.288, SE = 0.167, z = -1.172, p = 0.085) did not predict names of either gender.

Discussion

Due to the sparsity of studies examining sound symbolism in the context of names, most studies have been limited to examining names from Indo-European languages [35, 47, 48]. Nevertheless, our current results demonstrate that sound symbolism sound-gender associations reported in Indo-European languages are also prevalent in Japanese names. With only phonological information, both machine learning algorithms were able to capitalize on the systematic sound-meaning mapping in classifying the gender of names with well above average accuracy (Random Forest: M = 76.36%, SD = 1.82%; XGBoost: M = 77.16%, SD = 1.58%). The high accuracy in gender prediction suggests that sound-symbolic mapping could also be attained in Japanese and could be detected by machine learning algorithms.

In addition to the observed sound-gender associations in Japanese names, our findings may be further understood in the context of the bouba-kiki effect. This phenomenon, where people non-arbitrarily associate certain sounds with specific shapes [70], may extend to the perception of names. The systematic sound-meaning mappings we observed could reflect a broader linguistic phenomenon where certain phonetic qualities are generally associated with specific semantic properties, as discussed in Sidhu, Westbury, Hollis, and Pexman [71].

Of particular interest is the mapping between specific phonemes to male or female names. The sound-gender association observed for voiced bilabial nasal /m/ aligned with those previously reported in English. In previous accounts on English and French, sonorants and front vowels were more frequently attested in female names while obstruents and back vowels were more frequently found in male names [33, 40, 48], an observation in line with Ohala’s [14, 15] frequency code. Given that /m/ is a sonorant, its associationwith femininity was not suprising.

The observed association of the consonant /m/ with femininity could be attributed not only to the frequency code [14, 15], but potentially to the idea of the concept of breast [11] or cuteness [72, 73]. This could be attributed to the articulatory gesture involved. To articulate the consonant /m/, it requires the closure of both lips, and air flowing through the nasal cavity (i.e. bilabial and nasality). Since /m/ was the type of sound that was generated during breastfeeding, an association was formed between breast and /m/ [26, 74] Alternatively, /m/ have been found to be associated with cuteness in Japanese [72, 73] The consonant /m/ is widely observed in brand names related to baby diapers [72]. This finding also resonates with a study by Erben Johansson & Cronhamn [75] who found that nasals occur more frequently in feminine nouns. Considering the rich imagery invoked by the consonant /m/, the association between the consonant /m/ to femininity may be due to a combination of these imagery or concepts invoked. The bilabial and nasal qualities of /m/, reminiscent of nurturing and cuteness, may cross-linguistically invoke certain imagery or concepts, transcending cultural and linguistic boundaries.

Beyond the consonant /m/, the current study also observed associations that run contrary to those reported in English. Namely, the high front vowel /i/ and the voiceless velar consonant /k/ were found to be associated with masculinity and femininity, respectively. Morphology is likely to play a crucial role in explaining these findings, as Japanese does not overtly mark gender grammatically; however, certain name endings subtly provide indications of the bearer’s gender. It was found that the Japanese speakers rely on name-final syllables such as [ta], [to], [ma], [shi], [o], and [ro] to predict male names, and [ko], [ne], [ka], [e], [mi], [yo], [no], and [na] to predicts female name [76]. These elements could sometimes be classified as morphemes since they adhere to Japanese mora structure.

While investigating the associations between gender and phonemes in Japanese names, the current study is hesitant to draw conclusions regarding the open front vowel /a/ and the close-mid back vowel /o/. The reluctance stems from the near 50 percent distribution of these vowels in both male and female names in the training subset, making it unlikely that any observation may not be substantial and could fail to replicate. While phonemes have an equal distribution across genders, it becomes challenging to discern clear sound-symbolic associations. In such cases, small variation in the data or the inclusion of additional names could potentially lead to different results. Therefore, caution is warranted in interpreting the significance of these vowels in predicting genders in names, and a larger sample size may be needed to confirm or refute potential associations.

As for comparing the performance of the two algorithms, the small but significantly higher classification accuracy observed for the XGBoost algorithm provides valuable implication for future studies comparing performance between machine learning algorithms. The superior performance of the XGBoost algorithm suggests that the boosting technique, which optimizes an objective function through sequential tree-building, better represents the non-linear relationships within the data. The higher accuracy achieved by the XGBoost algorithm indicates its ability to discern intricate patterns and interactions among features, which might have been challenging for the Random Forest algorithm to capture. By outperforming the Random Forest algorithm, the XGBoost algorithm demonstrates its potential to yield more precise and reliable predictions, making it a promising choice for similar classification tasks in the future. Nevertheless, careful scrutiny of specific features and relationships is needed to gain insight behind the emergence of improved accuracy.

In terms of the broader discourse on ethical considerations in machine learning, we acknowledge the imperative to explicitly address the ethical implications of our gender classification model based on Japanese name phonemes. While our primary focus has been on exploring how sounds express gender in Japanese, we recognize the importance of engaging with the ethical dimensions surrounding AI applications, especially those related to gender. Concerns regarding potential biases and misuse underscore the need for ongoing vigilance in the development and deployment of such models. In addition to these concerns, it is crucial to highlight that our model operates within a binary classification system of male and female, reflecting the specific context of Japanese names and their phonetic attributes. We acknowledge the limitation of our model’s accuracy in capturing the evolving nuances of gender identity, as societal perspectives continue to shift towards a more nuanced understanding. In subsequent research, we aim to explore approaches that accommodate the complexities of gender identity and contribute to a more inclusive and ethically responsible use of machine learning in this domain. In conclusion, our study provides compelling evidence that sound symbolic sound-gender associations extend to Japanese names. The classification accuracy achieved by both machine learning algorithms, particularly the XGBoost algorithm, highlights the potential for systematic sound-meaning mappings in Japanese, detectable by these algorithms. Notably, we observed intriguing associations between specific phonemes and gender in Japanese names. The sound-gender correspondence for the voiced bilabial nasal /m/ aligns with findings from English and French, possibly linked to the frequency code and associations with breast or cuteness. However, the associations for the high front vowel /i/ and the voiceless velar consonant /k/ diverge from the patterns reported in English, indicating potential cross-linguistic differences perhaps due to morphology. Culture may also play a role, influencing the phoneme choice in names and reflecting shifts in gender expectations over time. However, our study approaches with caution when interpreting the associations between gender and the vowels /a/ and /o/, due to their near-equal distribution in both male and female names, making any conclusions uncertain without a larger sample size. The superior performance of the XGBoost algorithm over the Random Forest algorithm in gender prediction signifies the advantage of boosting techniques in capturing complex relationships within the data, improving accuracy by leveraging previous tree outputs to correct errors. This observation underscores the potential of the XGBoost algorithm for future classification tasks. Further investigation is needed to unveil the specific features and relationships driving the improved accuracy. Our study contributes to the growing understanding of sound symbolism and gender associations in language, while emphasizing the importance of considering cultural and linguistic nuances in such investigations.

Acknowledgments

We would like to express our sincere gratitude to all those who have contributed to this research. We extend our appreciation to the Forebears website for providing the dataset of Japanese names, as well as the moderators for their assistance in obtaining the data and linguistic insights. We are also thankful to the native Japanese linguist who meticulously inspected the dataset. Additionally, we acknowledge the developers of the R packages (Ranger, tuneRanger, and XGBoost) used in constructing and tuning the machine learning algorithms. Without their valuable contributions, this study would not have been possible.

Data Availability

All Sound Symbolism in Japanese Names: Machine Learning Approaches to Gender Classification files are avaliable from the OSF database: https://osf.io/yrx4u/.

Funding Statement

The work included in this submission was funded by grants from the Japan Society for the Promotion of Science (Grant Number: 20K13055). This funding sources provided financial support for the fee related to publication. Our third author, Aleksandra Ćwiek was also supported by the German Research Foundation (DFG) (Grant Number: CW 10/1-1).

References

  • 1.Hockett CF. The Origin of Speech. Sci Am. 1960;203: 88–97. [PubMed] [Google Scholar]
  • 2.Nuckolls JB. The case for sound symbolism. Annu Rev Anthropol. 1999;28: 225–252. doi: 10.1146/annurev.anthro.28.1.225 [DOI] [Google Scholar]
  • 3.Jespersen O. Symbolic value of the vowel i. Linguistics: Selected papers in English, French and German. Copenhagen: Levin and Munksgaard; 1933. pp. 283–303. [Google Scholar]
  • 4.Sapir E. A study in phonetic symbolism. J Exp Psychol. 1929;12: 225–239. doi: 10.1037/h0070931 [DOI] [Google Scholar]
  • 5.Dingemanse M, Blasi DE, Lupyan G, Christiansen MH, Monaghan P. Arbitrariness, iconicity, and systematicity in language. Trends Cogn Sci. 2015;19: 603–615. doi: 10.1016/j.tics.2015.07.013 [DOI] [PubMed] [Google Scholar]
  • 6.Lockwood G, Dingemanse M. Iconicity in the lab: A review of behavioral, developmental, and neuroimaging research into sound-symbolism. Front Psychol. 2015;6. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Perniss P., Thompson R. L., Vigliocco G. Iconicity as a general property of language: evidence from spoken and signed languages. Front Psychol. 2010;1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Huang YH, Pratoomraj S, Johnson RC. Universal magnitude symbolism. J Verbal Learning Verbal Behav. 1969;8: 155–156. doi: 10.1016/S0022-5371(69)80028-9 [DOI] [Google Scholar]
  • 9.Shinohara K, Kawahara S. A cross-linguistic study of sound symbolism: The images of size. Annu Meet Berkeley Linguist Soc. 2010;36: 396. doi: 10.3765/bls.v36i1.3926 [DOI] [Google Scholar]
  • 10.Johnson RC. Magnitude symbolism of English words. J Verbal Learning Verbal Behav. 1967;6: 508–511. doi: 10.1016/S0022-5371(67)80008-2 [DOI] [Google Scholar]
  • 11.Blasi DE, Wichmann S, Hammarström H, Stadler PF, Christiansen MH. Sound–meaning association biases evidenced across thousands of languages. Proc Natl Acad Sci. 2016;113: 10818–10823. doi: 10.1073/pnas.1605782113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Ultan R. Size-sound symbolism. In: Greenberg J, editor. Universals of Human Language. Stanford: Stanford University Press; 1978. pp. 525–568. [Google Scholar]
  • 13.Diffloth G. I: big, a:small. In: Hinton JN, Ohala J, editors. Sound Symbolism. Sound Symbolism (pp. 107?114): Cambridge University Press; 1994. pp. 107–114. [Google Scholar]
  • 14.Ohala JJ. An ethological perspective on common cross-language utilization of F0 of voice. Phonetica. 1984;41: 1–16. doi: 10.1159/000261706 [DOI] [PubMed] [Google Scholar]
  • 15.Ohala JJ. The frequency code underlies the sound-symbolic use of voice pitch. Sound Symb. 1994;2: 325–347. [Google Scholar]
  • 16.Fitch WT. Vocal tract length and formant frequency dispersion correlate with body size in rhesus macaques. J Acoust Soc Am. 1997;102: 1213–1222. doi: 10.1121/1.421048 [DOI] [PubMed] [Google Scholar]
  • 17.Vannoni E, McElligott AG. Low frequency groans indicate larger and more dominant fallow deer (Dama dama) males. PLoS One. 2008;3. doi: 10.1371/journal.pone.0003113 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Reby D, McComb K. Anatomical constraints generate honesty: Acoustic cues to age and weight in the roars of red deer stags. Anim Behav. 2003;65: 519–530. doi: 10.1006/anbe.2003.2078 [DOI] [Google Scholar]
  • 19.Wallschläger D. Correlation of song frequency and body weight in passerine birds. Experientia. 1980;36: 412. doi: 10.1007/BF01975119 [DOI] [Google Scholar]
  • 20.Gingras B, Boeckle M, Herbst CT, Fitch WT. Call acoustics reflect body size across four clades of anurans. J Zool. 2013;289: 143–150. doi: 10.1111/j.1469-7998.2012.00973.x [DOI] [Google Scholar]
  • 21.Auracher J. Sound iconicity of abstract concepts: Place of articulation is implicitly associated with abstract concepts of size and social dominance. PLoS One. 2017;12: 1–25. doi: 10.1371/journal.pone.0187196 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 22.Evans S, Neave N, Wakelin D. Relationships between vocal characteristics and body size and shape in human males: An evolutionary explanation for a deep male voice. Biol Psychol. 2006;72: 160–163. doi: 10.1016/j.biopsycho.2005.09.003 [DOI] [PubMed] [Google Scholar]
  • 23.Rendall D, Kollias S, Ney C, Lloyd P. Pitch (F0) and formant profiles of human vowels and vowel-like baboon grunts: The role of vocalizer body size and voice-acoustic allometry. J Acoust Soc Am. 2005;117: 944–955. doi: 10.1121/1.1848011 [DOI] [PubMed] [Google Scholar]
  • 24.Tsur R. Size–sound symbolism revisited. J Pragmat. 2006;38: 905–924. [Google Scholar]
  • 25.Traunmüller H, Eriksson A. Acoustic effects of variation in vocal effort by men, women, and children. J Acoust Soc Am. 2000;107: 3438–3451. doi: 10.1121/1.429414 [DOI] [PubMed] [Google Scholar]
  • 26.Johansson N, Anikin A, Carling G, Holmer A. The typology of sound symbolism: Defining macro-concepts via their semantic and phonetic features. Linguist Typology. 2020;24: 253–310. doi: 10.1515/lingty-2020-2034 [DOI] [Google Scholar]
  • 27.Fitch WT. Vocal tract length perception and the evolution of language. 1986. doi: 10.1093/nq/s5-IV.83.97-a [DOI] [Google Scholar]
  • 28.Newman S. Further experiments in phonetic symbolism. Am J Psychol. 1933;45: 53–75. [Google Scholar]
  • 29.Kawahara S, Kumagai G. Expressing evolution in Pokémon names: Experimental explorations. J Japanese Linguist. 2019;35: 3–38. [Google Scholar]
  • 30.Godoy MC, de Souza Filho NS, de Souza JGM, França HAN, Kawahara S. Gotta Name’em All: an Experimental Study on the Sound Symbolism of Pokémon Names in Brazilian Portuguese. J Psycholinguist Res. 2020;49: 717–740. doi: 10.1007/s10936-019-09679-2 [DOI] [PubMed] [Google Scholar]
  • 31.Kawahara S, Breiss C. Exploring the nature of cumulativity in sound symbolism: Experimental studies of Pokémonastics with English speakers. Lab Phonol. 2021;12: 1–29. doi: 10.5334/LABPHON.280 [DOI] [Google Scholar]
  • 32.Ackermann T, Zimmer C. The sound of gender: Correlations of name phonology and gender across languages. Linguistics. 2021;59: 1143–1177. doi: 10.1515/ling-2020-0027 [DOI] [Google Scholar]
  • 33.Pitcher BJ, Mesoudi A, McElligott AG. Sex-biased sound symbolism in English-language first names. PLoS One. 2013;8: 1–6. doi: 10.1371/journal.pone.0064825 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Cutler A, Mcqueen J, Robinson K. Elizabeth and John: Sound patterns of men’s and women’s names. J Linguist. 2017;26: 471–482. [Google Scholar]
  • 35.Slater AS, Feinman S. Gender and the phonology of north American first names. Sex Roles. 1985;13: 429–440. doi: 10.1007/BF00287953 [DOI] [Google Scholar]
  • 36.Kingston J, Diehl RL. Phonetic Knowledge. Language (Baltim). 1994;70: 419. doi: 10.2307/416481 [DOI] [Google Scholar]
  • 37.Kawahara S, Noto A, Kumagai G. Sound symbolic patterns in Pokémon names. Phonetica. 2018;75: 219–244. doi: 10.1159/000484938 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 38.Cassidy KW, Kelly MH, Sharoni LJ. Inferring gender from name phonology. J Exp Psychol Gen. 1999;128: 362–381. doi: 10.1037//0096-3445.128.3.362 [DOI] [Google Scholar]
  • 39.Pickering M, Barry G. Sentence Processing without Empty Categories. Lang Cogn Process. 1991;6: 229–259. doi: 10.1080/01690969108406944 [DOI] [Google Scholar]
  • 40.Suire A, Mesa AB, Raymond M, Barkat-Defradas M. Sex-biased sound symbolism in French first names. Evol Hum Sci. 2019;1: 1–17. doi: 10.1017/ehs.2019.7 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 41.Sidhu DM, Pexman PM. What’s in a name? Sound symbolism and gender in first names. PLoS One. 2015;10: 1–22. doi: 10.1371/journal.pone.0126809 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Pawlowski B, Dunbar RI, Lipowicz A. Tall men have more reproductive success. Nature. 2000;403: 156. doi: 10.1038/35003107 [DOI] [PubMed] [Google Scholar]
  • 43.Stulp G, Pollet T V, Verhulst S, Buunk AP. A curvilinear effect of height on reproductive success in human males. Behav Ecol Sociobiol. 2012;66: 375–384. doi: 10.1007/s00265-011-1283-2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Brown WM, Price ME, Kang J, Pound N, Zhao Y, Yu H. Fluctuating asymmetry and preferences for sex-typical bodily characteristics. Proc Natl Acad Sci U S A. 2008;105: 12938–12943. doi: 10.1073/pnas.0710420105 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Singh D. Adaptive Significance of Female Physical Attractiveness: Role of Waist-to-Hip Ratio. J Pers Soc Psychol. 1993;65: 293–307. doi: 10.1037//0022-3514.65.2.293 [DOI] [PubMed] [Google Scholar]
  • 46.Tovée MJ, Reinhardt S, Emery JL, Cornelissen PL. Optimum body-mass index and maximum sexual attractiveness. Lancet. 1998;352: 548. doi: 10.1016/s0140-6736(05)79257-6 [DOI] [PubMed] [Google Scholar]
  • 47.Sutton L. Aliens are just like us: Personal names in the legion of super-heroes. Names. 2016;64: 109–119. doi: 10.1080/00277738.2016.1159446 [DOI] [Google Scholar]
  • 48.Cutler A, Mcqueen J, Robinson K. Sound Patterns of men’s and women’s names. J Linguist. 1990;26: 471–482. [Google Scholar]
  • 49.Barry H, Harper AS. Increased choice of female phonetic attributes in first names. Sex Roles. 1995;32: 809–819. doi: 10.1007/BF01560190 [DOI] [Google Scholar]
  • 50.Slater AS, Feinman S. Gender and the phonology of north {A}merican first names. Sex Roles. 1985;13: 7. Available: 10.1007/BF00287953 [DOI] [Google Scholar]
  • 51.Power J. Japanese names. Indexer. 2008; c4-2-c4-8. doi: 10.1093/nq/s9-VIII.186.66-c [DOI] [Google Scholar]
  • 52.Dylman AS, Kikutani M. The role of semantic processing in reading Japanese orthographies: an investigation using a script-switch paradigm. Read Writ. 2018;31: 503–531. doi: 10.1007/s11145-017-9796-3 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Barešová I. Japanese Given Names: A Window Into Contemporary Japanese Society A window Into Contemporary Japanese Society. Palacký University Olomouc; 2020. [Google Scholar]
  • 54.Kita S. Two-dimensional semantic analysis of Japanese mimetics. Linguistics. 1997;35: 379–415. [Google Scholar]
  • 55.Tsujimura N. Mora vs syllable. An introduction to Japanese Linguistics. 2013. pp. 65–74. [Google Scholar]
  • 56.Kilpatrick AJ, Ćwiek A, Kawahara S. Random forests, sound symbolism and Pokémon evolution. PLoS One. 2023;18: e0279350. doi: 10.1371/journal.pone.0279350 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 57.Hamano SS. The sound-symbolic system of Japanese (ideophones, onomatopoeia, expressives, iconicity). 1986. [Google Scholar]
  • 58.Tsujimura N. An introduction to Japanese Linguistics. Malden,MA: Blackwell; 2014. [Google Scholar]
  • 59.Iwasaki S. Japanese. John Benjamins Publishing; 2013.
  • 60.Abedi R., Costache R., Shafizadeh-Moghadam H., Pham QB. Flash-flood susceptibility mapping based on XGBoost, random forest and boosted regression trees. Geocarto Int. 2022;37: 5479–5496. [Google Scholar]
  • 61.Kabiraj S, Raihan M, Alvi N, Afrin M, Akter L, Sohagi SA, et al. Breast Cancer Risk Prediction using XGBoost and Random Forest Algorithm. 2020 11th Int Conf Comput Commun Netw Technol ICCCNT 2020. 2020; 2020–2023. doi: 10.1109/ICCCNT49239.2020.9225451 [DOI]
  • 62.Breiman L. Random Forest. Mach Learn. 2001; 5–32. doi: 10.1109/ICCECE51280.2021.9342376 [DOI] [Google Scholar]
  • 63.Chen T, He T. xgboost: Extreme Gradient Boosting. R Lect. 2014; 1–84. [Google Scholar]
  • 64.Wright MN, Ziegler A. Ranger: A fast implementation of random forests for high dimensional data in C++ and R. J Stat Softw. 2017;77. doi: 10.18637/jss.v077.i01 [DOI] [Google Scholar]
  • 65.Probst P, Wright MN, Boulesteix AL. Hyperparameters and tuning strategies for random forest. Wiley Interdiscip Rev Data Min Knowl Discov. 2019;9: 1–15. doi: 10.1002/widm.1301 [DOI] [Google Scholar]
  • 66.Edwards AW. RA Fischer, statistical methods for research workers. Landmark Writings in Western Mathematics. 2005. pp. 856–870. [Google Scholar]
  • 67.Altmann A, Toloşi L, Sander O, Lengauer T. Permutation importance: A corrected feature importance measure. Bioinformatics. 2010;26: 1340–1347. doi: 10.1093/bioinformatics/btq134 [DOI] [PubMed] [Google Scholar]
  • 68.Barešová I. The phenomenon of female “-ko” names in modern Japan. Gakushuin J Int Stud. 2020;6: 23–38. [Google Scholar]
  • 69.Sidhu DM, Pexman PM. Five mechanisms of sound symbolic association. Psychon Bull Rev. 2018;25: 1619–1643. doi: 10.3758/s13423-017-1361-1 [DOI] [PubMed] [Google Scholar]
  • 70.Ćwiek A, Fuchs S, Draxler C, Asu EL, Dediu D, Hiovain K, et al. The bouba /kiki effect is robust across cultures and writing systems. Philos Trans R Soc B. 2021;377. doi: 10.1098/rstb.2020.0390 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 71.Sidhu DM, Westbury C, Hollis G, Pexman PM. Sound symbolism shapes the English language: The maluma/takete effect in English nouns. Psychon Bull Rev. 2021;28: 1390–1398. doi: 10.3758/s13423-021-01883-3 [DOI] [PubMed] [Google Scholar]
  • 72.Kumagai G. The pluripotentiality of bilabial consonants: The images of softness and cuteness in Japanese and English. Open Linguist. 2020;6: 693–707. [Google Scholar]
  • 73.Kawahara S. Sound symbolism and theoretical phonology. Lang Linguist Compass. 2020;14: 1–17. doi: 10.1111/lnc3.12372 [DOI] [Google Scholar]
  • 74.Wichmann S, Holman EW, Brown CH. Sound symbolism in basic vocabulary. Entropy. 2010;12: 844–858. [Google Scholar]
  • 75.Johansson NE, Cronhamn S. Vocal iconicity in nominal classification. Lang Cogn. 2023;15: 266–291. [Google Scholar]
  • 76.Otaka H. An investigation of gender classifiers in modern Japanese first names. Kwansei Gakuin Univ Humanit Rev. 2016;21: 183–200. [Google Scholar]

Decision Letter 0

Søren Wichmann

13 Sep 2023

PONE-D-23-25119Sound Symbolism in Japanese Names: Machine Learning Approaches to Gender ClassificationPLOS ONE

Dear Dr. Ngai,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

See my comments at the end of this message. Please submit your revised manuscript by Oct 28 2023 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Søren Wichmann, PhD

Academic Editor

PLOS ONE

Journal Requirements:

When submitting your revision, we need you to address these additional requirements.

1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at 

https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and 

https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf

2. Please note that PLOS ONE has specific guidelines on code sharing for submissions in which author-generated code underpins the findings in the manuscript. In these cases, all author-generated code must be made available without restrictions upon publication of the work. Please review our guidelines at https://journals.plos.org/plosone/s/materials-and-software-sharing#loc-sharing-code and ensure that your code is shared in a way that follows best practice and facilitates reproducibility and reuse.

3. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. 

When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section.

4. We note that you have stated that you will provide repository information for your data at acceptance. Should your manuscript be accepted for publication, we will hold it until you provide the relevant accession numbers or DOIs necessary to access your data. If you wish to make changes to your Data Availability statement, please describe these changes in your cover letter and we will update your Data Availability statement to reflect the information you provide.

5. Please include a new copy of Table 2,4,5 in your manuscript; the current table is difficult to read. Please follow the link for more information: https://blogs.plos.org/plos/2019/06/looking-good-tips-for-creating-your-plos-figures-graphics/

6. Please include a copy of Table 7 and 8 which you refer to in your text on page 15.

7. We note you have included a table to which you do not refer in the text of your manuscript. Please ensure that you refer to Table 9 in your text; if accepted, production will need this reference to link the reader to the Table.

Additional Editor Comments:

I have not sent this out for reviewing yet because you need to provide a cleaner manuscript before bothering reviewers. On the first few pages alone I came across numerous typos and stylistic problems. I got tired of noting these when arriving at p. 8. What I noted up to then is indicated below. I also noted numerous problems in the list of references. See also below. But these are just things I happened to quickly note. It is not the case that you can just take care of these things, you need to be more thorough than that. So you need to carefully revise the manuscript taking care of these presentational issues first. Possibly you need to involve a professional copy-editor. When I get a better presented manuscript I will send it out for review.

Three typos in the abstract:

are shown to reasonably -> are shown to be reasonably

was associated -> were associated

and which -> ???

p. 4, clumsy formulation: Combined, evidence suggests a strong tendency to map certain phonemes with the imagery of size, other than speakers of the Bahnar language (13)

p. 4 could potential deceive -> could potentially deceive

p. 4 correlates with body size -> correlate with body size

p. 6 it remains an open question as to how -> it remains an open question how

p. 6 For example, the name ‘Catherine’: the way this is transcribed (which is wrong) it has four syllables; also, it is stressed on the first syllable, while the text says "non-initial stressed syllable"

p. 7 Kanji characters are logographic characters adopted from early Chinese religious texts: a bit weird statement; they are not somehow extracted from specific texts, but adopted from early Chinese writing

p. 8 Kanji characters is -> Kanji characters are

Ref. 2: what is K.A.?

Ref. 5 is garbled

Ref 11 incomplete

Ref 13 incomplete

Ref 15 incomplete

Ref 23 inconsistent use of capitalization

Ref 38 capitalization

Ref 56 capitalization

Ref 58 Forest -> forests

Ref 60 incomplete?

Ref 63 incomplete

Ref 73 capitalization

Ref 75 extra space

[Note: HTML markup is below. Please do not edit.]

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Mar 11;19(3):e0297440. doi: 10.1371/journal.pone.0297440.r002

Author response to Decision Letter 0


24 Oct 2023

Ngai Chun Hau

East Asian Languages and Cultures Department

Indiana University, Bloomington

12th October, 2023

Dr. Søren Wichmann

Academic Editor

PLOS ONE

Dear Dr. Wichmann,

We would like to express our sincere appreciation for the opportunity to revise our manuscript titled "Sound Symbolism in Japanese Names: Machine Learning Approaches to Gender Classification" in response to the valuable feedback provided by the editor. We have carefully considered your comments and suggestions, and we believe that the revised manuscript now adequately addresses the concerns raised prior the review process.

We have made substantial revisions to the manuscript to improve to adherence to PLOS ONE's formatting criteria. Specifically, we have addressed the following key points:

• Presentation and Stylistic Issues: We have thoroughly revised the manuscript to rectify the typos, grammatical errors, stylistic problems, and references pointed out by the editor. We have carefully proofread the entire manuscript, starting from the abstract to the references section, to ensure its accuracy and readability.

• Format Requirements: We have carefully followed PLOS ONE's style requirements, including those for file naming. We have reviewed the PLOS ONE style templates provided at the URLs https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf, and we have adjusted the format of our manuscript accordingly.

• Code Sharing: As per PLOS ONE's guidelines on code sharing, we could assure you that we will make the osf repository publicly available once the manuscript is accepted. Currently the code has been uploaded to the public repository https://osf.io/yrx4u/?view_only=d7fc5ef8b4ab449d8aaba591b18fdfc9

• Grant Information and Financial Disclosure: We apologize for the previous discrepancies in the 'Funding Information' and 'Financial Disclosure' sections. We have revised the manuscript to provide accurate grant numbers for the awards we received for our study in the 'Funding Information' section. Furthermore, we have included an updated financial disclosure statement in our cover letter to reflect the necessary changes.

• Tables: We have included new copies of Tables 2, 4, 5, and 7 in the manuscript, following the formatting guidelines provided by PLOS ONE. Additionally, we discovered that table 8 and 9 were a mistake due an error in table and figure naming. There was no table 8 and 9 to begin with. In fact, these tables are table 6 and 7. Table 6 and 7 presented the results obtained from Random Forest and XGBoost which could be found in page 14 and 15 respectively.

• Furthermore, we have attached a photo of Figure 3 in response to the request for additional figure files. This addition provides further clarity and support to our findings.

We would like to emphasize that we have taken great care to address all the comments and suggestions provided by the editor. We believe that the revised manuscript now meets PLOS ONE's publication criteria and contributes significantly to the field of sound symbolism in Japanese names.

Once again, we sincerely appreciate the valuable feedback provided by the editor and the editor. We are confident that the revised manuscript will be well-received by the scientific community. Thank you for your time and consideration.

Sincerely,

Ngai Chun Hau

Indiana University, Bloomington

chngai@iu.edu

Attachment

Submitted filename: Response to Reviewers.pdf

pone.0297440.s001.pdf (80.1KB, pdf)

Decision Letter 1

Søren Wichmann

20 Nov 2023

PONE-D-23-25119R1Sound Symbolism in Japanese Names: Machine Learning Approaches to Gender ClassificationPLOS ONE

Dear Dr. Ngai,

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The reviewers provide many constructive comments, which are largely complementary, since one reviewer looks more at methods and the other is more focussed on the context of the general study of sound symbolism. I strongly advice paying close attention to the comments of both reviewers. Additionally, below my signature I offer some observations on stylistic issue and typos. Please take those into account as well.

Please submit your revised manuscript by Jan 04 2024 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.

Please include the following items when submitting your revised manuscript:

  • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.

  • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.

  • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.

If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.

We look forward to receiving your revised manuscript.

Kind regards,

Søren Wichmann, PhD

Academic Editor

PLOS ONE

Additional Editor Comments:

References are to line numbers

54-56:

feature importance is examined to determine whether associations previously reported in

English are found in Japanese which suggest that the systematic sound symbolic expressions of

gender are universal.

->

feature importance is examined to determine whether associations previously reported in

English are found in Japanese since if they are, this would suggest that the systematic sound symbolic expressions of

gender are universal.

167: the loanword London /ro.n.do.n/ would be consisted of four mora ->  the loanword London /ro.n.do.n/ would consist of four mora

278: the Altmann methods [65] method -> the Altmann method [65]

325: poison regression -> Poisson regression

332: Poison regression -> Poisson regression

347: typologically unique from -> typologically distinct from

360: femineity -> femininity

369-370: in contrary to -> contrary to

367-378: "Considering that these elements adhere to the mora structure of Japanese, and seldomly encountered in other languages, it is fitting to classify these elements as morphemes." Seems to me a logical non sequitur.

404-406

"The superior performance of the XGBoost algorithm suggests that the boosting technique, which optimizes an

objective function through sequential tree-building, allowing for a more effective capture of

complex non-linear relationships within the data." Maybe "allows for" rather than "allowing for" is meant? Something is wrong with this sentence.

414-415: sound symbolism sound-gender associations extend to Japanese names, a language distinct from Indo-European languages. -> sound symbolic sound-gender associations extend to Japanese names, a language distinct from Indo-European languages.

462-464:

Blasi DE, Hammarström H, Stadler PF, Christiansen MH. Sound-meaning association

biases evidenced across thousands of languages. Proc Natl Acad Sci.

2016;113(39):10818–23.

->

Blasi DE, Wichmann S, Stadler PF, Hammarström H, Christiansen MH. Sound-meaning association

biases evidenced across thousands of languages. Proc Natl Acad Sci.

2016;113(39):10818–23

[Note: HTML markup is below. Please do not edit.]

Reviewers' comments:

Reviewer's Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation.

Reviewer #1: (No Response)

Reviewer #2: (No Response)

**********

2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Reviewer #2: Yes

**********

3. Has the statistical analysis been performed appropriately and rigorously?

Reviewer #1: Yes

Reviewer #2: Yes

**********

4. Have the authors made all data underlying the findings in their manuscript fully available?

The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified.

Reviewer #1: Yes

Reviewer #2: Yes

**********

5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here.

Reviewer #1: Yes

Reviewer #2: Yes

**********

6. Review Comments to the Author

Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters)

Reviewer #1: The motivation of the study needs to be explained better. In some research communities (e.g., NLP, machine learning), there was a huge discussion on the ethics of doing machine learning based gender classification in the current age (e.g., see this online discussion: https://www.reddit.com/r/MachineLearning/comments/qmm6uh/d_ethical_concerns_for_ml_to_predict_race_gender/) I am surprised the paper does not even mention such issues. Further, I found the paper to describe a very baseline approach, taking an existing dataset (whose creation process is unclear), and a vague algorithm for feature extraction (my impression is that this is the novel contribution for this paper because data and models are coming from somewhere else). Hence, I felt it needs major revisions.

Three main issues I found problematic are:

1. The dataset: it is clearly taken as is from the website. ""Gender is listed as a distribution between male and female." - in what data is that distribution calculated and how? You write that this data was inspected by a Japanese linguist. But, what was the inspection? What did you find out? Is the data 100% correct or are there errors? Is it totally obvious to distinguish male and female names in Japanese? In many languages, there are some gender neutral names and it is hard to guess the gender from name alone. What was your strategy for such cases?

2. The features: You write: "an algorithm was constructed to convert Japanese names into phone counts, the output of the algorithm was checked by the same native Japanese linguist" but no details are given. This seems to be main original contribution of this paper, considering that the dataset is external, and machine learning approaches are also standard ones. But there is no further detail on this algorithm. What did the manual checking of the output reveal? Were there any disagreements between the algorithm and human evaluation? How much? What is the accuracy of this phone counting program? They should be discussed.

3. Modeling: Why are only Random Forest and XGBoost chosen? Why not other simpler approaches like nearest neighbors or logisitic regression etc? Considering the size of your dataset, perhaps nearest neighbors would have worked too. In a paper like this, it would be good to see a comparison with more algorithms. In terms of feature selection too, there are approaches to estimate more predictive features irrespective of the algorithm used (e.g., looking for correlation of a feature with the predicted class). I think those should also be explored to understand the data better. The fold split also seems rather arbitrary. Why can't you just do a 5-fold or 10-fold stratified CV like most researchers report in their papers? Other than these, some error analysis showing where the algorithm failed and how to improve over this baseline approach would be good too.

Potential limitations of the paper needs to be discussed too, in my opinion.

Reviewer #2: I find this manuscript both interesting and valuable for iconicity research as it demonstrates how relatively new methodologies can be applied to address the challenge of handling increasingly available large datasets. The argumentation is easy to follow, and the language is clear. The statistical analysis also appears sound, although I am not an expert in machine learning, and all data underlying the findings is fully available. My recommendation is that it should be accepted with some minor but crucial revisions.

Main points:

Throughout

Although drawing universal conclusions from the occurrence of iconicity in names used in a single language can be challenging, I would like to stress that thorough language-specific iconicity studies serve the same purpose as descriptions of poorly documented languages. The compiled data and analyses deepen our knowledge of the diversity of iconicity in the linguistic system. This, in turn, helps us gain a clearer picture of the cross-linguistic situation in the field and guides future studies. With this being said, it is also important to clearly point out that the comparison between Japanese and English/Indo-European languages arises from the relatively sparse material on name iconicity research, which is mostly confined to Indo-European languages. Without such clarification, it might sound like English/Indo-European languages are a default when it comes to expected iconic patterns and naming conventions. For example:

“Our current results demonstrate that sound symbolism sound-gender association is also prevalent in Japanese names, a language that is typologically unique from Indo-European languages.”

“In conclusion, our study provides compelling evidence that sound symbolism sound gender associations extend to Japanese names, a language distinct from Indo-European languages.”

Throughout

The predictions made based on the frequency code are fitting and the authors have included an extensive list of relevant literature. The authors also draw some conclusions based on cross-linguistic patterns, such as in: “The observed association of the consonant /m/ with femininity could be attributed not only to the frequency code [15,16], but potentially to the idea of the concept of breast [12] or cuteness [67,68].” I wonder why these are not used as predictions along with the frequency code.

Iconicity in male and female names might appear binary, but the grounds for their associated sounds do not necessarily have to be of the same origin (cf. /m/ > femininity possibly via iconicity in words for ‘breast,’ etc.). Erben Johansson & Cronhamn (2022) tested the presence of iconicity in nominal classification systems, i.e., gender and classifier systems. Based on 344 languages from 212 language families, they found that morphological markings for masculine gender involved front, central, and back vowels, while markings for femininity involved nasal and stop consonants. While not directly name data, the material touches on the same type of male/female distinctions as the manuscript, and the findings align.

Furthermore, markings for human ~grammatical gender (in human/non-human classification systems) were largely associated with the same sound features as feminine. This raises the question of markedness in female and male names as well.

It would also be possible to connect the findings to the language-specific lexicon in general, and possibly to the bouba-kiki effect. References:

Sidhu, D. M., & Pexman, P. M. (2018). Lonely sensational icons: Semantic neighbourhood density, sensory experience and iconicity. Language, Cognition and Neuroscience, 33(1), 25–31. https://doi.org/10.1080/23273798.2017.1358379

Erben Johansson, N. E., & Cronhamn, S. (2022). Vocal iconicity in nominal classification. Language and Cognition, 15(2), 266-291. https://doi.org/10.1017/langcog.2022.36

Sidhu, D. M., Westbury, C., Hollis, G., & Pexman, P. M. (2021). Sound symbolism shapes the English language: The maluma/takete effect in English nouns. Psychonomic Bulletin & Review, 28(4), 1390–1398. https://doi.org/10.3758/s13423-021-01883-3

Page 8

Since this manuscript focuses on iconicity in Japanese, it is important to define what mimetics/ideophones are and their role in the Japanese language, especially in contrast to their status in other languages. This is crucial, considering the frequent comparisons with English and Indo-European languages throughout the text. Additionally, on page 17, “phonestheme” and “phonesthemic” appear without definition and should be clarified.

Page 10

The data used seems sound and the procedure for evaluation the data is well-described, but the authors do not explain why JTALK was chosen in the first place. For convenience, because JTALK is the only available database of Japanese names, or something else?

Page 19-20

The discussion brings up many relevant factors, including morphology and culture. However, while interesting, I find the connection between naming conventions and improvements in women’s rights too speculative without any data to back it up. To draw any conclusions, a comparison to older Japanese names would be needed. Additionally, a shorter paragraph at the beginning of the manuscript summarizing naming conventions from across the world would help contextualize the analyzed material, especially if the authors wish to maintain the point about potential phonological changes in Japanese names over the last decades. For instance, politically motivated names, as seen in Mandarin Chinese, could provide a useful comparison.

Minor points:

Page 3

I think it would benefit the reader if there were a clear linguistic example of sound symbolism in the first paragraph. Not necessarily an example like “bouba-kiki”, but just an association that is cross-linguistically common and can also be found in English or some other global language for familiarity.

Page 3

Since “iconicity” is used in the manuscript, it should be stated whether the authors consider sound symbolism as the same as (vocal) iconicity to avoid confusion. A distinction does not have to be drawn, but in that case, state that you use the terms interchangeably. I also want to mention that I am glad the authors highlighted that the “symbolism” in “sound symbolism” can be misleading.

Page 4

“thatthe” > “that the”

Page 5 and other places

While iconicity research in Pokémon names across languages has increased significantly in recent years, I think these findings should be contextualized. They are created in a manner that is arguably more deliberate than names and words. Therefore, their informative value about iconicity in language ought to be lesser and/or qualitatively different.

Page 8

“Typologically, Japanese differs from many Indo-European languages in many aspects. Japanese is a member of the Japonic language family. Although not limited to Japanese, one of the distinct features of Japanese are…”

The beginning of this paragraph sounds stunted. If the sentence “Japanese is a member of the Japonic language family” were expanded upon, it would likely flow better into the next sentence. For example, “Japanese is a member of the Japonic language family, together with Ryukyuan and Hachijō”, or something similar.

Page 8-9

“alphabets” > “syllabaries”

Hiragana and Katakana are not alphabets.

Page 9-10

Several technical terms are introduced here without description. While some are explained in the method section, others, such as “weak/strong learners”, are not. It would be helpful for the reader if the authors could add a sentence indicating that these terms will be described in detail later in the manuscript.

Page 12

“A partial Latin square revealed 28 possible combinations of subsets, and each combination was 237 used resulting in 28 iterations for each algorithm.”

This sentence should be explained in more detail for less statistics-savvy readers. Why is a partial Latin square used?

Discussion/conclusion

The post hoc analysis highlights the role of sound combinations, which I believe is crucial for understanding iconicity in spoken language data. I am not suggesting that the authors redo the entire analysis to include both phonemes and all possible phoneme combinations, but adding a sentence or two about the implications for future studies would be beneficial.

**********

7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.

If you choose “no”, your identity will remain anonymous but your review may still be made public.

Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.

Reviewer #1: No

Reviewer #2: Yes: Niklas Erben Johansson

**********

[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]

While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

PLoS One. 2024 Mar 11;19(3):e0297440. doi: 10.1371/journal.pone.0297440.r004

Author response to Decision Letter 1


19 Dec 2023

Ngai Chun Hau

East Asian Languages and Cultures Department

Indiana University, Bloomington

15th December 2023

Dear Editor and Reviewers,

We extend our heartfelt gratitude for the invaluable feedback and insightful observations provided during the review process. Your expertise and guidance have been instrumental in refining our manuscript to its current improved state. Below, we have provided detailed responses to your constructive comments, which we have presented indented for clarity and reference.

1. there was a huge discussion on the ethics of doing machine learning based gender classification in the current age (e.g., see this online discussion: https://www.reddit.com/r/MachineLearning/comments/qmm6uh/d_ethical_concerns_for_ml_to_predict_race_gender/) I am surprised the paper does not even mention such issues. Further, I found the paper to describe a very baseline approach, taking an existing dataset (whose creation process is unclear), and a vague algorithm for feature extraction (my impression is that this is the novel contribution for this paper.

We acknowledge the importance of engaging with the ethical dimensions surrounding AI applications, especially those related to gender. A paragraph has been added in the discussion section. In terms of the broader discourse on ethical considerations in machine learning, we recognize the imperative to explicitly address the ethical implications of our gender classification model based on Japanese name phonemes. While our primary focus has been on exploring how sounds express gender in Japanese, concerns regarding potential biases and misuse underscore the need for ongoing vigilance in the development and deployment of such models. We acknowledge the limitation of our model’s accuracy in capturing the evolving nuances of gender identity, as societal perspectives continue to shift towards a more nuanced understanding. In addition to these concerns, it is crucial to highlight that our model operates within a binary classification system of male and female, reflecting the specific context of Japanese names and their phonetic attributes.

2. The dataset: it is clearly taken as is from the website. "Gender is listed as a distribution between male and female." - in what data is that distribution calculated and how? You write that this data was inspected by a Japanese linguist. But, what was the inspection? What did you find out? Is the data 100% correct or are there errors? Is it totally obvious to distinguish male and female names in Japanese? In many languages, there are some gender-neutral names and it is hard to guess the gender from name alone. What was your strategy for such cases?

Thank you for pointing that out. The gender distribution was calculated based on the data available on the website. The dataset was inspected by a Japanese linguist to ensure the transcription and gender classification were accurate. The linguist reported that the transcription was accurate and that, overall, gender classifications were accurate, though they did note a couple of cases of unisex names whose majority classification did not meet their expectations. Given that this was a very small number of edge cases, we made no adjustments to the gender distribution listed on the website. We used the gender of the entry when available, and when it was not available, entries without gender are discarded. We acknowledge that there are some gender-neutral names in Japanese, and we did our best to ensure that the gender classification was accurate. We hope that this explanation addresses the reviewer’s concerns.

3. Why are only Random Forest and XGBoost chosen? Why not other simpler approaches like nearest neighbors or logisitic regression etc? Considering the size of your dataset, perhaps nearest neighbors would have worked too. In a paper like this, it would be good to see a comparison with more algorithms. In terms of feature selection too, there are approaches to estimate more predictive features irrespective of the algorithm used (e.g., looking for correlation of a feature with the predicted class). I think those should also be explored to understand the data better. The fold split also seems rather arbitrary. Why can't you just do a 5-fold or 10-fold stratified CV like most researchers report in their papers?

We appreciate the reviewer's thoughtful feedback. While we acknowledge the merit in exploring a broader set of algorithms and alternative fold split strategies, we would like to clarify that the focus of our study was on introducing a novel approach to gender classification based on the Random Forest and XGBoost algorithms which are known for their ability to capture intricate patterns and nonlinear relationships, providing a more sophisticated modelling approach. Given the specific goals and scope of our research, we believe that conducting further tests with simpler algorithms may dilute the primary focus of our study. We are, however, open to incorporating additional discussions on the limitations of our chosen approach in the revised version. We appreciate the reviewer's understanding and continued engagement with our work. In respect to the selection of folds; we created a Latin square that would provide 28 unique subset splits because we wanted to conduct a statistical test that was familiar to the reader (regression) and needed enough samples to make it a fair test. Given the closeness in accuracy (and SD) for the two methods, 5- or 10-fold models would likely not achieve significance. No edits were made to the manuscript.

4. Other than these, some error analysis showing where the algorithm failed and how to improve over this baseline approach would be good too.

Sound symbolism is stochastic, not prescriptive. Thus, it is expected that a certain number of errors are found. Misclassified samples are likely not being misclassified because of modelling errors, but rather because those names do not reflect gender sound symbolically. No edits are made to the manuscript.

5. It is also important to clearly point out that the comparison between Japanese and English/Indo-European languages arises from the relatively sparse material on name iconicity research, which is mostly confined to Indo-European languages. Without such clarification, it might sound like English/Indo-European languages are a default when it comes to expected iconic patterns and naming conventions. For example:“Our current results demonstrate that sound symbolism sound-gender association is also prevalent in Japanese names, a language that is typologically unique from Indo-European languages.”

Thank you for pointing that out. These sentences are rephrased to better reflect the sparsity in the existing literature of names.

6. The predictions made based on the frequency code are fitting and the authors have included an extensive list of relevant literature. The authors also draw some conclusions based on cross-linguistic patterns, such as in: “The observed association of the consonant /m/ with femininity could be attributed not only to the frequency code [15,16], but potentially to the idea of the concept of breast [12] or cuteness [67,68].” I wonder why these are not used as predictions along with the frequency code.

Line 85 was added to include this prediction.

7. Johansson & Cronhamn (2022) tested the presence of iconicity in nominal classification systems, i.e., gender and classifier systems. Based on 344 languages from 212 language families, they found that morphological markings for masculine gender involved front, central, and back vowels, while markings for femininity involved nasal and stop consonants. While not directly name data, the material touches on the same type of male/female distinctions as the manuscript, and the findings align.

Lines 431 and 434 are added to reference existing results to the findings in Johansson & Cronhamn (2022). Indeed, the findings of Johansson & Cronhamn (2022) resonate with our study. Erben Johansson & Cronhamn also found that nasals occur more frequently in feminine nouns. Considering the rich imagery invoked by the consonant /m/, the association between the consonant /m/ to femininity may be due to a combination of these imagery or concepts invoked. The bilabial and nasal qualities of /m/, reminiscent of nurturing and cuteness, may cross-linguistically invoke certain imagery or concepts, transcending cultural and linguistic boundaries.

8. It would also be possible to connect the findings to the language-specific lexicon in general, and possibly to the bouba-kiki effect. References:

Sidhu, D. M., & Pexman, P. M. (2018). Lonely sensational icons: Semantic neighbourhood density, sensory experience and iconicity. Language, Cognition and Neuroscience, 33(1), 25–31. https://doi.org/10.1080/23273798.2017.1358379

Erben Johansson, N. E., & Cronhamn, S. (2022). Vocal iconicity in nominal classification. Language and Cognition, 15(2), 266-291. https://doi.org/10.1017/langcog.2022.36

Sidhu, D. M., Westbury, C., Hollis, G., & Pexman, P. M. (2021). Sound symbolism shapes the English language: The maluma/takete effect in English nouns. Psychonomic Bulletin & Review, 28(4), 1390–1398. https://doi.org/10.3758/s13423-021-01883-3

Lines 402 to 407, 432, and 434 are added to include a discussion related to the bouba-kiki effect. In addition to the observed sound-gender associations in Japanese names, our findings may be further understood in the context of the bouba-kiki effect. This phenomenon, where people non-arbitrarily associate certain sounds with specific shapes, may extend to the perception of names. The systematic sound-meaning mappings we observed could reflect a broader linguistic phenomenon where certain phonetic qualities are generally associated with specific semantic properties, as discussed in Sidhu, Westbury, Hollis, and Pexman.

9. Since this manuscript focuses on iconicity in Japanese, it is important to define what mimetics/ideophones are and their role in the Japanese language, especially in contrast to their status in other languages. This is crucial, considering the frequent comparisons with English and Indo-European languages throughout the text.

In response to your comment, we have revised the manuscript to include a more comprehensive explanation of mimetics, also known as ideophones, in the Japanese language. We have highlighted their vivid depictions of sensations, actions, and various subjective experiences through iconic sound-meaning correspondences. We have also emphasized their unique status in the lexicon due to their often poor grammatical integration. Examples such as ‘goro-goro’ for sounds of rolling and ‘pika-pika’ for shiny or flashing objects have been included to illustrate this concept. Furthermore, we have discussed the likelihood of sound-gender associations being encoded in names, another core domain of the lexicon, given the integral role of mimetics in Japanese.

10. Additionally, on page 17, “phonestheme” and “phonesthemic” appear without definition and should be clarified.

Thank you for pointing out the need for clarification of the terms “phonestheme” and “phonesthemic”. We understand the importance of defining key terms for the reader’s comprehension. In response to your comment, we have added a definition of “phonestheme” in the manuscript. It is now defined as systematic sound-meaning mappings due to shared genealogy [69]. To illustrate this concept, we have included a commonly cited example of the phoneme sequence ‘gl-’ in English words related to light or vision, such as glitter, gleam, and glare [6,70].

11. The data used seems sound and the procedure for evaluation the data is well-described, but the authors do not explain why JTALK was chosen in the first place. For convenience, because JTALK is the only available database of Japanese names, or something else?

In response to your comment, we have clarified in the manuscript why JTALK was chosen. The primary reason for this choice is the way the names are listed on the website. Unfortunately, their native script is not presented on the website, limiting our options. Despite this limitation, we found that the gender classifications were generally accurate. We did encounter a few cases of unisex names whose majority classification did not meet our expectations. However, given that these were very few edge cases, we decided not to adjust the gender distribution listed on the website.

12. The discussion brings up many relevant factors, including morphology and culture. However, while interesting, I find the connection between naming conventions and improvements in women’s rights too speculative without any data to back it up. To draw any conclusions, a comparison to older Japanese names would be needed. Additionally, a shorter paragraph at the beginning of the manuscript summarizing naming conventions from across the world would help contextualize the analyzed material, especially if the authors wish to maintain the point about potential phonological changes in Japanese names over the last decades. For instance, politically motivated names, as seen in Mandarin Chinese, could provide a useful comparison.

Thank you for your insightful comment. We agree with your observation about the speculative nature of the connection between naming conventions and improvements in women’s rights. In response to your feedback, we have removed the paragraph in question from the discussion.We understand the importance of providing a solid foundation for our arguments and ensuring that our conclusions are backed by data. We appreciate your suggestion about including a summary of naming conventions from across the world to contextualize the analyzed material. We will take this into consideration for future research.

13. I think it would benefit the reader if there were a clear linguistic example of sound symbolism in the first paragraph. Not necessarily an example like “bouba-kiki”, but just an association that is cross-linguistically common and can also be found in English or some other global language for familiarity.

Thank you for your suggestion to include a clear linguistic example of sound symbolism in the first paragraph. We agree that this would benefit the reader and provide a more immediate understanding of the concept.

In response to your comment, we have added an example of sound symbolism in the introduction. We chose an example that is cross-linguistically common and can be found in English, for familiarity. Specifically, we discussed the association between certain sounds and the perception of size. For instance, many language speakers associate high front vowels such as [i] and [ɪ] with smallness, while low back vowels, such as [a] and [ɔ], are associated with largeness [2,3].

14. Since “iconicity” is used in the manuscript, it should be stated whether the authors consider sound symbolism as the same as (vocal) iconicity to avoid confusion. A distinction does not have to be drawn, but in that case, state that you use the terms interchangeably. I also want to mention that I am glad the authors highlighted that the “symbolism” in “sound symbolism” can be misleading.

Thank you for your comment regarding the use of the terms “iconicity” and “sound symbolism” in our manuscript. We understand the potential for confusion and the importance of clarity in our terminology.

In response to your comment, we have added a footnote to clarify our usage of these terms. We have stated that in the context of our paper, we make no distinction between sound symbolism and (vocal) iconicity. However, we acknowledge that in certain contexts, vocal iconicity is reserved for the direct imitation of environmental sounds [6].

15. While iconicity research in Pokémon names across languages has increased significantly in recent years, I think these findings should be contextualized. They are created in a manner that is arguably more deliberate than names and words. Therefore, their informative value about iconicity in language ought to be lesser and/or qualitatively different.

Thank you for your comment regarding the use of Pokémon names in iconicity research. We agree with your observation that these names are created in a more deliberate manner than natural language words or names, and therefore, their informative value about iconicity in language might be lesser and/or qualitatively different.

In response to your comment, we have added a footnote in our manuscript to contextualize the findings from Pokémon names. We have noted that while these names did provide support for the frequency code[15,16], they are not entirely equivalent to natural language words or names. This is because they are often created to highlight certain characteristics of the creatures and do not undergo changes.

16. “Typologically, Japanese differs from many Indo-European languages in many aspects. Japanese is a member of the Japonic language family. Although not limited to Japanese, one of the distinct features of Japanese are…” The beginning of this paragraph sounds stunted. If the sentence “Japanese is a member of the Japonic language family” were expanded upon, it would likely flow better into the next sentence. For example, “Japanese is a member of the Japonic language family, together with Ryukyuan and Hachijō”, or something similar.

Thank you for your feedback on the flow of our manuscript. We agree that the paragraph you pointed out could be improved for better readability and coherence.

In response to your comment, we have rephrased the sentences in question. We now start the paragraph by acknowledging the limited scope of previous studies on sound symbolism in the context of names, which have mostly focused on Indo-European languages [37,38,50]. We then transition into discussing our current results, which demonstrate that sound symbolism and sound-gender associations reported in Indo-European languages are also prevalent in Japanese names.

17. “alphabets” > “syllabaries” Hiragana and Katakana are not alphabets.

Changed.

18. Several technical terms are introduced here without description. While some are explained in the method section, others, such as “weak/strong learners”, are not. It would be helpful for the reader if the authors could add a sentence indicating that these terms will be described in detail later in the manuscript.

Thank you for your comment regarding the introduction of technical terms in our manuscript. We understand the importance of clear definitions for reader comprehension.

In response to your comment, we have added explanations for the terms “weak/strong learners” in the context of XGBoost. We have clarified that weaker decision trees are trained on the residuals or errors of stronger decision trees. This process emphasizes areas where proficient decision trees exhibit deficiencies, with the goal of rectifying those specific errors. This collaborative optimization contributes to the overall model by refining its ability to address diverse scenarios and minimizing prediction errors.

19. “A partial Latin square revealed 28 possible combinations of subsets, and each combination was 237 used resulting in 28 iterations for each algorithm.” This sentence should be explained in more detail for less statistics-savvy readers. Why is a partial Latin square used?

Thank you for your comment regarding the use of a partial Latin square and the need for more detailed explanation. We understand the importance of making our methodology accessible to readers with varying levels of statistical knowledge.

In response to your comment, we have expanded upon our explanation of the use of k-fold cross-validation and the rationale behind the selection of a 3:1 split for training and testing subsets. We clarified that decision tree-based algorithms are prone to overfitting when dealing with datasets that have many null values [56], hence the use of k-fold cross-validation. In this method, the data is split into folds which are then recombined to create multiple testing and training subsets.

We also explained that our study used a Latin square to combine all subsets, revealing 28 possible combinations. Each combination was used, resulting in 28 iterations for each algorithm. This number of folds was selected to ensure an adequate sample size for the statistical tests that explore accuracy differences between the Random Forest and XGBoost algorithms.

We appreciate your thorough assessment of our work and the time and effort you have invested in ensuring the quality and rigor of our research. All other suggested edits in terms of wording and spelling are also incorporated. Your input has significantly contributed to the enhancement of our manuscript. The second reviewer made methodological suggestions regarding collecting additional data from fictional sources. While we feel that their concerns are valid, we have not undertaken any additional analyses as we feel that this would be outside the scope of the present manuscript. We have, however, addressed their concerns in an additional paragraph in the conclusion section where we discuss future directions for this research.

Thank you once again for your dedication to scholarly excellence.

Sincerely,

Ngai Chun Hau

Indiana University, Bloomington

chngai@iu.edu

Attachment

Submitted filename: Response to Reviewers.docx

pone.0297440.s002.docx (28.3KB, docx)

Decision Letter 2

Søren Wichmann

5 Jan 2024

Sound Symbolism in Japanese Names: Machine Learning Approaches to Gender Classification

PONE-D-23-25119R2

Dear Dr. Ngai,

We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.

Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.

An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.

If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

Kind regards,

Søren Wichmann, PhD

Academic Editor

PLOS ONE

Additional Editor Comments (optional):

Reviewers' comments:

Acceptance letter

Søren Wichmann

1 Mar 2024

PONE-D-23-25119R2

PLOS ONE

Dear Dr. Ngai,

I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now being handed over to our production team.

At this stage, our production department will prepare your paper for publication. This includes ensuring the following:

* All references, tables, and figures are properly cited

* All relevant supporting information is included in the manuscript submission,

* There are no issues that prevent the paper from being properly typeset

If revisions are needed, the production department will contact you directly to resolve them. If no revisions are needed, you will receive an email when the publication date has been set. At this time, we do not offer pre-publication proofs to authors during production of the accepted work. Please keep in mind that we are working through a large volume of accepted articles, so please give us a few weeks to review your paper and let you know the next and final steps.

Lastly, if your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.

If we can help with anything else, please email us at customercare@plos.org.

Thank you for submitting your work to PLOS ONE and supporting open access.

Kind regards,

PLOS ONE Editorial Office Staff

on behalf of

Dr. Søren Wichmann

Academic Editor

PLOS ONE

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Supplementary Materials

    Attachment

    Submitted filename: Response to Reviewers.pdf

    pone.0297440.s001.pdf (80.1KB, pdf)
    Attachment

    Submitted filename: Response to Reviewers.docx

    pone.0297440.s002.docx (28.3KB, docx)

    Data Availability Statement

    All Sound Symbolism in Japanese Names: Machine Learning Approaches to Gender Classification files are avaliable from the OSF database: https://osf.io/yrx4u/.


    Articles from PLOS ONE are provided here courtesy of PLOS

    RESOURCES