Abstract
Cross-cultural perception of musical similarity is important for understanding musical diversity and universality. In this study we analyzed cross-cultural music similarity ratings on a global song sample from 110 participants (62 previously published from Japan, 48 newly collected from musicians and non-musicians from north and south India). Our pre-registered hypothesis that average Indian and Japanese ratings would be correlated was strongly supported (r = .80, p <.001). Exploratory analyses showed that ratings from experts in Hindustani music from the north and Carnatic music from the south showed the lowest correlations (r= .25). These analyses suggest that the correlations we found are likely due more to shared musical exposure than to innate universals of music perception.
Keywords: Cantometrics, cross-cultural, intra-cultural, perception, similarity
Introduction
Musical similarity is a central concept in music cognition and music information retrieval (Casey et al., 2008; Flexer & Grill, 2016; Müllensiefen & Frieler, 2004; Volk et al., 2016). However, like other aspects of music cognition, empirical data about music similarity has been largely restricted to perception of similarity in Western music by Western listeners (Jacoby et al., 2020). One of the few exceptions investigated perceptual similarity ratings by Japanese participants in a global sample of 30 songs (Daikoku et al., 2022). However, this study did not collect data from listeners from different societies and so was unable to address questions about the degree of diversity and universality in music similarity perception within and between societies. Without a cross-cultural understanding of such musical diversity and universality, we cannot know if or how musical data collected from one society can be generalized to other societies, limiting our scientific understanding of the role and value of music in human society (Jacoby et al., 2020).
While some large research teams have managed to collect data at global scale to address similar questions for other domains of music perception (e.g., rhythm [Jacoby et al., 2021], infant-directedness [Hilton et al., 2022]), we chose to focus on a regional scale by adding data from musicians and non-musicians from India. India has very different musical and cultural traditions from Japan and also has substantial internal musical diversity (Arnold, 2000; Daikoku et al., 2020). Historical migration in India created a divide between the indigenous southern societies both linguistically and culturally. While the north mainly speaks so-called “Indo-Aryan” languages like Hindi, Punjabi, Bengali, Rajasthani and Gujarati, the south generally speaks “Dravidian” languages like Kannada, Tamil, Malayalam, and Telugu.1 North–south distinctions are prominent in Indian classical music, with Hindustani classical music in the north and Carnatic music in the south (Arnold, 2000). This well documented internal variability also gave us a chance to explore diversity at two levels – between cultures and within cultures (Daikoku et al., 2020; Rzeszutek et al., 2012). Note that there is no single agreed upon definition of “culture” (Jacoby et al., 2020), and our analysis includes hierarchical, multi-layered groupings (e.g., northern and southern sub-cultures within a broader Indian macro-culture, each of which can be further divided into sub-sub-cultures with and without training in local musical traditions).
In this study we asked 48 raters from India to rate musical similarity from a global sample of 30 songs chosen from Alan Lomax’s Cantometrics Training Tapes (Lomax, 1976; Wood, 2020). We compared these results with data previously collected on the same set of songs from 62 Japanese participants (Daikoku et al., 2022) to measure the degree of agreement in similarity perception within and between cultures. We pre-registered our experimental design and our hypothesis that average similarity ratings among Japanese and Indian raters would be correlated, due to a combination of universals of music perception and shared experience listening to globalized music (Daikoku et al., 2022).
The following section of this paper describes the sample, participants, experiment design, and hypotheses, including the various logistical constraints needed to keep experiment times feasible. Then we describe our confirmatory (pre-registered) and exploratory results. Finally, we explore possible causes for differences in similarity perception within and between cultures using qualitative insights from expert musicians.
Methods and Hypothesis
Musical Sample
We chose the same 30 recordings used by Daikoku et al. (2022) to compare perceptual responses between Japanese and Indian raters on a diverse set of global recordings. The 30 recordings consist of a collection of traditional songs from Alan Lomax’s Cantometrics Consensus tapes (Lomax, 1976; Wood, 2020). The recordings were chosen to represent cultural, stylistic, and linguistic diversity to capture the full breadth of categories under Lomax’s Cantometric system.2 The recordings include a cappella and accompanied samples of both group and solo performances, and were chosen for their representativeness in covering a global range of styles and categories (Figure 1). These recordings were originally designed to test human raters after being trained to use Lomax’s 37-feature Cantometric rating system (Lomax, 1968; Savage, 2018; Wood, 2020; Wood et al., 2022). We used 10 s excerpts of these 30 recordings from Daikoku et al. (2022) to allow us to directly compare results (10 s excerpts were originally chosen to ensure the experiments could be conducted within a feasible time). The 10 s excerpts were randomly sampled from each track (in cases where this random excerpt happened to include only instrumental music, these were resampled until all tracks included a comparable section of vocals). Confirmation was carried out manually by the first author.
Figure 1. Map (adapted from Daikoku et al., 2022) showing the approximate locations and cultural group names of the 30-song sample from the Cantometrics Consensus Tape.
Songs are labeled by solo vs. group singing and with vs. without instrumental accompaniment. For further details about each track, see Wood (2020).
Participants
Participants were chosen from four categories, based on region and musical experience: North Indian Musician, South Indian Musician, North Indian Non-musician and South Indian Non-musician. “Musicians” were limited to those who had spent a minimum of five years as a performing musician trained in either Hindustani (north Indian music) or Carnatic (south Indian music) traditions. “Non-musicians” were limited to those with less than one year of formal music lessons. Regional distinctions were made based on geographic location (North: lived/living in Delhi, Haryana, Uttarakhand, Punjab, Himachal Pradesh, Rajasthan, Jammu or Kashmir; South: lived/living in Tamil Nadu, Kerala, Karnataka, Andhra Pradesh, or Telangana) and/or native tongue (North Indian [47% of the population]: Indo-Aryan languages, such as Hindi or Punjabi; South Indian [39% of the population]: Dravidian languages, such as Tamil, Telugu, Kannada, or Malayalam). While Daikoku et al.’s experiment had the option of English or Japanese, the current experiment only used the English version since all participants in the India study spoke English fluently. As most Indians are bilingual, their native language(s) would be their regional vernacular (e.g., Hindi, Tamil, Telugu, etc.) and English. Gender distribution was 58% male and 41% female. Selection criteria are detailed in the preregistration (see Appendix A). Power analysis3 indicated a sample size of N=48 (North Indian Musicians = 12; North Indian Non-musicians = 12; South Indian Musicians = 12; South Indian Non-Musicians = 12) for the sample of 30 recordings. Refer to Figure 2 for a detailed breakdown of sex and language by group. Note that we did not explicitly collect ages of any participants for this new experiment, and Daikoku et al. (2022) did not collect data on participant age, musicianship, or regional origin.
Figure 2. Breakdown of self-reported sex (left) and native language (right) by participant group.
For simplicity, the various Indian languages reported are grouped into higher-level family/sub-family groupings based on classifications in Glottolog (Hammarström et al., 2021). We emphasize that these groupings strictly refer to the traditionally accepted language families/sub-families (“Indo-Aryan” and “Dravidian”) and do not imply any real or imagined races.
Experiment Design
We used the same basic experimental design as Daikoku et al. (2022), who divided a set of 30 diverse global recordings (Figure 1) into 6 sets of 5 recordings to collect perceptual judgments of musical features for each recording and similarity ratings for all possible pairs of recordings from within each of the 6 sets. Like Daikoku et al. (2022), we divided the participants into 6 groups (Figure 3), where all members within each group rated the same 5 songs from the 30-song sample. Participants were assigned randomly but distributed evenly so that eight Indian participants (two from each of the four sub-groups) contributed similarity ratings for any given pair of songs. Tracks were played in a random order for each participant, and participants within each sub-group were randomly assigned to each group of songs. Tables 1 and 2 show the various feature labels used for the low, medium, and high values of each feature.
Figure 3.
The experiment is divided into three parts, rating each song by individual features (on a three-point scale; Part I), rating similarity for pairs of songs (five-point scale; Part II), and then a demographic survey (e.g., age, gender, musical experience).
Table 1. Full summary of participant listening preferences, sex, and native language.
| Musical preference | Gender | Native language | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Demographic | Indian genres | Western genres | Male | Female | Indo-Aryan | Dravidian | English | Japanese | ||
| North Indian Musician | Bollywood, Hindustani Classical, Contemporary fusion, Indian Classical, Ghazals, World Music | K-Pop, Rock, Blues R&B, Progressive, Rock, Psychedelic Rock, Progressive Metal, Hip Hop, Western Classical, Metal, Fusion, Jazz, DnB, Pop | 11 | 1 | 12 | 0 | 0 | 0 | ||
| North Indian Non-musician | Bollywood, Ghazal, Classical, Old Hindi Music, Punjabi | Classic Rock, Alternative, Hip Hop, Pop, Loft, Soft Rock, Western Classical | 7 | 5 | 8 | 0 | 4 | 0 | ||
| South Indian Musician | Carnatic Music, Indian Classical, Hindustani, Bollywood, Devotional, World Music, Indian Folk, Film Music, Tamil Film Music, Fusion | Philharmonic, Western Classical, English Pop, Rock, EDM, Indie | 7 | 5 | 0 | 12 | 0 | 0 | ||
| South Indian Non-musician | Bollywood, Indian Pop, Instrumental Classical, Indian Classical, Ghazals, Indie | K-Pop, Western Classical, Rock, Blues, Ambient, Instrumental, Slow, Hip Hop, Jazz, Pop, Electronic Music | 7 | 5 | 3 | 7 | 2 | 0 | ||
| Japanese | J-Pop, Japanese Classical, Vocaloid, Japanese Rock, Game, Anime | Rock, Classic, Jam, Trip-Hop, Reggae, Punk Rock, Jazz, Blues, Funk, Soul, Acoustic, R&B, 70s USA, Hard Rock, Afro Beat, Swing, Orchestral Music, Brass band, EDM, A Capella, Techno, House, Electronica, Alternative Rock, Progressive Rock, Metal | 52 | 10 | 0 | 0 | 1 | 62 | ||
Table 2. Labels used for low middle and high values for each feature.
| Feature | Low | Middle | High |
|---|---|---|---|
| Ornamentation | Unornamented | Moderately ornamented | Very ornamented |
| Grooviness | Don’t want to dance | Moderately want to dance | Want to dance a lot |
| Familiarity | Unfamiliar | Somewhat familiar | Very familiar |
| Liking | Dislike | Ambivalent | Like |
| Consonance | Dissonant | Neutral | Consonant |
| Valence | Negative | Neither | Positive |
| Excitement | Calming | Moderately | Exciting |
| Tempo | Slow | Standard | Fast |
| Rhythmic regularity | Very irregular | Moderately regular | Very regular |
| Vocal range | Narrow range | Moderate range | Wide range |
| Vocal tension | Very relaxed | Moderate | Very tense |
| Vocal texture | Solo | Some singers | Many singers |
| Sound quality | Low quality | Standard quality | High quality |
Following Daikoku et al. (2022), recordings were selected from the Cantometrics training tapes available on the Global Jukebox (Wood et al., 2022), which include audio examples for each of the 37 stylistic features in the Cantometrics classification system. Daikoku et al. selected and adapted a small number of Cantometric variables that covered a variety of domains (rhythm, melody, texture, etc.) and were easily condensable into three-point scales. They also collected (although did not previously publish) data on aesthetic features, which were chosen less systematically based on commonly used features from the music cognition literature.
Following Daikoku et al. (2022), our experimental paradigm divided raters into 6 groups who each evaluated all 10 possible pairwise similarities in sets of 5 songs at a time. This left us with an incomplete similarity matrix of 60 pairwise similarities in a set of 30 songs (Figure 4), rather than a full 30×30 song similarity matrix of 435 pairwise comparisons (which would not have been feasible to collect).
Figure 4.
Red shaded cells in the bottom left half of the partial similarity matrix show average pairwise distance evaluations from Daikoku et al.’s (2022) Japanese participants collected previously on a five-point scale normalized to range from 1 (very similar) to 0 (very different). Blue shaded cells on the top right half show the matching data from Indian raters collected in the current study. In order to make experiment length feasible, participants are assigned to 1 of 6 groups, with each group evaluating all possible pairs of a 5-song subset from the full 30 song sample.
Each experiment lasted approximately 20–30 min and was divided into two blocks: feature evaluation and pairwise evaluation. The one difference from Daikoku et al. (2022) was that in this experiment we did not collect triplet (odd-one-out) evaluations, because Daikoku et al. found these were less reliable and informative than the feature and pairwise similarity data. Before beginning the experiment, participants were presented with a series of reference tracks taken from the Cantometrics training tapes (Lomax, 1976) to familiarize them with the features they would be rating and the types of recordings they would be asked to rate. Participants then evaluated a set of features for each song after listening to each song at least once, after which they performed the pairwise similarity task.
Although the experiment interface was accessed online, all participants were monitored in real-time to maximize data quality. After the experiment was over, participants were asked to fill out a questionnaire asking them about their age, gender, musical experience, preferences, and exposure to non-Western music.
We collected responses from 12 North Indian musicians, 12 South Indian musicians, 12 North Indian Non-musicians and 12 South Indian Non-musicians to meet our pre-registered target of 48 participants equally balanced across these four categories (Daikoku et al., 2022). Reaching these pre-registered sample targets required that we conduct experiments with a total of 65 participants, as data from 17 participants had to be discarded due to data corruption or failure to complete the experiment.
Results
Confirmatory Results
We found a strong and significant positive correlation between average pairwise similarity ratings from Indian raters and Japanese raters (r=0.80, pmantel < .001, 1,000 permutations; Figure 4), confirming our pre-registered prediction. Note that our pre-registered analysis design (Daikoku et al., 2022) used Mantel’s (1967) permutation test to account for non-independence in distance matrix data.
To examine some similarities and differences, take track 16 (Zuni rain dance) and track 18 (Māori love song [waiata aroha]), which both feature unison, a cappella singing with long, rhythmically irregular phrases. In Figure 5 we see that both Indian and Japanese raters tended to rate these one of the most similar pairs, although the absolute similarity level differs (Japanese tended to rate them as “quite similar,” while Indians tended to rate them only as “similar” or “some-what similar”). Conversely, both Indian and Japanese raters tended to rate both songs as very different (“not at all similar”) to track 15 (Trinidad calypso), which features solo, accompanied singing with short, rhythmically regular phrases. In Figure 6 we can see that the Japanese tended to have an overall more conservative approach to rating similarity, and the Indians tended to rate in the extremes.
Figure 5. Average pairwise ratings of similarity between Indian raters and Japanese raters show a strong correlation (r = 0.80, pmantel<.001).
Similarity ratings for 60 pairs among 30 songs (cf. Figure 2) on a five-point scale are averaged and normalized so that “very similar” is 0 and “not at all similar” is 1. Statistical significance is assessed using Mantel’s (1967) permutation test to account for non-independence among data points.
Figure 6.
A heatmap of the correlation coefficients between five subgroups of participants, namely North Indian Hindustani Musicians, South Indian Carnatic Musicians, North Indian Non-musicians, South Indian Non musicians and Japanese Non-musicians. Each rating here has two raters per group for the Indian dataset. Here each cell represents r values between the groups in the rows and columns.
Exploratory Results
The following analyses are exploratory, and thus we report only descriptive statistics, not inferential statistical significance testing. First, we compared the various sub-groups of participants. We found that, while all groups had a net positive correlation, north Indian expert musicians and south Indian expert musicians had a relatively lower correlation (r = 0.25) as shown in the heatmap plot (Figure 6). Overall, most groups tend to have a relatively high correlation with Japanese raters.
When plotting the average rating distributions by group as a density function in Figure 7, we found that Japanese raters tended to have a slightly lower average density centered at around 0.65, suggesting that their judgements are slightly less extreme on average
Figure 7. Average similarity ratings plotted as a probability distribution function.
To test whether the results in Figure 5 could be due to the lower sample size in each subgroup rather than group differences, we re-ran our main analysis of averaged Indian data randomly resampling only two Indian participants per group at a time rather than averaging all 8 participants per group and calculated the correlation in their responses with the Japanese ratings. Figure 8 shows the distribution of these correlations across 200 resamples, with an average correlation coefficient of 0.69.
Figure 8.
Distribution of observed correlations between average Indian and Japanese data when Indian data are created by randomly resampling and averaging ratings from only two of the eight available participants. The mean correlation coefficient of r= 0.69 is lower than the value of 0.80 observed when using all eight participants.
Figure 9 explore the degree to which overall similarity ratings are predicted by ratings on individual features (e.g., consonance, rhythmic regularity) by comparing Euclidean distance matrices calculated from each individual feature with overall ratings of similarity. Correlation coefficients are all relatively low (r < .3). In general, few strong differences between features were observed across the groups, with a few possible exceptions discussed in the discussion section.
Figure 9.
A comparison between average distances from individual feature ratings from the first part of the experiment and average similarity ratings from the second part of the experiment. Features collected include aesthetic features (Grooviness, Familiarity, Liking, Sound Quality, Consonance, Valence and Excitement) and stylistic features (Vocal Range, Ornamentation, Tempo, Rhythmic Regularity, Vocal tension and Vocal Texture).
Discussion
Previously, Daikoku et al. (2022) found a low correlation between perceptual ratings of similarity from Japanese raters and automated algorithms but an overall higher degree of inter-rater reliability between participants. In this study we hypothesized that these ratings would be correlated cross-culturally due both to universals in music perception and to shared exposure to world music (Daikoku et al., 2022; Jacoby et al., 2020, 2021). Our results show an overall high correlation (r=0.80, p< .01) between average similarity ratings from Japanese and Indian raters. The high correlation between Japanese and Indian raters could be due to a number of factors, including a shared exposure to Western music, lack of familiarity with world music, and the presence of universals in music perception. However, this confirmatory analysis alone cannot distinguish whether causality is due to innate universals or is driven by cultural exposure.
We see an overall moderately high (r= .48-.63) correlation between Japanese raters and sub-groups of Indian raters in Figure 6 in musicians and non-musicians alike. Jacoby et al.’s (2021) global analysis of rhythm perception suggests that local musicians (“bimusical” participants trained in traditional, non-Western styles in addition to general exposure to globalized Western music) tend to perceive music differently from university students from the same country without such bimusicality, who instead tend to have similar music perception to university students in other countries, presumably due to shared exposure to globally dominant popular and classical music. We suspect that the similarities we found in perception of musical similarity may also result from such shared exposure. Had the similarities in ratings been due to innate universals in music perception, we would have expected the correlations in ratings to be consistent across all sub-groups. Instead, we found that the two groups with the strongest exposure to different local musical traditions (Hindustani and Carnatic musicians) had the most different ratings (r= .25; Figure 5), implicating musical exposure as the more likely mechanism.
Such exposure can be seen in the participants’ reporting of their own listening habits (Table 1). In general, nearly all Indian participants reported that they actively listened to some form of Western-based popular music, barring the South Indian classical musicians, who listened almost exclusively to Carnatic music. However, Western elements in Indian Bollywood and Japanese J-pop are so widespread that some passive exposure is inevitable.
The lower correlations between different sub-groups within India than between India and Japan is consistent with prior research showing greater variability within than between cultures (Daikoku et al., 2020; Mehr et al., 2019; Rzeszutek et al., 2012). However, such variation may be partly due to the fact that we explicitly sought to sample participants with maximum differences in musical exposure within India (i.e., expert musicians trained in distinct Hindustani and Carnatic traditions) and may also be affected by the unbalanced sample sizes that were smallest for the Indian sub-groups (see discussion below). Future studies sampling expert Japanese musicians and sampling a broader range of musical experience may provide further nuance about the degree of within- vs. between-society variation.
When comparing correlations between individual feature ratings and perceptual similarity ratings across groups, our results are inconclusive, yet might prove useful for future research. Japanese raters seem to base their judgements slightly more on certain features, such as rhythmic regularity, while Indian raters appeared to rely slightly more on other features, such as ornamentation, which is highly emphasized in Carnatic music training. In some cases, experts and non-experts seem to have opposite correlations with similarity measurements. However, we emphasize that all these correlations are relatively weak and that these differences are minor and speculative, based on post-hoc exploratory analyses.
One of the larger limitations of this study is the low data basis for the similarity ratings of the Indian sub-groups, for which each similarity rating is based on averaging only two participants. Our bootstrap resampling analysis (Figure 8) showed that re-running the correlations with Japanese data randomly sampling only two of the eight Indian participants reduced the overall correlation from 0.80 to 0.69. This suggests that unbalanced sample sizes could be affecting some of our exploratory results shown in Figure 5. This issue does not affect our main confirmatory results (which simply confirmed our predicted positive correlation between Indian and Japanese ratings) or exploratory comparisons between different Indian sub-groups (as these all have the same sample size). However, unbalanced sample sizes might contribute to the lower correlations observed among these subgroups when compared with the larger aggregated Indian or Japanese samples. Future analyses focused on the different sub-groups would require larger and more balanced samples to be able to make strong conclusions about variation among sub-groups.
Some of the other limitations of this study involve the participant sample. Raters from Japan were almost all non-musicians employed at a single company (Yamaha), whereas raters from India included both expert musicians and non-musicians from more diverse locations and backgrounds. In future we would like to collect additional ratings, especially from expert Japanese musicians, for a more balanced comparison of the role of musical training.
Furthermore, the ratings from Indian raters were from urban/upper class Indians, who only represent a minority of all demographics in India. Future studies including Japanese expert musicians and participants from other Western and non-Western populations might help us to answer broader questions about cross-cultural universality and diversity of music perception, and perhaps to apply this knowledge through music similarity algorithms and localized instruments that are tailored to match such diversity and universality (Gómez et al., 2013; Tzanetakis et al., 2007).
Supplementary Material
Acknowledgements
We would like to thank Takuya Nakata, Jun Murai, Eiji Murata, Takashi Mori, Minoru Kitamura, Takuya Fujishima, Rento Tanase, Shinichi Ito, Tsukasa Yamashita, Masahiro Sugiura, Yasushi Sakurai, Hideki Sakanashi, and Motoichi Tamura for their support and feedback. We thank Delton Ding, Yuto Ozaki, Sam Passmore, Christopher Micheltree, Yuchen Yuan, Gakuto Chiba, Jiei Kuroyanagi and Shafagh Hadavi of the Comp Music Lab for giving feedback and doing pilot experiments. We thank Nao Tokui and the other faculty at Keio SFC who provided feedback and comments on this research. We thank Kaori Nogata, Rachna Jaggi, Manabu Ando and the staff at Keio University who helped run this project smoothly. We would also like to thank Alan Lomax and Anna Lomax Wood for setting up and publishing the Global Jukebox dataset and Cantometrics training tapes. We thank many Twitter commenters for providing discussion about appropriate language family terminology, particularly Ashish Kulkarni, whose suggestion we have adopted here (https://twitter.com/agenetics1/status/1675966744958222336).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by funding from the Yamaha Corporation; Japanese Government, administered by the Japan Society for the Promotion of Science (KAKENHI Grant-in-Aid #19KK0064), and the New Zealand Government, administered by the Royal Society Te Apārangi (Rutherford Discovery Fellowship 22-UOA-040 and Marsden Fast-Start Grant 22-UOA-052).
Footnotes
Action Editor
Ian Cross, University of Cambridge, Faculty of Music.
Peer Review
One anonymous reviewer; Richard Widdess, SOAS University of London, Department of Music.
Author Contributions
Conceptualization/pre-registration: PES, HD, TS, SF, SH. Data curation: HD. Formal analysis: HD. Investigation/participant recruitment: HD, TS, SH. Methodology: HD, PES, TS. Visualization: HD, PES. Writing – original draft: HD, PES. Writing – review & editing: TS, SF, SH. Funding acquisition: PES, SF, TS.
Declaration of Conflicting Interests
The author(s) declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Coauthor TS is employed by the Yamaha Corporation, who funded this study. This has no effect on our interpretation, but we are declaring this in the interest of transparency.
Ethical Approval
Permission to perform this study was granted by the Keio University Shonan Fujisawa Campus Research Ethics Committee (no. 211) to PES. All participants provided informed consent and were offered payment for participation.
We emphasize that these “Indo-Aryan” and “Dravidian” groupings strictly refer to the traditionally accepted language families/sub-families (Hammarström et al., 2021), and do not imply any real or imagined races.
While other studies have aimed to use balanced samples that matched one or more participant groups’ cultural backgrounds to compare experiences with familiar vs. non-familiar musics (Balkwill & Thompson, 1999; Balkwill et al., 2004; Hadavi et al., 2023), the goal of this study is to measure perceptions of cross-cultural music similarity in a global sample, following the same protocol as a previous study (Daikoku et al. 2022) without comparing familiarity. By chance, the Cantometric sample happened to include one Indian and one Japanese song, but the sample was chosen to be globally diverse and was not specifically curated for any particular participant groups.
Refer to Appendix A for details of power analysis and sample size rationale. Note that the power analysis accounts for number of comparisons of songs, not participants, which is why we are able to compare the N =63 Japanese raters with the lower N = 12 sub-groups of Indian raters. This method was selected due to logistical challenges of the experimental design, and we elaborate on potential drawbacks in the discussion section.
Data and Code Availability
All data, code and audio is available at https://github.com/comp-music-lab/indian-music-data/
References
- Arnold A. The Garland Encyclopedia of world music volume 5: South Asia. Routledge; 2000. [Google Scholar]
- Balkwill L-L, Thompson WF. A cross-cultural investigation of the perception of emotion in music: Psychophysical and cultural cues. Music Perception. 1999;17(1):43–64. doi: 10.2307/40285811. [DOI] [Google Scholar]
- Balkwill L-L, Thompson WF, Matsunaga R. Recognition of emotion in Japanese, Western, and Hindustani music by Japanese listeners. Japanese Psychological Research. 2004;46(4):337–349. doi: 10.1111/j.1468-5584.2004.00265.x. [DOI] [Google Scholar]
- Casey MA, Veltkamp R, Goto M, Leman M, Rhodes C, Slaney M. Content-based music information retrieval: Current directions and future challenges. Proceedings of the IEEE. 2008;96(4):668–696. doi: 10.1109/JPROC.2008.916370. [DOI] [Google Scholar]
- Daikoku H, Ding S, Benetos E, Wood ALC, Shimozono T, Sanne US, Fujii S, Savage PE. Agreement among human and automated estimates of similarity in a global music sample; Proceedings of the 2022 international folk music analysis workshop (FMA 2022); 2022. pp. 26–32. [DOI] [Google Scholar]
- Daikoku H, Wood ALC, Savage PE. Musical diversity in India: A preliminary computational study using Cantometrics. Keio SFC Journal. 2020;20(2):34–61. doi: 10.14991/003.00200002-0034. [DOI] [Google Scholar]
- Daikoku H, Shimozono T, Fujii S, Hegde S, Savage PE. Cross-cultural perceptual musical similarity between India and Japan. OSF Preregistration; 2022. [DOI] [PubMed] [Google Scholar]
- Flexer A, Grill T. The problem of limited inter-rater agreement in modelling music similarity. Journal of New Music Research. 2016;45(3):239–251. doi: 10.1080/09298215.2016.1200631. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Gómez E, Herrera P, Gómez-Martin F. Computational ethnomusicology: Perspectives and challenges. Journal of New Music Research. 2013;42(2):111–112. doi: 10.1080/09298215.2013.818038. [DOI] [Google Scholar]
- Hadavi S, Kuroda J, Shimozono T, Leongómez JD, Savage PE. Cross-cultural relationships between music, emotion, and visual imagery: A comparative study of Iran, Canada, and Japan [Stage 1 Registered Report Snapshot] PsyArXiv Preprint. 2023 doi: 10.31234/osf.io/26yg5. [DOI] [Google Scholar]
- Hammarström H, Forkel R, Haspelmath M, Bank S. Glottolog. Glottolog database 4.5. 2021 doi: 10.5281/zenodo.5772642. [DOI] [Google Scholar]
- Hilton CB, Moser CJ, Bertolo M, Lee-Rubin H, Amir D, Bainbridge CM, Simson J, Knox D, Glowacki L, Galbarczyk A, Jasienska G, et al. Acoustic regularities in infant-directed speech and song across cultures. Nature Human Behaviour. 2022 doi: 10.1038/s41562-022-01410-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacoby N, Margulis EH, Clayton M, Hannon E, Honing H, Iversen J, Klein TR, Mehr SA, Pearson L, Peretz I, Perlman M, et al. Cross-cultural work in music cognition: Methodologies, pitfalls, and practices. Music Perception. 2020;37(3):185–195. doi: 10.1525/mp.2020.37.3.185. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Jacoby N, Polak R, Grahn JA, Cameron DJ, Lee KM, Godoy R, Undurraga EA, Huanca T, Thalwitzer T, Doumbia N, Goldberg D, et al. Universality and cross-cultural variation in mental representations of music revealed by global comparison of rhythm priors. PsyArXiv Preprint. 2021 doi: 10.31234/osf.io/b879v. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lakens D. Equivalence tests: A practical primer for t tests, correlations and meta-analyses. Social Psychological and Personality Science. 2017;8(4):355–362. doi: 10.1177/1948550617697177. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lomax A. Folk song style and culture. American Association for the Advancement of Science; 1968. [Google Scholar]
- Lomax A. Cantometrics: An approach to the anthropology of music. University of California Extension Media Center; 1976. [Google Scholar]
- Mantel N. The detection of disease clustering and a generalized regression approach. Cancer Research. 1967;27(2):209–220. https://aacrjournals.org/cancerres/article/27/2_Part_1/209/476508/The-Detection-of-Disease-Clustering-and-a . [PubMed] [Google Scholar]
- Mehr SA, Singh M, Knox D, Ketter DM, Pickens-Jones D, Atwood S, Lucas C, Jacoby N, Egner AA, Hopkins EJ, Howard RM, et al. Universality and diversity in human song. Science (New York, N.Y.) 2019;366:eaax0868. doi: 10.1126/science.aax0868. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Müllensiefen D, Frieler K. Cognitive adequacy in the measurement of melodic similarity: Algorithmic vs. Human judgments. Computing in Musicology. 2004;13(2004):147–176. doi: 10.1007/3-540-34416-0_32. [DOI] [Google Scholar]
- Rzeszutek T, Savage PE, Brown S. The structure of cross-cultural musical diversity. Proceedings of the Royal Society B: Biological Sciences. 2012;279(1733):1606–1612. doi: 10.1098/rspb.2011.1750. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Savage PE. Alan Lomax’s Cantometrics Project: A comprehensive review. Music & Science. 2018;1:1–19. doi: 10.1177/2059204318786084. [DOI] [Google Scholar]
- Tzanetakis G, Kapur A, Schloss WA, Wright M. Computational ethnomusicology. Journal of Interdisciplinary Music Studies. 2007;1(2):1–24. http://musicstudies.org/wp-content/uploads/2017/01/CompEthno_JIMS_071201..pdf . [Google Scholar]
- Volk A, Chew E, Hellmuth Margulis E, Anagnostopoulou C. Music similarity: Concepts, cognition and computation. Journal of New Music Research. 2016;45(3):207–209. doi: 10.1080/09298215.2016.1232412. [DOI] [Google Scholar]
- Wood ALC. Songs of Earth: Aesthetic and social codes in music: Based on Cantometrics: An approach to the anthropology of music by Alan Lomax. University Press of Mississippi; 2020. [Google Scholar]
- Wood ALC, Kirby KR, Ember CR, Silbert S, Passmore S, Daikoku H, McBride J, Paulay F, Flory M, Szinger J, D’Arcangelo G, et al. The Global Jukebox: A public database of performing arts and culture. PLOS ONE. 2022;17(11):e0275469. doi: 10.1371/journal.pone.0275469. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
All data, code and audio is available at https://github.com/comp-music-lab/indian-music-data/









