Modification of digital music files for use in human temporary threshold shift studies

C G Le Prell; Q Yang; J G Harris

doi:10.1121/1.3630017

. 2011 Sep 8;130(4):EL142–EL146. doi: 10.1121/1.3630017

Modification of digital music files for use in human temporary threshold shift studies

C G Le Prell ^1,^a), Q Yang ², J G Harris ³

PMCID: PMC3189256 PMID: 21974483

Abstract

An exposure that is reproducible across clinical∕laboratory environments, and appealing to subjects, is described here. Digital music files are level-equated within and across songs such that playlists deliver an exposure that is consistent across time. Modified music is more pleasant to listen to than pure tones or shaped noise, and closely follows music exposures subjects may normally experience. Multiple therapeutics reduce noise-induced hearing loss in animals but human trial design is complicated by limited access to noise-exposed subject populations. The development of standard music exposure parameters for temporary threshold shift studies would allow comparison of protection across agents with real-world relevant stimuli in human subjects.

Introduction

There is a critical need for a real-world-relevant noise exposure that can be reliably defined and reproduced across clinical∕laboratory environments, and which subjects will volunteer to listen to [given appropriate level and duration, to limit changes in hearing to temporary threshold shifts (TTS), and thus minimize risk to subjects]. In previous studies, pure-tones and filtered noise have often been used to induce TTS in human studies. These sounds can be unpleasant to listen to, and they lack real-world application. A few studies have used music, but there is significant variability within and across songs when music is selected as a stimulus. This reduces investigator control of the exposure parameters. For example, in one recent study, over the course of 17 songs, exposure levels varied by as much as 10 dB from song to song (Keppler et al., 2010).

The need for real-world relevant TTS models that can be used within clinical trials evaluating novel otoprotective agents is clear. TTS trials require a shorter time to complete and cost less than long-term (multi-year) studies in which slowly progressive permanent threshold shift (PTS) is measured. There is decreased potential for subject drop-out in short-term TTS studies, and there may be better control over subject safety, as control subjects are not expected to develop PTS. For these, and other, reasons, TTS models have been used to evaluate potentially otoprotective novel therapeutics in human subjects (Attias et al., 2004; Kramer et al., 2006; Lin et al., 2010). Shortcomings of the TTS study designs used to date include real-world variability of the noise exposure within nightclubs∕discotheques (up to 10-dB difference in exposure level across subject cohorts tested on different days, see Kramer et al., 2006), failure to measure robust TTS in subjects working in a noisy environment (less than 3 dB TTS in control subjects, see Lin et al., 2010), and use of a broad-band noise that is unpleasant to listen to and lacks real-world relevance (Attias et al., 2004). We therefore developed a procedure for manipulating digital music files to provide a controlled, pleasant to listen to, exposure with real-world relevance for NCT00808470 (and studies with other agents).

Procedures

We purchased and downloaded 326 digital rock and pop songs (21.7 h) in MPEG audio layer 3 (MP3) format (Amazon.com). Each song was played off of a laptop computer using iTunes^®; the speaker output was 6 isolater earphones (ER6I; Etymotic Research, Inc.) inserted into type 4157 Artificial Ear Simulators (Brüel & Kjær). Spectral data were sampled at 0.001 ms intervals using the PULSE system (version 12.5, Brüel & Kjær, Denmark). These virtually continuous data samples entered a multi-buffer that automatically exported average sound levels (sum of 1∕3-octave bands from 20 Hz to 20 kHz). During the first series of acoustic analyses, sound levels were averaged over the previous 1∕16th s (62.5 ms) interval and exported at 1 s intervals. Thus, depending on the song duration, there were up to 3500 time-level samples per song. Sound level data for the MP3 downloads shown in Table Table 1. clearly illustrates variability both within and across songs. These data suggest that random selection of songs will result in significant level variation across the duration of the playlist. Overall level is set during the digital song mastering process (Katz, 2007).

Table 1.

Song level variation measured from MP3 downloads.

	Rock	Pop
Number of songs	171	155
Total duration	11.6 h	10.1 h
Song duration	4.10 ± 0. 9 min	3.91 ± 0.8 min
Average song level	98.9 ± 3.5 dB A	97.5 ± 3.1 dB A
Range of average song levels	82.0–106.2 dB A	86.9–104.5 dB A
	(24.2 dB range)	(17.6 dB range)
Within song level variation	4.9 ± 1.6 dB	5.5 ± 1.6 dB
Peak song level	107.4 ± 2.4 dB A	107.7 ± 2.4 dB A

Open in a new tab

To improve empirical control over the exposure parameters, without distorting or otherwise degrading sound quality, we developed procedures for digitally manipulating sound files using custom matlab (The MathWorks Inc.) program code. The MP3 files were first converted to WAV files using Goldwave (Goldwave, Inc.). All WAV files were then digitally manipulated using matlab program code, which first broke the songs into 50 ms non-overlapping frames. An A-weighted digital filter was applied to each frame, the original RMS power level of the filter output was calculated for each frame, and the output was scaled by P_desired∕P_original such that the A-weighted power of each frame was a constant. Both the left and right channels were manipulated, with equal A-weighted power levels in the two channels after manipulation. Then, the processed frames were reconnected to reconstruct the song in a new WAV file. The normalized WAV files were then screened for potentially objectionable language or content, or any acoustic artifact that might have been introduced during manipulation (clicks, gaps, noises, etc.). Songs selected for the final playlist were trimmed using Audacity (Sourceforge.net), to remove initial silent periods which varied in duration across songs, and then the first 0.05 s of the song was faded in and the final 0.4 s of each song was faded out. Acoustic analyses were then repeated as described above.

Results

The time-level plots for several exemplar songs are shown in Fig. 1 to illustrate the specific changes to the song amplitude spectra. In many cases, the songs were relatively unchanged (A), or were minimally compressed with respect to rapid dynamic amplitude change (B), (C). Songs that had slowly fluctuating levels across the duration of the individual song were modified to have more constant levels across time (D), (E), (F). Those songs that were initially digitized with greater levels of energy were manipulated so that they were the same overall level as other songs (G); overall level adjustment was in combination with the flattening across the song if the song level slowly fluctuated (H). Variability across the songs selected for two 4 h playlists (one rock and one pop) are shown in Table Table 2. (prior to manipulation and after processing).

(Color online) Time-level plots for exemplar songs are shown here. Songs levels shown here were measured using Brüel & Kjær type 4157 artificial ear simulators, with right and left earphones simultaneously measured using separate 4157 devices. Songs selected to be shown here were chosen to specifically illustrate that in many cases, the songs were relatively unchanged (A) or they were only minimally compressed with respect to rapid dynamic amplitude change (B), (C). Some of the songs had slowly fluctuating levels across the duration of the individual song, and these were modified to have more constant levels across time (D)–(F). After manipulation, all songs were delivered at the same overall level. In some cases, the total amount of energy in the song was reduced across the duration of the song (G). Finally, there was a subset of songs requiring overall level adjustment in combination with flattening across the song (H).

Table 2.

Effects of dynamic level processing on song level.

	Baseline	Post-processing
Average song level (pop)	89.9–104.2 dB A	92.3–94.5 dB A
Range (pop)	14.3 dB	2.2 dB
Average song level (rock)	91.5–106.2 dB A	92.1–94.5 dB A
Range (rock)	14.7 dB	2.4 dB
Within song level variation (pop)	4.9 ± 1.4 dB	2.5 ± 0.7 dB
Within song level variation (rock)	4.5 ± 1.4 dB	2.0 ± 0.5 dB

Open in a new tab

In brief, prior to dynamic level manipulation via matlab processing, the average level of the individual songs had a 14 to 15 dB range and the average level variation within each song was 4–5 dB. Subsequent to dynamic level manipulation, the average level of the individual songs had a 2 to 3 dB range and the within-song level variation was reduced to 2–3 dB. Taken together, overall levels were held significantly more constant across time and the amount of amplitude change within the songs was also reduced, producing a much more constant, controlled exposure across time.

Discussion

The end result of the manipulations was a music playlist that is more constant across time than the sound levels which have been used in other TTS studies employing music exposure. The music playlist has greater real-world relevance than the pure-tone, broad-band, and octave-band noise insults that have been used in other human TTS studies. Importantly, the students and investigators that screened the manipulated songs did not note any qualitative changes in the songs when they were loaded onto a digital music player (iPod^®) and listened to using ER6I earphones. Although changes in music quality may be more obvious to other populations, such as audio engineers or audiophiles (i.e., those who seek∕appreciate high-quality audio reproduction), the current observation that the music sounds “normal” is encouraging for the potential use of these manipulated songs in studies with human subjects. In fact, these songs have now been used in a study in which 33 subjects were asked to listen to the 4 h playlists at investigator-selected levels (for data presented in abstract form, see Le Prell et al., 2010; Sakowicz et al., 2010; Le Prell et al., 2011). When surveyed, subjects in those studies similarly stated the music “sounded normal.”

The development of music stimuli with level parameters that are well-controlled within and across songs is of significant utility with respect to the implementation of routine exposure parameters in otoprotection studies; such studies would allow the possibility of direct comparisons of efficacy of different agents to be assessed in human clinical trials. The ability to compare data across studies would be of significant benefit to clinicians, given the need for scientific evidence to guide any changes in the best counseling for their patients. A variety of antioxidants and other agents reduce noise-induced hearing loss in animal studies (for reviews, see Le Prell et al., 2007; Le Prell and Bao, 2011). Because the different agents have been evaluated using protocols with different species, different noise insults, and different treatment paradigms (method of delivery and duration), it is difficult, if not impossible, to directly compare or contrast efficacy across the different agents (for review, see Le Prell and Bao, 2011). Several of the agents that have been effective in pre-clinical animal models are entering human clinical trials. The specific trial designs for upcoming human studies with different agents have been largely driven by investigator-specific access to unique subject populations. Thus, it will be equally challenging to compare efficacy of different agents across human studies.

Acknowledgments

The project was supported by Grant No. U01 DC 008423 from the National Institute On Deafness And Other Communication Disorders, National Institutes of Health. An NIH-selected Data Safety Monitoring Board, as well as Gordon Hughes at the NIH, had oversight of the development of these procedures for application to human subjects; we thank them for helpful feedback and suggestions, and their review of an earlier version of this manuscript. We are grateful for the contributions of Jim Wyatt, Marcello Pineiro, and Robert Trahoitis at Brüel & Kjær. They were instrumental in developing all acoustic measurement protocols. Finally, we thank Jason Schmitt and Lindsey Willis, who measured the song levels. Portions of this research have been presented in abstract form (Le Prell et al., 2009).

References and links

Attias, J., Sapir, S., Bresloff, I., Reshef-Haran, I., and Ising, H. (2004). “Reduction in noise-induced temporary threshold shift in humans following oral magnesium intake,” Clin. Otolaryngol. 29, 635–41. [DOI] [PubMed] [Google Scholar]
Katz, B. (2007). Mastering Audio: The Art and the Science, 2nd ed. (Focal Press, Waltham: ). [Google Scholar]
Keppler, H., Dhooge, I., Maes, L., D’Haenens, W., Bockstael, A., Philips, B., Swinnen, F., and Vinck, B. (2010). “Short-term auditory effects of listening to an MP3 player,” Arch. Otolaryngol. Head Neck Surg. 136, 538–548. [DOI] [PubMed] [Google Scholar]
Kramer, S., Dreisbach, L., Lockwood, J., Baldwin, K., Kopke, R. D., Scranton, S., and O’Leary, M. (2006). “Efficacy of the antioxidant N-acetycysteine (NAC) in protecting ears exposed to loud music,” J. Am. Acad. Audiol. 17, 265–278. [DOI] [PubMed] [Google Scholar]
Le Prell, C. G., and Bao, J. (2011). “Prevention of noise-induced hearing loss: Potential therapeutic agents,” in Noise-Induced Hearing Loss: Scientific Advances, Springer Handbook of Auditory Research, edited by Le Prell C. G., Henderson D., Fay R. R., and Popper A. N. (Springer, New York). [Google Scholar]
Le Prell, C. G., Guire, K., Hall, J. W. I., and Holmes, A. E. (2009). “Prevalence of 6 kHz ‘notch’ in populations of adolescents and young adults,” Presented at IX European Federation of Audiology Societies (EFAS) Congress, Tenerife, Spain.
Le Prell, C. G., Yamashita, D., Minami, S., Yamasoba, T., and Miller, J. M. (2007). “Mechanisms of noise-induced hearing loss indicate multiple methods of prevention,” Hear. Res. 226, 22–43. [DOI] [PMC free article] [PubMed] [Google Scholar]
Le Prell, C. G., Hall, J. W. I., Sakowicz, B., Campbell, K. C. M., Kujawa, S. G., Antonelli, P. A., Green, G. E., Miller, J. M., Holmes, A. E., and Guire, K. (2010). “Temporary threshold shift subsequent to music player use: Comparison with hearing screenings in populations of adolescents and young adults,” 35th Annual National Hearing Conservation Conference-Explore the World of Hearing Loss Prevention, NHCA Spec. 27, Suppl. 1, 28.
Le Prell, C. G., Kujawa, S. G., Dell, S., Hensley, B. N., Hall, J. W. I., Campbell, K. C. M., Antonelli, P. A., Green, G. E., Miller, J. M., and Guire, K. (2011). “Temporary threshold shifts and otoacoustic emission amplitude reductions subsequent to music player use by young adults,” 36th Annual National Hearing Conservation Conference-Innovation and Technology, NHCA Spec. 28, Suppl. 1, 36–37.
Lin, C. Y., Wu, J. L., Shih, T. S., Tsai, P. J., Sun, Y. M., Ma, M. C., and Guo, Y.L. (2010). “N-Acetyl-cysteine against noise-induced temporary threshold shift in male workers,” Hear. Res. 269, 42–47. [DOI] [PubMed] [Google Scholar]
Sakowicz, B., Le Prell, C. G., Hall, J. W. I., Campbell, K. C. M., Kujawa, S. G., Antonelli, P. A., Green, G. E., Miller, J. M., and Guire, K. (2010). “Temporary changes in hearing after digital audio player use,” Abstracts of the American Academy of Audiology, AudiologyNow!; ProgramNOW!, p. 147.

[c1] Attias, J., Sapir, S., Bresloff, I., Reshef-Haran, I., and Ising, H. (2004). “Reduction in noise-induced temporary threshold shift in humans following oral magnesium intake,” Clin. Otolaryngol. 29, 635–41. [DOI] [PubMed] [Google Scholar]

[c2] Katz, B. (2007). Mastering Audio: The Art and the Science, 2nd ed. (Focal Press, Waltham: ). [Google Scholar]

[c3] Keppler, H., Dhooge, I., Maes, L., D’Haenens, W., Bockstael, A., Philips, B., Swinnen, F., and Vinck, B. (2010). “Short-term auditory effects of listening to an MP3 player,” Arch. Otolaryngol. Head Neck Surg. 136, 538–548. [DOI] [PubMed] [Google Scholar]

[c4] Kramer, S., Dreisbach, L., Lockwood, J., Baldwin, K., Kopke, R. D., Scranton, S., and O’Leary, M. (2006). “Efficacy of the antioxidant N-acetycysteine (NAC) in protecting ears exposed to loud music,” J. Am. Acad. Audiol. 17, 265–278. [DOI] [PubMed] [Google Scholar]

[c5] Le Prell, C. G., and Bao, J. (2011). “Prevention of noise-induced hearing loss: Potential therapeutic agents,” in Noise-Induced Hearing Loss: Scientific Advances, Springer Handbook of Auditory Research, edited by Le Prell C. G., Henderson D., Fay R. R., and Popper A. N. (Springer, New York). [Google Scholar]

[c6] Le Prell, C. G., Guire, K., Hall, J. W. I., and Holmes, A. E. (2009). “Prevalence of 6 kHz ‘notch’ in populations of adolescents and young adults,” Presented at IX European Federation of Audiology Societies (EFAS) Congress, Tenerife, Spain.

[c7] Le Prell, C. G., Yamashita, D., Minami, S., Yamasoba, T., and Miller, J. M. (2007). “Mechanisms of noise-induced hearing loss indicate multiple methods of prevention,” Hear. Res. 226, 22–43. [DOI] [PMC free article] [PubMed] [Google Scholar]

[c8] Le Prell, C. G., Hall, J. W. I., Sakowicz, B., Campbell, K. C. M., Kujawa, S. G., Antonelli, P. A., Green, G. E., Miller, J. M., Holmes, A. E., and Guire, K. (2010). “Temporary threshold shift subsequent to music player use: Comparison with hearing screenings in populations of adolescents and young adults,” 35th Annual National Hearing Conservation Conference-Explore the World of Hearing Loss Prevention, NHCA Spec. 27, Suppl. 1, 28.

[c9] Le Prell, C. G., Kujawa, S. G., Dell, S., Hensley, B. N., Hall, J. W. I., Campbell, K. C. M., Antonelli, P. A., Green, G. E., Miller, J. M., and Guire, K. (2011). “Temporary threshold shifts and otoacoustic emission amplitude reductions subsequent to music player use by young adults,” 36th Annual National Hearing Conservation Conference-Innovation and Technology, NHCA Spec. 28, Suppl. 1, 36–37.

[c10] Lin, C. Y., Wu, J. L., Shih, T. S., Tsai, P. J., Sun, Y. M., Ma, M. C., and Guo, Y.L. (2010). “N-Acetyl-cysteine against noise-induced temporary threshold shift in male workers,” Hear. Res. 269, 42–47. [DOI] [PubMed] [Google Scholar]

[c11] Sakowicz, B., Le Prell, C. G., Hall, J. W. I., Campbell, K. C. M., Kujawa, S. G., Antonelli, P. A., Green, G. E., Miller, J. M., and Guire, K. (2010). “Temporary changes in hearing after digital audio player use,” Abstracts of the American Academy of Audiology, AudiologyNow!; ProgramNOW!, p. 147.

PERMALINK

Modification of digital music files for use in human temporary threshold shift studies

C G Le Prell

Q Yang

J G Harris

Abstract

Introduction

Procedures

Table 1.

Results

Figure 1.

Table 2.

Discussion

Acknowledgments

References and links

ACTIONS

PERMALINK

RESOURCES

Cite

Add to Collections

PERMALINK

Modification of digital music files for use in human temporary threshold shift studies

C G Le Prell

Q Yang

J G Harris

Abstract

Introduction

Procedures

Table 1.

Results

Figure 1.

Table 2.

Discussion

Acknowledgments

References and links

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases