Abstract
Objective
Distortion product otoacoustic emissions (DPOAEs) provide a rapid, noninvasive measure of outer hair cell damage associated with chemotherapy, and are a key component of pediatric ototoxicity monitoring. Serial monitoring of DPOAE levels in reference to baseline measures is one method for detecting ototoxic damage. Interpreting DPOAE findings in this context requires that test retest differences be considered in relation to normal variability, data which are lacking in children. This study sought to (1) characterize normal test-retest variability in DPOAE level over the long time periods reflective of pediatric chemotherapy regimens for a variety of childhood ages and f2 primary frequencies using common clinical instrumentation and stimulus parameters; (2) develop level-shift reference intervals; and (3) account for any age-related change in DPOAE level or measurement error that may occur as the auditory system undergoes maturational change early in life.
Design
Serial DPOAE measurements were obtained in 38 healthy children (25 females and 13 males) with normal hearing and ranging in age from one month to 10 years at the initial (baseline) visit. On average, children were tested 5.2 times over an observation period of 6.5 months. Data were collected in the form of DP-grams in which DPOAE level was measured for f2 ranging from 1.4–10 kHz, using a fixed f2/f1 ratio of 1.22 and stimulus level of 65/55 dB SPL for L1/L2. Age effects on DPOAE level and measurement error were estimated using Bayesian regression of the longitudinal data. The raw and model based distribution of DPOAE test-retest differences were characterized using means and standard error of the measurement (SEM) for several ages and f2s.
Results
DPOAE test-retest differences for the children in this study are at the high end of those previously observed in adults, as reflected in the associated shift reference intervals. Further, although we observe substantial child-specific variation in DPOAE level, the pattern of age related changes is highly consistent across children. Across a wide range of f2s, DPOAE level decreases by 3–4 dB from 1 to 13 months of age followed by a more gradual decline of < 1 dB/year. An f2 of 6 kHz shows the smallest decrease during the early rapid maturation period. DPOAE measurement error is fairly constant with age. It is 3–4 dB at most f2’s and is greater (indicating poorer reliability) at 1.5, 8 and 10 kHz.
Conclusions
DPOAE level decreases with childhood age, with the greatest changes observed in the first year of life. Maturational effects during infancy, and greater measurement error at very low and high f2’s impact test-retest variability in children. An f2 of 6 kHz shows minimal maturation and measurement error, suggesting it may be an optimal sentinel frequency for ototoxicity monitoring in pediatric patients. Once validated with locally-developed normative data, reference intervals provided herein could be used to determine screen fail criteria for serial monitoring using DPOAEs. Employing state-of-the-art calibration techniques might reduce variability, allowing for more sensitive screen fail criteria.
INTRODUCTION
The antineoplastic chemotherapeutic agents, cisplatin and carboplatin, are effective in the treatment of a variety of childhood cancers, but can cause severe dose-related toxicities including irreversible hearing loss (Knight et al. 2007). Distortion-product otoacoustic emission (DPOAE) testing provides a rapid, noninvasive means to monitor the outer hair cell damage associated with chemotherapy, and is a key component of pediatric ototoxicity monitoring programs. DPOAEs arise from the nonlinear basilar membrane response to two tones at frequencies, f1 and f2 (where f1<f2). DPOAEs are influenced by damage near the tonotopic location of f2, the DPOAE frequency itself (2f1-f2) and regions basal to f2 (Kim et al. 1980; Kemp 1983; Talmadge et al. 2000; Martin et al. 2011). In addition to their value added as an objective measure, DPOAEs are considered more sensitive to ototoxicity compared with standard pure tone threshold testing, establishing their usefulness in a pediatric ototoxicity monitoring protocol (Ress et al. 1999; Stavroulaki et al. 1999; Knight et al. 2007). Serial monitoring of DPOAE levels in reference to baseline measures is one method for detecting ototoxic damage. However, this requires that any observed DPOAE test-retest shift be interpreted in relation to the normal variability observed in samples of healthy children tested under similar measurement conditions, data which are lacking.
Identifying a Minimally Acceptable DPOAE Change Based on the Distribution of Test-retest Differences
Normal test-retest variability is conventionally estimated by the average difference, and the variance of differences, between a baseline measurement and a follow-up measurement, or as the standard error of measurement (SEM), given by s times √1-r, where s is the standard deviation of combined baseline and follow up measurements and r is Pearson’s correlation coefficient between baseline and follow up measurements (Demorest & Walden 1984). There currently is no standard for identifying a “clinically significant” DPOAE change, or “screen failure” indicative of possible cochlear damage. One approach is to develop an operational definition of response change as a test-retest difference that is greater than the variability routinely observed in a healthy reference sample (Helleman & Dreschler 2012; Reavis et al. 2015). This approach assumes the screening decision is made based on results at a single frequency or frequency average. The 90% point in the distribution of test-retest differences represents a suitable shift reference limit given as the mean shift plus/minus 1.645 times the standard deviation of the differences. Other approaches for estimating the 90% point in the distribution of test-retest differences are also used including plus/minus 1.645 times √2 times the SEM (McMillan & Hanson 2014). A test-retest level difference greater than the 90% shift reference limit, if observed in a new clinical patient, would be considered a screen fail as it would be expected to occur in only 10% of healthy children. The tolerance for false positives is a clinical decision typically based on the seriousness of the effect and it’s pervasiveness within a patient population. Fortunately, reference intervals of any size can be computed from an estimate of test-retest variability (McMillan et al. 2013).
Are Separate DPOAE Shift Reference Intervals Needed for Children?
A recent meta-analysis of 10 studies in the archival literature estimates DPOAE test-retest differences for adults using the SEM statistic (Reavis et al. 2015). The mean test interval between test and retest was 3.6 days and ranged from 0 days to just slightly over 15 days with results extrapolated to 20 days. The SEM values from individual included studies ranged from 0.57 to 3.9 dB for f2 frequencies of 1, 2, 4, and 6 kHz presented at moderate stimulus levels, with most meta-analytic SEM values concentrated near 2 with associated shift reference limits ranging from + 3.76 to + 5.63 dB depending on f2. Variability was greatest at the higher test frequencies analyzed as noted previously (Roede et al. 2003; Dreisbach et al. 2006). Further, at 4 and 6 kHz, variability also increased with increasing time between tests (Reavis et al. 2015).
Only Sockalingham et al. (2007) appears to have published test-retest findings in children. The 24 healthy children aged 5–12 years (mean 9 years) were each tested twice, 13–15 days apart at f2 frequencies from 2.5–10 kHz using moderate stimulus levels. The highest baseline-to-retest correlations were found between measurements at frequencies of 2.5 and 7 kHz and the poorest correlations for 3.5, 5, and 10 kHz. Mean shifts between trials ranged from −0.55 (SD=2.85 dB) to 0.56 (SD=5.57 dB) and associated 90% reference intervals ranged from +5.24 to +6.03 dB for f2 frequencies between 2.5 and 7 kHz, and +9.72 dB at 10 kHz. This suggests that variability in school age children is in the upper range of that found in adults over a comparable time frame. These scant data do not encompass infancy or toddlerhood, the younger ages investigated here. Nor do they cover prolonged time periods, whereas typical pediatric chemotherapy regimens can extend to half a year or longer.
Sources of DPOAE variability include patient noise, especially at the lower test frequencies and signal to noise ratios, standing wave nulls in the calibration at higher frequencies, system distortion at higher test frequencies and levels, and the probe insertion and reinsertion necessary for serial testing (Beattie et al. 2003). When testing young children, a likely further source of OAE test-retest variability is physical maturation of the external and middle ear systems, and the medial olivocochlear reflex, which develop beyond birth (Abdala et al. 2007). Babies born at young gestational ages (28 to 30 weeks) have lower DPOAE levels than those born at older gestational ages (Gorga et al. 2000) and DPOAE levels have been shown to increase throughout the preterm period (Abdala et al. 2008). Numerous studies show that DPOAE and TEOAE levels are higher in term infants than toddlers, and in children than adults (Prieve, Fitzgerald, & Schulte 1997; Prieve et al. 1997; Kon et al. 2000; Groh et al. 2006), although not all studies agree (Abdala et al. 2008). Finally, relative to adults, measurement error may be greater among children because sub-clinical middle ear dysfunction that could impact OAE levels is relatively common among children (Lonsbury-Martin et al. 1994; Prieve et al. 2008). Further, whereas young infants may be coaxed to sleep during DPOAE measurement, toddlers may be more likely to move, vocalize or resist precise placement of the measurement probe into the ear canal. In-the-ear calibration is designed to minimize variability related to probe position and ear canal geometry differences across measures; however, the SPL-based method standard in most clinical systems is imperfect. Calibration errors can occur particularly above about 3 kHz in adults and above about 6 kHz in children under 2 years (Kruger et al. 1987; Souza et al. 2014). Improved calibration, such as depth compensation or forward pressure level (FPL) methods, minimize these frequency-specific errors and thus could reduce test retest variability.
The overall goal of this report is to characterize the variability in DPOAE levels obtained in infants and children at a wide range of f2 primary frequencies using common clinical instrumentation and stimulus parameters over the long time periods reflective of typical chemotherapy regimens. We provide raw and model-based distributions of the DPOAE test-retest differences from which shift reference limits can be computed to assist with clinical interpretation of serial DPOAE changes. Our analysis is designed to identify any age-related changes in DPOAE level or measurement error.
MATERIALS AND METHODS
Participants
All children between the ages of 1 month to 15 years who had normal hearing were invited to participate in this study using an advertisement placed on the Oregon Health and Science University (OHSU) Child Development and Rehabilitation University Research Studies webpage. Inclusion criteria for this normative sample of infants and children included 1) normal hearing or for infants enrolled at age < 7 months, a parent report of “pass” on an Auditory Brainstem Response (ABR) newborn hearing screening, 2) valid DPOAEs measureable and present in the range of published normative values at study entry (Gorga et al. 1997), 3) no history of ototoxic treatment, and 4) no active ear disease or history of tympanostomy tubes, ear pathology, or ear surgery. Parental consent was provided for all participants following the guidelines of the Institutional Review Board at OHSU. Assent to participate in the study was also obtained from those children who were 7 years of age and older.
Measurement Procedures
All testing was completed by the second author or another experienced pediatric audiologist at Doernbecher Children’s Hospital affiliated with and located on the campus OHSU. The intent of the data collection was to obtain valid DPOAE responses at up to 12 frequencies in one or both ears, with measures taken once a month over a time period of 6–9 months of monitoring. The number of study visits, time interval between evaluations and duration of the observation period were designed to replicate the audiologic monitoring schedule for children treated with cisplatin at Doernbecher Children’s Hospital. The number of valid observations varied primarily due to middle ear pathology and the inability of all families to follow through with each scheduled appointment (for more details see “Participant Sample and Data Structure” in the Results section).
Prior to testing, the second author or her designate approached the child and family member(s) to obtain minimal demographic information such as, the child’s age, sex, race and ethnicity, and history of hearing loss, middle ear disease and whether or not the child has had tympanostomy tube insertion in the past. Study procedures were explained including the procedures and number and length of the test intervals. At this point, any questions were answered, assent was obtained from the child, and formal consent for testing was obtained from a parent.
Testing included bilateral otoscopy and tympanometry to ensure ear canals were free of debris with no noticeable inflammation and middle ear function was normal. Tympanograms were measured with the Grason-Stadler GSI Tympstar Middle ear Analyzer (Grason-Stadler Inc, Edin Praire, MN) using a 1000 Hz probe for participants 7 months and younger and a 226 Hz probe tone for participants older than 7 months. Tympanograms were classified as normal when the peak pressure was within +50 to −50 daPa and acoustic admittance was 0.2 mmho or greater.
Hearing was evaluated using age-appropriate techniques to ensure all participants had normal hearing at the baseline evaluation and that no clinically significant hearing shifts occurred over the observation period. For children younger than 7 months at enrollment, data for this project were obtained and normal hearing was confirmed at or near 7 months of age. For children between 7 and 30 months of age, visual reinforcement audiometry (VRA) was used to establish hearing within normal limits at all test frequencies. For those children between 31 months and 6 years, conditioned play audiometry was used and for those older than 6 years, standard audiometric techniques were used. Pure tone audiometric thresholds were obtained at octave frequencies 0.5–8 kHz and at 3 and 6 kHz. Normal hearing was defined as thresholds of 20 dB HL or better. A clinically significant hearing shift was defined as 20 dB or greater shift at one test frequency or 10 dB or greater at two adjacent test frequencies (ASHA 1994).
DPOAEs with frequency 2f1-f2 were recorded using a Bio-Logic Scout AuDX system (Natus Medical Inc, Mundelein, IL). DPOAE levels were recorded as a function of f2 primary frequency, i.e., as DP-grams. Twelve log spaced f2 frequencies between 1.453 and 10.031 kHz were presented at a fixed primary frequency ratio (f2/f1=1.22) and level (L1=65 dB SPL, L2= 55 dB SPL). An in-the-ear calibration provided by the Biologic test equipment was utilized prior to each measurement. Specifically, voltage adjustments were applied to maintain the desired stimulus SPL across frequency as measured by the probe. This approach can provide accurate results for lower frequency primaries and in smaller ear canals (Kruger et al. 1987; Siegel, 1994). At higher frequencies, especially in larger ear canals interactions between incident and reflected waves in the sealed ear canal cause spatial variations in sound pressure such that the SPL at the plane of the microphone can differ from that at the tympanic membrane. The exact frequencies at which these calibration errors occur depend on the half-wave resonance of the sealed ear canal and therefore vary with probe insertion depth and ear-canal length (Siegel, 1994). Measurement-based stopping rules were used that shortened the test for quiet participants by allowing averaging to stop when the measured noise reached a level below −10 dB SPL. To be considered valid for inclusion in analysis, DPOAE measurement points had to have a signal level of at least −10 dB SPL and be associated with low noise such that the signal to noise ratio was at least 6 dB.
All audiologic equipment was calibrated annually in accordance the American National Standard Institute (ANSI) S3.6 – 2010 procedures. All hearing testing was done in a single-walled audiologic test booth using the Interacoustics AC-40 clinical audiometer (Interacoustics AS, Denmark) and Etymotic ER3 insert earphones (Etymotic Research Inc, Elk Grove Village, IL). DPOAEs were measured in a quiet room at the Doernbecher Audiology Clinic though outside of a sound booth. This test environment was chosen to most closely simulate DPOAE evaluations conducted in the pediatric audiology clinic or at bedside when children are receiving ototoxic medications.
Statistical Methods
The primary goal of the analysis is to estimate the distribution of test-retest differences in DPOAE level shifts across children’s ages and f2 primary frequencies. Estimates of the age effects on the test-retest variability are provided by computing the mean change in DPOAE level and the variance of the DPOAE changes using the raw data. However, not only does this require a great many subjects at each baseline age to get an accurate normative reference standard, but it also requires precise control over the time points at which DPOAE measurements are made. In practice this is an exceptionally challenging logistical problem, particularly among child research participants. Binning over ages or test intervals to address the associated missing data problems is common practice in the literature, but can reduce the accuracy of reference intervals where maturational changes occur within the bin interval. To avoid these challenges a statistical paradigm described by Royston (1995) was followed for estimating shift reference intervals for an unrelated measure (to this analysis), height, in growing fetuses and children (Pan & Goldstein 1997). The current project approach was to develop a statistical model of the mean and variance of the longitudinal DPOAE levels, and use the model to analytically deduce features of the test-retest variability according to standard statistical theory.
In the parlance of Royston’s paradigm, the observed longitudinal DPOAE levels at a particular f2 frequency were conceived as the sum of a smoothed population-level process that is dictated by the maturational changes of the children in the target population plus child-specific deviations from the population-level process due variability among children in their development. Furthermore, DPOAE measurement error could be age dependent. For example, session-to- session variation in the ability of children to cooperate or in middle ear status could vary in an age-dependent manner.
Royston’s approach was implemented using a multi-level regression model of each child’s observed DP-gram assuming Gaussian errors. The Gaussian model is reasonable since each DPOAE measurement is an average over approximately 15 sweeps. In previous work (McMillan et al. 2011, 2013), we considered Manly’s exponential transformation to achieve normal errors but generally find imperceptible benefits to this added effort. Let yijl denote the DPOAE level measured on the lth measurement occasion in the jth ear of the ith child at a particular f2 primary, where the child’s age in months on that occasion is denoted ail. To simplify presentation, the f2 primary is not indexed in the model shown below, though the analysis allowed for variability across f2 frequencies. Accordingly, the data model used here is
| (1) |
where the population process is modeled as the sum of an intercept γ, a linear age effect and a smooth maturation process (aijl) modeled as a cubic B-spline. The child-specific and ear-specific deviations were modeled with random intercepts δi and δij, while the child-specific deviation from the population maturation process was modeled by . The βi are zero-mean Gaussian random variables with variances equal to . Measurement error was also modeled as a function of age, so that εijl is a Gaussian error term with mean zero and variance σ2(aijl), which is also a smooth function of age, such that ln ((aijl)) = ω0 + ω(aijl). Note that σ(a) is the DPOAE measurement error at age a. The measurement error in the absence of age effects is commonly estimated by SEM.
According to this model, the distribution of test-retest differences between measurements taken at age a and a’ is also Gaussian with mean . In other words, the population average test-retest difference is equal to the population age effect on DPOAE level. The variance of the distribution of test-retest differences is equal to (Royston 1995). This is equal to the variability among children in test-retest differences due to the maturation process plus the squared measurement errors at the test and retest ages.
If all children were effectively identical in their maturation processes, then . Similarly, if there was no age effect on DPOAE measurement error, then σ2(a) = σ2. In the event that there was no maturation effect on DPOAE levels at certain frequencies, then . Therefore, if there is no age effect on any aspect of DPOAE level measurements then the distribution of test-retest differences is Gaussian with mean zero and variance equal to twice σ2 (McMillan et al. 2013), which is a much simpler distribution to work with. The extent to which , or θ(a) ≠ 0, β0 ≠ 0, or σ(a) depends on age dictates departure in pediatric test-retest variability from the simpler model. Therefore, the focus was on using the current longitudinal data to estimate , β0, θ(a), and σ(a) at each f2 primary.
Bao et al. (2016) modeled these data using a Gaussian process in a prior approach. Their focus was on the theoretical properties of the model and on computational approaches to establishing participant-specific test-retest standards in a pediatric population. There are certain similarities with the model described here, though our focus was on inferring the age effects on the DPOAE test-retest variability.
Our analysis was conducted using a Bayesian approach. This greatly facilitated computation, which was effectively impossible with maximum likelihood methods, and because it enabled us to introduce some prior information on maturational effects in pediatric populations. Specifically, we applied a Gaussian prior with mean −2 and standard deviation 2 to the β0 parameter, meaning that we expect DPOAE levels to decrease by about 2 dB ± 4 dB per year of life (Prieve et al. 1997), and that ω0 is centered at about 1.1 ± 2. Further details on the structure of the model and the priors used in the Bayesian analysis are given in section A of the Online Statistical Appendix.
Results of a Bayesian analysis are expressed in terms of posterior probability distributions, which reflect all of our existing data-based or experiential knowledge about parameters and functions of parameters. The posterior probability distribution is summarized by the 90% Bayesian confidence interval, which is the interval within which we are 90% certain that the parameter lies. This is analogous to a classical confidence interval, though the Bayesian interval is much more easily understood than the classical counterpart. The posterior inter-quartile range is also used to summarize the central portion of the posterior distribution, and the posterior median is a conventional point estimate of the relevant parameter. Classical concepts such as ‘p-values’ and ‘statistical significance’ do not figure into a Bayesian analysis, which favors parameter estimation and interpretation of effect sizes over statistical hypothesis testing.
The model was fit using PROC MCMC in the SAS software version 9.4. Three chains were run for 50,000 iterations following a 20,000 sample burn-in period. Gelman-Rubin diagnostics (Gelman et al. 2014) for all parameters were less than 1.03, indicating convergence. Model fits for all the children are shown in section B of the Online Appendix.
RESULTS
Participant Sample and Data Structure
Fifty-seven children were consented to participate. All were white and only one reported Hispanic ethnicity. Twelve had baseline evaluations, but chose not return for follow-up study visits. Two were excluded due to lack of measurable DPOAEs. Five who developed persistent negative middle ear pressure or middle ear disease were not included in this analysis. Among participants not excluded, data from 60 visits were excluded for one or both ears because of conductive hearing loss (n=14 ears) and/or abnormal tympanometry (n=56 ears). Data from 3 visits were excluded due to excessive participant noise or lack of cooperation precluding DPOAE testing (n=2 ears) or due to excessive cerumen in the ear canal (n=1 ear).
Participant characteristics are summarized in Table 1. A total of 38 children aged 1 month to 10 years (mean age 3.2 years) provided 75 ears for testing over a total of 196 test visits with an average of 5.2 test visits per child. The observation period varied from a minimum of less than one month to a maximum of just over 1 year with data collected over a 7 month period on average. Most participants were female and female participants were, on average, older than the males. To illustrate the data structure, section C of the Online Appendix displays the numbers and intervals of study visits for each individual participant including age and sex (indicated by symbols). Altogether, between 321 and 356 valid DPOAE level measurements were obtained at each f2 frequency tested.
Table 1.
Participant Characteristics.
| Sex | All | |||
|---|---|---|---|---|
| female | male | |||
| N | 25 | 13 | 38 | |
| Baseline Age (months) | Mean | 46.6 | 23.8 | 38.8 |
| Min | 1 | 4 | 1 | |
| Max | 120 | 73 | 120 | |
| Observation period (months) | Mean | 6.7 | 6.0 | 6.5 |
| Min | 3.0 | 0.9 | 0.9 | |
| Max | 12.2 | 9.2 | 12.2 | |
| Number of Visits | Mean | 5.1 | 5.2 | 5.2 |
| Min | 3 | 2 | 2 | |
| Max | 9 | 8 | 9 | |
Characteristics of DP-grams Among Children
Figure 1 presents the DPOAE levels recorded at baseline in the form of a DP-gram. Thin lines represent data from each individual participant with the sample mean shown by the thick line. A consistent pattern of a slight decrease in level down to about 2.5 kHz is followed by an increase up to about 4 kHz, which again decreases continuously through 10 kHz. There is substantial variability in the overall DPOAE level across children, which could reflect both age- and child-specific effects, as argued previously (Spektor et al. 1991; Prieve et al. 1997; Groh et al. 2006). Fig. 1 also shows greater spread in the DPOAE data at the lowest and highest f2 frequencies.
Figure 1. Baseline DP-grams from a cohort of healthy infants and children show substantial variation in DPOAE level.
DPOAE level is plotted by f2 frequency to construct DP-grams using data from the initial (Baseline) visit. Solid lines show DPOAE level; Dashed lines show the level of the noise. Thin lines show data from individual participants; thick line shows the sample means. The variation in DPOAE level across individuals is greatest at the low and high f2 frequency extremes. Note that for a DPOAE recording to be included in the analysis, the signal to noise ratio (SNR) had to be at least 6 dB and amplitude had to be at least −10 dB SPL.
Test-retest Variability in DPOAE Level at a Range of Childhood Ages
Figure 2 shows raw DPOAE level shifts (follow-up level – baseline level) observed at several f2 frequencies from individual ears grouped by age. For illustrative purposes, f2’s are octave or inter-octave frequencies in the range found to generally have the lowest test-retest variability (spaghetti plots overlaying the model-based results in Figure 3 shows the complete data set). Each black line is one ear’s longitudinal measurements at the f2 frequency shown on the right side of the graph. Positive values above the ‘0’ reference line indicate an increase in DPOAE level compared to baseline. Negative values indicate a decrease in DPOAE levels from baseline. In each panel, a loess smoothed curve (red line) is fit to the data for each participant group. The solid grey bar is a 90% shift reference interval derived from a meta-analysis of 10 research in adult studies (Reavis et al., 2015). (Note that this meta-analysis yielded no estimates for a 3 kHz f2 primary.)
Figure 2. Test-retest variability is greater in children than previously reported for adults.
The DPOAE level shift relative to baseline is given in dB as a function of test-retest interval for participants grouped by age (columns) at a range of f2 frequencies (rows). Within a panel, thin black lines represent the DPOAE level shift in each ear and the thick red line is a loess smooth fit to the group data. The vertical distance between the red line and the horizontal line indicating zero change can be used to appreciate the magnitude of mean shift. For comparison, the gray box in each panel at f2~2, 4, and 6 kHz, indicates the 90% shift reference interval obtained in a meta-analysis of published data from adults for which test-retest intervals vary from <1hr to slightly over weeks (Reavis et al., 2015).
Figure 3. Model results showing population average maturational trajectories for DPOAE level at each f2 frequency.
Each panel is for a separate f2. Within a panel, the white line is the posterior median, thin error bars represent the corresponding 90% Bayesian confidence interval while thick error bars show the posterior inter-quartile range. The model results overlay the observed data measured longitudinally in each participant (thin lines). Results indicate that aging in childhood is associated with a reduction in DPOAE level and that this maturational effect, evident by the slope of the white line, tends to be greatest during the first year of life.
Several features of the observed test-retest differences stand out. One feature is the decrease in mean DPOAE level over time (predominately negative shifts) in the youngest (<1 year old) group. There is also considerable variation among ears in their longitudinal test-retest differences. The spread of the test-retest differences depends on the f2 frequency and generally increases over time, which we propose could reflect the maturational changes primarily attributed in previous studies to anatomical development of the outer and middle ear systems. These features will also be influenced by imprecision in the measurement, which we argue could also have a developmental component. Where data are sparse, possibly spurious patterns emerge such as the observed increase in DPOAE level in the 3–5 year old group at 2 kHz.
The distribution of the test-retest variability shown in Fig. 2 is tabulated in Table 2, which provides the unadjusted mean DPOAE level shift and SEM for each age category and f2 frequency. DPOAE shifts are pooled into three month windows of longitudinal observation. Though most SEMs are within 2–4.5 dB, few are <2 and there is some variation in the estimates with f2 frequency and age. Additionally, a systematic effect of test-retest interval is lacking, except perhaps for infants <1yr. Note that these estimates are based on as few as 2 to 10 measurements for the longer time intervals to as much as 15 to 29 measurements for the shorter windows.
Table 2. Distribution of raw DPOAE level test-retest differences given by baseline age and f2.
Sample size (N), mean, standard error of the measurement (SEM), and standard error of the SEM (SESEM) of the observed test-retest level shifts (DPOAE level at follow up minus DPOAE level at baseline). For illustrative purposes, data are presented using the baseline age groups from Fig. 2 (<1 yr; 1–3 yrs; 3–5 yrs; 5+ yrs) for several test-retest intervals (0–3, 3–6 and 6–9 months post baseline).
| Follow-up Interval (in months) | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0–3 | 3–6 | 6–9 | |||||||||||
| n | MeanShift | SEM | SESEM | n | MeanShift | SEM | SESEM | n | MeanShift | SEM | SESEM | ||
| f2 Primary (Hz) | Baseline Age | ||||||||||||
| 2063 | <1yr | 26 | −0.2 | 2.6 | 1.9 | 28 | −2.4 | 3.6 | 2.5 | 11 | −2.4 | 3.7 | 2.7 |
| 1–3yrs | 28 | −0.8 | 3.3 | 2.3 | 39 | 0.6 | 3.2 | 2.3 | 10 | 0.1 | 3.3 | 2.4 | |
| 3–5yrs | 16 | −1.1 | 3.8 | 2.7 | 15 | 0.3 | 3.8 | 2.7 | 5 | 5.6 | 4.8 | 3.6 | |
| 5+yrs | 23 | −0.7 | 2.2 | 1.6 | 29 | −3.6 | 4.2 | 3.0 | 8 | −0.3 | 2.6 | 1.9 | |
| 3000 | <1yr | 26 | −0.6 | 2.1 | 1.5 | 29 | −1.6 | 2.2 | 1.6 | 11 | −2.5 | 2.3 | 1.7 |
| 1–3yrs | 31 | −1.4 | 2.5 | 1.8 | 40 | 0.2 | 3.7 | 2.6 | 12 | −1.0 | 3.1 | 2.2 | |
| 3–5yrs | 15 | −0.7 | 2.2 | 1.6 | 14 | 0.3 | 1.7 | 1.2 | 5 | 2.4 | 2.8 | 2.0 | |
| 5+yrs | 25 | 0.2 | 1.8 | 1.3 | 29 | −0.5 | 2.4 | 1.7 | 8 | −1.5 | 2.5 | 1.8 | |
| 4219 | <1yr | 26 | −2.2 | 2.9 | 2.1 | 29 | −2.2 | 3.1 | 2.2 | 11 | −2.9 | 3.0 | 2.1 |
| 1–3yrs | 31 | −0.4 | 3.4 | 2.4 | 39 | −0.1 | 2.9 | 2.1 | 12 | −0.8 | 1.9 | 1.3 | |
| 3–5yrs | 17 | −0.4 | 1.6 | 1.2 | 15 | −0.2 | 1.7 | 1.2 | 5 | 2.6 | 1.3 | 1.0 | |
| 5+yrs | 25 | 0.1 | 2.4 | 1.7 | 29 | 0.0 | 2.4 | 1.7 | 8 | −1.3 | 2.0 | 1.4 | |
| 6000 | <1yr | 26 | −1.4 | 3.3 | 2.4 | 29 | 0.0 | 4.4 | 3.2 | 11 | −2.1 | 5.0 | 3.6 |
| 1–3yrs | 31 | −0.5 | 2.6 | 1.8 | 41 | 0.7 | 3.0 | 2.1 | 12 | 0.3 | 2.2 | 1.6 | |
| 3–5yrs | 17 | −1.0 | 1.9 | 1.4 | 15 | −1.6 | 1.9 | 1.4 | 5 | −0.8 | 1.0 | 0.8 | |
| 5+yrs | 25 | −1.0 | 2.8 | 2.0 | 28 | −0.8 | 2.3 | 1.7 | 8 | −1.5 | 3.2 | 2.3 | |
Table 3 shows the false-referral rates for test-retest differences measured up to 2 months from baseline. A false-referral is defined as an observed test-retest difference that is outside the adult meta-analytic reference interval indicated by the grey bars in Fig. 2. Because the children in this sample are presumed healthy with normal auditory systems, we would expect that the false-referral rate is about 10%, which is consistent with the 90% shift reference intervals in Fig. 2, if the adult shift reference standards were suitable for children. This is clearly not the case, with false referral rates about 2 to 3 times those expected across f2 frequencies in the <1 year category, and 2 to 5 times greater across all ages at a 2 kHz f2.
Table 3. False referral rates as a function of baseline age for several f2 frequencies.
False referral rates obtained by subjecting the observed test-retest differences to pass-fail criteria based on the 90% shift reference intervals derived from a meta-analysis of 10 studies in adults (Reavis et al. 2015). Results suggest that shift reference intervals need to be developed specifically for use with children and/or long test-retest intervals.
| Baseline Age | |||||
|---|---|---|---|---|---|
| <1yr | 1–3yrs | 3–5yrs | 5+yrs | ||
| f2 Primary (Hz) | |||||
| 2063 | % | 29% | 16% | 50% | 18% |
| N | 17 | 19 | 12 | 17 | |
| 4219 | % | 18% | 18% | 0% | 5% |
| N | 17 | 22 | 13 | 19 | |
| 6000 | % | 35% | 5% | 8% | 26% |
| N | 17 | 22 | 13 | 19 | |
Results of the Statistical Model
Separating the true age-related changes in population mean DPOAE levels from measurement error and sampling error is a goal of the Bayesian statistical model. According to this model, age affects test-retest variability through three distinct sources: (1) the age effects on population mean DPOAE level; (2) child-specific variation in age-induced maturation; (3) and age-specific variation in measurement error. This approach was designed to allow us to gain insight into the processes by which age affects test-retest variability, which could be important for informing clinical test-retest standards in pediatric monitoring.
Mean DPOAE Level as a Function of Age
Figure 3 displays the longitudinal DPOAE levels and model-based population mean DPOAE level at each f2 as a function of age. Recall that the population mean DPOAE level is given by in equation (1). The thin lines are the raw ear-specific DPOAE levels measured in each participant, while the posterior distribution of the population mean DPOAE level is indicated by the vertical bars at each age where data were available. The posterior intervals are considerably narrower between about 6 and 48 months, indicating greater certainty about the population average DPOAE level, which also corresponds to the region of greatest data density. There is less data at the older ages, above about 75 months (also see Online Appendix section C), where the posterior intervals are wider than at the younger ages. These reflect greater uncertainty about the population mean DPOAE levels at these ages.
The model shown in Fig. 3 captures an overall age-related decrease in DPOAE level at most f2 frequencies. The most rapid decreases in DPOAE level occur within about the first year of life, but the rate of decrease varies among frequencies. At 6 kHz, the decrease is gradual and roughly constant with age, while at other frequencies, such as 2.5, 5 and 10 kHz the initial decrease is much more pronounced than at the later ages. In addition, the fitted mean DPOAE initial level varies by f2 with higher frequencies having an overall lower level compared to the other frequencies, consistent with the shape of the DP-grams shown in Fig. 1.
The maturation effects are shown in more detail in Figure 4. A difference was computed between the mean DPOAE level at 13 months and one month (Panel A) and between 48 and 36 months (Panel B) at each frequency, along with 95% confidence intervals. These differences correspond to the differences in the heights of the curves shown in Fig. 3 for the relevant ages and frequencies. Fig. 4A shows an estimated loss of about 3 to 4 dB of DPOAE level across most f2’s between one and 13 months of age. The exception is f2=6 kHz, which shows little to no drop in DPOAE level. The 95% confidence intervals cover about ± 1 dB change and the posterior interquartile range is within about ± 0.5 dB change, indicating considerable certainty about the magnitude of these effects. Fig. 4B shows that the posterior average difference in DPOAE level between 36 and 48 months of life is <1 dB across frequencies, which is smaller than during the first year of life. The narrow confidence intervals indicate even greater certainty about maturation effects over this age range compared with infants, which is consistent with the observation about data density in Fig. 3.
Figure 4. Maturational effects on DPOAE level are evident for a wide range of f2 frequencies, and are less pronounced in childhood than infancy.
Plot shows the median of the posterior distribution of the mean difference in estimated DPOAE level taken between specific ages indicated as points along the thick black line from Fig. 4. Panel A shows the contrast between 1 and 13 months; Panel B shows the contrast between 36 and 48 months of life. In both panels, results are plotted by f2. The thick and thin portions of the error bars indicate the corresponding 25th – 75th and 5th – 95th percentile ranges, respectively. The estimated maturational change during the first year of life is a 2–5 dB decrease in DPOAE level at most f2’s, with no change at 6 and 7 kHz. Panel B indicates less maturational change (1 dB/year) from 36 to 48 months of age and greater certainty in the estimate.
Mean DPOAE Measurement Error as a Function of Age
The model described in equation (1) includes maturation effects on the measurement error variance σ2(age) via the log-linear model ln(σ(age)) = ω0 + ω(age). The model-based measurement error variance is more easily understood when expressed as the reliability, which is defined as the average absolute difference between a pair of consecutive measurements taken on the same subject at the same test frequency. Because the data are modeled as Gaussian, the reliability can be estimated by (McMillan & Hanson 2014), which is plotted as a function of age for each frequency in Figure 5. These are model-based estimates of measurement error variance for two measurements taken without any interceding effects of age on mean DPOAE level. Note that larger values on the y-axis correspond to greater mean absolute test-retest differences (i.e. less reliable measurements).
Figure 5. Modeled measurement error trajectories for DPOAE level recordings show substantial variation with f2 frequency but no consistent age effect.
Each panel is for a separate f2. Within a panel, the thick line is the fitted mean and the shaded area represents the corresponding 95% Bayesian confidence interval. Measurement error is an estimate of reliability similar to an immediate absolute test-retest difference. Larger error is indicative of poorer reliability. Measurement error is typically within 3–4 dB except at the low and high f2 frequency extremes, which are associated with poorer reliability, but pronounced age effects are not evident.
Aside from the two highest and two lowest test frequencies, DPOAE reliability appears fairly consistent across ages. There appears to be a small increase in reliability (lower values on the y-axis) with age at an f2 primary of 2.5 kHz, but the change is only from about 4 to 3 dB average absolute test-retest difference. Overall reliability varies with f2 primary, and most average absolute test-retest differences are between about 3 to 4 dB. The reliability at both low and high frequency extremes is poorer and less precisely estimated. As was noted in the mean DPOAE level (Fig. 3), reliability for older children is less precisely estimated than for younger children likely due to sample size differences.
Variability in DPOAE Level Maturation among Children
The model in equation (1) states that a particular child varies from the age-specific population mean DPOAE level by . Maturational variability among children is governed by the parameter. As with measurement error, we can express variability among children at a given age as the average absolute difference in DPOAE growth per year of life equal to , which is plotted in Figure 6 for each f2 primary. With the exception of 1.5 kHz, the rate of change in DPOAE level varies among children by <1 dB per year of life, and is less than ½ dB for most f2 primary frequencies. This is to say that, overall, all children of a given age mature at about the same rate give or take ¼ to ½ dB per year.
Figure 6. Model-based average absolute difference between children in DPOAE-level decrement per year of life.
The average absolute difference in the age effect on DPOAE level between children given by f2. Units are dB SPL per year of life. The small average absolute difference in the age effect on DPOAE level between children in the sample indicates a consistent maturational effect at most f2 frequencies. The estimated age effect is variable at an f2 of 1.5 kHz, but is otherwise highly consistent across children.
Model-based Test-retest Differences and 90% Shift Reference Limits
Table 4 provides model-based estimates of test-retest variability which are less volatile and presumably are more accurate as compared with the raw data shown in Table 2. Results are provided for a subset of f2s, test intervals and ages. From these data, shift reference intervals can be generated as described in the Introduction and shown as shaded regions in Figure 7. DPOAE level shifts outside these regions could be considered a “clinically significant DPOAE change” or “screen fail”, if observed in any new clinical patient as it would be expected to occur in only 10% of healthy children. Columns contrast model-based results for an average infant age 4 months (the sample mean of participants younger than 13 months) vs. model-based results for children age 24, 48 or 96 months old (2, 4, and 8 yrs). The estimated 90% shift reference limits are clearly f2-dependent; greater test-retest variability is present at lower and higher f2’s. The shift reference limits also depend on age, with the greatest contrast found for the 4 month old infant relative to each of the other ages examined. The modeled 4 month old had an absolute difference between the upper and lower limit that ranged from about 12.5 to 17 dB, depending on the f2 and follow-up interval. The clinical ramifications of Fig. 7 should be clear: a child showing a loss of about 7 dB of DPOAE level is entirely expected during the first year of life at most f2’s, but such a change is unusual for an 8 yr old child. Further, at 6 kHz a change as great as 7 dB would be unusual even for children as young as 2 yrs.
Table 4. Model-based mean and SEM of test-retest level shifts given by f2 and retest interval provided for a variety of ages.
Results are adjusted for correlations among multiple DPOAE measures in each participant and for child-specific variation in level. Thus they are expected to be more accurate than raw data presented in Table 2. These results can be used to generate shift reference intervals using calculations provided in the Introduction. Once validated using locally-developed norms, the associated reference intervals could be used to inform clinical testing.
| Follow-up Interval (in months) | |||||||
|---|---|---|---|---|---|---|---|
| 2 | 4 | ||||||
| MeanShift | MeanSEM | SESEM | MeanShift | MeanSEM | SESEM | ||
| f2 Primary (Hz) | Baseline Age (in months) | ||||||
| 2063 | 4 | −0.7 | 3.2 | 0.3 | −1.3 | 3.2 | 0.2 |
| 24 | −0.1 | 3.3 | 0.2 | −0.1 | 3.3 | 0.2 | |
| 48 | −0.1 | 3.7 | 0.3 | −0.1 | 3.7 | 0.3 | |
| 96 | −0.1 | 2.9 | 0.3 | −0.3 | 2.9 | 0.3 | |
| 3000 | 4 | −0.7 | 2.7 | 0.2 | −1.3 | 2.7 | 0.2 |
| 24 | −0.1 | 2.9 | 0.2 | −0.3 | 2.9 | 0.2 | |
| 48 | −0.1 | 3.0 | 0.2 | −0.2 | 3.0 | 0.2 | |
| 96 | −0.1 | 2.7 | 0.3 | −0.3 | 2.7 | 0.3 | |
| 4219 | 4 | −0.8 | 3.5 | 0.3 | −1.4 | 3.6 | 0.3 |
| 24 | −0.1 | 3.5 | 0.2 | −0.1 | 3.4 | 0.2 | |
| 48 | −0.1 | 2.9 | 0.2 | −0.1 | 2.9 | 0.2 | |
| 96 | −0.1 | 2.8 | 0.3 | −0.3 | 2.8 | 0.3 | |
| 6000 | 4 | −0.4 | 3.6 | 0.3 | −0.7 | 3.5 | 0.3 |
| 24 | 0.0 | 2.9 | 0.2 | 0.0 | 2.9 | 0.2 | |
| 48 | −0.1 | 3.0 | 0.2 | −0.1 | 3.0 | 0.2 | |
| 96 | −0.2 | 3.0 | 0.3 | −0.3 | 3.0 | 0.3 | |
Figure 7. Model-based 90% shift reference limits for DPOAE level by observation interval compared for a variety of f2 frequencies and ages.
The 90% shift reference limits corresponding to a given age are represented by the shaded region, with solid or dashed lines indicating the mean shift. Rows indicate f2 given in Hz. Columns indicate the age contrast, with data shown for a modeled 4 month old infant (left column) and modeled 24, 48, and 96 month old infants (overlaid in the right column and distinguished by color). The shift reference limits are clearly f2-dependent; greater test retest variability is present at lower and higher f2’s. The shift reference limits are also age-dependent, with the greatest contrast found for the 4 month old infant relative to each of the other ages examined.
DISCUSSION
During cancer therapy, children are serially monitored using behavioral and objective measures of auditory function over a period of several months to one year. The present report provides data on the long-term variability of DPOAEs in healthy infants and children recorded under typical clinical conditions; identifies important factors influencing this variability, and offers shift reference intervals for a variety of ages, f2s and test-retest intervals. Such data should be useful for the interpretation of DPOAE findings in pediatric patients being monitored for cochlear damage associated with chemotherapies. For example, once validated by locally-developed normative data, the reference intervals provided herein could be used to determine screen fail criteria for serial monitoring using DPOAEs.
Major Findings
DPOAE test-retest differences observed in a cohort of healthy children show greater variability as compared with those reported previously for adults
Examining DPOAE test-retest data from quiet, cooperative young adults provides a benchmark against which pediatric data can be compared. To increase the accuracy of DPOAE test-retest variability estimates by increasing the sample size, Reavis et al. (2015) conducted a meta-analysis using data from 10 published studies in adults. The meta-analytic shift reference limits (shown in Fig. 2) ranged from +3.76 to +5.63 dB for f2 frequencies ranging from 1–6 kHz. Table 5 provides 90% reference intervals (calculated using the formulas provided in the Introduction) for 3 of the studies reported by Reavis et al., and for 2 additional studies conducted in adults not included in Reavis because they reported only the average test-retest difference. The 2 largest N studies in adults are from Beattie et al. (2003) and Ng and McPherson (2005). They yield 90% reference limits of +5.82 and +6.14 dB, respectively, for f2 frequencies ≤6 kHz. Variability increased at the highest frequency tested, 7 kHz, which had an associated reference limit of +7.07 dB (Ng & McPherson 2005). Reference limits for data reported by Roede et al. (1993) and Dreisbach et al. (2006) also increased with f2. For example, Dreisbach et al. (2006) reported reference limits of +7.24 and +12.39 dB for f2 frequencies of 2–8 and 10–16 kHz, respectively.
Table 5. Calculated 90% reference intervals from the archival literature.
Reference intervals based on SEM across trials in adults have been previously published (Reavis et al. 2015). Here we show reference intervals for 3 of the studies reported by Reavis et al. (the one with the longest test-retest interval and the two with the largest samples). Results are also shown for studies not included in Reavis because they reported only the average test-retest difference. This allows comparison with 2 additional studies conducted in adults as well as 3 studies in children as young as age 3 years. Larger reference intervals are observed at high frequencies and possibly for longer baseline to follow-up times. Variability in studies of children is at the upper range of adults even though infants and toddlers were not tested.
| Author | Mean Age (Range) yrs | Sample Size | F2 kHz | L1/L2 dB SPL | Follow-up Interval | SEM | MeanShift (SD) | 90% reference interval |
|---|---|---|---|---|---|---|---|---|
| ‡Franklin et al 1992 | 30 (19–44) | 12 | 1000–8000 | 65/65 | 4 weeks | 1.17 to 2.3 | 2.72 to 5.35 | |
| Roede et al 1993 | 26 | 12 | 1000–8000 | 55/55 | 6 weeks | 2.9 (2.7) | 7.34 | |
| 70/70 | 1.8 (1.8) | 4.76 | ||||||
| ‡Beattie et al 2003 | 24 (19–27) | 50 | 1000–4000 | 65/65 | 5–10 days | 2.5 | 5.82 | |
| ‡ Ng & McPherson 2005 | 23 (19–35) | 35 | 1000–6000 | 70/70 | 20 min-15 days | 2.64 | 6.14 | |
| 7000 | 3.04 | 7.07 | ||||||
| Dreisbach et al 2006 | 22 (18–29) | 25 | 2000–8000 | 60/45; 60/50; 70/55 | 4–8 weeks | 2.8 (2.7) | 7.24 | |
| 10000–16000 | 5.15 (4.4) | 12.39 | ||||||
| Sockalingam et al 2007 | 9 (5–12) | 24 | 2500–7000 | 65/55 | 13–15 days | 0.52 (2.87) to −1.57 (4.62) | 5.24 to 6.03 | |
| 10000 | −0.56 (5.57) | 8.60 | ||||||
| *Conrad & Dreisbach 2011 | 4.7 (3–6) | 30 | 1000–6000 | 65/50 | 54 (7–277) days | 0.2 (4.63) | 7.82 | |
| 7000–16000 | −0.11 (6.02) | 9.79 | ||||||
| *Newman & Dreisbach 2012 | 10.7 (10–12) | 37 | 1000–6000 | 65/50 | 51 (7–460) days | −0.35 (4.88) | 7.68 | |
| 7000–16000 | 0.14 (5.31) | 8.59 | ||||||
The 90% reference interval calculated from SEM or average difference between trials and standard deviation: +/−1.645*sqrt2*SEM OR meanshift +/− 1.645*SD
These studies represent 3 of the 10 included in Reavis et al. 2015 meta-analysis. Readers can refer to that study for data on the remaining 7 adult studies.
Studies were conducted using depth-compensated in-the-ear calibration.
Compared with most previous results in adults, DPOAE responses were more variable among the children tested in this study. Our results resemble those obtained in previous data in children at equivalent ages and f2’s (Table 5, bottom), although no previous reports include the younger ages tested here. Short-term variability data from Sockalingam et al. (2007) yielded 90% reference limits of +5.24 to +6.03 dB for f2 frequencies between 2.5–7 kHz and +8.60 dB at 10 kHz in children aged 5–12 years. Additionally, two published abstracts describe test-retest data in children collected over longer observation periods (four trials with an average interval of 50 days) using depth-compensated simulator SPL calibration techniques to minimize the effects of standing waves in the ear canal. Data yielded reference limits of +7.82 and +9.79 dB for children aged 3–6 years for f2 frequencies of 1–6 and 7–16 kHz, respectively (Conrad & Dreisbach 2011). Among children aged 10–12 years, reference limits were +7.68 and +8.59 dB for f2 frequencies of 1–6 and 7–16 kHz, respectively (Newman & Dreisbach 2012).
There is evidence for a rapid “maturational” change in DPOAE level during the first year of life, followed by a much more gradual change over the age span examined
Our model-based results showed that the largest decrements in mean DPOAE level within the first year were about 4 dB between 1.5 and 5 kHz and at 10 kHz; the smallest mean decrement was about 2 dB at 6 kHz, whereas the mean decrements at 7 and 8 kHz were around 3 dB. This pattern is similar to that observed in a previous study by Prieve and colleagues (1997) in which infants (as opposed to neonates) were compared with other age groups. Their sample of 196 participants ranged in cross-sectional age from 1 month to 29 yrs. The mean DPOAE levels among infants less than 1 yr were higher than all other age groups (older children, teens and adults) at 2 and 3 kHz. They were also slightly higher in children aged 1–3 yr compared to levels in young adults at 3, 5 and 6 kHz, consistent with a more gradual reduction in mean DPOAE level from childhood to adulthood as shown in other previous reports (Spektor et al. 1991; Lonsbury-Martin et al. 1994). Although these age-related differences are frequency dependent, Prieve and colleagues (1997) found them to be independent of stimulus level, which they argued was evidence for a non-cochlear origin of the change. In a separate study, a model of ear canal and middle ear transmission based on observed energy reflectance and DPOAEs, indicated that the forward flow of energy into the cochlea is reduced in infants compared with adults, yet the reverse flow of power into the ear canal is much greater in infants (Keefe & Abdala 2007). These effects and the developmental loss of DPOAE level were attributed in part to developmental changes in the tympanic membrane cross sectional area. Pairing DPOAE testing with a measure of middle ear reflectance could improve interpretability of DPOAE test-retest changes (middle vs. inner ear). Further, the use of improved calibration methods (e.g., depth compensated SPL or FPL) could minimize DPOAE variability related to maturational changes in ear canal size and shape, and probe placement.
There is substantial child-specific variability in overall DPOAE level, but the developmental pattern in DPOAE level is consistent across children
We and most previous studies note the large variability in DPOAE level across individuals of any particular age. Several studies have attributed this variability at least in part to the presence of spontaneous otoacoustic emissions (SOAEs) located near the DPOAE frequency, which can significantly impact the evoked emission level (e.g. Prieve et al. 1997a,b). A similar developmental pattern of DPOAE and TEOAE changes was observed by Prieve and colleagues whether or not an infant or child had SOAEs. Our results also indicate that the developmental DPOAE trajectory is highly consistent across children.
Measurement error varies markedly with f2 frequency
The reliability data in the present study are estimates of measurement error at a given test (as opposed to test-retest differences). We find that DPOAEs are less reliable at low and high f2 frequency extremes (~1.5, 8, and 10 kHz), consistent with other studies in both children and adults (Franklin et al. 1992; Dreisbachet al. 2006;Sockalingam et al. 2007). Poorer reliability at higher f2’s may be related to variability in probe placement due to limitations in current calibration techniques for correcting standing wave effects as already described (Dreisbach & Siegel 2001; Beattie et al. 2003). Poorer reliability at lower f2’s has been attributed to increased participant noise and/or a lower response level leading to poorer signal to noise ratio (SNR) (Beattie & Bleech 2000; Thorson et al. 2012). Measurement error for f2’s between 2 and 7 kHz is generally between 3–4 dB among the children in this study who were tested in a quiet area of the clinic. This is greater than measurement error typically reported for adults. For comparison, the mean meta-analytic measurement error in adults was <2 dB for f2 ranging from 1 to 6 kHz (see the intercept estimate in Table 2 of Reavis et al. 2015). We considered that some excess measurement error might be age-related. Evidence from the present report did not show a consistent trend across childhood age.
Clinical Application of Results
Pediatric ototoxicity monitoring and the case for objective metrics of ototoxicity
Cisplatin, used in the treatment of pediatric medulloblastoma, hepatoblastoma, osteosarcoma, neuroblastoma and germ cell tumors is associated with ototoxicity rates as high as 91% in the extended high frequencies (> 8 kHz) and 60% in the conventionally-tested (≤ 8 kHz) range in children (Gilmer et al. 2005; Knight et al. 2007; Brock et al. 2012). Carboplatin, used in the treatment of lower risk neuroblastoma, brain tumors, germ cell tumors, retinoblastoma and optic gliomas, is less toxic with reported ototoxicity rates of 5–20% (Musial-Bright et al. 2011; Qaddoumi et al. 2012;Peleva et al. 2014); however, rates climb to 70–90% when co-administered with cisplatin (Kushner et al. 2006; Landier et al. 2014). The high frequency hearing loss associated with platinum ototoxicity generally increases in severity and spreads to lower frequencies with increasing cumulative dose (Blakely & Myers 1993; Dille et al. 2012). Detecting ototoxicity early can provide an opportunity to consider other less toxic chemotherapies or to adjust drug exposure before hearing loss becomes debilitating. Further, when treatment must continue, knowledge of a child’s hearing loss can enable timely provision of aural rehabilitation to help offset adverse effects on speech and language development or learning in general.
The American Speech-Language-Hearing Association (ASHA) has developed criteria for early identification of ototoxic hearing change based on shifts in the pure tone audiogram relative to a baseline (pre-exposure) test (ASHA 1994; AAA 2009). Pediatric oncologists often characterize ototoxicity using a numeric grading system based on hearing shifts following one of several approaches, including the National Cancer Institute Common Terminology Criteria for Adverse Events, CTCAE, (NCI 2010), the International Society of Pediatric Oncology Ototoxicity Scale (SIOP), the Brock grades and the Chang grading system (Chang 2011; Brock et al. 2012; Gurney & Bass 2012). All of these approaches are based on pure tone audiometry, which represents an important clinical challenge because it can be difficult to obtain reliable ear-specific hearing thresholds in children at multiple frequencies. Barriers faced by audiologists include limited access to these young patients considering all of their other competing medical appointments, fearfulness of the clinic or hospital setting during cancer treatment which can compound a child’s already limited ability to cooperate, and deterioration in the child’s health. ABR and DPOAE testing are important components of the pediatric ototoxicity monitoring evaluation because they mitigate some of these barriers. ABR threshold data are sometimes used to extrapolate pure tone thresholds for use in grading ototoxicity using one or more of the grading scales described above (Chang 2011). However, ABR has the disadvantage of being a lengthy test often requiring sedation to achieve successful results with sufficient detail. DPOAE testing has the advantage of measuring physiologic function of the sensory cells sustaining initial ototoxic damage and can be used to screen for changes in cochlear function relatively quickly with minimal patient cooperation (Hellberg et al. 2009). A 2-stage approach of DPOAE testing to screen for cochlear changes followed by a (sedated) ABR should emissions testing result in a screen failure is commonly done because it increases efficiency, decreases the need for unnecessary sedation and is generally well tolerated by children.
Clinical use of DPOAE shift reference intervals
In contrast to ABR, there are no broadly accepted criteria for identifying a clinically significant DPOAE change. Approaches have been proposed as described in the Introduction that are based on the distribution of test-retest differences (i.e., variability in level shifts) among healthy controls (Helleman & Dreschler 2012; Reavis et al. 2015). A caveat for use is that the statistics assume that there is a single value for comparisons across serial measures. Ototoxicity monitoring protocols often involve collecting DPOAEs using fine f2 steps and a range of levels if time permits because this is useful for interpreting results. Thus an a priori decision should be made about which frequency or bands of frequencies will be tacitly used for screening purposes; otherwise the false alarm rate will be too high (Konrad-Martin et al. 2016). Certainly, high f2s will be the most sensitive to ototoxic damage (Ress et al. 1999; Reavis et al. 2008; Dille et al. 2010; Reavis et al. 2011). Our results suggest that choosing 6 kHz as a “sentinel” frequency may be fruitful as this is the highest f2 that showed minimal maturational effects and measurement error.
We provide model-based means and SEMs in Table 4 from which any shift reference intervals can be constructed. Reference limits can be constructed for both negative (emission decrement) and positive (emission enhancement) shifts as in Fig. 7, which would be preferable for infants in whom there is a systematic drop in the mean DPOAE over time. From these data it can be seen that separate reference intervals are most critical for infants.
Fig. 7 can be used to visualize how 90% shift reference intervals might be used clinically. Consider a hypothetical case in which a two month follow-up exam revealed a number of small changes compared to baseline. Imagine an a priori decision had been made to use an f2 of 6 kHz for screening purposes because (1) the patient had robust DPOAEs near this frequency prior to embarking on a regimen of cisplatin chemotherapy, and (2) the audiologist, oncologist and parent prioritized identifying cochlear changes at test frequencies that are sensitive to ototoxicity, reliable, and important for speech and language development. Otoscopy and tympanometry findings were consistently normal, bilaterally, for this patient. At the monitor test, however, the DPOAE level at 6 kHz was 5 dB SPL in the right ear, a decrease from 12 dB SPL at baseline. If this hypothetical patient is a 4 month old infant, this 7 dB DPOAE level decrease would fall inside the DPOAE level shift reference interval appropriate for the patient’s age at baseline (4 months, which is shown in the left column), the f2 tested (6 kHz, which is shown in the bottom row) and the follow-up interval (2 months, which corresponds to the left end of the x-axis). In the absence of other compelling data, this change would not be considered a screen fail. In contrast, a change of 7 dB at an f2 of 6 kHz would constitute a screen fail for a patient who is 24 months old at baseline. It would fall just outside the DPOAE level shift reference interval representing a baseline age of 24 months (right column, pink shading), a 6 kHz f2 (bottom row) and a 2 month follow-up interval (left end of the x-axis). Notably, the pink shading in the lower right panel is not visible because it covers roughly the same range of test-retest differences as the blue and brown shading; the reference intervals for patients who are 24, 48 or 96 months old at baseline are roughly equivalent at this f2.
An important caveat to the use of shift reference intervals based on this or other reports is that DPOAEs would need to be elicited using similar stimulus levels and frequencies on equipment with comparable performance. Further, we advocate an approach in which each clinician establishes a local test-retest normative data set. Confidence in published shift reference limits can be established if similar sized shifts are present in about 10% of controls based on a clinician’s local test-retest normative data set. Local false referral rates much greater than 10% indicate changes are needed to improve variability (quieter setting, more precise probe placement technique), or that there is need for wider reference limits. Conversely, observed false referral rates much less than 10% suggest better reliability than that found in this study.
Although our data provide an important point of reference for pediatric audiologists, the sample is small. Additional studies estimating test–retest variability among children are needed. Ideally, future studies would report raw estimates of the sample average and variation such as is presented in Table 2 to facilitate the synthesis of results in a meta-analysis to provide a more precise shift reference interval than any single study might. Our approach for avoiding confounds associated with a small sample was to develop a statistical model of the DPOAEs based on Bayesian multiple linear regression with baseline age and other influential factors, as described in the Methods. This was an efficient use of each child’s data because it allowed us to include all available measures on each participant, even if some time points were missing. It also maximized the accuracy of estimates for several reasons. First, this allowed us to avoid binning of recordings across time points, which could blur longitudinal changes. Second, by including participant and ear-specific random effects, we avoid bias induced by unusually large or small DPOAE recordings. This also allowed us to model (and adjust for) the correlation among repeated measurements on each participant across ears, f2, and time points. Finally, all of the data informs each estimate. Model fit was good, as determined by examining residual plots and by comparing the fitted models to the observed data. Furthermore, DPOAE results in the present study are consistent with the literature for a similar range of ages and test-retest intervals (see comparison with prior studies above).
Conclusions
DPOAE levels decrease with childhood age, with the greatest changes observed in the first year of life. Maturational effects during infancy, and greater measurement error at very low and high f2s impact test-retest variability in children. An f2 of 6 kHz shows minimal maturation and measurement error, suggesting it may be an optimal sentinel frequency for ototoxicity monitoring in pediatric patients. Once validated by locally-developed normative data, reference intervals provided herein could be used to determine screen fail criteria for serial monitoring using DPOAEs. Employing state-of-the-art calibration techniques might reduce variability, allowing for more sensitive screen fail criteria.
Supplementary Material
Acknowledgments
Conflicts of Interest and Source of Funding: Drs. Konrad-Martin, McMillan and Dille are paid in part by a grant from the U.S. Department of VA, Office of Rehabilitation Research & Development (RR&D) Service, Department of Veterans Affairs (C0239R). The remaining authors report no conflicts of interest. K.K. designed and performed experiments and wrote the paper. D.K. analyzed data and wrote the paper. G.P.M. performed statistical analysis and wrote the paper. L.E.D. wrote the paper. E.N. performed experiments and edited the paper. M.D. wrote the paper. Portions of this work have been presented at the Annual Scientific and Technology Conference of the 2014 American Auditory Society (AAS) annual meeting, Scottsdale, AZ, March 2014. The contents of this report do not represent the views of the U.S. government.
REFERENCES
- Abdala C, Keefe DH, Oba SI (2007). Distortion product otoacoustic emission suppression tuning and acoustic admittance in human infants: birth through 6 months. J Acoust Soc Am, 121, 3617–3627. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Abdala C, Oba SI, Ramanathan R (2008). Changes in the DP-gram during the preterm and early postnatal period. Ear Hear, 29, 512–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- American Academy of Audiology. (2009). American Academy of Audiology’s Position Statement and Practice Guidelines on Ototoxicity Monitoring. American Academy of Audiology, 1–25. [Google Scholar]
- American Speech-Language-Hearing Association. (1994). Guidelines for the audiologic management of individuals receiving cochleotoxic drug therapy. ASHA, 36 (Suppl 12), 11–19. [Google Scholar]
- Bao J, Hanson T, McMillan GP, Knight K (2017). Assessment of DPOAE test-retest difference curves via hierarchical Gaussian processes. Biometrics, 73 (1), 334–343. [DOI] [PubMed] [Google Scholar]
- Beattie RC, & Bleech J (2000). Effects of sample size on the reliability of noise floor and DPOAE. Br J Audiol, 34, 305–309. [DOI] [PubMed] [Google Scholar]
- Beattie RC, Kenworthy OT, Luna CA (2003). Immediate and short-term reliability of distortion-product otoacoustic emissions. Int J Audiol, 42, 348–354. [DOI] [PubMed] [Google Scholar]
- Blakely C, Myers SF (1993). Patterns of hearing loss resulting from cis-platin therapy. Otolaryngol Head Neck Surg, 109, 385–391. [DOI] [PubMed] [Google Scholar]
- Brock PR, Knight KR, Freyer DR, et al. (2012). Platinum-induced ototoxicity in children: A consensus review on mechanisms, predisposition, and protection, including a new International Society of Pediatric Oncology Boston Ototoxicity Scale. J Clin Oncol, 30, 2408–2417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chang KW (2011). Clinically accurate assessment and grading of ototoxicity. Laryngo, 121, 2649–2657. [DOI] [PubMed] [Google Scholar]
- Conrad A, & Dreisbach L (2011). Repeatability of high-frequency DPOAE measures in normal-hearing children American Auditory Society Abstracts, 36, 42. [Google Scholar]
- Demorest ME, & Walden BE (1984). Psychometric principles in the selection, interpretation, and evaluation of communication self-assessment inventories. J Speech Hear Dis, 49, 226–240. [DOI] [PubMed] [Google Scholar]
- Dille MF, Konrad-Martin D, Gallun F, et al. (2010). Tinnitus onset rates from chemotherapeutic agents and ototoxic antibiotics: Results of a large prospective study. J Am Acad Audiol, 21, 409–417. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dille MF, Wilmington D, Mcmillan GP et al. (2012). Development and validation of cisplatin dose-ototoxicity risk for adult cancer patients. J Am Acad Audiol, 23, 510–521. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dreisbach LE, & Siegel JH (2001). Distortion-product otoacoustic emissions measured at high frequencies in humans. J Acoust Soc Am, 110, 2456–2469. [DOI] [PubMed] [Google Scholar]
- Dreisbach LE, Long KM, Lees SE (2006). Repeatability of high-frequency distortion-product otoacoustic emissions in normal-hearing adults. Ear Hear, 27, 466–479. [DOI] [PubMed] [Google Scholar]
- Franklin D, McCoy M, Martin G, et al. (1992). Test/retest reliability of distortion-product and transiently evoked otoacoustic emissions. Ear Hear, 13, 417–429. [DOI] [PubMed] [Google Scholar]
- Gelman A, Carlin JB, Stern HS, et al. Bayesian Data Analysis (3rd ed.). Boca Raton, FL: Chapman and Hall/CRC [Google Scholar]
- Gilmer Knight KR, Kraemer DF, Neuwelt EA (2005). Ototoxicity in children receiving platinum chemotherapy: Underestimating a commonly occurring toxicity that may influence academic and social development. J Clin Oncol, 23, 8588–8596. [DOI] [PubMed] [Google Scholar]
- Gorga M, Neely S, Ohlrich B, et al. (1997). From laboratory to clinic: A large scale study of distortion product otoacoustic emissions in ears with normal hearing and ears with hearing loss. Ear Hear, 19, 440–455. [DOI] [PubMed] [Google Scholar]
- Gorga MP, Norton SJ, Sininger YS, et al. (2000). Identification of neonatal hearing impairment: distortion product otoacoustic emissions during the perinatal period. Ear Hear, 21, 400–424. [DOI] [PubMed] [Google Scholar]
- Groh D, Pelanova J, Jilek M, et al. (2006). Changes in otoacoustic emissions and high-frequency hearing thresholds in children and adolescents. Hear Res, 212, 90–98. [DOI] [PubMed] [Google Scholar]
- Gurney JG, & Bass JK (2012). New International Society of Pediatric Oncology Boston Ototoxicity Grading Scale for pediatric oncology: still room for improvement. J Clin Oncol, 30, 2303–2306. [DOI] [PubMed] [Google Scholar]
- Hellberg V, Wallin I, Eriksson S, et al. (2009). Cisplatin and oxaliplatin toxicity: Importance of cochlear kinetics as a determinant for ototoxicity. J Natl Cancer Inst, 101, 37–47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Helleman HW, & Dreschler WA (2012). Overall versus individual changes for otoacoustic emissions and audiometry in a noise-exposed cohort. Intl J Aud, 51, 362–372. [DOI] [PubMed] [Google Scholar]
- Keefe DH, & Abdala C (2007). Theory of forward and reverse middle-ear transmission applied to otoacoustic emissions in infant and adult ears. J Acoust Soc Am, 121, 978–993. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kemp DT, & Brown AM (1983). An integrated view of cochlear mechanical nonlinearities observable from the ear canal In: de Boer E, Viergever MA (eds), Cochlear Mechanics, (pp. 75–82). The Hague, The Netherlands: Martinus Nijhoff. [Google Scholar]
- Kim DO, Molnar CE, Matthews JW (1980). Cochlear mechanics: nonlinear behavior in two-tone responses as reflected in cochlear-nerve-fiber responses and in ear-canal sound pressure. J Acoust Soc Am, 67, 1704–1721. [DOI] [PubMed] [Google Scholar]
- Knight KR, Kraemer DF, Winter C, et al. (2007). Early changes in auditory function as a result of platinum chemotherapy: use of extended high-frequency audiometry and evoked distortion product otoacoustic emissions. J Clin Oncol, 25, 1190–1195. [DOI] [PubMed] [Google Scholar]
- Kon K, Inagaki M, Kaga M (2000). Developmental changes of distortion product and transient evoked otoacoustic emissions in different age groups. Brain Develop, 22, 41–46. [DOI] [PubMed] [Google Scholar]
- Konrad-Martin D, Poling GL, Dreisbach LE, et al. (2016). Serial monitoring of otoacoustic emissions in clinical trials. Otology and Neurotology, 37(8):e286–e294. [DOI] [PubMed] [Google Scholar]
- Kruger B (1987) An update on the external ear resonance in infants and young children. Ear Hear, 8 (6), 333–336. [DOI] [PubMed] [Google Scholar]
- Kushner BH, Budnick A, Kramer K, et al. (2006). Ototoxicity from high-dose use of platinum compounds in patients with neuroblastoma. Cancer 107, 417–422. [DOI] [PubMed] [Google Scholar]
- Landier W, Knight K, Wong FL, et al. (2014) Ototoxicity in children with high-risk neuroblastoma: prevalence, risk factors, and concordance of grading scales – a report from the Children’s Oncology Group. J Clin Oncol, 32, 527–534. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lonsbury-Martin BL, Martin GK, McCoy MJ, et al. (1994). Otoacoustic emissions testing in young children: Middle-ear influences. Amer J Otol, 15, 13–20. [Google Scholar]
- Martin GK, Stagner BB, Chung YS, et al. (2011). Characterizing distortion-product otoacoustic emission components across four species. J Acoust Soc Am, 129, 3090–3103. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McMillan GP, Reavis KM, Konrad-Martin D, et al. (2013). The statistical basis for serial monitoring in audiology. Ear Hear, 34, 610–618. [DOI] [PMC free article] [PubMed] [Google Scholar]
- McMillan GP, & Hanson TE (2014). Sample size requirements for establishing clinical test-retest standards. Ear Hear, 35, 283–286. [DOI] [PubMed] [Google Scholar]
- Musial-Bright L, Fengler R, Henze G, et al. (2011). Carboplatin and ototoxicity: hearing loss rates among survivors of childhood medulloblastoma. Childs Nerv Syst, 27, 407–413. [DOI] [PubMed] [Google Scholar]
- National Cancer Institute. (2010). Common terminology criteria for adverse events (CTCAE). In: Health NIo, ed., National Institutes of Health. [Google Scholar]
- Newman S, & Dreisbach LE (2012). Repeatability of high-frequency behavioral and DPOAE measures in normal-hearing children. American Auditory Society Abstracts, 37, 57. [Google Scholar]
- Pan H, & Goldstein H (1997). Multi-level models for longitudinal growth norms. Stats in Med, 16, 2665–2678. [DOI] [PubMed] [Google Scholar]
- Peleva E, Emami N, Alzahrani M, et al. (2014). Incidence of platinum-induced ototoxicity in pediatric patients in Quebec. Pediatr Blood Cancer, 61, 2012–2017. [DOI] [PubMed] [Google Scholar]
- Prieve BA, Fitzgerald TS, Schulte LE (1997). Basic characteristics of click-evoked otoacoustic emissions in infants and children. J Acoust Soc Amer, 102, 2860–2870. [DOI] [PubMed] [Google Scholar]
- Prieve BA, Fitzgerald TS, Schulte LE, et al. (1997). Basic characteristics of distortion product otoacoustic emissions in infants and children. J Acoust Soc Amer, 102, 2871–2879. [DOI] [PubMed] [Google Scholar]
- Prieve BA, Calandruccio L, Fitzgerald T, et al. (2008). Changes in transient evoked otoacoustic emission levels with negative tympanometric peak pressure in infants and toddlers. Ear Hear, 29, 533–542. [DOI] [PubMed] [Google Scholar]
- Qaddoumi I, Bass JK, Wu J, et al. (2012). Carboplatin-associated ototoxicity in children with retinoblastoma. J Clin Oncol, 30, 1034–1041. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reavis KM, Phillips DS, Fausti SA, et al. (2008). Factors affecting sensitivity of distortion-product otoacoustic emissions to ototoxic hearing loss. Ear Hear, 29, 875–893. [DOI] [PubMed] [Google Scholar]
- Reavis KM, McMillan G, Austin D, et al. (2011). Distortion-product otoacoustic emission test performance for ototoxicity monitoring. Ear Hear, 31, 61–74. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Reavis KM, McMillan GP, Dille MF, et al. (2015). Meta-analysis of distortion product otoacoustic retest variability for serial monitoring of cochlear function in adults. Ear Hear, 36, 251–260. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ress BD, Sridhar KS, Balkany TJ, et al. (1999). Effects of Cis-platin chemotherapy on otoacoustic emissions: The development of an objective screening protocol. Otolaryngol Head Neck Surg, 121, 693–701. [DOI] [PubMed] [Google Scholar]
- Royston P (1995). Calculation of unconditional and conditional reference intervals for fetal size and growth from longitudinal measurements. Statistics in Medicine, 14, 1417–1436. [DOI] [PubMed] [Google Scholar]
- Sockalingam R, Lee Choi J, Choi D, et al. (2007). Test-retest reliability of distortion-product otoacoustic emissions in children with normal hearing: a preliminary study. Intl Jnl Aud, 46, 351–354. [DOI] [PubMed] [Google Scholar]
- Souza NN, Dhar S, Neely ST et al. (2014). Comparison of nine methods to estimate ear canal stimulus levels. J Acoust Soc Am, 136(4), 1768. doi: 10.1121/1.4894787 [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spektor Z, Leonard G, Kim DO, et al. (1991). Otoacoustic emissions in normal and hearing-impaired children and normal adults. Laryngoscope, 101, 965–976. [DOI] [PubMed] [Google Scholar]
- Stavroulaki P, Apostolopoulos N, Dinopoulou D, et al. (1999). Otoacoustic emissions--an approach for monitoring aminoglycoside induced ototoxicity in children. Int Jnl Pediatr Otorhinolaryngol, 50, 177–184. [DOI] [PubMed] [Google Scholar]
- Talmadge CL, Tubis A, Long GR, et al. (2000). Modeling the combined effects of basilar membrane nonlinearity and roughness on stimulus frequency otoacoustic emission fine structure. J Acoust Soc Amer, 108, 2911–2932. [DOI] [PubMed] [Google Scholar]
- Thorson MJ, Kopun JG, Neely ST, et al. (2012). Reliability of distortion-product otoacoustic emissions and their relation to loudness. J Acoust Soc Amer, 131, 1282–1295. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.







