Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

bioRxiv logoLink to bioRxiv
[Preprint]. 2025 May 21:2025.05.19.654952. [Version 1] doi: 10.1101/2025.05.19.654952

Introductory notes before zebra finch song have unique timing properties while sharing acoustic properties with song

Divya Rao 1, Raghav Rajan 1,2
PMCID: PMC12139792  PMID: 40475594

Abstract

Preparatory neural activity precedes the initiation of simple movements and a key feature of this preparatory activity is its trial-by-trial correlation with features of the upcoming movement. Recent studies in the zebra finch, a songbird with a complex, naturally learned, movement sequence (song), have suggested that the repeats of short introductory notes (INs) at the start of each song bout, reflect motor preparation. However, whether IN properties correlate with upcoming song features remains poorly understood. Here, we addressed this question by recording and analyzing male zebra finch songs over a 3 year period. We found bout-to-bout correlations in the acoustic features of the last IN and the first song syllable. However, similar correlations were present between the first song syllable and the first IN and the first and second song syllable suggesting that INs are also part of the song sequence. Next, we found an age-related increase in the mean IN number before song and song tempo. If INs reflected preparation of song parameters, we expected age-related song changes to be predicted by IN-song correlations at a younger age. We did not find any such correlations. Finally, we compared INs to other repeated syllables within song and outside song bouts and found that the speeding up of intervals between successive INs is unique to INs. Overall our results showing similarities in the acoustic features of INs and song syllables suggest shared neural control of INs and song syllables, while differences in timing suggest different neural mechanisms controlling IN timing.

Significance Statement

Simple movements are believed to be “prepared” in the brain before execution and this preparatory neural activity is correlated with features of the upcoming movement. Recent studies have suggested that the short introductory notes before the complex song sequence of the zebra finch reflect motor preparation. Whether correlations exist for introductory notes and upcoming song remains poorly understood. Here we found bout-to-bout correlations between the acoustic properties of introductory notes and the first song syllable but our analyses suggest that this reflects shared neural control of the acoustics of both introductory notes and song syllables. We also found differences in the timing of introductory notes suggesting different neural mechanisms for controlling introductory note timing.

Introduction

How does the brain initiate a movement? Current research suggests a “preparatory” period before movement initiation when neural activity converges on a consistent initial state (Churchland et al., 2006b, 2010; Shenoy et al., 2011; Li et al., 2016; Svoboda and Li, 2018). Support for this hypothesis comes from delayed reaching tasks where subjects (humans or other animals) are trained to execute simple movements like reaching for an object. At the start of a trial, an instruction stimulus is provided that tells subjects where to move, but subjects are expected to withhold their movements until a “GO” cue is provided. The “GO” cue is presented after a variable delay period from presentation of the instruction stimulus. The reaction time, measured as time between GO cue presentation and movement initiation, is shorter when the delay period is longer suggesting the need for a time to “prepare” the movement (Rosenbaum, 1980; Riehle and Requin, 1989; Churchland et al., 2006b). Premotor neural activity, recorded during this task, shows changes during the delay period (Churchland et al., 2006b; Guo et al., 2014) and an important feature of delay period neural activity is a reduction in variability across trials; more variable activity at the beginning of the delay period to less variable activity just before movement onset (Fig. 1A). Thus, for simple reaching movements, the brain appears to prepare by bringing premotor activity to a consistent initial state from which the correct patterns of movement related activity are produced immediately after the GO cue (Shenoy et al., 2011; Svoboda and Li, 2018).

Fig. 1. Preparatory activity and its relationship to upcoming movement.

Fig. 1

(A) Simple movement in monkeys: movement of finger from a central fixation point to a target as part of delayed reaching tasks. The neural activity on each trial (represented by the black traces), during the preparatory period, converges on a consistent point before movement onset. (B) Zebra finch song is an example of a complex, naturally learned movement. ‘i’s represent Introductory Notes (INs), ‘ab’ repesent song sequence. Properties of INs converge on a consistent state before the onset of first song syllable ‘a’. (C) If INs represent preparation for song, properties of INs should be correlated to properties of upcoming song. ‘iiiab’ represent IN-SONG sequence, and size of letters represent features that are correlated within a day. (D) If INs represent preparation for song, age-related changes to INs and SONGs should be predicted by within-day correlations.

What is the nature of motor preparation before more complex, naturally learned, motor sequences? A well-studied example of a complex, naturally learned, motor sequence is the song of the adult male zebra finch, a songbird (Fee and Scharff, 2010). Song is a sequence of sounds interleaved by silent gaps (Fig. 1B, ‘a’ and ‘b’ represent the song sequence). It is part of the courtship ritual of the male zebra finch (Sossinka and Böhner, 1980; Zann, 1996; Fee and Scharff, 2010) and is naturally learned by young birds (Immelmann, 1969; Price, 1979; Zann, 1996). Song bouts typically begin with a variable number of repetitions of a short sound called an introductory note (IN marked as ‘i’ in Fig. 1B) (Price, 1979; Sossinka and Böhner, 1980). As INs repeat, the timing and acoustic features change; intervals between successive INs become shorter and more stereotyped and the acoustic features reach a stereotyped “state” just before the start of the first song (Rajan and Doupe, 2013). This progression is unaffected immediately after deafening or peripheral nerve cuts (Rao et al., 2019). This sensory-feedback independent progression from a variable first IN to a stereotyped last IN is similar to changes in delay-period neural activity suggesting that INs reflect motor preparation for song initiation.

An important feature of neural preparatory activity is its correlation with features of the upcoming movement. In primates, preparatory neural activity is different for different movements; for eg. preparatory neural activity occupies different initial states for fast and slow reaching movements. It also correlates with movement parameters on a trial-by-trial basis; within all fast reaching movements, preparatory neural activity is correlated with movement speed on that trial (Churchland et al., 2006a). Whether zebra finch INs also correlate with the upcoming song in a similar fashion remains poorly understood. Here, we addressed this by recording and analyzing songs from multiple adult, male, zebra finches across a 3 year period. Specifically, we first tested for the existence of bout-to-bout correlations between IN properties and song properties (Fig. 1C). Second, previous studies have shown age-related changes in song properties (Brainard and Doupe, 2001; Pytte et al., 2007; Glaze and Troyer, 2013; James and Sakata, 2019). If IN properties are correlated with song properties, one would also expect age-related changes in IN properties. Additionally, bout-to-bout correlations between INs and song properties, at a younger age, should predict age-related changes in IN and song properties (Fig. 1D). We tested for these by analyzing songs from the same birds at multiple different ages.

Materials and Methods

All experiments at [Author University 1] were approved by Institute Animal Ethical Committee in accordance with the guidelines of the Committee for the Control and Supervision of Experiments on Animals [Author Country]. All song recording procedures for songs recorded at [Author University 2] were approved by the [Author University 2] Institutional Animal Care and Use Committee in accordance with NIH guidelines. We recorded songs from a total of 46 adult male birds (age > 90 days post hatch) with some purchased from an outside local vendor (n=19) and some bred at [Author University 1] (n=21) or [Author University 2] (n=6). The age of all purchased birds was assumed to be 60 days post hatch (dph) as they had red beaks (beaks are black for birds < 60 dph). At times when birds were not being recorded, they were housed in large cages with 5–8 other birds in a bird colony maintained with a 14h/10h light/dark cycle. Food and water were provided at all times.

SONG RECORDING

Birds were isolated from the colony and placed in separate cages in a sound attenuation enclosure (NewTech Acoustic Systems, Bangalore) maintained at 14h light / 10 h dark cycle. A microphone (AKG Acoustics C417PP) was clipped on the roof of the cage to record song. Song recordings were either in ‘triggered’ or ‘continuous’ mode. Briefly, in ‘triggered’ mode periods of recordings that crossed a pre-set threshold were saved along with an additional 1–3 seconds of data flanking this period on either side. In ‘continuous’ mode, audio files were saved continuously for the entire recording period. All data was recorded and saved to disk at 44100 Hz sampling rate using custom written software (Python or Matlab). The different sets of birds used for different analyses and the overlap between these sets are explained below.

BIRDS USED FOR ANALYSIS OF IN-SONG CORRELATIONS

20 birds were recorded on multiple days (median 3 days [range 2–8 days] sessions - put median sessions here done) in the age range from 89 – 1087 dph. As far as possible, we maintained the same position of the microphone for a given bird on different days of recording. A subset of birds and sessions (2 session each from 14/20 birds) overlapped with those analyzed for day-to-day changes in IN properties in a previous study. One bird (1/20) overlapped with the birds used for ts-cut surgery in that same study, but the recording sessions analyzed here are different and are from well before the surgical procedure. A subset of birds (5/20) were recorded earlier at [Author University 2] and have sessions from two nearby days, <5 days apart, included in this analysis. For characterizing IN changes towards song (Fig. 2) and bout-to-bout correlations of INs and song (Table 1 and Fig. 3) songs recorded from one session (<1 yr of age) in each bird were analyzed. Multiple sessions were analyzed for age related changes in INs and songs (Fig. 46). The difference between days of recording spanned a wide range (1–812 days) and the time of recording varied across sessions for the same bird.

Fig. 2. Introductory Note (IN) properties change in the direction of the song.

Fig. 2

(A-G) Comparison of properties between First and Last IN position (gray) and the first syllable of the upcoming song (blue). Circles joined by lines represent data from individual birds, boxes represent group data across birds. Column 1 shows mean and column 2 shows CV or Coefficent of Variation for intervals (A), syllable acoustic properties namely duration (B), mean frequency (C), entropy (D), log amplitude (E), pitch goodness (F), frequency modulation (G) respectively. * denotes p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.005, Repeated Measures ANOVA followed by post-hoc Tukey-Kramer test.

TABLE 1.

Correlation between IN and song syllables

Unmanipulated birds Surgically manipulated birds
Suspended microphone (n=20) Head fixed microphone (n=4) Ts Cut (n=8)
IN number First IN Last IN Second Song syllable IN Number First IN Last IN Second Song syllable IN Number First IN Last IN Second Song syllable
First Song Syllable Interval 2/20 [0.027564 – 0.11579] 1/20 [−0.040692 -] 6/20 [0.055634 – 0.20126] 5/15 [0.062064 – 0.69446] 0/4 [-] 1/4 [−0.094595 -] 0/4 [-] 0/3 [-] 2/7 [0.020996–0.084361] 1/7 [−0.060756 -] 3/7 [0.077037 – 0.20297] 5/6 [0.059212 – 0.31928]
Duration 2/16 [0.055525 – 0.14306] 2/16 [0.050712 – 0.16412] 4/16 [0.07711 – 0.32075] 9/16 [0.049804 – 0.59583] 0/4 [-] 1/4 [−0.10509 -] 0/4 [-] 1/4 [−0.41001 -] 2/8 [0.069903–0.094664] 5/7 [0.019376 – 0.32247] 3/7 [0.11523 – 0.15811] 5/8 [0.066957 – 0.32461]
Mean Frequency 4/16 [0.037527 – 0.17197] 10/16 [0.07751 – 0.43468] 11/16 [0.097598 – 0.53551] 11/16 [0.086829 – 0.61189] 0/4 [-] 0/4 [-] 2/4 [0.2547 – 0.26094] 3/4 [0.18096 – 0.2585] 2/8 [0.026057–0.035218] 6/7 [0.085925 – 0.17935] 6/7 [0.085449 – 0.32444] 7/8 [0.17162 – 0.59137]
Entropy 1/16 [−0.065818 -] 6/16 [0.043557 – 0.43134] 7/16 [0.089148 – 0.40611] 7/16 [0.033528 – 0.39943] 0/4 [-] 0/4 [-] 2/4 [0.21301 – 0.23058] 2/4 [0.13476 – 0.37133] 3/8 [0.033457–0.098931] 4/7 [0.075541 – 0.35578] 5/7 [0.074278 – 0.53467] 7/8 [0.077904 – 0.69269]
Log Amplitude 0/16 [-] 15/16 [0.1022 – 0.71674] 16/16 [0.086365 – 0.80228] 16/16 [0.10257 – 0.90177] 0/4 [-] 1/4 [−0.25684 -] 3/4 [0.095298 – 0.29403] 3/4 [0.10264 – 0.23485] 3/8 [0.070799–0.091607] 7/7 [0.18765 – 0.64247] 7/7 [0.43535 – 0.83514] 8/8 [0.46317 – 0.78988]
Pitch Goodness 1/16 [−0.13814 -] 4/16 [0.093954 – 0.45981] 7/16 [0.082209 – 0.40086] 7/16 [0.039365 – 0.25201] 0/4 [-] 1/4 [−0.55047 -] 0/4 [-] 0/4 [-] 3/8 [0.022745–0.073854] 6/7 [0.1454 – 0.69199] 7/7 [0.16424–0.82966] 8/8 [0.31519 – 0.88773]
Frequency Modulation 1/16 [−0.070169 -] 2/16 [0.039416 – 0.079888] 3/16 [0.054115 – 0.075381] 2/16 [0.12817 – 0.14779] 0/4 [-] 0/4 [-] 0/4 [-] 1/4 [−0.14862 -] 2/8 [0.053764–0.067706] 4/7 [0.042636 – 0.16752] 3/7 [0.099547 – 0.36539] 5/8 [0.16071 – 0.22946]

Timing and acoustic properties of first song syllable (see Row Names) were correlated to IN number, or corresponding timing and acoustic properties for first IN position, last IN position or second song syllable (see Column names). Pearson correlation coefficient was calculated after removing outliers for one session in each bird and considered significant if p ≤ 0.05. Each cell indicates the proportion of birds with significant correlations along with the range of R-squared values for significant correlations, within parenthesis. Highlighted cells indicate properties for which significant correlation coefficients were observed in at least half of the birds. The subsequent columns repeat the analysis for birds with head-implanted microphone and ts-nerve cut birds.

Fig. 3. Trial-to-trial correlations between acoustic properties of IN and song.

Fig. 3

(A-C) Example session from one bird that shows significant positive correlations for Log Amplitude of first syllable in the (first) motif along y-axis with (A) First IN position, (B) Last IN position and (C) second syllable respectively along x-axis. Circles represent data from individual trials or bouts. p ≤ 0.05, Pearson correlation coefficient calculated after removing outliers.

Fig. 4. Correlations with first song syllable.

Fig. 4

IN number, timing and acoustic properties (duration, mean frequency, entropy, log amplitude, pitch goodness and frequency modulation for a syllable) at First IN, Last IN and second song syllable positions were correlated to timing and acoustic properties at first song syllable position across bouts. Each circle represents r-values for bout-to-bout correlations in one bird. Lines connecting circles represent data from the same bird bird. circles with a ‘*’ represent significant correlations and unfilled circles represent non-significant correlations. Proportion of birds with significant correlations are indicated on top and marked with red if proportions were ≥ 0.5. A property is considered significantly correlated if proportions are ≥ 0.5. Correlation strengths across pairs of groups were compared using sign-rank by including only significant correlation values. Correlations to First IN were compared with correlations to last IN, correlations to last IN were compared with correlations to second song syllable. * denotes p ≤ 0.05, ** denotes p ≤ 0.01, Wilcoxon signed-rank test.

Fig. 6. Song and IN properties simultaneously change in the first year of age.

Fig. 6

(A - D) Changes in Motif similarity (A), motif duration (B), IN number (C) and time-to-song (D) are plotted across different age categories. Circles represent change in mean property between pairs of days for individual birds. All pairs of days are the same pairs of days selected for maximum change in IN number in Fig. 5. Square and whiskers represent median and interquartile range for the 3 age groups. * denotes p ≤ 0.05 KruskalWallis ANOVA followed by Tukey-Kramer test.

DATA OVERLAP WITH PREVIOUS STUDIES

Songs of birds from [Author University 2] were recorded as part of a different study characterizing INs. Besides characterizing the properties of INs in every sequence and the associated neural activity during INs, the study also correlated the similarity of pairs of last INs to pairs of first song syllables. However, the direct relationship between individual properties of INs and songs was not compared. Here, we used these songs to directly compare the properties of INs and songs.

The IN-song correlation analysis was repeated for one session of undirected song each from 5 birds recorded earlier with head-implanted microphones for a different study. Head-implanted microphone recordings control for amplitude measurement differences in suspended microphone recordings, that may arise due to changes in relative position of the bird and the microphone across trials. IN-song correlations of acoustic properties with head-fixed microphones were analyzed as a control to rule out the possibility that IN-song correlations were a result of differences in relative position of the bird from the microphone. IN-song correlations have not been analyzed as part of the original study.

IN-song correlation analysis was repeated for one session of undirected song each from 8 birds recorded after ts-nerve (tracheosyringeal nerve) cut surgery as part of an earlier study. This data provided an advantage to look at IN-song acoustic correlations in the absence of properties related to syllable identity as these are lost following ts-nerve cut manipulation. IN-song correlations were not analyzed as part of the original study.

DATA FOR REPEAT SYLLABLES

Songs of 17 birds in the colony were identified that specifically repeated at least one syllable within the motif. One session each from these birds was used to compare the syllable repetition properties among the different types of repeated syllables produced by the bird namely INs, motif syllable repeats and calls. One session from one bird was also analyzed for IN-song correlations. One of the birds overlapped with ts-nerve cut birds, but the songs analyzed were from before surgery. One of the birds recorded in this set was also bred and recorded earlier at [Author University 2].

DATA ANALYSIS

All the analyses were performed using custom-written scripts in MATLAB (www.mathworks.com). Audio files were processed and vocalizations were labeled. Briefly, audio files were segmented into syllables based on a user-defined amplitude threshold, syllables were defined as segments greater than 10 ms and inter-syllable gap shorter than 5ms were merged. Labels were assigned to syllables in a semi-automatic manner. Automatic labels were assigned using a modified template-matching procedure (Glaze and Troyer, 2006) or Klustakwik clustering (http://klustakwik.sourceforge.net/) of acoustic features calculated using Sound Analysis Pro (https://soundanalysispro.com/matlab-sat). Labels were then manually checked for all files. Files were split into bouts and all bouts containing songs (motifs) were selected for analysis. Typically these bouts consisted of repeating IN sequences before the first song.

Defining a bout and bout interval criteria

A bout was defined as a period of vocalizations separated by 2 seconds of silence. This criteria was applied to select bouts from recordings in ‘continuous’ mode. However, many audio files recorded in ‘triggered’ mode did not have 2 seconds of silence before the first syllable in the file. This occurred when initial vocalizations in the bout were soft and the set trigger threshold was crossed by a later vocalization. Such audio files had less than 2 seconds of silence before the first syllable in the file. However, we assumed that there was silence before the start of the file as the set trigger-threshold was not crossed and so we included such files in our analysis. The bout criteria for triggered recordings was reduced to include enough audio files (>15 bouts) for analysis and the criteria ranged from 500–1500 ms only for the beginning of the file. Within a file, we still considered 2000ms of silence and we always considered 2000ms of silence for continuous recordings. across birds. For a given bird, we maintained the same bout criteria across sessions. This was important as we observed that mean IN number depended on bout interval criteria; as shown previously, mean IN number reduced a little when considering a bout criteria of < 2s (Sossinka and Böhner, 1980; Rajan and Doupe, 2013). However, mean IN number between days was not different when the same bout interval criteria was applied.

Determining minimum number of bouts for IN analysis

The number of INs varied across bouts and the number of bouts sung by birds varied across sessions and birds (median 73.5 song bouts: range 4 – 430). It was important to characterize how the number of bouts affected the estimate of mean IN number for a session, and to determine the minimum number of bouts beyond which the estimate did not change considerably. For all sessions with >100 song bouts (30 sessions), a range of different number of bouts were sub-sampled and the corresponding average standard deviation across 1000 iterations of Monte Carlo simulations was calculated. The minimum number of bouts for IN analysis was identified as the number of bouts beyond which increasing the number of bouts did not change the average standard deviation by more than 0.01. The standard deviation met our criteria by 10 bouts for 18/30 sessions and by 15 bouts for 28/30 sessions. Hence, across all days and birds, only sessions with at least 15 song bouts were selected for IN analysis. This excluded two sessions from a bird (1/20) with < 15 song bouts.

Calculating IN and song properties

For analyzing IN and song properties across trials, only INs in the beginning of the bout and the first motif, that followed immediately after the INs, was considered, unless otherwise specified. For each song bout, the number of INs was calculated preceding the first motif. The timing and acoustic properties of INs and song were calculated as follows: timing of IN for first position or first interval was the time interval between offset of first IN and onset of second IN. Timing of IN for last position or the last interval was calculated as the time interval between offset of last interval and the onset of the first song syllable. Song timing for the first position in the motif was calculated as the interval between offset of first motif syllable and onset of second motif syllable. Song timing for the second position in the motif was calculated as the interval between offset of second motif syllable and onset of third motif syllable. Similar to timing, the acoustic properties of INs were calculated for first and last IN and for the first and second motif syllable. For comparisons of acoustic properties involving changes in IN from first to last position, birds with two types of INs in the IN sequence were excluded as the syllable identity in these birds would be different for first and last IN. For comparisons of intervals involving first and second song position, birds with only two syllables in the motif were excluded as there were no second song intervals.

Acoustic properties of syllables

The acoustic properties for IN and song syllables were measured as described by Sound Analysis Pro 2011 (Tchernichovski et al., 2000) (http://soundanalysispro.com/) using SAP matlab code (https://soundanalysispro.com/matlab-sat). These basic features reduce dimensionality of the complex sound spectrogram for analysis and make it easier to understand the changes in different aspects of sound. A brief intuitive understanding of the features is described below:

  • Log Amplitude: It measures the loudness of a syllable. The intensity or power in audio signal is measured relative to an arbitrary baseline for silence. It is reported in log scale units or dB.

  • Duration: Time from onset to offset of the syllable. Onset and offset are determined based on an amplitude threshold that distinguishes sound from silence in the audio signal. It is measured in units of seconds or milliseconds.

  • Mean Frequency: Audio signal can be decomposed into different frequencies present in the signal. The mean frequency is a pitch measure that assesses the center of the distribution of intensity across different frequencies. It is measured in Hz. Since mean frequency is dependent on intensity, it is an amplitude related property.

  • Pitch goodness: It measures the periodicity of harmonic pitch (frequency stacks) in the syllable. Syllables with harmonics or ‘frequency stacks’ (as observed in a spectrogram) have higher value, and syllables that are noisy or pure tone have lower value.

  • Weiner entropy or entropy: It measures the noisiness in the syllable. A noisy syllable appears as broadband or white noise in the spectrogram with intensity spread equally across all frequencies. White noise has a value of 1 and pure tone (sound intensity concentrated at one frequency) has a value of 0. However, this is converted to log-scale of 0 to minus infinity. More noisy syllables will have entropy 0. Since entropy is dependent on intensity, it is an amplitude related property.

  • Frequency Modulation: It is an estimate of slope of the frequency trace on the spectrogram, measured in degrees. Steeper the slope, higher the modulation.

Analyzing age-related changes in IN and song properties

Changes in IN and song properties between two different days were measured by subtracting the mean or variability calculated for the earlier of the two days from those measured on the later of the two days. For changes in INs between days, in addition to properties related to IN number, IN timing and IN acoustic properties, the time-to-song was also calculated. Time-to-song was defined as the duration of the IN sequence from onset of the first IN to onset of the first song syllable. To quantify song changes with age, parameters of song that have earlier been reported to change with age were calculated. These included motif duration and song similarity (Brainard and Doupe, 2001; Pytte et al., 2007; Glaze and Troyer, 2013; James and Sakata, 2019). Motif duration was measured as the time from the onset of the first motif syllable to the offset of the last motif syllable. Song similarity was calculated as average similarity index between pairs of motifs from a recording day. Ten motifs of a fixed syllable sequence were randomly selected across all motifs, and each of the ten motifs was compared to the 9 other motifs using a similarity index algorithm (Mandelblat-Cerf and Fee, 2014), followed by taking the mean for all pairs. The similarity index gives a measure of how similar songs on a given day were to each other. Similar to other properties, changes between days was measured by subtracting the similarity index of an earlier day from a later day. Other song properties related to song sequence and individual timing and acoustic features of first song syllable were also measured for change between days. To compare song spectral structure changes between pairs of days we measured the average similarity index of 10 randomly selected motifs on the two days. To compare song temporal structure between days the temporal pattern of the motif sequence was obtained by taking the amplitude profile and replacing this with ‘1s’ for the duration of syllables and ‘0s’ for the duration of intervals between syllables. Two patterns were compared up to the duration of the shorter sequence using cross correlation. Similar to comparison of spectral structure, 10 randomly selected motifs each from the two days were compared all-to-all and averaged. Sequence Consistency and Entropy were measured as described in an earlier study (James and Sakata, 2019). The analysis was restricted to the first motif sequence in the bout. For each motif sequence one additional syllable at both ends was included for calculating sequence consistency and entropy. Sequence Consistency (SC) measured the consistent transitions between syllables of the motif and was calculated as the ratio of most typical transitions divided by total transitions. Sequence Entropy(SE) measured the variability of syllable transitions, and was calculated as the sum of transition probabilities pi of all transitions from a syllable using the formula Σ-pi×log2pi. Higher the value, more variable the transition. The change between pairs of days was measured by subtracting the values of an earlier day from a later day.

Age-related changes were calculated as changes in mean or variability of properties between pairs of days belonging to either of the 3 age-groups: (1) pairs of days < 5 days apart as short-term day-to-day change, (2) pairs of days with first day <1 year and second day > 1yr as long-term change in the first year and (3) pairs of days with both days > 1yr and more than 5 days apart as long-term changes after one year of age. If a bird was recorded for more than one day within an age-group, we compared all pairs of days with each other and for each age-group category, we chose pairs of days with the maximum change in IN number. We reasoned that this would give us the maximum chance of detecting changes, if there were any. The same pairs of days were used to calculate age-related changes in other IN and song properties to understand the extent of age-related changes in other properties when IN number changes were maximum.

Comparison of IN properties with motif duration and similarity within a session

To compare IN properties with motif duration and motif similarity within a day, we divided the bouts based on IN number at the start and then calculated the average motif duration and song similarity for bouts with same number of INs. To account for individual variation across birds, the number of INs was normalized to median IN number and motif duration was normalized to the corresponding median motif duration. We then asked whether song properties were different for the different IN numbers by comparing song properties across the different IN numbers. As another measure of number of INs, we also compared song changes with changes in the time-to-song, measured as the time from the start of the first IN to the start of the first motif syllable. For this, we split bouts into two groups based on the median value of time-to-song. The two groups were then compared across birds for differences in mean song properties corresponding to shorter or longer timing of INs. Other song properties chosen based on significant changes with age such as song sequence entropy, mean pitch goodness of first song syllable, and variability of frequency modulation of first song syllable were also compared corresponding to shorter or longer IN song timing. Similar analyses were done for comparing song properties corresponding to trials split based on properties of first or last IN position before song, namely lower and higher IN frequency modulation, less and more variable IN duration, less and more variable pitch goodness.

Analysis of repeat syllables

Repeat syllables belonging to either motifs, INs or calls were analyzed for each bird. Call repeats outside of song were considered as these were more common in all the birds we analysed. Similar to criteria used for INs in other datasets, repeat syllables with more than 15 instances were analysed. The number, interval and acoustic properties were analyzed using a similar procedure as that described for INs above. The ratio of intervals was defined as the current interval between successive repeats in the sequence divided by the next interval. Shortening of intervals is denoted by a value of <1. Median ratio of intervals was measured across trials to represent data for a syllable. The acoustic distance to last repeat was defined as an inverse measure of similarity between a repeat syllable and all last repeat syllables. The six acoustic properties defined earlier were used to calculate acoustic distance, namely duration, mean frequency, log amplitude, entropy, pitch goodness and frequency modulation. To calculate acoustic distance, all repeat instances were randomly split in half. The last repeats from one half were chosen as the reference distribution occupying a six-dimensional space formed by the six acoustic properties. The Mahalanobis distance from this distribution to each repeat syllable in the second half was calculated to give the acoustic distance for each of these repeat syllables. Similar to ratio of intervals, ratio of acoustic distance was then calculated as the distance of every repeat divided by distance of following repeat syllable. The median ratio of acoustic distance was measured across trials to represent data for a syllable. A median ratio of <1 denoted convergence towards the last repeat syllable.

EXPERIMENTAL DESIGN AND STATISTICAL ANALYSIS

The experimental design of all groups tested for differences had comparable sample numbers. All comparison of independent groups of unequal sample sizes were tested for differences using non-parametric Kruskal-Wallis test. If the p-value was ≤ 0.05, Tukey-Kramer’s post-hoc test was used to identify pairs of groups that were significantly different These group comparisons included age-related changes in mean IN or song properties, age-related changes in IN number in individual birds, comparison of average song motif duration or song similarity for different IN numbers, and comparison of mean properties of repeat syllables belonging to either motifs, INs or calls. Differences between (more than two) bird-matched groups were tested using Repeated-Measures one-way ANOVA. If the p-value was ≤ 0.05, Tukey-Kramer’s post-hoc test was used to identify pairs of groups that were significantly different. This included comparison of mean or CV of syllable properties across bouts based on position, i.e. first IN, last IN and first song syllable. Paired-group comparisons were tested for differences using Wilcoxon signed-rank test with a p-value criteria of 0.05 for significant differences. These comparisons included significant correlation coefficient estimates of first IN - first song syllable with that of last IN – first song syllable, motif duration or similarity comparisons between shorter and longer time-to-song bouts, and properties of repeat syllables at first and last IN positions. All bout-to-bout correlations were measured using Pearson’s Correlation Coefficient of corresponding IN and song property values across bouts. Prior to measuring correlations, the outliers were removed (percentage removed – median-2.19% data points; range - 0–28.99%). Outliers were detected as values beyond three times the median absolute deviation from the median for continuous variables and values beyond three standard deviations away from the mean for discrete variables. The correlations were considered significant if p-values were ≤ 0.05. The range of significant correlations were reported across birds along with proportion of birds significantly correlated.

DATA AND CODE ACCESSIBILITY

All data and scripts for analysis are available on request from the corresponding author.

Results

Introductory note features change in the direction of upcoming song

Introductory notes (INs) are characterized by three properties; the number of INs before song at the start of each bout, their timing measured by the intervals between successive INs and their acoustic properties. As shown earlier (Rajan and Doupe, 2013; Rao et al., 2019), both the timing and the acoustic properties of INs changed systematically from the first to the last IN (Fig. 2). Additionally, changes in IN properties were in the direction of changes from INs to song. As INs progressed from the first IN to the last IN to the first song syllable, inter-syllable intervals got shorter (Fig. 2A, left, p < 0.001, Repeated Measures ANOVA, p < 0.001 for first IN vs song and first IN vs. last IN, post-hoc Tukey-Kramer test) and syllables progressively became longer (Fig. 2B, left, p < 0.001, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and last IN vs. song, p = 0.02 for first IN vs. last IN, post-hoc Tukey-Kramer test), higher in frequency (Fig. 2C, left, p = 0.008, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and first IN vs. last IN and p = 0.003 for last IN vs. song, post-hoc Tukey-Kramer test) and louder (Fig. 2E, left, p = 0.003, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and first IN vs. last IN and p = 0.002 for last IN vs. song, post-hoc Tukey-Kramer test). In addition to changes in the mean, we also found a reduction in variability as INs approached song. Specifically, from the first IN to the last IN to song, the inter-syllable intervals became less variable (Fig. 2A, right, p = 0.048, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and first IN vs. last IN, post-hoc Tukey-Kramer test), syllable duration (Fig. 2B, right, p =0.03, Repeated Measures ANOVA, p < 0.001 for last IN vs. song, p = 0.005 for first IN vs. song, post-hoc Tukey-Kramer test), entropy (Fig. 2D, right, p < 0.001, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and for last IN vs. song, post-hoc Tukey-Kramer test) and frequency modulation (Fig. 2G, right, p = 0.002, Repeated Measures ANOVA, p < 0.001 for first IN vs. song, p = 0.04 for first IN vs. last IN and last IN vs. song, post-hoc Tukey-Kramer test) became less variable. Syllable duration (Fig. 2B, right) and entropy variability (Fig. 2D, right) were not significantly different from first IN to last IN (p > 0.05, Repeated Measures ANOVA). Overall, these results showed that the mean features of INs changed and became more similar to those of the upcoming song, as INs progress from the first to the last IN. In addition, timing variability systematically reduced from first to last IN, while the variability of acoustic features only reduced with the onset of song.

IN number is not correlated with first song syllable features

If INs represent motor preparation for upcoming song, IN properties should be correlated with properties of the first song syllable on a bout-to-bout basis. We first correlated the number of INs with the properties of the first song syllable. In most of the birds, the number of INs (or the “time to song”, measured as the time between the first IN and the first song syllable) was not significantly correlated with either the interval between the first two song syllables or the acoustic properties of the first song syllable (Fig. 4, Table 1, p > 0.05, Pearson’s correlation co-efficient). Previous studies have shown that the number of INs (and the “time to song”) is positively correlated with the length of the interval between the first two INs and the acoustic similarity of the first IN to the last IN (Rajan and Doupe, 2013; Rao et al., 2019). Taken together with our results showing the absence of correlations between IN number and song features, this suggests that birds sing different number of INs to reach the same last IN state from different initial conditions (different first IN states).

IN timing is not correlated with song syllable timing

We next examined correlations between the timing of INs and the timing of song. Specifically, we separately calculated correlations for first IN timing with song timing and last IN timing with song timing. If INs represent motor preparation, we expected an improvement in the IN-song correlations as IN timings change from first IN to last IN; i.e. we expected weak or no significant correlations between the first IN and song and strong correlations between the last IN and song. Additionally, we expected this pattern to be present in greater than 50% of the birds. Contrary to this expectation, we found significant correlations in timing of last IN and timing of song in only 6/20 birds and these correlations were weak (median - 0.14; range: 0.06 – 0.2; Table 1). Only 1/20 birds had significant correlations between first IN timing and song timing. Overall, a majority of birds did not show significant correlations between IN and song intervals or improvement in correlations with song interval from first to last IN intervals suggesting that changes in IN timing do not represent motor preparation for song syllable timing.

IN acoustic features are correlated with song features

Similar to timing, we examined correlations between the acoustic features of INs and the acoustic features of the first song syllable (see Fig. 3 for an example). Log amplitude and mean frequency of the last IN were correlated with the corresponding features of the first song syllable in a large proportion of birds (Fig. 4, Table 1, p < 0.05, Pearson’s correlation co-efficient). Surprisingly, we also found significant correlations between the acoustic features of the first IN and the first song syllable in most of these birds (Fig. 4, Table 1, p < 0.05, Pearson’s correlation coefficient), albeit the strength of the correlations were slightly lower when compared to the strength of correlations between the last IN and the first song syllable (Fig. 4, p = 0.004, Mean Frequency and p=0.015, Log Amplitude for comparisons of r-values). The presence of correlations with the first IN (Fig. 4, Table 1) suggested the possibility that these correlations could alternatively be related to previously described global correlations between syllables within individual bouts (Glaze and Troyer, 2006). In support of this idea, we also found significant correlations between the acoustic features of the first two song syllables (Fig. 4, Table 1, p < 0.05, Pearson’s correlation co-efficient). The correlations were similar in strength to the correlations between IN features and song features and were present in a similar proportion of birds (Table 1, Fig. 4, p > 0.05, Wilcoxon signed-rank test). In a majority of birds, significant correlations were also absent for other acoustic features related to syllable identity (duration, pitch goodness, frequency modulation), further supporting the idea that syllable feature correlations represent global control of IN and song syllables, rather than INs reflecting motor preparation for the upcoming song.

To rule out the possibility that such correlations arose because the distance from the suspended microphone to the bird varied from bout-to-bout, we also examined similar correlations in a dataset from a different set of birds with head-implanted microphones (Suri and Rajan, 2018). In these birds, the distance between the microphone and the bird was always fixed. We found similar correlations in a similar proportion of head-implanted birds (Table 1) confirming that these correlations were not just due to differences in relative distance from the microphone.

To better understand the origin of these correlations of globally controlled features, we further analysed song recordings from birds with bilateral cuts in the tracheosyringeal nerve, the nerve that carries neural input to the syringeal muscles. In these birds, syllable identities are lost as syllables are reduced to harmonic stacks, but the temporal patterning of song remains similar to pre-nerve cut as this is controlled by respiratory motor neurons (Bottjer and Arnold, 1984; Vicario, 1991; Williams and McKibben, 1992; Roy and Mooney, 2007). Significant correlations in most acoustic features were present in a similar proportion of these birds (Table 1, p < 0.05, Pearson’s correlation co-efficient). These results suggest that bout-to-bout correlations in acoustic features of INs and song syllables are driven by bout-to-bout differences in the control of respiratory pressure.

Overall, these results show the presence of significant correlations in IN acoustic features (first and last IN) with acoustic features of the first song syllable. Additionally, correlations between acoustic features of the first two song syllables suggest that the correlations between IN features and first song syllable features are not a result of motor preparatory function of INs but, instead, reflect the fact that INs are also vocalizations that are part of the song sequence and are controlled by the same neural pathways.

Song and INs show age-related changes

We next examined a second prediction of the motor preparation hypothesis, age-related song changes and their correlation with IN properties. Previous studies have shown age-related changes in song tempo and song stereotypy; songs become faster and more stereotyped as birds get older (Brainard and Doupe, 2001; Pytte et al., 2007; Glaze and Troyer, 2013; James and Sakata, 2019). If INs represent motor preparation, we would predict simultaneous changes in specific IN properties with age. In addition, we would expect age-related changes to be predicted by correlations between these IN properties and song properties at an earlier age. For instance, if IN number increases with age and songs get more stereotyped with age, we would expect higher IN number to be associated with more stereotyped songs, on a bout-to-bout basis, at an earlier age (Fig. 1D).

To test this, we first recorded birds at multiple time-points (median - 3; range - 2–8 time-points per bird) from ~90 days post-hatch to ~3 years of age (Fig. 5A). As compared to the first day of recordings, we found increases in IN number in most birds (Fig. 5A, red circles represent birds with increases over age, black circles represent birds with decreases over age and gray circles represent birds that do not change over age). We divided our recordings into 3 age categories, namely, (1) within 1 year post hatch, (2) 1–2 years post hatch and, (3) > 2 years post hatch. For some of the birds we had multiple recordings within each of these categories, typically within a few days of each other. For each bird we calculated pair-wise differences in mean IN number between the different age categories. Differences in mean IN number were largest when the first recording day was within 1 year post hatch and the second recording day greater than 1 year (Fig. 5B, bigger circles for Age 1 < 365 dph and Age 2 > 365 dph). For each bird, we chose pairs of days between age categories with the largest difference in mean IN number (see Methods for details, Fig. 5B, ‘*’ represents days chosen for further analysis) and then used changes between these pairs of days to further compare age-related changes across birds. We chose this strategy to increase our chances of testing the prediction of the motor preparation hypothesis as we expected greater correlations for these days as mentioned above. All of the analyses described below were carried out with these days.

Fig. 5. IN number increases in the first year of age.

Fig. 5

(A) Mean number of INs recorded at different ages across birds. Each circle represents the mean IN number for one session and lines join multiple datapoints from the same bird. Vertical yellow lines mark 1 year (365 dph) and 2 year (730 dph) respectively. Red lines indicate significant increase, black lines indicate significant decrease and gray lines indicate no change with age for a bird. (B) Circles represent comparison between an earlier day (Age 1 along x-axis) to a later day (Age 2 along y-axis) of recording. Horizontal and vertical gray dashed lines mark 1 year (365 dph). Diagonal gray dashed lines represent Age 1 = Age 2. Size of the circles represent change in mean IN number between Age 1 and Age 2 and filled circles represent a significant change. Different colors represent different birds. Points marked with star (*) were selected for maximum change in IN number for a bird in an age group that are also plotted in (6C). KruskalWallis ANOVA in (A) for individual birds followed by Tukey-Kamer test to mark significant pairs of days in (B), p ≤ 0.05 considered significant.

Mean IN number and the associated “time-to-song” increased in the 1st year post-hatch (Fig. 6C, p = 0.02, Kruskal-Wallis test, p = 0.09 for <1 yr to > 1yr vs. day-day and p = 0.02 for <1 yr to > 1 yr vs. > 1yr, post-hoc Tukey-Kramer test, Fig. 6D, p=0.003 Kruskal-Wallis test, p=0.04 for day-day and <yr to >1yr, p=0.003 for <1yr to >yr and >1yr, post-hoc Tukey-Kramer Test). Song tempo, for the first song motif (that followed the INs at the beginning) in the bout, increased significantly, as seen by the significant shortening of first song motif duration (Fig. 6B, p = 0.007, Kruskal-Wallis test, p = 0.005, day-day vs. <1yr to > 1yr, post-hoc Tukey-Kramer test). In our birds, song stereotypy did not increase significantly across the 3 age-groups, although song structure itself significantly changed in the first year (Table 2, song motif similarity index and motif temporal similarity between days). Other features of songs that showed changes in the first year were reduced sequence entropy, increased pitch goodness and reduced variability in frequency modulation of first song syllable. Significant changes were also observed for acoustic features of INs with increased frequency modulation, decreased CV of duration, pitch goodness and increased CV of frequency modulation of INs. We also observed some trends that approached significance including longer intervals between the first two INs, increased pitch goodness of INs, increased number of motifs in a bout, and reduced duration of the first motif syllable (Table 2, p>0.05 and <0.1). Overall, these results document age-related changes in IN properties that occur along with age-related song changes.

TABLE 2:

IN and Song changes with age

Property Measured Age groups Individual groups Statistic Test P-value Post-hoc: Tukey-Kramer Test

N GroupMeanSE GroupMedianRange Confidence Intervals

IN number Number day-day 10 MEAN +/− SE 0.0118+/−0.0216 MEDIAN [RANGE] 0.012616[−0.37937 – 0.29564] CI [−0.14249,0.16606] KruskalWallis ANOVA p value = 0.021175 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’} {‘1yr’}  −18.847  −8.9455  0.95651  0.086335
{‘day-day’}  {‘>1 yr’}  −8.1035  1.6  11.304  0.92098
{‘1yr’}  {‘>1 yr’}  1.0856  10.545  20.005  0.024384
<1yr to >1yr 11 MEAN +/− SE 0.561+/−0.0708 MEDIAN [RANGE] 0.45195[−0.92505 – 1.73] CI [0.038162,1.0839]
>1 yr 12 MEAN +/− SE −0.0594+/−0.0355 MEDIAN [RANGE] −0.090537[−0.79488 – 0.91923] CI [−0.32993,0.21121]

IN timing mean last interval day-day 10 MEAN +/− SE −1.2846+/−0.528 MEDIAN [RANGE] −0.20482[−10.952 – 6.0383] CI [−5.0621,2.4928] KruskalWallis ANOVA p value = 0.85669
<1yr to >1yr 10 MEAN +/− SE −6.089+/−1.2398 MEDIAN [RANGE] −3.253[−35.0634 – 5.5633] CI [−14.9577,2.7797]
>1 yr 11 MEAN +/− SE −2.405+/−0.3852 MEDIAN [RANGE] −0.24286[−11.1614 – 1.7304] CI [−5.2512,0.44129]

mean Ratio of intervals day-day 10 MEAN +/− SE 0.0028+/−0.0041 MEDIAN [RANGE] −0.0041247[−0.073999 – 0.085084] CI [−0.026361,0.031913] KruskalWallis ANOVA p value = 0.9567
<1yr to >1yr 10 MEAN +/− SE −0.1424+/−0.0481 MEDIAN [RANGE] −0.034303[−1.4426 – 0.29967] CI [−0.48622,0.20146]
>1 yr 11 MEAN +/− SE −0.0054+/−0.0051 MEDIAN [RANGE] −0.0029565[−0.13038 – 0.065233] CI [−0.043185,0.032319]

mean first interval day-day 10 MEAN +/− SE −0.9244+/−2.3378 MEDIAN [RANGE] 2.8502[−30.5386 – 49.3913] CI [−17.6477,15.7989] KruskalWallis ANOVA p value = 0.068496
<1yr to >1yr 10 MEAN +/− SE 12.0475+/−3.2618 MEDIAN [RANGE] 15.3978[−66.9523 – 50.7163] CI [−11.2859,35.381]
>1 yr 11 MEAN +/− SE −15.0694+/−3.2748 MEDIAN [RANGE] −9.6375[−76.4166 – 44.6174] CI [−39.2702,9.1313]

IN acoustic progression mean Last IN acoustic distance to last day-day 9 MEAN +/− SE −0.3475+/−0.0402 MEDIAN [RANGE] −0.38907[−1.0176 – 0.2705] CI [−0.6257,−0.069276] KruskalWallis ANOVA p value = 0.16984
<1yr to >1yr 7 MEAN +/− SE 0.0377+/−0.0538 MEDIAN [RANGE] 0.1414[−0.60348 – 0.3727] CI [−0.31055,0.38586]
>1 yr 9 MEAN +/− SE −0.1991+/−0.0721 MEDIAN [RANGE] −0.12088[−1.6982 – 0.68917] CI [−0.69764,0.29937]

mean Ratio of acoustic day-day 9 MEAN +/− SE 0.019+/−0.0108 MEDIAN [RANGE] −0.0090575[−0.083166 – 0.14485] CI [−0.055698,0.093642] KruskalWallis ANOVA p value = 0.81741
<1yr to >1yr 7 MEAN +/− SE −0.0051+/−0.0136 MEDIAN [RANGE] 0.014391[−0.17221 – 0.11397] CI [−0.092935,0.082812]
>1 yr 9 MEAN +/− SE – 0.0003+/−0.008 MEDIAN [RANGE] 0.0065595[−0.11984 – 0.094904] CI [−0.05576,0.055129]

mean First IN acoustic distance to Last day-day 9 MEAN +/− SE −1.0099+/−0.2076 MEDIAN [RANGE] −0.48591[−4.4112 – 0.66241] CI [−2.4461,0.4262] KruskalWallis ANOVA p value = 0.81741
<1yr to >1yr 7 MEAN +/− SE −0.5839+/−0.4119 MEDIAN [RANGE] −0.16996[−6.7354 – 1.7692] CI [−3.2506,2.0828]
>1 yr 9 MEAN +/− SE −0.6614+/−0.244 MEDIAN [RANGE] −0.58999[−3.5397 – 2.3548] CI [−2.3495,1.0267]

IN temporal progression mean IN temporal similarity day-day 10 MEAN +/− SE −0.0415+/−0.0133 MEDIAN [RANGE] −0.016053[−0.34315 – 0.12322] CI [−0.13698,0.053904] KruskalWallis ANOVA p value = 0.5976
<1yr to >1yr 10 MEAN +/− SE −0.0375+/−0.0116 MEDIAN [RANGE] −0.058391[−0.22948 – 0.15545] CI [−0.12041,0.045359]
>1 yr 10 MEAN +/− SE 0.0105+/−0.0111 MEDIAN [RANGE] −0.0099938[−0.12124 – 0.22186] CI [−0.068633,0.089703]

IN individual acoustic properties mean Duration day-day 8 MEAN +/− SE −111.05+/−48.371 MEDIAN [RANGE] 24.4624[−1050.0514 – 134.5232] CI [−434.5631,212.463] KruskalWallis ANOVA p value = 0.95198
<1yr to >1yr 7 MEAN +/− SE 291.7934+/−108.7078 MEDIAN [RANGE] 22.0911[−562.1608 – 1549.8409] CI [−411.972,995.5588]
>1 yr 9 MEAN +/− SE 150.8497+/−55.5404 MEDIAN [RANGE] −31.729[−474.1411 – 1057.3896] CI [−233.3793,535.0787]

mean MeanFrequency day-day 8 MEAN +/− SE 0.0269+/−0.0087 MEDIAN [RANGE] 0.0082835[−0.044376 – 0.15728] CI [−0.031269,0.085104] KruskalWallis ANOVA p value = 0.50265
<1yr to >1yr 7 MEAN +/− SE 0.2975+/−0.0629 MEDIAN [RANGE] 0.34665[−0.26392 – 1.0618] CI [−0.10941,0.70441]
>1 yr 9 MEAN +/− SE 0.2592+/−0.0563 MEDIAN [RANGE] 0.039671[−0.16794 – 1.3985] CI [−0.13037,0.64879]

mean Entropy day-day 8 MEAN +/− SE −0.0315+/−0.2474 MEDIAN [RANGE] 0.18696[−3.3383 – 2.7798] CI [−1.6862,1.6232] KruskalWallis ANOVA p value = 0.46338
<1yr to >1yr 7 MEAN +/− SE −9.0364+/−1.754 MEDIAN [RANGE] −13.1006[−26.8391 – 5.6277] CI [−20.3913,2.3186]
>1 yr 9 MEAN +/− SE −8.8939+/−1.8216 MEDIAN [RANGE] −2.5427[−42.3353 – 9.6014] CI [−21.4956,3.7078]

mean LogAmplitude day-day 8 MEAN +/− SE −4.1258+/−1.4467 MEDIAN [RANGE] −3.4446[−24.2639 – 9.7269] CI [−13.8016,5.55] KruskalWallis ANOVA p value = 0.28124
<1yr to >1yr 7 MEAN +/− SE 22.8571+/−5.8646 MEDIAN [RANGE] 11.4677[−35.6068 – 84.9099] CI [−15.1097,60.824]
>1 yr 9 MEAN +/− SE −5.3387+/−4.7627 MEDIAN [RANGE] 13.6997[−75.7125 – 54.7122] CI [−38.2874,27.61]

mean PitchGoodness day-day 8 MEAN +/− SE 0.7614+/−0.2263 MEDIAN [RANGE] 0.85853[−1.8487 – 3.2726] CI [−0.75213,2.2748] KruskalWallis ANOVA p value = 0.067983
<1yr to >1yr 7 MEAN +/− SE 3.2852+/−0.7066 MEDIAN [RANGE] 2.8079[−2.8372 – 12.7516] CI [−1.2896,7.86]
>1 yr 9 MEAN +/− SE −2.4962+/−0.9237 MEDIAN [RANGE] −1.8307[−18.5602 – 11.9518] CI [−8.8867,3.8943]

mean FM day-day 8 MEAN +/− SE −0.0022+/−0.0007 MEDIAN [RANGE] −0.0023104[−0.013369 – 0.0067817] CI [−0.0071321,0.0027838] KruskalWallis ANOVA p value = 0.039514 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  −15.077  −6.5  2.0771  0.17762
{‘day-day’}  {‘>1 yr’}  −5.6639  2.3889  10.442  0.76625
{‘1yr’}  {‘>1 yr’}  0.53715  8.8889  17.241  0.033743
<1yr to >1yr 7 MEAN +/− SE 0.0079+/−0.0014 MEDIAN [RANGE] 0.0052126[−0.0065648 – 0.022539] CI [−0.00092705,0.016636]
>1 yr 9 MEAN +/− SE −0.0094+/−0.0016 MEDIAN [RANGE] −0.0074783[−0.031784 – 0.0063546] CI [−0.02058,0.0017585]

IN variability: number Variability/ CV IN Number day-day 10 MEAN +/− SE −0.0187+/−0.0039 MEDIAN [RANGE] −0.010639[−0.079494 – 0.034191] CI [−0.046518,0.0092119] KruskalWallis ANOVA p value = 0.38459
<1yr to >1yr 11 MEAN +/− SE −0.0199+/−0.007 MEDIAN [RANGE] −5.1272e-05[−0.17488 – 0.077454] CI [−0.071309,0.031595]
>1 yr 12 MEAN +/− SE 0.0199+/−0.0053 MEDIAN [RANGE] 0.024832[−0.081275 – 0.13978] CI [−0.020633,0.06047]

IN variability: timing Variability/ CV Last IN Interval day-day 10 MEAN +/− SE 0.0181+/−0.0039 MEDIAN [RANGE] 0.014022[−0.016373 – 0.12097] CI [−0.0099727,0.04619] KruskalWallis ANOVA p value = 0.43052
<1yr to >1yr 11 MEAN +/− SE 0.0285+/−0.0222 MEDIAN [RANGE] −0.011954[−0.25775 – 0.68661] CI [−0.13518,0.19224]
>1 yr 12 MEAN +/− SE −0.0094+/−0.0026 MEDIAN [RANGE] −0.0081933[−0.068218 – 0.034897] CI [−0.029477,0.010764]

Variability/ CV Ratio of intervals day-day 10 MEAN +/− SE 0.0064+/−0.0164 MEDIAN [RANGE] 0.027374[−0.35553 – 0.25091] CI [−0.11082,0.12369] KruskalWallis ANOVA p value = 0.36706
<1yr to >1yr 11 MEAN +/− SE 0.2316+/−0.0309 MEDIAN [RANGE] 0.1856[−0.21458 – 0.73979] CI [0.0034375,0.45981]
>1 yr 12 MEAN +/− SE 0.0775+/−0.0257 MEDIAN [RANGE] 0.048876[−0.32217 – 0.92068] CI [−0.1188,0.27386]

Variability/ CV First IN interval day-day 10 MEAN +/− SE −0.0137+/−0.0162 MEDIAN [RANGE] −0.016848[−0.3406 – 0.17153] CI [−0.12935,0.10192] KruskalWallis ANOVA p value = 0.18575
<1yr to >1yr 11 MEAN +/− SE 0.1268+/−0.0193 MEDIAN [RANGE] 0.14451[−0.20024 – 0.46645] CI [−0.015772,0.26936]
>1 yr 12 MEAN +/− SE −0.0053+/−0.0185 MEDIAN [RANGE] −0.037041[−0.30882 – 0.43536] CI [−0.14616,0.13548]

IN variability acoustic progresssion Variability/ CV Last IN acoustic distance to last day-day 10 MEAN +/− SE −0.0918+/−0.0123 MEDIAN [RANGE] −0.080348[−0.41275 – 0.020987] CI [−0.18001,−0.0035068] KruskalWallis ANOVA p value = 0.042309 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  −17.002  −7.1  2.802  0.21266
(‘day-day’}  {‘>1 yr’}  −19.97  −10.267  −0.56316  0.0351
{‘1yr’}  {‘>1 yr’}  −12.627  −3.1667  6.2932  0.71254
<1yr to >1yr 11 MEAN +/− SE −0.0158+/−0.0245 MEDIAN [RANGE] −0.017625[−0.64494 – 0.44022] CI [−0.19671,0.16509]
>1 yr 12 MEAN +/− SE 0.0521+/−0.0166 MEDIAN [RANGE] 0.063083[−0.44492 – 0.34816] CI [−0.074458,0.17869]

Variability/ CV Ratio of acoustic distance to last day-day 10 MEAN +/− SE −0.058+/−0.0206 MEDIAN [RANGE] −0.049568[−0.36106 – 0.38739] CI [−0.20515,0.089092] KruskalWallis ANOVA p value = 0.69068
<1yr to >1yr 11 MEAN +/− SE −0.1163+/−0.0364 MEDIAN [RANGE] −0.018213[−1.1743 – 0.32172] CI [−0.3854,0.15288]
>1 yr 12 MEAN +/− SE 0.0201+/−0.037 MEDIAN [RANGE] 0.0065369[−1.0488 – 0.85784] CI [−0.26179,0.30202]

Variability/ CV First IN acoustic distance to last day-day 10 MEAN +/− SE −0.0294+/−0.0103 MEDIAN [RANGE] −0.027447[−0.25547 – 0.085092] CI [−0.10308,0.044208] KruskalWallis ANOVA p value = 0.58736
<1yr to >1yr 11 MEAN +/− SE −0.0643+/−0.0127 MEDIAN [RANGE] −0.080123[−0.2751 – 0.2042] CI [−0.15855,0.029855]
>1 yr 12 MEAN +/− SE −0.0119+/−0.0115 MEDIAN [RANGE] −0.01908[−0.27526 – 0.20089] CI [−0.099899,0.076058]

IN variability temporal progression Variability/ CV IN temporal similarity day-day 10 MEAN +/− SE 0.01+/−0.013 MEDIAN [RANGE] 0.014167[−0.25792 – 0.25117] CI [−0.08284,0.10292] KruskalWallis ANOVA p value = 0.2413
<1yr to >1yr 11 MEAN +/− SE 0.0392+/−0.0203 MEDIAN [RANGE] −0.037849[−0.11084 – 0.59477] CI [−0.11062,0.18898]
>1 yr 11 MEAN +/− SE 0.091+/−0.0128 MEDIAN [RANGE] 0.10421[−0.11398 – 0.33483] CI [−0.0037432,0.18577]

IN variability individual acoustic properties Variability/ CV Duration day-day 8 MEAN +/− SE −0.004+/−0.0031 MEDIAN [RANGE] 0.0031015[−0.046894 – 0.028636] CI [−0.024799,0.016788] KruskalWallis ANOVA p value = 0.023576 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  −3.4699  5.1071  13.684  0.34321
{‘day-day’}  {‘>1 yr’}  −12.692  −4.6389  3.4139  0.36744
{‘1yr’}  {‘>1 yr’}  −18.098  −9.746  −1.3943  0.017162
<1yr to >1yr 7 MEAN +/− SE −0.0602+/−0.0111 MEDIAN [RANGE] −0.067725[−0.20548 – 0.020437] CI [−0.1318,0.011329]
>1 yr 9 MEAN +/− SE 0.0312+/−0.005 MEDIAN [RANGE] 0.024413[−0.012636 – 0.13136] CI [−0.0031162,0.065454]

Variability/ CV MeanFrequency day-day 8 MEAN +/− SE 0.0119+/−0.0016 MEDIAN [RANGE] 0.014156[−0.015905 – 0.026102] CI [0.0015679,0.022302] KruskalWallis ANOVA p value = 0.34252
<1yr to >1yr 7 MEAN +/− SE 0.0061+/−0.0064 MEDIAN [RANGE] 0.011416[−0.061778 – 0.073862] CI [−0.035534,0.047768]
>1 yr 9 MEAN +/− SE −0.0129+/−0.0054 MEDIAN [RANGE] −0.00155[−0.13064 – 0.040464] CI [−0.050524,0.02463]

Variability/ CV Entropy day-day 8 MEAN +/− SE 0.0019+/−0.0017 MEDIAN [RANGE] 0.00057463[−0.016068 – 0.025586] CI [−0.0096725,0.013444] KruskalWallis ANOVA p value = 0.30481
<1yr to >1yr 7 MEAN +/− SE 0.0684+/−0.0189 MEDIAN [RANGE] 0.036479[−0.068524 – 0.34817] CI [−0.053753,0.19063]
>1 yr 9 MEAN +/− SE −0.278+/−0.0914 MEDIAN [RANGE] 0.019074[−2.4435 – 0.11887] CI [−0.91053,0.3546]

Variability/ CV LogAmplitude day-day 8 MEAN +/− SE −0.0025+/−0.0033 MEDIAN [RANGE] −0.0012403[−0.043856 – 0.034072] CI [−.024717,0.019762] KruskalWallis ANOVA p value = 0.16444
<1yr to >1yr 7 MEAN +/− SE −0.045+/−0.0113 MEDIAN [RANGE] −0.04986[−0.14515 – 0.078516] CI [−0.11793,0.02799]
>1 yr 9 MEAN +/− SE 0.0102+/−0.0039 MEDIAN [RANGE] 0.0027708[−0.026547 – 0.088177] CI [−0.016941,0.037414]

Variability/ CV PitchGoodness day-day 8 MEAN +/− SE −0.0032+/−0.0036 MEDIAN [RANGE] −0.0043208[−0.04264 – 0.054537] CI [−0.02713,0.020723] KruskalWallis ANOVA p value = 0.010684 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  −0.95207  7.625  16.202  0.093232
{‘day-day’}  {‘>1 yr’}  −10.983  −2.9306  5.1222  0.67001
{‘1yr’}  {‘>1 yr’}  −18.907  −10.556  −2.2038  0.0085829
<1yr to >1yr 7 MEAN +/− SE −0.0551+/−0.0078 MEDIAN [RANGE] −0.048224[−0.1559 – 0.00080264] CI [−0.1056,−0.0045073]
>1 yr 9 MEAN +/− SE 0.0137+/−0.0036 MEDIAN [RANGE] 0.019179[−0.046311 – 0.050554] CI [−0.010894,0.038251]

Variability/ CV FM day-day 8 MEAN +/− SE 0.0976+/−0.0299 MEDIAN [RANGE] 0.063292[−0.27683 – 0.56986] CI [−0.10227,0.29751] KruskalWallis ANOVA p value = 0.024223 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  −2.5592  6.0179  14.595  0.22705
{‘day-day’}  {‘>1 yr’}  −11.733  −3.6806  4.3722  0.532
{‘1yr’}  {‘>1 yr’}  −18.05  −9.6984  −1.3467  0.017846
<1yr to >1yr 7 MEAN +/− SE 5.7541+/−2.9229 MEDIAN [RANGE] −0.74238[−6.5352 – 51.8705] CI [−13.1688,24.677]
>1 yr 9 MEAN +/− SE 4.9266+/−1.3632 MEDIAN [RANGE] 0.7803[0.4878 – 37.4909] CI [−4.5038,14.357]

SONG Number of motifs mean Number of motifs day-day 10 MEAN +/− SE 0.2026+/−0.1084 MEDIAN [RANGE] −0.15016[−1.1807 – 2.5053] CI [−0.57253,0.97777] KruskalWallis ANOVA p value = 0.071358
<1yr to >1yr 11 MEAN +/− SE 1.4231+/−0.122 MEDIAN [RANGE] 1.1011[0.53571 – 4.0663] CI [0.52176,2.3244]
>1 yr 12 MEAN +/− SE 0.2657+/−0.1103 MEDIAN [RANGE] 0.20406[−2.1523 – 3.2665] CI [−0.57497,1.1063]

SONG Motif duration mean Motif duration day-day 10 MEAN +/− SE 29.228+/−8.2369 MEDIAN [RANGE] 3.9954[−15.9994 – 262.1195] CI [−29.6952,88.1512] KruskalWallis ANOVA p value = 0.0073049 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  3.1755  12.782  22.388  0.0051686
{‘day-day’}  {‘>1 yr’}  −1.7336  7.8727  17.479  0.1328
{‘1yr’}  {‘>1 yr’}  −14.284  −4.9091  4.4657  0.437
<1yr to >1yr 11 MEAN +/− SE −36.1327+/−3.103 MEDIAN [RANGE] −45.4232[−82.1703 – 8.1005] CI [−59.0635,−13.2019]
>1 yr 11 MEAN +/− SE −11.384+/−1.4579 MEDIAN [RANGE] −11.5948[−38.336 – 13.5182] CI [−22.1579,−0.61]

SONG Motif similarity mean Similarity Index b/w days day-day 10 MEAN +/− SE 0.7533+/−0.0229 MEDIAN [RANGE] 0.82157[0.11343 – 0.8877] CI [0.58926,0.9174] KruskalWallis ANOVA p value = 0.0099177 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  2.7028  12.309  21.915  0.0075326
{‘day-day’}  {‘>1 yr’}  −4.7518  4.8545  14.461  0.46249
{‘1yr’}  {‘>1 yr’}  −16.829  −7.4545  1.9203  0.14936
<1yr to >1yr 11 MEAN +/− SE 0.5192+/−0.0217 MEDIAN [RANGE] 0.51651[0.097511 – 0.81988] CI [0.35885,0.67963]
>1 yr 11 MEAN +/− SE 0.7251+/−0.0202 MEDIAN [RANGE] 0.77332[0.068898 – 0.89233] CI[0.5757,0.87444]

mean Motif Temporal Similarity b/w days day-day 10 MEAN +/− SE 0.8508+/−0.0067 MEDIAN [RANGE] 0.8516[0.74702 – 0.94962] CI [0.80285,0.89867] KruskalWallis ANOVA p value = 0.0061719 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  3.2728  12.5  21.727  0.0042759
{‘day-day’}  {‘>1 yr’}  −4.0272  5.2  14.427  0.38353
{‘1yr’}  {‘>1 yr’}  −16.527  −7.3  1.9272  0.15223
<1yr to >1yr 10 MEAN +/− SE 0.4524+/−0.0375 MEDIAN [RANGE] 0.47334[−0.27436 – 0.85254] CI [0.18411,0.72078]
>1 yr 10 MEAN +/− SE 0.7465+/−0.0161 MEDIAN [RANGE] 0.81097[0.42351 – 0.89456] CI [0.63163,0.86136]

SONG First syllable timing and acoustic properties mean a-b interval day-day 10 MEAN +/− SE −0.8401+/−0.5078 MEDIAN [RANGE] 0.18811[−8.8901 – 5.5271] CI [−4.4727,2.7925] KruskalWallis ANOVA p value = 0.23329
<1yr to >1yr 11 MEAN +/− SE −4.4853+/−0.62 MEDIAN [RANGE] −5.7741[−12.1102 – 5.1788] CI [−9.0674,0.096841]
>1 yr 12 MEAN +/− SE −0.9871+/−0.3742 MEDIAN [RANGE] −2.1705[−4.6817 – 11.8113] CI [−3.8401,1.866]

mean first syll. duration day-day 10 MEAN +/− SE 6.4948+/−1.5049 MEDIAN [RANGE] 1.8102[−7.7428 – 46.3523] CI [−4.2709,17.2605] KruskalWallis ANOVA p value = 0.070662
<1yr to >1yr 11 MEAN +/− SE −5.9773+/−1.0548 MEDIAN [RANGE] −1.9745[−29.2968 – 7.3544] CI [−13.7721,1.8174]
>1 yr 11 MEAN +/− SE −2.6005+/−0.3672 MEDIAN [RANGE] −1.6526[−11.5721 – 2.257] CI [−5.3139,0.11279]

mean first syll. meanfrequency day-day 10 MEAN +/− SE −151.733+/−35.1637 MEDIAN [RANGE] 6.7708[−1084.3736 – 59.1939] CI [−403.2792,99.8132] KruskalWallis ANOVA p value = 0.15753
<1yr to >1yr 11 MEAN +/− SE 391.7428+/−83.5108 157.6735[−632.617 – 2409.3258] CI [−225.3937,1008.8794]
>1 yr 11 MEAN +/− SE 201.7046+/−61.525 24.7215[−400.7622 – 2184.6467] CI [−252.9587,656.3679]

mean first syll. entropy day-day 10 MEAN +/− SE −0.0022+/−0.0084 MEDIAN [RANGE] −0.0086113[−0.14802 – 0.14057] CI [−0.062035,0.057607] KruskalWallis ANOVA p value = 0.86606
<1yr to >1yr 11 MEAN +/− SE 0.1305+/−0.0436 MEDIAN [RANGE] −0.0031437[−0.43748 – 1.1387] CI [−0.19172,0.45274]
>1 yr 11 MEAN +/− SE 0.0163+/−0.0157 MEDIAN [RANGE] 0.048293[−0.25637 – 0.36555] CI [−0.099529,0.13204]

mean first syll. logamplitude day-day 10 MEAN +/− SE 1.2641+/−0.3431 MEDIAN [RANGE] 0.13582[−2.2428 – 10.1959] CI [−1.19,3.7181] KruskalWallis ANOVA p value = 0.18769
<1yr to >1yr 11 MEAN +/− SE −10.3542+/−0.9537 MEDIAN [RANGE] −14.8694[−22.9947 – 5.9059] CI [−17.4023,−3.3062]
>1 yr 11 MEAN +/− SE −5.3161+/−1.7469 MEDIAN [RANGE] 1.0588[−43.9232 – 20.7091] CI [−18.2257,7.5935]

mean first syll. pitchgoodness day-day 10 MEAN +/− SE −2.5285+/−0.3072 MEDIAN [RANGE] −1.7103[−6.4773 – 1.8701] CI [−4.7264,−0.33064] KruskalWallis ANOVA p value = 0.00057452 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’)  −24.079  −14.473  −4.8664  0.0012043
(‘day-day’}  {‘>1 yr’}  −11.715  −2.1091  7.4972  0.86425
{‘1yr’}  {‘>1 yr’}  2.9888  12.364  21.738  0.0056618
<1yr to >1yr 11 MEAN +/− SE 25.7401+/−2.2324 MEDIAN [RANGE] 15.7712[2.3762 – 77.0576] CI [9.2427,42.2375]
>1 yr 11 MEAN +/− SE 1.016+/−1.6216 MEDIAN [RANGE] −1.8068[−26.5565 – 37.4611] CI [−10.9673,12.9993]

mean first syll. FM day-day 10 MEAN +/− SE −0.0678+/−0.1779 MEDIAN [RANGE] 0.51513[4.6709 – 1.2936] CI [−1.3401,1.2046] KruskalWallis ANOVA p value = 0.24517
<1yr to >1yr 11 MEAN +/− SE 2.2977+/−0.3432 MEDIAN [RANGE] 2.4831[−3.4269 – 9.0626] CI [−0.23826,4.8337]
>1 yr 11 MEAN +/− SE 0.1021+/−0.354 MEDIAN [RANGE] −0.27577[−6.3593 – 9.0956] CI [−2.5138,2.718]

SONG Sequence mean Sequence Linearity day-day 10 MEAN +/− SE −0.0064+/−0.0014 MEDIAN [RANGE] −0.00466[−0.033023 – 0.010378] CI [−0.016217,0.003357] KruskalWallis ANOVA p value = 0.26738
<1yr to >1yr 11 MEAN +/− SE 0.0085+/−0.0051 MEDIAN [RANGE] 0.010651[−0.092121 – 0.094655] CI [−0.029464,0.046449]
>1 yr 12 MEAN +/− SE 0.0137+/−0.0031 MEDIAN [RANGE] 0.0089315[−0.040662 – 0.10559] CI [−0.010246,0.037614]

mean Sequence Consistency day-day 10 MEAN +/− SE 0.0356+/−0.0077 MEDIAN [RANGE] 4.8284e-05[−0.025719 – 0.18262] CI [−0.019339,0.090479] KruskalWallis ANOVA p value = 0.91088
<1yr to >1yr 11 MEAN +/− SE 0.0144+/−0.0115 MEDIAN [RANGE] 0.0041893[−0.15141 – 0.25843] CI [−0.070296,0.099097]
>1 yr 12 MEAN +/− SE 0.0203+/−0.0078 MEDIAN [RANGE] −0.0034457[−0.077696 – 0.24331] CI [−0.039332,0.079836]

mean Sequence Entropy day-day 10 MEAN +/− SE −0.0441+/−0.0081 MEDIAN [RANGE] −0.040476[−0.16685 – 0.10099] CI [−0.10234,0.014168] KruskalWallis ANOVA p value = 0.024716 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  −4.7565  5.1455  15.047  0.44254
{‘day-day’}  {‘>1 yr’}  −15.52  −5.8167  3.8868  0.33832
{‘1yr’}  {‘>1 yr’}  −20.422  −10.962  −1.5023  0.018147
<1yr to >1yr 11 MEAN +/− SE −0.1381+/−0.0194 MEDIAN [RANGE] −0.17927[−0.49211 – 0.23086] CI [−0.28177,0.0054986]
>1 yr 12 MEAN +/− SE 0.0489+/−0.0102 MEDIAN [RANGE] 0.0025373[−0.058314 – 0.3006] CI [−0.02922,0.127]

SONG variability Number of motifs variability/ CV Number of motifs day-day 10 MEAN +/− SE 0.0061+/−0.0044 MEDIAN [RANGE] 0.01163[−0.060488 – 0.10553] CI [−0.02569,0.037943] KruskalWallis ANOVA p value = 0.9075
<1yr to >1yr 11 MEAN +/− SE −0.0174+/−0.0119 MEDIAN [RANGE] −0.016933[−0.20432 – 0.26592] CI [−0.10557,0.070809]
>1 yr 12 MEAN +/− SE −0.0227+/−0.0107 MEDIAN [RANGE] 0.014456[−0.27843 – 0.14339] CI [−0.10426,0.058791]

SONG variability Motif duration variability/ CV Motif duration day-day 10 MEAN +/− SE −0.0002+/−0.0003 MEDIAN [RANGE] 5.2688e-05[−0.0040215 – 0.0045341] CI [−0.0022209,0.0017378] KruskalWallis ANOVA p value = 0.79223
<1yr to >1yr 11 MEAN +/− SE −0.0004+/−0.0004 MEDIAN [RANGE] −3.4335e-05[−0.0069788 – 0.0052676] CI [−0.0034333,0.0025572]
>1 yr 11 MEAN +/− SE −0.0012+/−0.0003 MEDIAN [RANGE] 1.8438e-05[−0.006412 – 0.003475] CI [−0.0035511,0.0011614]

SONG variability Motif similarity variability/ CV Similarity Index b/w days day-day 10 MEAN +/− SE 0.0168+/−0.0036 MEDIAN [RANGE] 0.0061638[−0.021014 – 0.11002] CI [−0.0089138,0.042558] KruskalWallis ANOVA p value = 0.58138
<1yr to >1yr 11 MEAN +/− SE 0.0233+/−0.0073 MEDIAN [RANGE] −0.013012[−0.072234 – 0.18034] CI [−0.030843,0.077447]
>1 yr 11 MEAN +/− SE −0.0001+/−0.0036 MEDIAN [RANGE] 0.004499[−0.066177 – 0.06819] CI [−0.026985,0.026744]

variability/ CV Motif Temporal Similarity b/w days day-day 10 MEAN +/− SE 0.0024+/−0.0033 MEDIAN [RANGE] −0.00087396[−0.041232 – 0.062861] CI [−0.021115,0.02597] KruskalWallis ANOVA p value = 0.94379
<1yr to >1yr 11 MEAN +/− SE 0.008+/−0.0077 MEDIAN [RANGE] 0.0031895[−0.12646 – 0.15639] CI [−0.048744,0.064727]
>1 yr 11 MEAN +/− SE 0.0139+/−0.0054 MEDIAN [RANGE] 0.0061457[−0.064618 – 0.10181] CI [−0.025661,0.053547]

SONG variability First motif syllable Timing and acoustic properties variability/ CV a-b interval day-day 10 MEAN +/− SE 0.0048+/−0.0036 MEDIAN [RANGE] 0.006541[−0.068773 – 0.069437] CI [−0.020815,0.030502] KruskalWallis ANOVA p value = 0.3443
<1yr to >1yr 11 MEAN +/− SE −0.0309+/−0.0057 MEDIAN [RANGE] −0.0096783[−0.17501 – 0.034413] CI [−0.073356,0.011538]
>1 yr 12 MEAN +/− SE −0.0112+/−0.0028 MEDIAN [RANGE] −0.0046618[−0.076364 – 0.040639] CI [−0.032934,0.010525]

variability/ CV first syll. duration day-day 10 MEAN +/− SE −0.0002+/−0.0006 MEDIAN [RANGE] 0.0008949[−0.014397 – 0.008081] CI [−0.004258,0.0039024] KruskalWallis ANOVA p value = 0.35878
<1yr to >1yr 11 MEAN +/− SE −0.0112+/−0.002 MEDIAN [RANGE] −0.0056469[−0.062141 – 0.014012] CI [−0.026271,0.0038214]
>1 yr 11 MEAN +/− SE 0.0008+/−0.0016 MEDIAN [RANGE] −0.0021715[−0.023275 – 0.046759] CI [−0.010703,0.012235]

variability/ CV first syll. meanfrequency day-day 10 MEAN +/− SE −0.0055+/−0.001 MEDIAN [RANGE] −0.0025316[−0.024134 – 0.0090383] CI [−0.012463,0.0015509] KruskalWallis ANOVA p value = 0.9193
<1yr to >1yr 11 MEAN +/− SE −0.0101+/−0.0029 MEDIAN [RANGE] −0.010687[−0.059918 – 0.057253] CI [−0.031614,0.011496]
>1 yr 11 MEAN +/− SE −0.012+/−0.0019 MEDIAN [RANGE] −0.0050096[−0.050071 – 0.0089345] CI [−0.026197,0.0021871]

variability/ CV first syll. entropy day-day 10 MEAN +/− SE −0.0005+/−0.0019 MEDIAN [RANGE] 0.0050826[−0.039327 – 0.021044] CI [−0.014012,0.013049] KruskalWallis ANOVA p value = 0.38537
<1yr to >1yr 11 MEAN +/− SE −0.0142+/−0.0031 MEDIAN [RANGE] −0.0076413[−0.061403 – 0.037881] CI [−0.036817,0.0083946]
>1 yr 11 MEAN +/− SE −0.0013+/−0.0015 MEDIAN [RANGE] −0.0046465[−0.022166 – 0.028778] CI [−0.012305,0.0096898]

variability/ CV first syll. logamplitude day-day 10 MEAN +/− SE 0.049+/−0.0137 MEDIAN [RANGE] 0.0025827[−0.0047802 – 0.43934] CI [−0.049387,0.14731] KruskalWallis ANOVA p value = 0.5123
<1yr to >1yr 11 MEAN +/− SE 0.0233+/−0.0077 MEDIAN [RANGE] −0.0025583[−0.070594 – 0.16975] CI [−0.033259,0.079812]
>1 yr 11 MEAN +/− SE −0.0113+/−0.0084 MEDIAN [RANGE] −0.0010149[−0.16967 – 0.15332] CI [−0.07347,0.050869]

variability/ CV first syll. pitchgoodness day-day 10 MEAN +/− SE 0.0032+/−0.0016 MEDIAN [RANGE] 0.0039579[−0.020149 – 0.029188] CI [−0.0081618,0.014606] KruskalWallis ANOVA p value = 0.32383
<1yr to >1yr 11 MEAN +/− SE −0.0158+/−0.0036 MEDIAN [RANGE] −0.0098688[−0.089673 – 0.049601] CI [−0.042089,0.010569]
>1 yr 11 MEAN +/− SE −0.0024+/−0.0018 MEDIAN [RANGE] −0.0043676[−0.028945 – 0.03971] CI [−0.015916,0.011124]

variability/ CV first syll. FM day-day 10 MEAN +/− SE 0.0096+/−0.0018 MEDIAN [RANGE] 0.0044912[−0.0025808 – 0.057731] CI [−0.0030461,0.022322] KruskalWallis ANOVA p value = 0.0043445 Group AGroup BLower LimitA-BUpper LimitP-value
{‘day-day’}  {‘1yr’}  1.8119  11.418  21.024  0.014767
(‘day-day’}  {‘>1 yr’}  2.5391  12.145  21.752  0.0085546
{‘1yr’}  {‘>1 yr’}  −8.6475  0.72727  10.102  0.98194
<1yr to >1yr 11 MEAN +/− SE −0.0212+/−0.0048 MEDIAN [RANGE] −0.0046254[−0.17894 – 0.0063666] CI [−0.056712,0.014238]
>1 yr 11 MEAN +/− SE −0.0068+/−0.0006 MEDIAN [RANGE] −0.006761[−0.018602 – 0.0024346] CI [−0.011559,−0.0021225]

SONG variability Similarity of motifs within bouts variability/ CV similarity within motif day-day 10 MEAN +/− SE −2.4078+/−0.7858 MEDIAN [RANGE] 0.038583[−24.7646 – 0.48545] CI [−8.0287,3.2132] KruskalWallis ANOVA p value = 0.48074
<1yr to >1yr 11 MEAN +/− SE 4.1703+/−1.0823 MEDIAN [RANGE] 0.019058[−5.192 – 36.4043] CI [−3.8279,12.1685]
>1 yr 11 MEAN +/− SE 0.041+/−0.1997 MEDIAN [RANGE] −0.21903[−3.0156 – 6.0959] CI [−1.435,1.517]

Changes in mean or variability of IN properties between pairs of days belonging to 3 age categories: day-day, <1yr to >1yr and > 1 yr. All pairs of days are the same pairs of days selected for maximum change in IN number in Fig. 5. Names of properties that significantly change with age are highlighted in bold.

IN features that change with age do not show bout-to-bout correlations with song features that change with age

If INs represented motor preparation, the correlations between IN features and song features on a bout-to-bout basis should predict long-term changes in both IN and song features (Fig. 1D). For instance, our results demonstrated an age-related speeding up of song (decrease in song motif duration) and an age-related increase in mean IN number with age. If INs were preparatory, then these changes would be predicted by an already existing negative correlation between the number of INs in a bout and song motif duration. However, we did not find any correlation between the number of INs in a bout or the time-to-song and song motif duration (Fig. 7A, 7B, p = 0.86, Kruskal-Wallis Test and p = 0.88, Wilcoxon signed-rank test respectively). We also compared first song motif stereotypy between bouts with fewer INs than the median IN number and bouts with IN number greater than the median IN number. First song motif stereotypy was not significantly different (Fig. 7C, 7D, p=0.44. Kruskal-Wallis Test and p=0.65, Wilcoxon signed-rank test respectively).

Fig. 7. Correlated IN and song changes with age are not related within a session.

Fig. 7

(A, C) Motif length normalized to median length (A) and motif similarity (C) are plotted for different IN numbers relative to the median IN number. Black squares represent mean across birds. (B, D) Motif length normalized to median length (A) and motif similarity (C) are plotted for short and long time-to-song. Squares and whiskers represent the mean and SEM across birds. In all plots, circles joined by lines represent mean values for one session from individual birds and different colours represent different birds. p > 0.05 KruskalWallis ANOVA in (A) and (C), p > 0.05, Wilcoxon signed-rank test in (B) and (D).

Similarly, we did not find any bout-to-bout correlations between other IN features that change with age and song features that change with age (Table 3). Overall, these results demonstrate that song features and IN features change independently with age. The absence of correlations between features that change do not satisfy the predictions of the motor preparation hypothesis. Rather, these changes demonstrate independent age-related changes in both INs and songs, suggesting the possibility that age affects neural circuits controlling INs and songs.

TABLE 3:

Within-day relationship of IN and Song properties that change with age.

IN Property changing with age Measured song property Corresponding IN Property groups Individual group data Test P

N Group means Group range Confidence Intervals

Mean Time to Song First song syll. mean pitch goodness Time to Song <Median 20 MEAN +/− SE 123.5228+/−2.2636 125.5742[50.9307 – 216.3257] CI [102.335,144.7106] sign-rank p value = 0.50159
Time to Song >Median 20 MEAN +/− SE 122.706+/−2.1943 125.3481[57.3089 – 211.60251 CI [102.1666,143.2454]

Song Sequence Entropy Time to Song <Median 20 MEAN +/− SE 0.1951+/−0.0082 MEDIAN [RANGE] 0.15957[0 – 0.486371 CI [0.11845,0.27181] sign-rank p value = 0.19299
Time to Song >Median 20 MEAN +/− SE 0.2318+/−0.0071 MEDIAN [RANGE] 0.24343[0 – 0.630161 CI [0.16515,0.29843]

First song syll. CV of Freq. modulation Time to Song <Median 20 MEAN +/− SE 0.0692+/−0.0025 MEDIAN [RANGE] 0.055[0.024748 – 0.26642] CI [0.045368,0.092963] sign-rank p value = 0.97022
Time to Song >Median 20 MEAN +/− SE 0.0701+/−0.0029 0.056609[0.015444 – 0.30047] CI [0.042704,0.097529]

Mean IN Frequency modulation First song syll. mean pitch goodness IN Freq. Modulation First IN < Median 17 MEAN +/− SE 128.1796+/−2.6248 126.6775[52.6362 – 212.7263] CI [105.2376,151.1215] sign-rank p value = 0.86841
IN Freq. Modulation First IN > Median 17 MEAN +/− SE 128.131+/−2.5845 124.8967[57.6679 – 214.1074] CI [105.5405,150.7215]

First song syll. mean pitch goodness IN Freq. Modulation LastIN < Median 17 MEAN +/− SE 127.4103+/−2.5747 125.6182[50.6308 – 203.6352] CI [104.906,149.9146] sign-rank p value = 0.61915
IN Freq. Modulation LastIN > Median 17 MEAN +/− SE 128.8826+/−2.6526 126.0253[60.9129 – 224.16731 CI [105.6972,152.0679]

Song Sequence Entropy IN Freq. Modulation First IN < Median 17 MEAN +/− SE 0.1822+/−0.0068 MEDIAN [RANGE] 0.19398[0 – 0.37821] CI [0.12307,0.24139] sign-rank p value = 0.94604
IN Freq. Modulation First IN > Median 17 MEAN +/− SE 0.1749+/−0.0073 MEDIAN [RANGE] 0.20596[0 – 0.40732] CI [0.11102,0.23869]

Song Sequence Entropy IN Freq. Modulation LastIN < Median 17 MEAN +/− SE 0.1831+/−0.0069 MEDIAN [RANGE] 0.20961[0 – 0.39832] CI [0.12301,0.24313] sign-rank p value = 0.39099
IN Freq. Modulation LastIN > Median 17 MEAN +/− SE 0.1745+/−0.0083 MEDIAN [RANGE] 0.15161[0 – 0.48851] CI [0.10185,0.24723]

First song syll. CV of Freq. modulation IN Freq. Modulation First IN < Median 17 MEAN +/− SE 0.0655+/−0.0033 0.053098[0.018979 – 0.27323] CI [0.036895,0.094017] sign-rank p value = 0.16257
IN Freq. Modulation First IN > Median 17 MEAN +/− SE 0.0711+/−0.0036 0.05341[0.020524 – 0.29628] CI [0.039201,0.10298]

First song syll. CV of Freq. modulation IN Freq. Modulation LastIN < Median 17 MEAN +/− SE 0.0664+/−0.0029 0.053298[0.016997 – 0.2475] CI [0.040922,0.09195] sign-rank p value = 0.52278
IN Freq. Modulation LastIN > Median 17 MEAN +/− SE 0.068+/−0.0036 0.053593[0.02202 – 0.29521] CI [0.036477,0.099552]

Motif duration mean IN Freq. Modulation First IN < Median 17 MEAN +/− SE 540.1806+/−8.1151 507.0181[347.9723 – 794.4851] CI [469.2495,611.1117] sign-rank p value = 0.22739
IN Freq. Modulation First IN > Median 17 MEAN +/− SE 539.6074+/−8.0783 511.695[348.1255 – 793.3096] CI [468.9981,610.2168]

Motif duration mean IN Freq. Modulation LastIN < Median 17 MEAN +/− SE 540.404+/−8.1084 513.1207[347.061 – 795.4847] CI [469.5315,611.2764] sign-rank p value = 0.17729
IN Freq. Modulation LastIN > Median 17 MEAN +/− SE 539.3276+/−8.0792 505.7908[348.9808 – 792.3758] CI [468.7109,609.9444]

First song syll. mean pitch goodness IN Freq. ModulationFirst IN < variable 17 MEAN +/− SE 128.7139+/−2.6547 125.1689[56.1686 – 216.0633] CI [105.5102,151.9176] sign-rank p value = 0.90579
IN Freq. ModulationFirst IN > variable 17 MEAN +/− SE 127.5747+/−2.5569 126.4439[54.5083 – 211.0079] CI [105.2262,149.9233]

First song syll. mean pitch goodness IN Freq. ModulationLastIN < variable 17 MEAN +/− SE 128.2524+/−2.5679 125.817[58.4789 – 217.5751] CI [105.8077,150.6971] sign-rank p value = 0.43476
IN Freq. ModulationLastIN > variable 17 MEAN +/− SE 128.0654+/−2.6364 126.3588[52.979 – 209.4515] CI [105.0218,151.109]

Song Sequence Entropy IN Freq. ModulationFirst IN < variable 17 MEAN +/− SE 0.1822+/−0.0068 MEDIAN [RANGE] 0.19398[0 – 0.37821] CI [0.12307,0.24139] sign-rank p value = 0.94604
IN Freq. ModulationFirst IN > variable 17 MEAN +/− SE 0.1749+/−0.0073 MEDIAN [RANGE] 0.20596[0 – 0.40732] CI [0.11102,0.23869]

Song Sequence Entropy IN Freq. ModulationLastIN < variable 17 MEAN +/− SE 0.1831+/−0.0069 MEDIAN [RANGE] 0.20961[0 – 0.39832] CI [0.12301,0.24313] sign-rank p value = 0.39099
IN Freq. ModulationLastIN > variable 17 MEAN +/− SE 0.1745+/−0.0083 MEDIAN [RANGE] 0.15161[0 – 0.48851] CI [0.10185,0.24723]

First song syll. CV of Freq. modulation IN Freq. ModulationFirst IN < variable 17 MEAN +/− SE 0.0681+/−0.0033 0.054209[0.011482 – 0.27301] CI [0.038987,0.097192] sign-rank p value = 0.79459
IN Freq. ModulationFirst IN > variable 17 MEAN +/− SE 0.0688+/−0.0036 0.055328[0.023822 – 0.29855] CI [0.037104,0.10043]

First song syll. CV of Freq. modulation IN Freq. ModulationLastIN < variable 17 MEAN +/− SE 0.0674+/−0.0035 0.052272[0.017337 – 0.28556] CI [0.037008,0.097777] sign-rank p value = 0.35596
IN Freq. ModulationLastIN > variable 17 MEAN +/− SE 0.0694+/−0.0033 0.057171[0.022597 – 0.27334] CI [0.040961,0.09793]

Motif duration mean IN Freq. ModulationFirst IN < variable 17 MEAN +/− SE 540.1084+/−8.0747 509.8513[348.6979 – 793.6949] CI [469.5305,610.6863] sign-rank p value = 0.17729
IN Freq. ModulationFirst IN > variable 17 MEAN +/− SE 539.6431+/−8.1125 508.7642[347.3595 – 793.9809] CI [468.7348,610.5513]

Motif duration mean IN Freq. ModulationLastIN < variable 17 MEAN +/− SE 540.1361+/−8.0878 508.9897[348.602 – 793.3415] CI [469.444,610.8283] sign-rank p value = 0.17729
IN Freq. ModulationLastIN > variable 17 MEAN +/− SE 539.5851+/−8.0994 509.7336[347.3936 – 794.3277] CI [468.7917,610.3786]

CV IN Pitch Goodness First song syll. mean pitch goodness IN Pitch GoodnessFirst IN < variable 17 MEAN +/− SE 127.2711+/−2.5544 128.3044[54.126 – 208.7475] CI [104.9437,149.5984] sign-rank p value = 0.14879
IN Pitch GoodnessFirst IN > variable 17 MEAN +/− SE 128.9532+/−2.6568 125.7771[56.5934 – 218.5388] CI [105.7313,152.175]

First song syll. mean pitch goodness IN Pitch GoodnessLastIN < variable 17 MEAN +/− SE 128.3615+/−2.5988 127.4243[56.1387 – 211.9064] CI [105.6465,151.0764] sign-rank p value = 0.46311
IN Pitch GoodnessLastIN > variable 17 MEAN +/− SE 127.8869+/−2.6044 125.9434[54.8024 – 215.287] CI [105.1226,150.6513]

Song Sequence Entropy IN Pitch GoodnessFirst IN < variable 17 MEAN +/− SE 0.1846+/−0.0068 MEDIAN [RANGE] 0.16901[0 – 0.39832] CI [0.12533,0.24383] sign-rank p value = 0.58789
IN Pitch GoodnessFirst IN > variable 17 MEAN +/− SE 0.1755+/−0.008 MEDIAN [RANGE] 0.18068[0 – 0.39237] CI [0.1057,0.24528]

Song Sequence Entropy IN Pitch GoodnessLastIN < variable 17 MEAN +/− SE 0.1904+/−0.0078 MEDIAN [RANGE] 0.19053[0 – 0.43366] CI [0.12214,0.25863] sign-rank p value = 0.39099
IN Pitch GoodnessLastIN > variable 17 MEAN +/− SE 0.174+/−0.0067 MEDIAN [RANGE] 0.1985[0 – 0.37024] CI [0.11542,0.23264]

First song syll. CV of Freq. modulation IN Pitch GoodnessFirst IN < variable 17 MEAN +/− SE 0.069+/−0.0036 0.05312[0.019788 – 0.29365] CI [0.037933,0.10008] sign-rank p value = 0.86841
IN Pitch GoodnessFirst IN > variable 17 MEAN +/− SE 0.0682+/−0.0034 0.052886[0.018332 – 0.27788] CI [0.038838,0.097469]

First song syll. CV of Freq. modulation IN Pitch GoodnessLastIN < variable 17 MEAN +/− SE 0.0661+/−0.0032 0.048979[0.015658 – 0.26418] CI [0.038071,0.094113] sign-rank p value = 0.13592
IN Pitch GoodnessLastIN > variable 17 MEAN +/− SE 0.071+/−0.0037 MEDIAN [RANGE] 0.0558[0.024032 – 0.30476] CI [0.038706,0.10325]

Motif duration mean IN Pitch GoodnessFirst IN < variable 17 MEAN +/− SE 540.118+/−8.0985 510.4636[347.9831 – 794.9476] CI [469.3327,610.9033] sign-rank p value = 0.19299
IN Pitch GoodnessFirst IN > variable 17 MEAN +/− SE 539.6061+/−8.0889 508.0754[348.1141 – 792.7513] CI [468.9045,610.3076]

Motif duration mean IN Pitch GoodnessLastIN < variable 17 MEAN +/− SE 539.7045+/−8.0856 510.0202[347.1989 – 795.2668] CI [469.0315,610.3774] sign-rank p value = 0.49246
IN Pitch GoodnessLastIN > variable 17 MEAN +/− SE 540.0411+/−8.1053 508.5743[348.9418 – 792.4381] CI [469.1962,610.8859]

CV IN Duration First song syll. mean pitch goodness IN DurationFirst IN < variable 17 MEAN +/− SE 128.4482+/−2.6013 125.6802[57.8534 – 207.1889] CI [105.711,151.1854] sign-rank p value = 0.79459
IN DurationFirst IN > variable 17 MEAN +/− SE 127.874+/−2.6139 125.9007[53.434 – 220.5358] CI [105.0267,150.7213]

First song syll. mean pitch goodnes IN DurationLastIN < variable 17 MEAN +/− SE 127.2207+/−2.623 125.0715[53.0123 – 211.6273] CI [104.2942,150.1472] sign-rank p value = 0.02772
IN DurationLastIN > variable 17 MEAN +/− SE 129.1233+/−2.5822 126.5474[58.4331 – 215.5743] CI [106.5534,151.6933]

Song Sequence Entropy IN DurationFirst IN < variable 17 MEAN +/− SE 0.1909+/−0.0068 MEDIAN [RANGE] 0.21116[0 – 0.39477] CI [0.13168,0.25002] sign-rank p value = 0.16772
IN DurationFirst IN > variable 17 MEAN +/− SE 0.1687+/−0.0076 MEDIAN [RANGE] 0.18119[0 – 0.39832] CI [0.10242,0.23496]

Song Sequence Entropy IN DurationLastIN < variable 17 MEAN +/− SE 0.1949+/−0.0083 MEDIAN [RANGE] 0.21285[0 – 0.437] CI [0.12244,0.26727] sign-rank p value = 0.14648
IN DurationLastIN > variable 17 MEAN +/− SE 0.1702+/−0.006 MEDIAN [RANGE] 0.19892[0 – 0.33124] CI [0.11807,0.22243]

First song syll. CV of Freq. modulation IN DurationFirst IN < variable 17 MEAN +/− SE 0.0703+/−0.0036 0.056273[0.021323 – 0.29779] CI [0.038726,0.10185] sign-rank p value = 0.20967
IN DurationFirst IN > variable 17 MEAN +/− SE 0.0663+/−0.0034 0.048202[0.018392 – 0.07986] CI [0.036368,0.096161]

First song syll. CV of Freq. modulation IN DurationLastIN < variable 17 MEAN +/− SE 0.0715+/−0.004 0.06134[0.021127 – 0.32388] CI [0.036949,0.10608] sign-rank p value = 0.52278
IN DurationLastIN > variable 17 MEAN +/− SE 0.0651+/−0.0028 0.050541[0.018461 – 0.23581] CI [0.040397,0.089759]

Motif duration mean IN DurationFirst IN < variable 17 MEAN +/− SE 540.0089+/−8.0909 508.0247[348.5905 – 793.3571] CI [469.2893,610.7285] sign-rank p value = 0.83131
IN DurationFirst IN > variable 17 MEAN +/− SE 539.7405+/−8.0971 510.8192[347.4729 – 794.3124] CI [468.9674,610.5135]

Motif duration mean IN DurationLastIN < variable 17 MEAN +/− SE 539.9145+/−8.1055 510.6374[347.7909 – 794.1965] CI [469.0675,610.7616] sign-rank p value = 0.79459
IN DurationLastIN > variable 17 MEAN +/− SE 539.8063+/−8.0804 507.8798[348.317 – 793.4885] CI [469.179,610.4336]

For IN properties (Time to song, frequency modulation, CV of pitch goodness and duration) and song properties (Motif duration, mean pitch goodness of first song syllable, sequence entropy of the motif, CV of frequency modulation of first song syllable) that change with age, changes within-day were compared. The bouts were split based on mean or CV of IN property into two groups and corresponding mean or CV of song property was measured within each bird. The paired-groups were tested for statistical differences using Wilcoxon sign-rank test.

IN acoustic features share similarities with other syllable repeats but speeding up of intervals between INs is unique to IN repeats

Our current results suggest that none of the IN properties, namely, number, timing and, acoustic features represent motor preparation for features of the upcoming first song syllable. Further, IN acoustic features co-vary with song syllable acoustic features. One possibility for this correlation in acoustic features is the fact that INs are also vocalizations like song syllables. In fact, a recent study has shown that mean IN number and IN acoustic features are learned by young zebra finches, similar to learning of song syllables (Kalra et al., 2021). Unlike most song syllables, INs repeat and this repetition may drive some of the changes in properties of INs as they progress to song. To examine this possibility, we next asked if the properties of INs are more similar to the properties of song syllables that repeat. Such song syllable repeats are present in a small fraction of zebra finches (Fig. 8A top; example spectrogram with repeat of the song syllable ‘c’). Zebra finches also repeat calls (other non-song vocalizations that can be produced outside of song bouts for communication) outside of their song bouts (Fig. 8A bottom). To compare repeats across these three categories, we chose birds with motif syllable repeats. Both mean IN number and the variability of IN number from bout to bout were not significantly different from the mean number and variability of repeats for other categories (Fig. 8B, p=0.049 Kruskal-Wallis test followed by post-hoc Tukey-Kramer test, p = 0.31 for IN vs. song and p =0.67 for IN vs. calls, Fig. 8C p=0.20, Kruskal-Wallis Test).

Fig. 8. Temporal progression of INs is not present in other repeats.

Fig. 8

(A) Top - example spectrogram of a bird that produces repeating INs ‘i’ and repeating syllable ‘c’ within the motif. Bottom – example of a bird producing repeats of call syllables ‘z’ outside song. (B-E) Comparison of repeat number (B left), repeat number variability (B right), repeat interval (C), median ratio of repeat interval (D), and median ratio of acoustic features of successive syllables (E) between different types of repeats (motif repeats, IN repeats or call repeats). Circles represent data from individual birds, squares and whiskers represent the mean and SEM across birds. In C, lines join data from the same repeat syllable across different positions. ** denotes p < 0.01 Wilcoxon signed-rank test in (C), * denotes p < 0.05, *** p denotes < 0.005, Kruskal-Wallis test, followed by posthoc Tukey-Kramer Test in (D). Dashed lines in D and E represent a ratio of 1.

Interestingly, intervals between repeat syllables showed interesting differences between the three categories of repeats. For all categories of repeats, we compared the mean interval between the first two syllables in the repeat sequence (first interval) and the last syllable of the repeat sequence and the next syllable in the bout (last interval). As shown earlier (Rajan and Doupe, 2013; Rao et al., 2019), for IN repeats, the last interval was significantly shorter than the first interval (p=0.004, Wilcoxon signed-rank test) and the opposite was true for call repeats with the last interval being significantly longer than the first interval (Fig. 8C, p=0.008, Wilcoxon signed-rank test). Interestingly, for motif syllable repeats, we observed differences depending on the position of the repeat within the song motif; for repeats that were the last syllable of the song motif, the last interval was significantly longer than the first interval (p=0.002, Wilcoxon signed-rank test), while for repeats that occurred at the beginning of the song motif, the last and first interval were not significantly different (Fig. 8C). Thus, the speeding up of intervals between successive INs was a unique feature of INs (Fig. 8D - ratio of successive intervals is less than 1 only for IN repeats; mean +/− SE - 0.65+/−0.03). The acoustic features of all repeats changed from first to last syllable of the repeat and was not significantly different for any of the categories (Fig. 8E, p=0.71), suggesting that changes in acoustic features were a property of syllable repetition. Overall, these results show that bout-to-bout variation in number of INs and variation in acoustic properties of successive INs are a property of all syllable repeats. However, the speeding up of intervals between successive INs is not a property of all syllable repeats but is unique to INs at the beginning of the bout.

Discussion

Here, we tested two predictions of the hypothesis that the variable repeats of INs that precede song represent motor preparation for the upcoming song. First, we examined bout-to-bout correlations between the first song syllable and three properties of INs, namely, (1) number before song, (2) timing of INs and (3) acoustic features of INs. Both number and timing of INs were not correlated with features of the first song syllable. Log amplitude and mean frequency of the last IN were correlated with the corresponding acoustic features of the first song syllable. However, acoustic features of the first song syllable were also correlated with those of the first IN and the second song syllable. This suggested that INs are also vocalizations that are part of the song sequence and IN-song correlations reflect global bout-to-bout variation in acoustic features. Second, we found an age-related increase in mean IN number along with an age-related increase in the speed of song. Age-related changes in INs and song were independent and were not predicted by pre-existing bout-to-bout IN-song correlations in the same sets of features. Finally, we compared IN repeats with other types of syllable repeats. Just like IN repeats, other syllable repeats also varied in number between bouts and varied in acoustic properties as the repeats progressed. The speeding up of intervals between INs with each successive repeat was unique to IN repeats, while the intervals between other kinds of syllable repeats remained constant or increased with each successive repeat. Overall, these results show that INs and song syllables share similarities in acoustic features that suggest shared neural control, while IN and song syllable timing are different highlighting possible differences in the neural control of timing.

Song control pathways may control changes in IN acoustic features

Our results show correlations between IN acoustic properties and the corresponding acoustic properties of the first song syllable. While, this is expected from the motor preparation hypothesis, we also found correlations between the first IN and the first song syllable and the first and second song syllables, suggesting the presence of global correlations between syllables within a bout. Similar correlations in the duration of syllables within a song motif have been described earlier (Glaze and Troyer, 2006). Our results show that such correlations extend to INs as well, suggesting that INs are also syllables similar to song syllables. Consistent with this idea, we also found changes in acoustic features for song syllables that repeat within song and non-song vocalizations that repeat outside of song sequences. These results suggest that INs and song syllables are controlled by the same neural circuitry that control song syllable production.

What areas of the brain control song syllable and IN production? Song syllable structure and timing are known to be controlled by the song motor pathway consisting of premotor nucleus HVC (used as a proper name) and its projection to the robust nucleus of the arcopallium (RA) (Hahnloser et al., 2002; Long and Fee, 2008; Fee and Scharff, 2010). There is some evidence that INs might also be controlled by the same circuitry, albeit with some differences. HVC neurons and RA neurons show temporally precise bursts of neural activity during song syllables (McCasland, 1987; Yu and Margoliash, 1996; Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Long et al., 2010) and some of these neurons also show precise bursts of neural activity during INs (Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Rajan and Doupe, 2013). However, some HVC neurons also show differences in activity depending on the position of INs within the sequence (Rajan and Doupe, 2013) similar to HVC firing in other songbird species with more complex songs with syllable repeats (Fujimoto et al., 2011; Cohen et al., 2020). Cooling of HVC slows down song syllables at multiple timescales and cooling of HVC also slows down INs, albeit by 70% of the extent of slowing down of song syllables (Long and Fee, 2008). Finally, complete bilateral lesions of HVC abolish the normal production of song syllables (Nottebohm et al., 1976; Simpson and Vicario, 1990; Aronov et al., 2008; Chen et al., 2014), but whether they entirely abolish INs is unclear. Outside of the song motor pathway, a recent study has also shown that lesions of the midbrain dopaminergic nucleus A11 makes birds completely mute for song (Ben-Tov et al., 2023). Given the soft amplitude of initial INs, it is unclear whether these birds produce a few INs that do not progress normally to song. Overall, our results and the results of earlier studies suggest that IN production is also partly controlled by the song motor pathway and partly controlled by neural circuits outside of the song motor pathway.

What controls the speeding up of gaps between INs?

Our results show that IN timing was not correlated with the timing of the upcoming song syllables. However, IN timing, characterized by a speeding up of intervals between successive INs, was not a feature of all other syllables that repeat. Infact, for other syllables that repeated, intervals between successive syllables remained the same or became longer. This suggests that the speeding up of intervals between INs could indicate the time to song initiation (Rao et al., 2019). In primates, one aspect of neural preparatory activity is the consistent, and strong, correlation with the time at which the movement is initiated (Kaufman et al., 2016). Similarly, the speeding up of intervals between INs could reflect the readiness of specific brain regions to initiate song.

Which brain regions could be involved in this process? One candidate region is the premotor nucleus HVC and its inputs. As mentioned above, HVC controls the temporal dynamics of syllable production and is necessary for normal song production (Nottebohm et al., 1976; Simpson and Vicario, 1990; Aronov et al., 2008; Long and Fee, 2008). Intracellular recordings have shown that HVCRA neurons (HVC neurons projecting to motor nucleus RA) are more depolarised during singing compared to non-singing and this depolarisation begins during the INs (Long et al., 2010). This depolarisation could potentially come from dopaminergic input from midbrain dopaminergic nucleus, A11. In support of the idea that this dopaminergic input is important for IN progression to song, dopaminergic blockers in HVC make birds sing strings of INs, without progressing to song, directed towards female birds (Ben-Tov et al., 2023).

How could this depolarisation of HVC neurons during INs drive IN progression to song? Initiation of song syllables is believed to be driven by thalamic input, from nucleus Uvaeformis (UVa), to HVC at the start of each song syllable (Moll et al., 2023). UVa neurons are also active at the onset of INs suggesting the possibility that UVa input drives HVC at the start of each IN (Danish et al., 2017). The depolarisation state of HVC could play a role in determining the time when UVa input successfully initiates an IN; more depolarised state in HVC, sooner HVC neurons are activated and this in-turn could lead to a shorter interval between INs. In this model, the speeding up of INs would reflect the gradual reduction in the time taken by HVC neurons to respond to UVa input. This model also predicts that INs would not be necessary when HVC neurons are already depolarised. This is supported by the fact that successive songs within a bout can be initiated without INs (Sossinka and Böhner, 1980; Rajan and Doupe, 2013).

Do INs have a preparatory function?

Our results do not completely support the predictions of the motor preparation hypothesis postulated based on neural activity in primates. This suggests that the parallels between INs reaching a consistent “state” before song initiation and primate preparatory neural activity reaching a consistent neural state before movement initiation are restricted to their correlations with the time to movement initiation. This hypothesis predicts that disruption of INs and IN-related activity in HVC would only delay song initiation without affecting the features of upcoming song. Further studies disrupting IN-related neural activity in HVC could be used to test this.

Second, our results showing the similarities in the acoustic features of INs and other repeated syllables suggest that INs might be a special form of repeating syllables that occur at the beginning of zebra finch song bouts. Other songbird species with more complex song bouts have syllable repeats that occur regularly within the song (Okanoya, 2004; Yarden et al., n.d.). However, two questions remain unanswered, namely, (1) do IN-like repeating syllables occur at the beginning of song bouts of other songbird species and (2) do the properties of IN repeats share similarities with syllables repeats that are present in the song sequences of other songbird species. Further comparative analysis of song bouts across different songbird species could be used to address these questions.

Overall, our results suggest that IN acoustic structure and repetition share similarities with song syllables, suggesting shared neural control of IN acoustic structure and repeat number. The speeding up of intervals between INs is a unique feature of INs and could reflect different neural mechanisms controlling the timing of INs.

Supplementary Material

Supplement 1

Extended Data Table supporting Figure 2

Statistical test results for data from Fig 2. - Fig. 8

media-1.xls (176KB, xls)

Acknowledgments

We would like to thank Prakash Raut for bird care. We thank Shikha Kalra, Ananya Kumar, Aditi Agarwal, Vishruta Yawatkar, Harsha K Kumar, Sharvari Tamhankar, Gaurav Isola for labeled songs or songs recorded at different ages, and Harini Suri for sharing head-implanted microphone recordings. We thank Allison Doupe and Michael Brainard for song recordings of birds with repeat syllables. We also thank Mimi Kao, Michael Long and members of the Rajan lab for useful discussions. DR is currently at Department of Neuroscience, University of Copenhagen, Blegdamsvej 3B, Denmark 2200, Copenhagen N

Funding sources:

This work was supported by a DBT/Wellcome Trust India Alliance Senior Fellowship (IA/S/21/1/505621), a Department of Biotechnology (DBT) Ramalingaswami Fellowship (BT/HRD/35/02/2006) and, grants from the Science and Engineering Research Board (EMR/2015/000829, CRG/2021/004690) and the Department of Science and Technology, Cognitive Science Research Initiative (DST/CSRI/2017/163) to RR, an National Institutes of Health Grant R01 MH55987 to Allison Doupe where RR collected some of the songs that were used in this study, graduate student fellowship from IISER Pune to DR, travel support from Department of Biotechnology, Ministry of Science and Technology-Conference, Travel, Exhibition and Popular Lectures (DBT/CTEP/02/2018 0847433) and the Infosys Foundation Travel Award (IISER-P/InfyFnd/Trv/116) to DR.

Footnotes

Conflict of Interest:

A. No - The authors report no conflict of interest

References

  1. Aronov D, Andalman AS, Fee MS (2008) A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science 320:630–634. [DOI] [PubMed] [Google Scholar]
  2. Ben-Tov M, Duarte F, Mooney R (2023) A neural hub for holistic courtship displays. Curr Biol 33:1640–1653. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Bottjer SW, Arnold AP (1984) The role of feedback from the vocal organ. I. Maintenance of stereotypical vocalizations by adult zebra finches. J Neurosci 4:2387–2396. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Brainard MS, Doupe AJ (2001) Postlearning consolidation of birdsong: stabilizing effects of age and anterior forebrain lesions. J Neurosci 21:2501–2517. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Chen JR, Stepanek L, Doupe AJ (2014) Differential contributions of basal ganglia and thalamus to song initiation, tempo, and structure. J Neurophysiol 111:248–257. [DOI] [PMC free article] [PubMed] [Google Scholar]
  6. Churchland MM, Cunningham JP, Kaufman MT, Ryu SI, Shenoy KV (2010) Cortical preparatory activity: representation of movement or first cog in a dynamical machine? Neuron 68:387–400. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Churchland MM, Santhanam G, Shenoy KV (2006a) Preparatory activity in premotor and motor cortex reflects the speed of the upcoming reach. J Neurophysiol 96:3130–3146. [DOI] [PubMed] [Google Scholar]
  8. Churchland MM, Yu BM, Ryu SI, Santhanam G, Shenoy KV (2006b) Neural variability in premotor cortex provides a signature of motor preparation. J Neurosci 26:3697–3712. [DOI] [PMC free article] [PubMed] [Google Scholar]
  9. Cohen Y, Shen J, Semu D, Leman DP, Liberti III WA, Perkins LN, Liberti DC, Kotton DN, Gardner TJ (2020) Hidden neural states underlie canary song syntax. Nature 582:539–544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Danish HH, Aronov D, Fee MS (2017) Rhythmic syllable-related activity in a songbird motor thalamic nucleus necessary for learned vocalizations. PloS One 12:e0169568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  11. Fee MS, Scharff C (2010) The songbird as a model for the generation and learning of complex sequential behaviors. ILAR J Natl Res Counc Inst Lab Anim Resour 51:362–377. [DOI] [PubMed] [Google Scholar]
  12. Fujimoto H, Hasegawa T, Watanabe D (2011) Neural Coding of Syntactic Structure in Learned Vocalizations in the Songbird. J Neurosci 31:10023–10033. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Glaze CM, Troyer TW (2006) Temporal structure in zebra finch song: implications for motor coding. J Neurosci 26:991–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Glaze CM, Troyer TW (2013) Development of temporal structure in zebra finch song. J Neurophysiol 109:1025–1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
  15. Guo ZV, Li N, Huber D, Ophir E, Gutnisky D, Ting JT, Feng G, Svoboda K (2014) Flow of cortical activity underlying a tactile decision in mice. Neuron 81:179–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  16. Hahnloser RHR, Kozhevnikov AA, Fee MS (2002) An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419:65–70. [DOI] [PubMed] [Google Scholar]
  17. Immelmann K (1969) Song development in the zebra finch and other estrildid finches. Bird Vocalizations:61–77. [Google Scholar]
  18. James LS, Sakata JT (2019) Developmental modulation and predictability of age-dependent vocal plasticity in adult zebra finches. Brain Res 1721:146336. [DOI] [PubMed] [Google Scholar]
  19. Kalra S, Yawatkar V, James LS, Sakata JT, Rajan R (2021) Introductory gestures before songbird vocal displays are shaped by learning and biological predispositions. Proc Biol Sci 288:20202796. [DOI] [PMC free article] [PubMed] [Google Scholar]
  20. Kaufman MT, Seely JS, Sussillo D, Ryu SI, Shenoy KV, Churchland MM (2016) The Largest Response Component in the Motor Cortex Reflects Movement Timing but Not Movement Type. eNeuro 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  21. Kozhevnikov AA, Fee MS (2007) Singing-related activity of identified HVC neurons in the zebra finch. J Neurophysiol 97:4271–4283. [DOI] [PubMed] [Google Scholar]
  22. Li N, Daie K, Svoboda K, Druckmann S (2016) Robust neuronal dynamics in premotor cortex during motor planning. Nature 532:459–464. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Long MA, Fee MS (2008) Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456:189–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Long MA, Jin DZ, Fee MS (2010) Support for a synaptic chain model of neuronal sequence generation. Nature 468:394–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
  25. Mandelblat-Cerf Y, Fee MS (2014) An automated procedure for evaluating song imitation. PloS One 9:e96484. [DOI] [PMC free article] [PubMed] [Google Scholar]
  26. McCasland JS (1987) Neuronal control of bird song production. J Neurosci Off J Soc Neurosci 7:23–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
  27. Moll FW, Kranz D, Corredera Asensio A, Elmaleh M, Ackert-Smith LA, Long MA (2023) Thalamus drives vocal onsets in the zebra finch courtship song. Nature 616:132–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Nottebohm F, Stokes TM, Leonard CM (1976) Central control of song in the canary, Serinus canarius. J Comp Neurol 165:457–486. [DOI] [PubMed] [Google Scholar]
  29. Okanoya K (2004) The Bengalese finch: a window on the behavioral neurobiology of birdsong syntax. Ann N Y Acad Sci 1016:724–735. [DOI] [PubMed] [Google Scholar]
  30. Price PH (1979) Developmental determinants of structure in zebra finch song. J Comp Physiol Psychol 93:260. [Google Scholar]
  31. Pytte CL, Gerson M, Miller J, Kirn JR (2007) Increasing stereotypy in adult zebra finch song correlates with a declining rate of adult neurogenesis. Dev Neurobiol 67:1699–1720. [DOI] [PubMed] [Google Scholar]
  32. Rajan R, Doupe AJ (2013) Behavioral and neural signatures of readiness to initiate a learned motor sequence. Curr Biol CB 23:87–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
  33. Rao D, Kojima S, Rajan R (2019) Sensory feedback independent pre-song vocalizations correlate with time to song initiation. J Exp Biol 222. [DOI] [PMC free article] [PubMed] [Google Scholar]
  34. Riehle A, Requin J (1989) Monkey primary motor and premotor cortex: single-cell activity related to prior information about direction and extent of an intended movement. J Neurophysiol 61:534–549. [DOI] [PubMed] [Google Scholar]
  35. Rosenbaum DA (1980) Human movement initiation: Specification of arm, direction, and extent. J Exp Psychol Gen J Exp Psychol Gen 109:444. [DOI] [PubMed] [Google Scholar]
  36. Roy A, Mooney R (2007) Auditory Plasticity in a Basal Ganglia–Forebrain Pathway during Decrystallization of Adult Birdsong. J Neurosci 27:6374–6387. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Shenoy KV, Kaufman MT, Sahani M, Churchland MM (2011) A dynamical systems view of motor preparation: implications for neural prosthetic system design. Prog Brain Res 192:33–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Simpson HB, Vicario DS (1990) Brain pathways for learned and unlearned vocalizations differ in zebra finches. J Neurosci 10:1541–1556. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Sossinka R, Böhner J (1980) Song Types in the Zebra Finch Poephila guttata castanotis1. Z Für Tierpsychol 53:123–132. [Google Scholar]
  40. Suri H, Rajan R (2018) Distance-dependent changes in courtship song amplitude reflect song state changes. bioRxiv:277210. [Google Scholar]
  41. Svoboda K, Li N (2018) Neural mechanisms of movement planning: motor cortex and beyond. Curr Opin Neurobiol 49:33–41. [DOI] [PubMed] [Google Scholar]
  42. Tchernichovski Nottebohm, Ho Pesaran, Mitra (2000) A procedure for an automated measurement of song similarity. Anim Behav 59:1167–1176. [DOI] [PubMed] [Google Scholar]
  43. Vicario DS (1991) Contributions of syringeal muscles to respiration and vocalization in the zebra finch. J Neurobiol 22:63–73. [DOI] [PubMed] [Google Scholar]
  44. Williams H, McKibben JR (1992) Changes in stereotyped central motor patterns controlling vocalization are induced by peripheral nerve injury. Behav Neural Biol 57:67–78. [DOI] [PubMed] [Google Scholar]
  45. Yarden C, Jun S, Dawit S, Daniel PL, William AL 3rd, Derek CL, Darrell NK, Timothy JG (n.d.) Hidden Neural States Underlie Canary Song Syntax. Nature. [DOI] [PMC free article] [PubMed] [Google Scholar]
  46. Yu AC, Margoliash D (1996) Temporal hierarchical control of singing in birds. Science 273:1871–1875. [DOI] [PubMed] [Google Scholar]
  47. Zann RA (1996) Zebra Finch. Oxford. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

Extended Data Table supporting Figure 2

Statistical test results for data from Fig 2. - Fig. 8

media-1.xls (176KB, xls)

Articles from bioRxiv are provided here courtesy of Cold Spring Harbor Laboratory Preprints

RESOURCES