Introductory notes before zebra finch song have unique timing properties while sharing acoustic properties with song

Divya Rao; Raghav Rajan

doi:10.1101/2025.05.19.654952

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

[Preprint]. 2025 May 21:2025.05.19.654952. [Version 1] doi: 10.1101/2025.05.19.654952

Introductory notes before zebra finch song have unique timing properties while sharing acoustic properties with song

Divya Rao ¹, Raghav Rajan ^1,²

PMCID: PMC12139792 PMID: 40475594

Abstract

Preparatory neural activity precedes the initiation of simple movements and a key feature of this preparatory activity is its trial-by-trial correlation with features of the upcoming movement. Recent studies in the zebra finch, a songbird with a complex, naturally learned, movement sequence (song), have suggested that the repeats of short introductory notes (INs) at the start of each song bout, reflect motor preparation. However, whether IN properties correlate with upcoming song features remains poorly understood. Here, we addressed this question by recording and analyzing male zebra finch songs over a 3 year period. We found bout-to-bout correlations in the acoustic features of the last IN and the first song syllable. However, similar correlations were present between the first song syllable and the first IN and the first and second song syllable suggesting that INs are also part of the song sequence. Next, we found an age-related increase in the mean IN number before song and song tempo. If INs reflected preparation of song parameters, we expected age-related song changes to be predicted by IN-song correlations at a younger age. We did not find any such correlations. Finally, we compared INs to other repeated syllables within song and outside song bouts and found that the speeding up of intervals between successive INs is unique to INs. Overall our results showing similarities in the acoustic features of INs and song syllables suggest shared neural control of INs and song syllables, while differences in timing suggest different neural mechanisms controlling IN timing.

Significance Statement

Simple movements are believed to be “prepared” in the brain before execution and this preparatory neural activity is correlated with features of the upcoming movement. Recent studies have suggested that the short introductory notes before the complex song sequence of the zebra finch reflect motor preparation. Whether correlations exist for introductory notes and upcoming song remains poorly understood. Here we found bout-to-bout correlations between the acoustic properties of introductory notes and the first song syllable but our analyses suggest that this reflects shared neural control of the acoustics of both introductory notes and song syllables. We also found differences in the timing of introductory notes suggesting different neural mechanisms for controlling introductory note timing.

Introduction

How does the brain initiate a movement? Current research suggests a “preparatory” period before movement initiation when neural activity converges on a consistent initial state (Churchland et al., 2006b, 2010; Shenoy et al., 2011; Li et al., 2016; Svoboda and Li, 2018). Support for this hypothesis comes from delayed reaching tasks where subjects (humans or other animals) are trained to execute simple movements like reaching for an object. At the start of a trial, an instruction stimulus is provided that tells subjects where to move, but subjects are expected to withhold their movements until a “GO” cue is provided. The “GO” cue is presented after a variable delay period from presentation of the instruction stimulus. The reaction time, measured as time between GO cue presentation and movement initiation, is shorter when the delay period is longer suggesting the need for a time to “prepare” the movement (Rosenbaum, 1980; Riehle and Requin, 1989; Churchland et al., 2006b). Premotor neural activity, recorded during this task, shows changes during the delay period (Churchland et al., 2006b; Guo et al., 2014) and an important feature of delay period neural activity is a reduction in variability across trials; more variable activity at the beginning of the delay period to less variable activity just before movement onset (Fig. 1A). Thus, for simple reaching movements, the brain appears to prepare by bringing premotor activity to a consistent initial state from which the correct patterns of movement related activity are produced immediately after the GO cue (Shenoy et al., 2011; Svoboda and Li, 2018).

Fig. 1 — (A) Simple movement in monkeys: movement of finger from a central fixation point to a target as part of delayed reaching tasks. The neural activity on each trial (represented by the black traces), during the preparatory period, converges on a consistent point before movement onset. (B) Zebra finch song is an example of a complex, naturally learned movement. ‘i’s represent Introductory Notes (INs), ‘ab’ repesent song sequence. Properties of INs converge on a consistent state before the onset of first song syllable ‘a’. (C) If INs represent preparation for song, properties of INs should be correlated to properties of upcoming song. ‘iiiab’ represent IN-SONG sequence, and size of letters represent features that are correlated within a day. (D) If INs represent preparation for song, age-related changes to INs and SONGs should be predicted by within-day correlations.

What is the nature of motor preparation before more complex, naturally learned, motor sequences? A well-studied example of a complex, naturally learned, motor sequence is the song of the adult male zebra finch, a songbird (Fee and Scharff, 2010). Song is a sequence of sounds interleaved by silent gaps (Fig. 1B, ‘a’ and ‘b’ represent the song sequence). It is part of the courtship ritual of the male zebra finch (Sossinka and Böhner, 1980; Zann, 1996; Fee and Scharff, 2010) and is naturally learned by young birds (Immelmann, 1969; Price, 1979; Zann, 1996). Song bouts typically begin with a variable number of repetitions of a short sound called an introductory note (IN marked as ‘i’ in Fig. 1B) (Price, 1979; Sossinka and Böhner, 1980). As INs repeat, the timing and acoustic features change; intervals between successive INs become shorter and more stereotyped and the acoustic features reach a stereotyped “state” just before the start of the first song (Rajan and Doupe, 2013). This progression is unaffected immediately after deafening or peripheral nerve cuts (Rao et al., 2019). This sensory-feedback independent progression from a variable first IN to a stereotyped last IN is similar to changes in delay-period neural activity suggesting that INs reflect motor preparation for song initiation.

An important feature of neural preparatory activity is its correlation with features of the upcoming movement. In primates, preparatory neural activity is different for different movements; for eg. preparatory neural activity occupies different initial states for fast and slow reaching movements. It also correlates with movement parameters on a trial-by-trial basis; within all fast reaching movements, preparatory neural activity is correlated with movement speed on that trial (Churchland et al., 2006a). Whether zebra finch INs also correlate with the upcoming song in a similar fashion remains poorly understood. Here, we addressed this by recording and analyzing songs from multiple adult, male, zebra finches across a 3 year period. Specifically, we first tested for the existence of bout-to-bout correlations between IN properties and song properties (Fig. 1C). Second, previous studies have shown age-related changes in song properties (Brainard and Doupe, 2001; Pytte et al., 2007; Glaze and Troyer, 2013; James and Sakata, 2019). If IN properties are correlated with song properties, one would also expect age-related changes in IN properties. Additionally, bout-to-bout correlations between INs and song properties, at a younger age, should predict age-related changes in IN and song properties (Fig. 1D). We tested for these by analyzing songs from the same birds at multiple different ages.

Materials and Methods

All experiments at [Author University 1] were approved by Institute Animal Ethical Committee in accordance with the guidelines of the Committee for the Control and Supervision of Experiments on Animals [Author Country]. All song recording procedures for songs recorded at [Author University 2] were approved by the [Author University 2] Institutional Animal Care and Use Committee in accordance with NIH guidelines. We recorded songs from a total of 46 adult male birds (age > 90 days post hatch) with some purchased from an outside local vendor (n=19) and some bred at [Author University 1] (n=21) or [Author University 2] (n=6). The age of all purchased birds was assumed to be 60 days post hatch (dph) as they had red beaks (beaks are black for birds < 60 dph). At times when birds were not being recorded, they were housed in large cages with 5–8 other birds in a bird colony maintained with a 14h/10h light/dark cycle. Food and water were provided at all times.

SONG RECORDING

Birds were isolated from the colony and placed in separate cages in a sound attenuation enclosure (NewTech Acoustic Systems, Bangalore) maintained at 14h light / 10 h dark cycle. A microphone (AKG Acoustics C417PP) was clipped on the roof of the cage to record song. Song recordings were either in ‘triggered’ or ‘continuous’ mode. Briefly, in ‘triggered’ mode periods of recordings that crossed a pre-set threshold were saved along with an additional 1–3 seconds of data flanking this period on either side. In ‘continuous’ mode, audio files were saved continuously for the entire recording period. All data was recorded and saved to disk at 44100 Hz sampling rate using custom written software (Python or Matlab). The different sets of birds used for different analyses and the overlap between these sets are explained below.

BIRDS USED FOR ANALYSIS OF IN-SONG CORRELATIONS

20 birds were recorded on multiple days (median 3 days [range 2–8 days] sessions - put median sessions here done) in the age range from 89 – 1087 dph. As far as possible, we maintained the same position of the microphone for a given bird on different days of recording. A subset of birds and sessions (2 session each from 14/20 birds) overlapped with those analyzed for day-to-day changes in IN properties in a previous study. One bird (1/20) overlapped with the birds used for ts-cut surgery in that same study, but the recording sessions analyzed here are different and are from well before the surgical procedure. A subset of birds (5/20) were recorded earlier at [Author University 2] and have sessions from two nearby days, <5 days apart, included in this analysis. For characterizing IN changes towards song (Fig. 2) and bout-to-bout correlations of INs and song (Table 1 and Fig. 3) songs recorded from one session (<1 yr of age) in each bird were analyzed. Multiple sessions were analyzed for age related changes in INs and songs (Fig. 4–6). The difference between days of recording spanned a wide range (1–812 days) and the time of recording varied across sessions for the same bird.

Fig. 2 — (A-G) Comparison of properties between First and Last IN position (gray) and the first syllable of the upcoming song (blue). Circles joined by lines represent data from individual birds, boxes represent group data across birds. Column 1 shows mean and column 2 shows CV or Coefficent of Variation for intervals (A), syllable acoustic properties namely duration (B), mean frequency (C), entropy (D), log amplitude (E), pitch goodness (F), frequency modulation (G) respectively. * denotes p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.005, Repeated Measures ANOVA followed by post-hoc Tukey-Kramer test.

TABLE 1.

Correlation between IN and song syllables

		Unmanipulated birds								Surgically manipulated birds
		Suspended microphone (n=20)				Head fixed microphone (n=4)				Ts Cut (n=8)
		IN number	First IN	Last IN	Second Song syllable	IN Number	First IN	Last IN	Second Song syllable	IN Number	First IN	Last IN	Second Song syllable
First Song Syllable	Interval	2/20 [0.027564 – 0.11579]	1/20 [−0.040692 -]	6/20 [0.055634 – 0.20126]	5/15 [0.062064 – 0.69446]	0/4 [-]	1/4 [−0.094595 -]	0/4 [-]	0/3 [-]	2/7 [0.020996–0.084361]	1/7 [−0.060756 -]	3/7 [0.077037 – 0.20297]	5/6 [0.059212 – 0.31928]
	Duration	2/16 [0.055525 – 0.14306]	2/16 [0.050712 – 0.16412]	4/16 [0.07711 – 0.32075]	9/16 [0.049804 – 0.59583]	0/4 [-]	1/4 [−0.10509 -]	0/4 [-]	1/4 [−0.41001 -]	2/8 [0.069903–0.094664]	5/7 [0.019376 – 0.32247]	3/7 [0.11523 – 0.15811]	5/8 [0.066957 – 0.32461]
	Mean Frequency	4/16 [0.037527 – 0.17197]	10/16 [0.07751 – 0.43468]	11/16 [0.097598 – 0.53551]	11/16 [0.086829 – 0.61189]	0/4 [-]	0/4 [-]	2/4 [0.2547 – 0.26094]	3/4 [0.18096 – 0.2585]	2/8 [0.026057–0.035218]	6/7 [0.085925 – 0.17935]	6/7 [0.085449 – 0.32444]	7/8 [0.17162 – 0.59137]
	Entropy	1/16 [−0.065818 -]	6/16 [0.043557 – 0.43134]	7/16 [0.089148 – 0.40611]	7/16 [0.033528 – 0.39943]	0/4 [-]	0/4 [-]	2/4 [0.21301 – 0.23058]	2/4 [0.13476 – 0.37133]	3/8 [0.033457–0.098931]	4/7 [0.075541 – 0.35578]	5/7 [0.074278 – 0.53467]	7/8 [0.077904 – 0.69269]
	Log Amplitude	0/16 [-]	15/16 [0.1022 – 0.71674]	16/16 [0.086365 – 0.80228]	16/16 [0.10257 – 0.90177]	0/4 [-]	1/4 [−0.25684 -]	3/4 [0.095298 – 0.29403]	3/4 [0.10264 – 0.23485]	3/8 [0.070799–0.091607]	7/7 [0.18765 – 0.64247]	7/7 [0.43535 – 0.83514]	8/8 [0.46317 – 0.78988]
	Pitch Goodness	1/16 [−0.13814 -]	4/16 [0.093954 – 0.45981]	7/16 [0.082209 – 0.40086]	7/16 [0.039365 – 0.25201]	0/4 [-]	1/4 [−0.55047 -]	0/4 [-]	0/4 [-]	3/8 [0.022745–0.073854]	6/7 [0.1454 – 0.69199]	7/7 [0.16424–0.82966]	8/8 [0.31519 – 0.88773]
	Frequency Modulation	1/16 [−0.070169 -]	2/16 [0.039416 – 0.079888]	3/16 [0.054115 – 0.075381]	2/16 [0.12817 – 0.14779]	0/4 [-]	0/4 [-]	0/4 [-]	1/4 [−0.14862 -]	2/8 [0.053764–0.067706]	4/7 [0.042636 – 0.16752]	3/7 [0.099547 – 0.36539]	5/8 [0.16071 – 0.22946]

Open in a new tab

Timing and acoustic properties of first song syllable (see Row Names) were correlated to IN number, or corresponding timing and acoustic properties for first IN position, last IN position or second song syllable (see Column names). Pearson correlation coefficient was calculated after removing outliers for one session in each bird and considered significant if p ≤ 0.05. Each cell indicates the proportion of birds with significant correlations along with the range of R-squared values for significant correlations, within parenthesis. Highlighted cells indicate properties for which significant correlation coefficients were observed in at least half of the birds. The subsequent columns repeat the analysis for birds with head-implanted microphone and ts-nerve cut birds.

Fig. 3 — (A-C) Example session from one bird that shows significant positive correlations for Log Amplitude of first syllable in the (first) motif along y-axis with (A) First IN position, (B) Last IN position and (C) second syllable respectively along x-axis. Circles represent data from individual trials or bouts. p ≤ 0.05, Pearson correlation coefficient calculated after removing outliers.

Fig. 4 — IN number, timing and acoustic properties (duration, mean frequency, entropy, log amplitude, pitch goodness and frequency modulation for a syllable) at First IN, Last IN and second song syllable positions were correlated to timing and acoustic properties at first song syllable position across bouts. Each circle represents r-values for bout-to-bout correlations in one bird. Lines connecting circles represent data from the same bird bird. circles with a ‘*’ represent significant correlations and unfilled circles represent non-significant correlations. Proportion of birds with significant correlations are indicated on top and marked with red if proportions were ≥ 0.5. A property is considered significantly correlated if proportions are ≥ 0.5. Correlation strengths across pairs of groups were compared using sign-rank by including only significant correlation values. Correlations to First IN were compared with correlations to last IN, correlations to last IN were compared with correlations to second song syllable. * denotes p ≤ 0.05, ** denotes p ≤ 0.01, Wilcoxon signed-rank test.

Fig. 6 — (A - D) Changes in Motif similarity (A), motif duration (B), IN number (C) and time-to-song (D) are plotted across different age categories. Circles represent change in mean property between pairs of days for individual birds. All pairs of days are the same pairs of days selected for maximum change in IN number in Fig. 5. Square and whiskers represent median and interquartile range for the 3 age groups. * denotes p ≤ 0.05 KruskalWallis ANOVA followed by Tukey-Kramer test.

DATA OVERLAP WITH PREVIOUS STUDIES

Songs of birds from [Author University 2] were recorded as part of a different study characterizing INs. Besides characterizing the properties of INs in every sequence and the associated neural activity during INs, the study also correlated the similarity of pairs of last INs to pairs of first song syllables. However, the direct relationship between individual properties of INs and songs was not compared. Here, we used these songs to directly compare the properties of INs and songs.

The IN-song correlation analysis was repeated for one session of undirected song each from 5 birds recorded earlier with head-implanted microphones for a different study. Head-implanted microphone recordings control for amplitude measurement differences in suspended microphone recordings, that may arise due to changes in relative position of the bird and the microphone across trials. IN-song correlations of acoustic properties with head-fixed microphones were analyzed as a control to rule out the possibility that IN-song correlations were a result of differences in relative position of the bird from the microphone. IN-song correlations have not been analyzed as part of the original study.

IN-song correlation analysis was repeated for one session of undirected song each from 8 birds recorded after ts-nerve (tracheosyringeal nerve) cut surgery as part of an earlier study. This data provided an advantage to look at IN-song acoustic correlations in the absence of properties related to syllable identity as these are lost following ts-nerve cut manipulation. IN-song correlations were not analyzed as part of the original study.

DATA FOR REPEAT SYLLABLES

Songs of 17 birds in the colony were identified that specifically repeated at least one syllable within the motif. One session each from these birds was used to compare the syllable repetition properties among the different types of repeated syllables produced by the bird namely INs, motif syllable repeats and calls. One session from one bird was also analyzed for IN-song correlations. One of the birds overlapped with ts-nerve cut birds, but the songs analyzed were from before surgery. One of the birds recorded in this set was also bred and recorded earlier at [Author University 2].

DATA ANALYSIS

All the analyses were performed using custom-written scripts in MATLAB (www.mathworks.com). Audio files were processed and vocalizations were labeled. Briefly, audio files were segmented into syllables based on a user-defined amplitude threshold, syllables were defined as segments greater than 10 ms and inter-syllable gap shorter than 5ms were merged. Labels were assigned to syllables in a semi-automatic manner. Automatic labels were assigned using a modified template-matching procedure (Glaze and Troyer, 2006) or Klustakwik clustering (http://klustakwik.sourceforge.net/) of acoustic features calculated using Sound Analysis Pro (https://soundanalysispro.com/matlab-sat). Labels were then manually checked for all files. Files were split into bouts and all bouts containing songs (motifs) were selected for analysis. Typically these bouts consisted of repeating IN sequences before the first song.

Defining a bout and bout interval criteria

A bout was defined as a period of vocalizations separated by 2 seconds of silence. This criteria was applied to select bouts from recordings in ‘continuous’ mode. However, many audio files recorded in ‘triggered’ mode did not have 2 seconds of silence before the first syllable in the file. This occurred when initial vocalizations in the bout were soft and the set trigger threshold was crossed by a later vocalization. Such audio files had less than 2 seconds of silence before the first syllable in the file. However, we assumed that there was silence before the start of the file as the set trigger-threshold was not crossed and so we included such files in our analysis. The bout criteria for triggered recordings was reduced to include enough audio files (>15 bouts) for analysis and the criteria ranged from 500–1500 ms only for the beginning of the file. Within a file, we still considered 2000ms of silence and we always considered 2000ms of silence for continuous recordings. across birds. For a given bird, we maintained the same bout criteria across sessions. This was important as we observed that mean IN number depended on bout interval criteria; as shown previously, mean IN number reduced a little when considering a bout criteria of < 2s (Sossinka and Böhner, 1980; Rajan and Doupe, 2013). However, mean IN number between days was not different when the same bout interval criteria was applied.

Determining minimum number of bouts for IN analysis

The number of INs varied across bouts and the number of bouts sung by birds varied across sessions and birds (median 73.5 song bouts: range 4 – 430). It was important to characterize how the number of bouts affected the estimate of mean IN number for a session, and to determine the minimum number of bouts beyond which the estimate did not change considerably. For all sessions with >100 song bouts (30 sessions), a range of different number of bouts were sub-sampled and the corresponding average standard deviation across 1000 iterations of Monte Carlo simulations was calculated. The minimum number of bouts for IN analysis was identified as the number of bouts beyond which increasing the number of bouts did not change the average standard deviation by more than 0.01. The standard deviation met our criteria by 10 bouts for 18/30 sessions and by 15 bouts for 28/30 sessions. Hence, across all days and birds, only sessions with at least 15 song bouts were selected for IN analysis. This excluded two sessions from a bird (1/20) with < 15 song bouts.

Calculating IN and song properties

For analyzing IN and song properties across trials, only INs in the beginning of the bout and the first motif, that followed immediately after the INs, was considered, unless otherwise specified. For each song bout, the number of INs was calculated preceding the first motif. The timing and acoustic properties of INs and song were calculated as follows: timing of IN for first position or first interval was the time interval between offset of first IN and onset of second IN. Timing of IN for last position or the last interval was calculated as the time interval between offset of last interval and the onset of the first song syllable. Song timing for the first position in the motif was calculated as the interval between offset of first motif syllable and onset of second motif syllable. Song timing for the second position in the motif was calculated as the interval between offset of second motif syllable and onset of third motif syllable. Similar to timing, the acoustic properties of INs were calculated for first and last IN and for the first and second motif syllable. For comparisons of acoustic properties involving changes in IN from first to last position, birds with two types of INs in the IN sequence were excluded as the syllable identity in these birds would be different for first and last IN. For comparisons of intervals involving first and second song position, birds with only two syllables in the motif were excluded as there were no second song intervals.

Acoustic properties of syllables

The acoustic properties for IN and song syllables were measured as described by Sound Analysis Pro 2011 (Tchernichovski et al., 2000) (http://soundanalysispro.com/) using SAP matlab code (https://soundanalysispro.com/matlab-sat). These basic features reduce dimensionality of the complex sound spectrogram for analysis and make it easier to understand the changes in different aspects of sound. A brief intuitive understanding of the features is described below:

Log Amplitude: It measures the loudness of a syllable. The intensity or power in audio signal is measured relative to an arbitrary baseline for silence. It is reported in log scale units or dB.
Duration: Time from onset to offset of the syllable. Onset and offset are determined based on an amplitude threshold that distinguishes sound from silence in the audio signal. It is measured in units of seconds or milliseconds.
Mean Frequency: Audio signal can be decomposed into different frequencies present in the signal. The mean frequency is a pitch measure that assesses the center of the distribution of intensity across different frequencies. It is measured in Hz. Since mean frequency is dependent on intensity, it is an amplitude related property.
Pitch goodness: It measures the periodicity of harmonic pitch (frequency stacks) in the syllable. Syllables with harmonics or ‘frequency stacks’ (as observed in a spectrogram) have higher value, and syllables that are noisy or pure tone have lower value.
Weiner entropy or entropy: It measures the noisiness in the syllable. A noisy syllable appears as broadband or white noise in the spectrogram with intensity spread equally across all frequencies. White noise has a value of 1 and pure tone (sound intensity concentrated at one frequency) has a value of 0. However, this is converted to log-scale of 0 to minus infinity. More noisy syllables will have entropy 0. Since entropy is dependent on intensity, it is an amplitude related property.
Frequency Modulation: It is an estimate of slope of the frequency trace on the spectrogram, measured in degrees. Steeper the slope, higher the modulation.

Analyzing age-related changes in IN and song properties

Changes in IN and song properties between two different days were measured by subtracting the mean or variability calculated for the earlier of the two days from those measured on the later of the two days. For changes in INs between days, in addition to properties related to IN number, IN timing and IN acoustic properties, the time-to-song was also calculated. Time-to-song was defined as the duration of the IN sequence from onset of the first IN to onset of the first song syllable. To quantify song changes with age, parameters of song that have earlier been reported to change with age were calculated. These included motif duration and song similarity (Brainard and Doupe, 2001; Pytte et al., 2007; Glaze and Troyer, 2013; James and Sakata, 2019). Motif duration was measured as the time from the onset of the first motif syllable to the offset of the last motif syllable. Song similarity was calculated as average similarity index between pairs of motifs from a recording day. Ten motifs of a fixed syllable sequence were randomly selected across all motifs, and each of the ten motifs was compared to the 9 other motifs using a similarity index algorithm (Mandelblat-Cerf and Fee, 2014), followed by taking the mean for all pairs. The similarity index gives a measure of how similar songs on a given day were to each other. Similar to other properties, changes between days was measured by subtracting the similarity index of an earlier day from a later day. Other song properties related to song sequence and individual timing and acoustic features of first song syllable were also measured for change between days. To compare song spectral structure changes between pairs of days we measured the average similarity index of 10 randomly selected motifs on the two days. To compare song temporal structure between days the temporal pattern of the motif sequence was obtained by taking the amplitude profile and replacing this with ‘1s’ for the duration of syllables and ‘0s’ for the duration of intervals between syllables. Two patterns were compared up to the duration of the shorter sequence using cross correlation. Similar to comparison of spectral structure, 10 randomly selected motifs each from the two days were compared all-to-all and averaged. Sequence Consistency and Entropy were measured as described in an earlier study (James and Sakata, 2019). The analysis was restricted to the first motif sequence in the bout. For each motif sequence one additional syllable at both ends was included for calculating sequence consistency and entropy. Sequence Consistency (SC) measured the consistent transitions between syllables of the motif and was calculated as the ratio of most typical transitions divided by total transitions. Sequence Entropy(SE) measured the variability of syllable transitions, and was calculated as the sum of transition probabilities $(p_{i})$ of all transitions from a syllable using the formula $Σ - p_{i} \times {l o g}_{2} (p_{i})$ . Higher the value, more variable the transition. The change between pairs of days was measured by subtracting the values of an earlier day from a later day.

Age-related changes were calculated as changes in mean or variability of properties between pairs of days belonging to either of the 3 age-groups: (1) pairs of days < 5 days apart as short-term day-to-day change, (2) pairs of days with first day <1 year and second day > 1yr as long-term change in the first year and (3) pairs of days with both days > 1yr and more than 5 days apart as long-term changes after one year of age. If a bird was recorded for more than one day within an age-group, we compared all pairs of days with each other and for each age-group category, we chose pairs of days with the maximum change in IN number. We reasoned that this would give us the maximum chance of detecting changes, if there were any. The same pairs of days were used to calculate age-related changes in other IN and song properties to understand the extent of age-related changes in other properties when IN number changes were maximum.

Comparison of IN properties with motif duration and similarity within a session

To compare IN properties with motif duration and motif similarity within a day, we divided the bouts based on IN number at the start and then calculated the average motif duration and song similarity for bouts with same number of INs. To account for individual variation across birds, the number of INs was normalized to median IN number and motif duration was normalized to the corresponding median motif duration. We then asked whether song properties were different for the different IN numbers by comparing song properties across the different IN numbers. As another measure of number of INs, we also compared song changes with changes in the time-to-song, measured as the time from the start of the first IN to the start of the first motif syllable. For this, we split bouts into two groups based on the median value of time-to-song. The two groups were then compared across birds for differences in mean song properties corresponding to shorter or longer timing of INs. Other song properties chosen based on significant changes with age such as song sequence entropy, mean pitch goodness of first song syllable, and variability of frequency modulation of first song syllable were also compared corresponding to shorter or longer IN song timing. Similar analyses were done for comparing song properties corresponding to trials split based on properties of first or last IN position before song, namely lower and higher IN frequency modulation, less and more variable IN duration, less and more variable pitch goodness.

Analysis of repeat syllables

Repeat syllables belonging to either motifs, INs or calls were analyzed for each bird. Call repeats outside of song were considered as these were more common in all the birds we analysed. Similar to criteria used for INs in other datasets, repeat syllables with more than 15 instances were analysed. The number, interval and acoustic properties were analyzed using a similar procedure as that described for INs above. The ratio of intervals was defined as the current interval between successive repeats in the sequence divided by the next interval. Shortening of intervals is denoted by a value of <1. Median ratio of intervals was measured across trials to represent data for a syllable. The acoustic distance to last repeat was defined as an inverse measure of similarity between a repeat syllable and all last repeat syllables. The six acoustic properties defined earlier were used to calculate acoustic distance, namely duration, mean frequency, log amplitude, entropy, pitch goodness and frequency modulation. To calculate acoustic distance, all repeat instances were randomly split in half. The last repeats from one half were chosen as the reference distribution occupying a six-dimensional space formed by the six acoustic properties. The Mahalanobis distance from this distribution to each repeat syllable in the second half was calculated to give the acoustic distance for each of these repeat syllables. Similar to ratio of intervals, ratio of acoustic distance was then calculated as the distance of every repeat divided by distance of following repeat syllable. The median ratio of acoustic distance was measured across trials to represent data for a syllable. A median ratio of <1 denoted convergence towards the last repeat syllable.

EXPERIMENTAL DESIGN AND STATISTICAL ANALYSIS

The experimental design of all groups tested for differences had comparable sample numbers. All comparison of independent groups of unequal sample sizes were tested for differences using non-parametric Kruskal-Wallis test. If the p-value was ≤ 0.05, Tukey-Kramer’s post-hoc test was used to identify pairs of groups that were significantly different These group comparisons included age-related changes in mean IN or song properties, age-related changes in IN number in individual birds, comparison of average song motif duration or song similarity for different IN numbers, and comparison of mean properties of repeat syllables belonging to either motifs, INs or calls. Differences between (more than two) bird-matched groups were tested using Repeated-Measures one-way ANOVA. If the p-value was ≤ 0.05, Tukey-Kramer’s post-hoc test was used to identify pairs of groups that were significantly different. This included comparison of mean or CV of syllable properties across bouts based on position, i.e. first IN, last IN and first song syllable. Paired-group comparisons were tested for differences using Wilcoxon signed-rank test with a p-value criteria of 0.05 for significant differences. These comparisons included significant correlation coefficient estimates of first IN - first song syllable with that of last IN – first song syllable, motif duration or similarity comparisons between shorter and longer time-to-song bouts, and properties of repeat syllables at first and last IN positions. All bout-to-bout correlations were measured using Pearson’s Correlation Coefficient of corresponding IN and song property values across bouts. Prior to measuring correlations, the outliers were removed (percentage removed – median-2.19% data points; range - 0–28.99%). Outliers were detected as values beyond three times the median absolute deviation from the median for continuous variables and values beyond three standard deviations away from the mean for discrete variables. The correlations were considered significant if p-values were ≤ 0.05. The range of significant correlations were reported across birds along with proportion of birds significantly correlated.

DATA AND CODE ACCESSIBILITY

All data and scripts for analysis are available on request from the corresponding author.

Results

Introductory note features change in the direction of upcoming song

Introductory notes (INs) are characterized by three properties; the number of INs before song at the start of each bout, their timing measured by the intervals between successive INs and their acoustic properties. As shown earlier (Rajan and Doupe, 2013; Rao et al., 2019), both the timing and the acoustic properties of INs changed systematically from the first to the last IN (Fig. 2). Additionally, changes in IN properties were in the direction of changes from INs to song. As INs progressed from the first IN to the last IN to the first song syllable, inter-syllable intervals got shorter (Fig. 2A, left, p < 0.001, Repeated Measures ANOVA, p < 0.001 for first IN vs song and first IN vs. last IN, post-hoc Tukey-Kramer test) and syllables progressively became longer (Fig. 2B, left, p < 0.001, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and last IN vs. song, p = 0.02 for first IN vs. last IN, post-hoc Tukey-Kramer test), higher in frequency (Fig. 2C, left, p = 0.008, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and first IN vs. last IN and p = 0.003 for last IN vs. song, post-hoc Tukey-Kramer test) and louder (Fig. 2E, left, p = 0.003, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and first IN vs. last IN and p = 0.002 for last IN vs. song, post-hoc Tukey-Kramer test). In addition to changes in the mean, we also found a reduction in variability as INs approached song. Specifically, from the first IN to the last IN to song, the inter-syllable intervals became less variable (Fig. 2A, right, p = 0.048, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and first IN vs. last IN, post-hoc Tukey-Kramer test), syllable duration (Fig. 2B, right, p =0.03, Repeated Measures ANOVA, p < 0.001 for last IN vs. song, p = 0.005 for first IN vs. song, post-hoc Tukey-Kramer test), entropy (Fig. 2D, right, p < 0.001, Repeated Measures ANOVA, p < 0.001 for first IN vs. song and for last IN vs. song, post-hoc Tukey-Kramer test) and frequency modulation (Fig. 2G, right, p = 0.002, Repeated Measures ANOVA, p < 0.001 for first IN vs. song, p = 0.04 for first IN vs. last IN and last IN vs. song, post-hoc Tukey-Kramer test) became less variable. Syllable duration (Fig. 2B, right) and entropy variability (Fig. 2D, right) were not significantly different from first IN to last IN (p > 0.05, Repeated Measures ANOVA). Overall, these results showed that the mean features of INs changed and became more similar to those of the upcoming song, as INs progress from the first to the last IN. In addition, timing variability systematically reduced from first to last IN, while the variability of acoustic features only reduced with the onset of song.

IN number is not correlated with first song syllable features

If INs represent motor preparation for upcoming song, IN properties should be correlated with properties of the first song syllable on a bout-to-bout basis. We first correlated the number of INs with the properties of the first song syllable. In most of the birds, the number of INs (or the “time to song”, measured as the time between the first IN and the first song syllable) was not significantly correlated with either the interval between the first two song syllables or the acoustic properties of the first song syllable (Fig. 4, Table 1, p > 0.05, Pearson’s correlation co-efficient). Previous studies have shown that the number of INs (and the “time to song”) is positively correlated with the length of the interval between the first two INs and the acoustic similarity of the first IN to the last IN (Rajan and Doupe, 2013; Rao et al., 2019). Taken together with our results showing the absence of correlations between IN number and song features, this suggests that birds sing different number of INs to reach the same last IN state from different initial conditions (different first IN states).

IN timing is not correlated with song syllable timing

We next examined correlations between the timing of INs and the timing of song. Specifically, we separately calculated correlations for first IN timing with song timing and last IN timing with song timing. If INs represent motor preparation, we expected an improvement in the IN-song correlations as IN timings change from first IN to last IN; i.e. we expected weak or no significant correlations between the first IN and song and strong correlations between the last IN and song. Additionally, we expected this pattern to be present in greater than 50% of the birds. Contrary to this expectation, we found significant correlations in timing of last IN and timing of song in only 6/20 birds and these correlations were weak (median - 0.14; range: 0.06 – 0.2; Table 1). Only 1/20 birds had significant correlations between first IN timing and song timing. Overall, a majority of birds did not show significant correlations between IN and song intervals or improvement in correlations with song interval from first to last IN intervals suggesting that changes in IN timing do not represent motor preparation for song syllable timing.

IN acoustic features are correlated with song features

Similar to timing, we examined correlations between the acoustic features of INs and the acoustic features of the first song syllable (see Fig. 3 for an example). Log amplitude and mean frequency of the last IN were correlated with the corresponding features of the first song syllable in a large proportion of birds (Fig. 4, Table 1, p < 0.05, Pearson’s correlation co-efficient). Surprisingly, we also found significant correlations between the acoustic features of the first IN and the first song syllable in most of these birds (Fig. 4, Table 1, p < 0.05, Pearson’s correlation coefficient), albeit the strength of the correlations were slightly lower when compared to the strength of correlations between the last IN and the first song syllable (Fig. 4, p = 0.004, Mean Frequency and p=0.015, Log Amplitude for comparisons of r-values). The presence of correlations with the first IN (Fig. 4, Table 1) suggested the possibility that these correlations could alternatively be related to previously described global correlations between syllables within individual bouts (Glaze and Troyer, 2006). In support of this idea, we also found significant correlations between the acoustic features of the first two song syllables (Fig. 4, Table 1, p < 0.05, Pearson’s correlation co-efficient). The correlations were similar in strength to the correlations between IN features and song features and were present in a similar proportion of birds (Table 1, Fig. 4, p > 0.05, Wilcoxon signed-rank test). In a majority of birds, significant correlations were also absent for other acoustic features related to syllable identity (duration, pitch goodness, frequency modulation), further supporting the idea that syllable feature correlations represent global control of IN and song syllables, rather than INs reflecting motor preparation for the upcoming song.

To rule out the possibility that such correlations arose because the distance from the suspended microphone to the bird varied from bout-to-bout, we also examined similar correlations in a dataset from a different set of birds with head-implanted microphones (Suri and Rajan, 2018). In these birds, the distance between the microphone and the bird was always fixed. We found similar correlations in a similar proportion of head-implanted birds (Table 1) confirming that these correlations were not just due to differences in relative distance from the microphone.

To better understand the origin of these correlations of globally controlled features, we further analysed song recordings from birds with bilateral cuts in the tracheosyringeal nerve, the nerve that carries neural input to the syringeal muscles. In these birds, syllable identities are lost as syllables are reduced to harmonic stacks, but the temporal patterning of song remains similar to pre-nerve cut as this is controlled by respiratory motor neurons (Bottjer and Arnold, 1984; Vicario, 1991; Williams and McKibben, 1992; Roy and Mooney, 2007). Significant correlations in most acoustic features were present in a similar proportion of these birds (Table 1, p < 0.05, Pearson’s correlation co-efficient). These results suggest that bout-to-bout correlations in acoustic features of INs and song syllables are driven by bout-to-bout differences in the control of respiratory pressure.

Overall, these results show the presence of significant correlations in IN acoustic features (first and last IN) with acoustic features of the first song syllable. Additionally, correlations between acoustic features of the first two song syllables suggest that the correlations between IN features and first song syllable features are not a result of motor preparatory function of INs but, instead, reflect the fact that INs are also vocalizations that are part of the song sequence and are controlled by the same neural pathways.

Song and INs show age-related changes

We next examined a second prediction of the motor preparation hypothesis, age-related song changes and their correlation with IN properties. Previous studies have shown age-related changes in song tempo and song stereotypy; songs become faster and more stereotyped as birds get older (Brainard and Doupe, 2001; Pytte et al., 2007; Glaze and Troyer, 2013; James and Sakata, 2019). If INs represent motor preparation, we would predict simultaneous changes in specific IN properties with age. In addition, we would expect age-related changes to be predicted by correlations between these IN properties and song properties at an earlier age. For instance, if IN number increases with age and songs get more stereotyped with age, we would expect higher IN number to be associated with more stereotyped songs, on a bout-to-bout basis, at an earlier age (Fig. 1D).

To test this, we first recorded birds at multiple time-points (median - 3; range - 2–8 time-points per bird) from ~90 days post-hatch to ~3 years of age (Fig. 5A). As compared to the first day of recordings, we found increases in IN number in most birds (Fig. 5A, red circles represent birds with increases over age, black circles represent birds with decreases over age and gray circles represent birds that do not change over age). We divided our recordings into 3 age categories, namely, (1) within 1 year post hatch, (2) 1–2 years post hatch and, (3) > 2 years post hatch. For some of the birds we had multiple recordings within each of these categories, typically within a few days of each other. For each bird we calculated pair-wise differences in mean IN number between the different age categories. Differences in mean IN number were largest when the first recording day was within 1 year post hatch and the second recording day greater than 1 year (Fig. 5B, bigger circles for Age 1 < 365 dph and Age 2 > 365 dph). For each bird, we chose pairs of days between age categories with the largest difference in mean IN number (see Methods for details, Fig. 5B, ‘*’ represents days chosen for further analysis) and then used changes between these pairs of days to further compare age-related changes across birds. We chose this strategy to increase our chances of testing the prediction of the motor preparation hypothesis as we expected greater correlations for these days as mentioned above. All of the analyses described below were carried out with these days.

Mean IN number and the associated “time-to-song” increased in the 1st year post-hatch (Fig. 6C, p = 0.02, Kruskal-Wallis test, p = 0.09 for <1 yr to > 1yr vs. day-day and p = 0.02 for <1 yr to > 1 yr vs. > 1yr, post-hoc Tukey-Kramer test, Fig. 6D, p=0.003 Kruskal-Wallis test, p=0.04 for day-day and <yr to >1yr, p=0.003 for <1yr to >yr and >1yr, post-hoc Tukey-Kramer Test). Song tempo, for the first song motif (that followed the INs at the beginning) in the bout, increased significantly, as seen by the significant shortening of first song motif duration (Fig. 6B, p = 0.007, Kruskal-Wallis test, p = 0.005, day-day vs. <1yr to > 1yr, post-hoc Tukey-Kramer test). In our birds, song stereotypy did not increase significantly across the 3 age-groups, although song structure itself significantly changed in the first year (Table 2, song motif similarity index and motif temporal similarity between days). Other features of songs that showed changes in the first year were reduced sequence entropy, increased pitch goodness and reduced variability in frequency modulation of first song syllable. Significant changes were also observed for acoustic features of INs with increased frequency modulation, decreased CV of duration, pitch goodness and increased CV of frequency modulation of INs. We also observed some trends that approached significance including longer intervals between the first two INs, increased pitch goodness of INs, increased number of motifs in a bout, and reduced duration of the first motif syllable (Table 2, p>0.05 and <0.1). Overall, these results document age-related changes in IN properties that occur along with age-related song changes.

TABLE 2:

IN and Song changes with age

	Property Measured	Age groups	Individual groups				Statistic Test	P-value	Post-hoc: Tukey-Kramer Test

			N	GroupMeanSE	GroupMedianRange	Confidence Intervals

IN number	Number	day-day	10	MEAN +/− SE 0.0118+/−0.0216	MEDIAN [RANGE] 0.012616[−0.37937 – 0.29564]	CI [−0.14249,0.16606]	KruskalWallis ANOVA	p value = 0.021175	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −18.847 −8.9455 0.95651 0.086335 {‘day-day’} {‘>1 yr’} −8.1035 1.6 11.304 0.92098 {‘1yr’} {‘>1 yr’} 1.0856 10.545 20.005 0.024384
		<1yr to >1yr	11	MEAN +/− SE 0.561+/−0.0708	MEDIAN [RANGE] 0.45195[−0.92505 – 1.73]	CI [0.038162,1.0839]
		>1 yr	12	MEAN +/− SE −0.0594+/−0.0355	MEDIAN [RANGE] −0.090537[−0.79488 – 0.91923]	CI [−0.32993,0.21121]

IN timing	mean last interval	day-day	10	MEAN +/− SE −1.2846+/−0.528	MEDIAN [RANGE] −0.20482[−10.952 – 6.0383]	CI [−5.0621,2.4928]	KruskalWallis ANOVA	p value = 0.85669
		<1yr to >1yr	10	MEAN +/− SE −6.089+/−1.2398	MEDIAN [RANGE] −3.253[−35.0634 – 5.5633]	CI [−14.9577,2.7797]
		>1 yr	11	MEAN +/− SE −2.405+/−0.3852	MEDIAN [RANGE] −0.24286[−11.1614 – 1.7304]	CI [−5.2512,0.44129]

	mean Ratio of intervals	day-day	10	MEAN +/− SE 0.0028+/−0.0041	MEDIAN [RANGE] −0.0041247[−0.073999 – 0.085084]	CI [−0.026361,0.031913]	KruskalWallis ANOVA	p value = 0.9567
		<1yr to >1yr	10	MEAN +/− SE −0.1424+/−0.0481	MEDIAN [RANGE] −0.034303[−1.4426 – 0.29967]	CI [−0.48622,0.20146]
		>1 yr	11	MEAN +/− SE −0.0054+/−0.0051	MEDIAN [RANGE] −0.0029565[−0.13038 – 0.065233]	CI [−0.043185,0.032319]

	mean first interval	day-day	10	MEAN +/− SE −0.9244+/−2.3378	MEDIAN [RANGE] 2.8502[−30.5386 – 49.3913]	CI [−17.6477,15.7989]	KruskalWallis ANOVA	p value = 0.068496
		<1yr to >1yr	10	MEAN +/− SE 12.0475+/−3.2618	MEDIAN [RANGE] 15.3978[−66.9523 – 50.7163]	CI [−11.2859,35.381]
		>1 yr	11	MEAN +/− SE −15.0694+/−3.2748	MEDIAN [RANGE] −9.6375[−76.4166 – 44.6174]	CI [−39.2702,9.1313]

IN acoustic progression	mean Last IN acoustic distance to last	day-day	9	MEAN +/− SE −0.3475+/−0.0402	MEDIAN [RANGE] −0.38907[−1.0176 – 0.2705]	CI [−0.6257,−0.069276]	KruskalWallis ANOVA	p value = 0.16984
		<1yr to >1yr	7	MEAN +/− SE 0.0377+/−0.0538	MEDIAN [RANGE] 0.1414[−0.60348 – 0.3727]	CI [−0.31055,0.38586]
		>1 yr	9	MEAN +/− SE −0.1991+/−0.0721	MEDIAN [RANGE] −0.12088[−1.6982 – 0.68917]	CI [−0.69764,0.29937]

	mean Ratio of acoustic	day-day	9	MEAN +/− SE 0.019+/−0.0108	MEDIAN [RANGE] −0.0090575[−0.083166 – 0.14485]	CI [−0.055698,0.093642]	KruskalWallis ANOVA	p value = 0.81741
		<1yr to >1yr	7	MEAN +/− SE −0.0051+/−0.0136	MEDIAN [RANGE] 0.014391[−0.17221 – 0.11397]	CI [−0.092935,0.082812]
		>1 yr	9	MEAN +/− SE – 0.0003+/−0.008	MEDIAN [RANGE] 0.0065595[−0.11984 – 0.094904]	CI [−0.05576,0.055129]

	mean First IN acoustic distance to Last	day-day	9	MEAN +/− SE −1.0099+/−0.2076	MEDIAN [RANGE] −0.48591[−4.4112 – 0.66241]	CI [−2.4461,0.4262]	KruskalWallis ANOVA	p value = 0.81741
		<1yr to >1yr	7	MEAN +/− SE −0.5839+/−0.4119	MEDIAN [RANGE] −0.16996[−6.7354 – 1.7692]	CI [−3.2506,2.0828]
		>1 yr	9	MEAN +/− SE −0.6614+/−0.244	MEDIAN [RANGE] −0.58999[−3.5397 – 2.3548]	CI [−2.3495,1.0267]

IN temporal progression	mean IN temporal similarity	day-day	10	MEAN +/− SE −0.0415+/−0.0133	MEDIAN [RANGE] −0.016053[−0.34315 – 0.12322]	CI [−0.13698,0.053904]	KruskalWallis ANOVA	p value = 0.5976
		<1yr to >1yr	10	MEAN +/− SE −0.0375+/−0.0116	MEDIAN [RANGE] −0.058391[−0.22948 – 0.15545]	CI [−0.12041,0.045359]
		>1 yr	10	MEAN +/− SE 0.0105+/−0.0111	MEDIAN [RANGE] −0.0099938[−0.12124 – 0.22186]	CI [−0.068633,0.089703]

IN individual acoustic properties	mean Duration	day-day	8	MEAN +/− SE −111.05+/−48.371	MEDIAN [RANGE] 24.4624[−1050.0514 – 134.5232]	CI [−434.5631,212.463]	KruskalWallis ANOVA	p value = 0.95198
		<1yr to >1yr	7	MEAN +/− SE 291.7934+/−108.7078	MEDIAN [RANGE] 22.0911[−562.1608 – 1549.8409]	CI [−411.972,995.5588]
		>1 yr	9	MEAN +/− SE 150.8497+/−55.5404	MEDIAN [RANGE] −31.729[−474.1411 – 1057.3896]	CI [−233.3793,535.0787]

	mean MeanFrequency	day-day	8	MEAN +/− SE 0.0269+/−0.0087	MEDIAN [RANGE] 0.0082835[−0.044376 – 0.15728]	CI [−0.031269,0.085104]	KruskalWallis ANOVA	p value = 0.50265
		<1yr to >1yr	7	MEAN +/− SE 0.2975+/−0.0629	MEDIAN [RANGE] 0.34665[−0.26392 – 1.0618]	CI [−0.10941,0.70441]
		>1 yr	9	MEAN +/− SE 0.2592+/−0.0563	MEDIAN [RANGE] 0.039671[−0.16794 – 1.3985]	CI [−0.13037,0.64879]

	mean Entropy	day-day	8	MEAN +/− SE −0.0315+/−0.2474	MEDIAN [RANGE] 0.18696[−3.3383 – 2.7798]	CI [−1.6862,1.6232]	KruskalWallis ANOVA	p value = 0.46338
		<1yr to >1yr	7	MEAN +/− SE −9.0364+/−1.754	MEDIAN [RANGE] −13.1006[−26.8391 – 5.6277]	CI [−20.3913,2.3186]
		>1 yr	9	MEAN +/− SE −8.8939+/−1.8216	MEDIAN [RANGE] −2.5427[−42.3353 – 9.6014]	CI [−21.4956,3.7078]

	mean LogAmplitude	day-day	8	MEAN +/− SE −4.1258+/−1.4467	MEDIAN [RANGE] −3.4446[−24.2639 – 9.7269]	CI [−13.8016,5.55]	KruskalWallis ANOVA	p value = 0.28124
		<1yr to >1yr	7	MEAN +/− SE 22.8571+/−5.8646	MEDIAN [RANGE] 11.4677[−35.6068 – 84.9099]	CI [−15.1097,60.824]
		>1 yr	9	MEAN +/− SE −5.3387+/−4.7627	MEDIAN [RANGE] 13.6997[−75.7125 – 54.7122]	CI [−38.2874,27.61]

	mean PitchGoodness	day-day	8	MEAN +/− SE 0.7614+/−0.2263	MEDIAN [RANGE] 0.85853[−1.8487 – 3.2726]	CI [−0.75213,2.2748]	KruskalWallis ANOVA	p value = 0.067983
		<1yr to >1yr	7	MEAN +/− SE 3.2852+/−0.7066	MEDIAN [RANGE] 2.8079[−2.8372 – 12.7516]	CI [−1.2896,7.86]
		>1 yr	9	MEAN +/− SE −2.4962+/−0.9237	MEDIAN [RANGE] −1.8307[−18.5602 – 11.9518]	CI [−8.8867,3.8943]

	mean FM	day-day	8	MEAN +/− SE −0.0022+/−0.0007	MEDIAN [RANGE] −0.0023104[−0.013369 – 0.0067817]	CI [−0.0071321,0.0027838]	KruskalWallis ANOVA	p value = 0.039514	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −15.077 −6.5 2.0771 0.17762 {‘day-day’} {‘>1 yr’} −5.6639 2.3889 10.442 0.76625 {‘1yr’} {‘>1 yr’} 0.53715 8.8889 17.241 0.033743
		<1yr to >1yr	7	MEAN +/− SE 0.0079+/−0.0014	MEDIAN [RANGE] 0.0052126[−0.0065648 – 0.022539]	CI [−0.00092705,0.016636]
		>1 yr	9	MEAN +/− SE −0.0094+/−0.0016	MEDIAN [RANGE] −0.0074783[−0.031784 – 0.0063546]	CI [−0.02058,0.0017585]

IN variability: number	Variability/ CV IN Number	day-day	10	MEAN +/− SE −0.0187+/−0.0039	MEDIAN [RANGE] −0.010639[−0.079494 – 0.034191]	CI [−0.046518,0.0092119]	KruskalWallis ANOVA	p value = 0.38459
		<1yr to >1yr	11	MEAN +/− SE −0.0199+/−0.007	MEDIAN [RANGE] −5.1272e-05[−0.17488 – 0.077454]	CI [−0.071309,0.031595]
		>1 yr	12	MEAN +/− SE 0.0199+/−0.0053	MEDIAN [RANGE] 0.024832[−0.081275 – 0.13978]	CI [−0.020633,0.06047]

IN variability: timing	Variability/ CV Last IN Interval	day-day	10	MEAN +/− SE 0.0181+/−0.0039	MEDIAN [RANGE] 0.014022[−0.016373 – 0.12097]	CI [−0.0099727,0.04619]	KruskalWallis ANOVA	p value = 0.43052
		<1yr to >1yr	11	MEAN +/− SE 0.0285+/−0.0222	MEDIAN [RANGE] −0.011954[−0.25775 – 0.68661]	CI [−0.13518,0.19224]
		>1 yr	12	MEAN +/− SE −0.0094+/−0.0026	MEDIAN [RANGE] −0.0081933[−0.068218 – 0.034897]	CI [−0.029477,0.010764]

	Variability/ CV Ratio of intervals	day-day	10	MEAN +/− SE 0.0064+/−0.0164	MEDIAN [RANGE] 0.027374[−0.35553 – 0.25091]	CI [−0.11082,0.12369]	KruskalWallis ANOVA	p value = 0.36706
		<1yr to >1yr	11	MEAN +/− SE 0.2316+/−0.0309	MEDIAN [RANGE] 0.1856[−0.21458 – 0.73979]	CI [0.0034375,0.45981]
		>1 yr	12	MEAN +/− SE 0.0775+/−0.0257	MEDIAN [RANGE] 0.048876[−0.32217 – 0.92068]	CI [−0.1188,0.27386]

	Variability/ CV First IN interval	day-day	10	MEAN +/− SE −0.0137+/−0.0162	MEDIAN [RANGE] −0.016848[−0.3406 – 0.17153]	CI [−0.12935,0.10192]	KruskalWallis ANOVA	p value = 0.18575
		<1yr to >1yr	11	MEAN +/− SE 0.1268+/−0.0193	MEDIAN [RANGE] 0.14451[−0.20024 – 0.46645]	CI [−0.015772,0.26936]
		>1 yr	12	MEAN +/− SE −0.0053+/−0.0185	MEDIAN [RANGE] −0.037041[−0.30882 – 0.43536]	CI [−0.14616,0.13548]

IN variability acoustic progresssion	Variability/ CV Last IN acoustic distance to last	day-day	10	MEAN +/− SE −0.0918+/−0.0123	MEDIAN [RANGE] −0.080348[−0.41275 – 0.020987]	CI [−0.18001,−0.0035068]	KruskalWallis ANOVA	p value = 0.042309	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −17.002 −7.1 2.802 0.21266 (‘day-day’} {‘>1 yr’} −19.97 −10.267 −0.56316 0.0351 {‘1yr’} {‘>1 yr’} −12.627 −3.1667 6.2932 0.71254
		<1yr to >1yr	11	MEAN +/− SE −0.0158+/−0.0245	MEDIAN [RANGE] −0.017625[−0.64494 – 0.44022]	CI [−0.19671,0.16509]
		>1 yr	12	MEAN +/− SE 0.0521+/−0.0166	MEDIAN [RANGE] 0.063083[−0.44492 – 0.34816]	CI [−0.074458,0.17869]

	Variability/ CV Ratio of acoustic distance to last	day-day	10	MEAN +/− SE −0.058+/−0.0206	MEDIAN [RANGE] −0.049568[−0.36106 – 0.38739]	CI [−0.20515,0.089092]	KruskalWallis ANOVA	p value = 0.69068
		<1yr to >1yr	11	MEAN +/− SE −0.1163+/−0.0364	MEDIAN [RANGE] −0.018213[−1.1743 – 0.32172]	CI [−0.3854,0.15288]
		>1 yr	12	MEAN +/− SE 0.0201+/−0.037	MEDIAN [RANGE] 0.0065369[−1.0488 – 0.85784]	CI [−0.26179,0.30202]

	Variability/ CV First IN acoustic distance to last	day-day	10	MEAN +/− SE −0.0294+/−0.0103	MEDIAN [RANGE] −0.027447[−0.25547 – 0.085092]	CI [−0.10308,0.044208]	KruskalWallis ANOVA	p value = 0.58736
		<1yr to >1yr	11	MEAN +/− SE −0.0643+/−0.0127	MEDIAN [RANGE] −0.080123[−0.2751 – 0.2042]	CI [−0.15855,0.029855]
		>1 yr	12	MEAN +/− SE −0.0119+/−0.0115	MEDIAN [RANGE] −0.01908[−0.27526 – 0.20089]	CI [−0.099899,0.076058]

IN variability temporal progression	Variability/ CV IN temporal similarity	day-day	10	MEAN +/− SE 0.01+/−0.013	MEDIAN [RANGE] 0.014167[−0.25792 – 0.25117]	CI [−0.08284,0.10292]	KruskalWallis ANOVA	p value = 0.2413
		<1yr to >1yr	11	MEAN +/− SE 0.0392+/−0.0203	MEDIAN [RANGE] −0.037849[−0.11084 – 0.59477]	CI [−0.11062,0.18898]
		>1 yr	11	MEAN +/− SE 0.091+/−0.0128	MEDIAN [RANGE] 0.10421[−0.11398 – 0.33483]	CI [−0.0037432,0.18577]

IN variability individual acoustic properties	Variability/ CV Duration	day-day	8	MEAN +/− SE −0.004+/−0.0031	MEDIAN [RANGE] 0.0031015[−0.046894 – 0.028636]	CI [−0.024799,0.016788]	KruskalWallis ANOVA	p value = 0.023576	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −3.4699 5.1071 13.684 0.34321 {‘day-day’} {‘>1 yr’} −12.692 −4.6389 3.4139 0.36744 {‘1yr’} {‘>1 yr’} −18.098 −9.746 −1.3943 0.017162
		<1yr to >1yr	7	MEAN +/− SE −0.0602+/−0.0111	MEDIAN [RANGE] −0.067725[−0.20548 – 0.020437]	CI [−0.1318,0.011329]
		>1 yr	9	MEAN +/− SE 0.0312+/−0.005	MEDIAN [RANGE] 0.024413[−0.012636 – 0.13136]	CI [−0.0031162,0.065454]

	Variability/ CV MeanFrequency	day-day	8	MEAN +/− SE 0.0119+/−0.0016	MEDIAN [RANGE] 0.014156[−0.015905 – 0.026102]	CI [0.0015679,0.022302]	KruskalWallis ANOVA	p value = 0.34252
		<1yr to >1yr	7	MEAN +/− SE 0.0061+/−0.0064	MEDIAN [RANGE] 0.011416[−0.061778 – 0.073862]	CI [−0.035534,0.047768]
		>1 yr	9	MEAN +/− SE −0.0129+/−0.0054	MEDIAN [RANGE] −0.00155[−0.13064 – 0.040464]	CI [−0.050524,0.02463]

	Variability/ CV Entropy	day-day	8	MEAN +/− SE 0.0019+/−0.0017	MEDIAN [RANGE] 0.00057463[−0.016068 – 0.025586]	CI [−0.0096725,0.013444]	KruskalWallis ANOVA	p value = 0.30481
		<1yr to >1yr	7	MEAN +/− SE 0.0684+/−0.0189	MEDIAN [RANGE] 0.036479[−0.068524 – 0.34817]	CI [−0.053753,0.19063]
		>1 yr	9	MEAN +/− SE −0.278+/−0.0914	MEDIAN [RANGE] 0.019074[−2.4435 – 0.11887]	CI [−0.91053,0.3546]

	Variability/ CV LogAmplitude	day-day	8	MEAN +/− SE −0.0025+/−0.0033	MEDIAN [RANGE] −0.0012403[−0.043856 – 0.034072]	CI [−.024717,0.019762]	KruskalWallis ANOVA	p value = 0.16444
		<1yr to >1yr	7	MEAN +/− SE −0.045+/−0.0113	MEDIAN [RANGE] −0.04986[−0.14515 – 0.078516]	CI [−0.11793,0.02799]
		>1 yr	9	MEAN +/− SE 0.0102+/−0.0039	MEDIAN [RANGE] 0.0027708[−0.026547 – 0.088177]	CI [−0.016941,0.037414]

	Variability/ CV PitchGoodness	day-day	8	MEAN +/− SE −0.0032+/−0.0036	MEDIAN [RANGE] −0.0043208[−0.04264 – 0.054537]	CI [−0.02713,0.020723]	KruskalWallis ANOVA	p value = 0.010684	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −0.95207 7.625 16.202 0.093232 {‘day-day’} {‘>1 yr’} −10.983 −2.9306 5.1222 0.67001 {‘1yr’} {‘>1 yr’} −18.907 −10.556 −2.2038 0.0085829
		<1yr to >1yr	7	MEAN +/− SE −0.0551+/−0.0078	MEDIAN [RANGE] −0.048224[−0.1559 – 0.00080264]	CI [−0.1056,−0.0045073]
		>1 yr	9	MEAN +/− SE 0.0137+/−0.0036	MEDIAN [RANGE] 0.019179[−0.046311 – 0.050554]	CI [−0.010894,0.038251]

	Variability/ CV FM	day-day	8	MEAN +/− SE 0.0976+/−0.0299	MEDIAN [RANGE] 0.063292[−0.27683 – 0.56986]	CI [−0.10227,0.29751]	KruskalWallis ANOVA	p value = 0.024223	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −2.5592 6.0179 14.595 0.22705 {‘day-day’} {‘>1 yr’} −11.733 −3.6806 4.3722 0.532 {‘1yr’} {‘>1 yr’} −18.05 −9.6984 −1.3467 0.017846
		<1yr to >1yr	7	MEAN +/− SE 5.7541+/−2.9229	MEDIAN [RANGE] −0.74238[−6.5352 – 51.8705]	CI [−13.1688,24.677]
		>1 yr	9	MEAN +/− SE 4.9266+/−1.3632	MEDIAN [RANGE] 0.7803[0.4878 – 37.4909]	CI [−4.5038,14.357]

SONG Number of motifs	mean Number of motifs	day-day	10	MEAN +/− SE 0.2026+/−0.1084	MEDIAN [RANGE] −0.15016[−1.1807 – 2.5053]	CI [−0.57253,0.97777]	KruskalWallis ANOVA	p value = 0.071358
		<1yr to >1yr	11	MEAN +/− SE 1.4231+/−0.122	MEDIAN [RANGE] 1.1011[0.53571 – 4.0663]	CI [0.52176,2.3244]
		>1 yr	12	MEAN +/− SE 0.2657+/−0.1103	MEDIAN [RANGE] 0.20406[−2.1523 – 3.2665]	CI [−0.57497,1.1063]

SONG Motif duration	mean Motif duration	day-day	10	MEAN +/− SE 29.228+/−8.2369	MEDIAN [RANGE] 3.9954[−15.9994 – 262.1195]	CI [−29.6952,88.1512]	KruskalWallis ANOVA	p value = 0.0073049	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} 3.1755 12.782 22.388 0.0051686 {‘day-day’} {‘>1 yr’} −1.7336 7.8727 17.479 0.1328 {‘1yr’} {‘>1 yr’} −14.284 −4.9091 4.4657 0.437
		<1yr to >1yr	11	MEAN +/− SE −36.1327+/−3.103	MEDIAN [RANGE] −45.4232[−82.1703 – 8.1005]	CI [−59.0635,−13.2019]
		>1 yr	11	MEAN +/− SE −11.384+/−1.4579	MEDIAN [RANGE] −11.5948[−38.336 – 13.5182]	CI [−22.1579,−0.61]

SONG Motif similarity	mean Similarity Index b/w days	day-day	10	MEAN +/− SE 0.7533+/−0.0229	MEDIAN [RANGE] 0.82157[0.11343 – 0.8877]	CI [0.58926,0.9174]	KruskalWallis ANOVA	p value = 0.0099177	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} 2.7028 12.309 21.915 0.0075326 {‘day-day’} {‘>1 yr’} −4.7518 4.8545 14.461 0.46249 {‘1yr’} {‘>1 yr’} −16.829 −7.4545 1.9203 0.14936
		<1yr to >1yr	11	MEAN +/− SE 0.5192+/−0.0217	MEDIAN [RANGE] 0.51651[0.097511 – 0.81988]	CI [0.35885,0.67963]
		>1 yr	11	MEAN +/− SE 0.7251+/−0.0202	MEDIAN [RANGE] 0.77332[0.068898 – 0.89233]	CI[0.5757,0.87444]

	mean Motif Temporal Similarity b/w days	day-day	10	MEAN +/− SE 0.8508+/−0.0067	MEDIAN [RANGE] 0.8516[0.74702 – 0.94962]	CI [0.80285,0.89867]	KruskalWallis ANOVA	p value = 0.0061719	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} 3.2728 12.5 21.727 0.0042759 {‘day-day’} {‘>1 yr’} −4.0272 5.2 14.427 0.38353 {‘1yr’} {‘>1 yr’} −16.527 −7.3 1.9272 0.15223
		<1yr to >1yr	10	MEAN +/− SE 0.4524+/−0.0375	MEDIAN [RANGE] 0.47334[−0.27436 – 0.85254]	CI [0.18411,0.72078]
		>1 yr	10	MEAN +/− SE 0.7465+/−0.0161	MEDIAN [RANGE] 0.81097[0.42351 – 0.89456]	CI [0.63163,0.86136]

SONG First syllable timing and acoustic properties	mean a-b interval	day-day	10	MEAN +/− SE −0.8401+/−0.5078	MEDIAN [RANGE] 0.18811[−8.8901 – 5.5271]	CI [−4.4727,2.7925]	KruskalWallis ANOVA	p value = 0.23329
		<1yr to >1yr	11	MEAN +/− SE −4.4853+/−0.62	MEDIAN [RANGE] −5.7741[−12.1102 – 5.1788]	CI [−9.0674,0.096841]
		>1 yr	12	MEAN +/− SE −0.9871+/−0.3742	MEDIAN [RANGE] −2.1705[−4.6817 – 11.8113]	CI [−3.8401,1.866]

	mean first syll. duration	day-day	10	MEAN +/− SE 6.4948+/−1.5049	MEDIAN [RANGE] 1.8102[−7.7428 – 46.3523]	CI [−4.2709,17.2605]	KruskalWallis ANOVA	p value = 0.070662
		<1yr to >1yr	11	MEAN +/− SE −5.9773+/−1.0548	MEDIAN [RANGE] −1.9745[−29.2968 – 7.3544]	CI [−13.7721,1.8174]
		>1 yr	11	MEAN +/− SE −2.6005+/−0.3672	MEDIAN [RANGE] −1.6526[−11.5721 – 2.257]	CI [−5.3139,0.11279]

	mean first syll. meanfrequency	day-day	10	MEAN +/− SE −151.733+/−35.1637	MEDIAN [RANGE] 6.7708[−1084.3736 – 59.1939]	CI [−403.2792,99.8132]	KruskalWallis ANOVA	p value = 0.15753
		<1yr to >1yr	11	MEAN +/− SE 391.7428+/−83.5108	157.6735[−632.617 – 2409.3258]	CI [−225.3937,1008.8794]
		>1 yr	11	MEAN +/− SE 201.7046+/−61.525	24.7215[−400.7622 – 2184.6467]	CI [−252.9587,656.3679]

	mean first syll. entropy	day-day	10	MEAN +/− SE −0.0022+/−0.0084	MEDIAN [RANGE] −0.0086113[−0.14802 – 0.14057]	CI [−0.062035,0.057607]	KruskalWallis ANOVA	p value = 0.86606
		<1yr to >1yr	11	MEAN +/− SE 0.1305+/−0.0436	MEDIAN [RANGE] −0.0031437[−0.43748 – 1.1387]	CI [−0.19172,0.45274]
		>1 yr	11	MEAN +/− SE 0.0163+/−0.0157	MEDIAN [RANGE] 0.048293[−0.25637 – 0.36555]	CI [−0.099529,0.13204]

	mean first syll. logamplitude	day-day	10	MEAN +/− SE 1.2641+/−0.3431	MEDIAN [RANGE] 0.13582[−2.2428 – 10.1959]	CI [−1.19,3.7181]	KruskalWallis ANOVA	p value = 0.18769
		<1yr to >1yr	11	MEAN +/− SE −10.3542+/−0.9537	MEDIAN [RANGE] −14.8694[−22.9947 – 5.9059]	CI [−17.4023,−3.3062]
		>1 yr	11	MEAN +/− SE −5.3161+/−1.7469	MEDIAN [RANGE] 1.0588[−43.9232 – 20.7091]	CI [−18.2257,7.5935]

	mean first syll. pitchgoodness	day-day	10	MEAN +/− SE −2.5285+/−0.3072	MEDIAN [RANGE] −1.7103[−6.4773 – 1.8701]	CI [−4.7264,−0.33064]	KruskalWallis ANOVA	p value = 0.00057452	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’) −24.079 −14.473 −4.8664 0.0012043 (‘day-day’} {‘>1 yr’} −11.715 −2.1091 7.4972 0.86425 {‘1yr’} {‘>1 yr’} 2.9888 12.364 21.738 0.0056618
		<1yr to >1yr	11	MEAN +/− SE 25.7401+/−2.2324	MEDIAN [RANGE] 15.7712[2.3762 – 77.0576]	CI [9.2427,42.2375]
		>1 yr	11	MEAN +/− SE 1.016+/−1.6216	MEDIAN [RANGE] −1.8068[−26.5565 – 37.4611]	CI [−10.9673,12.9993]

	mean first syll. FM	day-day	10	MEAN +/− SE −0.0678+/−0.1779	MEDIAN [RANGE] 0.51513[4.6709 – 1.2936]	CI [−1.3401,1.2046]	KruskalWallis ANOVA	p value = 0.24517
		<1yr to >1yr	11	MEAN +/− SE 2.2977+/−0.3432	MEDIAN [RANGE] 2.4831[−3.4269 – 9.0626]	CI [−0.23826,4.8337]
		>1 yr	11	MEAN +/− SE 0.1021+/−0.354	MEDIAN [RANGE] −0.27577[−6.3593 – 9.0956]	CI [−2.5138,2.718]

SONG Sequence	mean Sequence Linearity	day-day	10	MEAN +/− SE −0.0064+/−0.0014	MEDIAN [RANGE] −0.00466[−0.033023 – 0.010378]	CI [−0.016217,0.003357]	KruskalWallis ANOVA	p value = 0.26738
		<1yr to >1yr	11	MEAN +/− SE 0.0085+/−0.0051	MEDIAN [RANGE] 0.010651[−0.092121 – 0.094655]	CI [−0.029464,0.046449]
		>1 yr	12	MEAN +/− SE 0.0137+/−0.0031	MEDIAN [RANGE] 0.0089315[−0.040662 – 0.10559]	CI [−0.010246,0.037614]

	mean Sequence Consistency	day-day	10	MEAN +/− SE 0.0356+/−0.0077	MEDIAN [RANGE] 4.8284e-05[−0.025719 – 0.18262]	CI [−0.019339,0.090479]	KruskalWallis ANOVA	p value = 0.91088
		<1yr to >1yr	11	MEAN +/− SE 0.0144+/−0.0115	MEDIAN [RANGE] 0.0041893[−0.15141 – 0.25843]	CI [−0.070296,0.099097]
		>1 yr	12	MEAN +/− SE 0.0203+/−0.0078	MEDIAN [RANGE] −0.0034457[−0.077696 – 0.24331]	CI [−0.039332,0.079836]

	mean Sequence Entropy	day-day	10	MEAN +/− SE −0.0441+/−0.0081	MEDIAN [RANGE] −0.040476[−0.16685 – 0.10099]	CI [−0.10234,0.014168]	KruskalWallis ANOVA	p value = 0.024716	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} −4.7565 5.1455 15.047 0.44254 {‘day-day’} {‘>1 yr’} −15.52 −5.8167 3.8868 0.33832 {‘1yr’} {‘>1 yr’} −20.422 −10.962 −1.5023 0.018147
		<1yr to >1yr	11	MEAN +/− SE −0.1381+/−0.0194	MEDIAN [RANGE] −0.17927[−0.49211 – 0.23086]	CI [−0.28177,0.0054986]
		>1 yr	12	MEAN +/− SE 0.0489+/−0.0102	MEDIAN [RANGE] 0.0025373[−0.058314 – 0.3006]	CI [−0.02922,0.127]

SONG variability Number of motifs	variability/ CV Number of motifs	day-day	10	MEAN +/− SE 0.0061+/−0.0044	MEDIAN [RANGE] 0.01163[−0.060488 – 0.10553]	CI [−0.02569,0.037943]	KruskalWallis ANOVA	p value = 0.9075
		<1yr to >1yr	11	MEAN +/− SE −0.0174+/−0.0119	MEDIAN [RANGE] −0.016933[−0.20432 – 0.26592]	CI [−0.10557,0.070809]
		>1 yr	12	MEAN +/− SE −0.0227+/−0.0107	MEDIAN [RANGE] 0.014456[−0.27843 – 0.14339]	CI [−0.10426,0.058791]

SONG variability Motif duration	variability/ CV Motif duration	day-day	10	MEAN +/− SE −0.0002+/−0.0003	MEDIAN [RANGE] 5.2688e-05[−0.0040215 – 0.0045341]	CI [−0.0022209,0.0017378]	KruskalWallis ANOVA	p value = 0.79223
		<1yr to >1yr	11	MEAN +/− SE −0.0004+/−0.0004	MEDIAN [RANGE] −3.4335e-05[−0.0069788 – 0.0052676]	CI [−0.0034333,0.0025572]
		>1 yr	11	MEAN +/− SE −0.0012+/−0.0003	MEDIAN [RANGE] 1.8438e-05[−0.006412 – 0.003475]	CI [−0.0035511,0.0011614]

SONG variability Motif similarity	variability/ CV Similarity Index b/w days	day-day	10	MEAN +/− SE 0.0168+/−0.0036	MEDIAN [RANGE] 0.0061638[−0.021014 – 0.11002]	CI [−0.0089138,0.042558]	KruskalWallis ANOVA	p value = 0.58138
		<1yr to >1yr	11	MEAN +/− SE 0.0233+/−0.0073	MEDIAN [RANGE] −0.013012[−0.072234 – 0.18034]	CI [−0.030843,0.077447]
		>1 yr	11	MEAN +/− SE −0.0001+/−0.0036	MEDIAN [RANGE] 0.004499[−0.066177 – 0.06819]	CI [−0.026985,0.026744]

	variability/ CV Motif Temporal Similarity b/w days	day-day	10	MEAN +/− SE 0.0024+/−0.0033	MEDIAN [RANGE] −0.00087396[−0.041232 – 0.062861]	CI [−0.021115,0.02597]	KruskalWallis ANOVA	p value = 0.94379
		<1yr to >1yr	11	MEAN +/− SE 0.008+/−0.0077	MEDIAN [RANGE] 0.0031895[−0.12646 – 0.15639]	CI [−0.048744,0.064727]
		>1 yr	11	MEAN +/− SE 0.0139+/−0.0054	MEDIAN [RANGE] 0.0061457[−0.064618 – 0.10181]	CI [−0.025661,0.053547]

SONG variability First motif syllable Timing and acoustic properties	variability/ CV a-b interval	day-day	10	MEAN +/− SE 0.0048+/−0.0036	MEDIAN [RANGE] 0.006541[−0.068773 – 0.069437]	CI [−0.020815,0.030502]	KruskalWallis ANOVA	p value = 0.3443
		<1yr to >1yr	11	MEAN +/− SE −0.0309+/−0.0057	MEDIAN [RANGE] −0.0096783[−0.17501 – 0.034413]	CI [−0.073356,0.011538]
		>1 yr	12	MEAN +/− SE −0.0112+/−0.0028	MEDIAN [RANGE] −0.0046618[−0.076364 – 0.040639]	CI [−0.032934,0.010525]

	variability/ CV first syll. duration	day-day	10	MEAN +/− SE −0.0002+/−0.0006	MEDIAN [RANGE] 0.0008949[−0.014397 – 0.008081]	CI [−0.004258,0.0039024]	KruskalWallis ANOVA	p value = 0.35878
		<1yr to >1yr	11	MEAN +/− SE −0.0112+/−0.002	MEDIAN [RANGE] −0.0056469[−0.062141 – 0.014012]	CI [−0.026271,0.0038214]
		>1 yr	11	MEAN +/− SE 0.0008+/−0.0016	MEDIAN [RANGE] −0.0021715[−0.023275 – 0.046759]	CI [−0.010703,0.012235]

	variability/ CV first syll. meanfrequency	day-day	10	MEAN +/− SE −0.0055+/−0.001	MEDIAN [RANGE] −0.0025316[−0.024134 – 0.0090383]	CI [−0.012463,0.0015509]	KruskalWallis ANOVA	p value = 0.9193
		<1yr to >1yr	11	MEAN +/− SE −0.0101+/−0.0029	MEDIAN [RANGE] −0.010687[−0.059918 – 0.057253]	CI [−0.031614,0.011496]
		>1 yr	11	MEAN +/− SE −0.012+/−0.0019	MEDIAN [RANGE] −0.0050096[−0.050071 – 0.0089345]	CI [−0.026197,0.0021871]

	variability/ CV first syll. entropy	day-day	10	MEAN +/− SE −0.0005+/−0.0019	MEDIAN [RANGE] 0.0050826[−0.039327 – 0.021044]	CI [−0.014012,0.013049]	KruskalWallis ANOVA	p value = 0.38537
		<1yr to >1yr	11	MEAN +/− SE −0.0142+/−0.0031	MEDIAN [RANGE] −0.0076413[−0.061403 – 0.037881]	CI [−0.036817,0.0083946]
		>1 yr	11	MEAN +/− SE −0.0013+/−0.0015	MEDIAN [RANGE] −0.0046465[−0.022166 – 0.028778]	CI [−0.012305,0.0096898]

	variability/ CV first syll. logamplitude	day-day	10	MEAN +/− SE 0.049+/−0.0137	MEDIAN [RANGE] 0.0025827[−0.0047802 – 0.43934]	CI [−0.049387,0.14731]	KruskalWallis ANOVA	p value = 0.5123
		<1yr to >1yr	11	MEAN +/− SE 0.0233+/−0.0077	MEDIAN [RANGE] −0.0025583[−0.070594 – 0.16975]	CI [−0.033259,0.079812]
		>1 yr	11	MEAN +/− SE −0.0113+/−0.0084	MEDIAN [RANGE] −0.0010149[−0.16967 – 0.15332]	CI [−0.07347,0.050869]

	variability/ CV first syll. pitchgoodness	day-day	10	MEAN +/− SE 0.0032+/−0.0016	MEDIAN [RANGE] 0.0039579[−0.020149 – 0.029188]	CI [−0.0081618,0.014606]	KruskalWallis ANOVA	p value = 0.32383
		<1yr to >1yr	11	MEAN +/− SE −0.0158+/−0.0036	MEDIAN [RANGE] −0.0098688[−0.089673 – 0.049601]	CI [−0.042089,0.010569]
		>1 yr	11	MEAN +/− SE −0.0024+/−0.0018	MEDIAN [RANGE] −0.0043676[−0.028945 – 0.03971]	CI [−0.015916,0.011124]

	variability/ CV first syll. FM	day-day	10	MEAN +/− SE 0.0096+/−0.0018	MEDIAN [RANGE] 0.0044912[−0.0025808 – 0.057731]	CI [−0.0030461,0.022322]	KruskalWallis ANOVA	p value = 0.0043445	Group A Group B Lower Limit A-B Upper Limit P-value {‘day-day’} {‘1yr’} 1.8119 11.418 21.024 0.014767 (‘day-day’} {‘>1 yr’} 2.5391 12.145 21.752 0.0085546 {‘1yr’} {‘>1 yr’} −8.6475 0.72727 10.102 0.98194
		<1yr to >1yr	11	MEAN +/− SE −0.0212+/−0.0048	MEDIAN [RANGE] −0.0046254[−0.17894 – 0.0063666]	CI [−0.056712,0.014238]
		>1 yr	11	MEAN +/− SE −0.0068+/−0.0006	MEDIAN [RANGE] −0.006761[−0.018602 – 0.0024346]	CI [−0.011559,−0.0021225]

SONG variability Similarity of motifs within bouts	variability/ CV similarity within motif	day-day	10	MEAN +/− SE −2.4078+/−0.7858	MEDIAN [RANGE] 0.038583[−24.7646 – 0.48545]	CI [−8.0287,3.2132]	KruskalWallis ANOVA	p value = 0.48074
		<1yr to >1yr	11	MEAN +/− SE 4.1703+/−1.0823	MEDIAN [RANGE] 0.019058[−5.192 – 36.4043]	CI [−3.8279,12.1685]
		>1 yr	11	MEAN +/− SE 0.041+/−0.1997	MEDIAN [RANGE] −0.21903[−3.0156 – 6.0959]	CI [−1.435,1.517]

Open in a new tab

Changes in mean or variability of IN properties between pairs of days belonging to 3 age categories: day-day, <1yr to >1yr and > 1 yr. All pairs of days are the same pairs of days selected for maximum change in IN number in Fig. 5. Names of properties that significantly change with age are highlighted in bold.

IN features that change with age do not show bout-to-bout correlations with song features that change with age

If INs represented motor preparation, the correlations between IN features and song features on a bout-to-bout basis should predict long-term changes in both IN and song features (Fig. 1D). For instance, our results demonstrated an age-related speeding up of song (decrease in song motif duration) and an age-related increase in mean IN number with age. If INs were preparatory, then these changes would be predicted by an already existing negative correlation between the number of INs in a bout and song motif duration. However, we did not find any correlation between the number of INs in a bout or the time-to-song and song motif duration (Fig. 7A, 7B, p = 0.86, Kruskal-Wallis Test and p = 0.88, Wilcoxon signed-rank test respectively). We also compared first song motif stereotypy between bouts with fewer INs than the median IN number and bouts with IN number greater than the median IN number. First song motif stereotypy was not significantly different (Fig. 7C, 7D, p=0.44. Kruskal-Wallis Test and p=0.65, Wilcoxon signed-rank test respectively).

Fig. 7 — (A, C) Motif length normalized to median length (A) and motif similarity (C) are plotted for different IN numbers relative to the median IN number. Black squares represent mean across birds. (B, D) Motif length normalized to median length (A) and motif similarity (C) are plotted for short and long time-to-song. Squares and whiskers represent the mean and SEM across birds. In all plots, circles joined by lines represent mean values for one session from individual birds and different colours represent different birds. p > 0.05 KruskalWallis ANOVA in (A) and (C), p > 0.05, Wilcoxon signed-rank test in (B) and (D).

Similarly, we did not find any bout-to-bout correlations between other IN features that change with age and song features that change with age (Table 3). Overall, these results demonstrate that song features and IN features change independently with age. The absence of correlations between features that change do not satisfy the predictions of the motor preparation hypothesis. Rather, these changes demonstrate independent age-related changes in both INs and songs, suggesting the possibility that age affects neural circuits controlling INs and songs.

TABLE 3:

Within-day relationship of IN and Song properties that change with age.

IN Property changing with age	Measured song property	Corresponding IN Property groups	Individual group data				Test	P

			N	Group means	Group range	Confidence Intervals

Mean Time to Song	First song syll. mean pitch goodness	Time to Song <Median	20	MEAN +/− SE 123.5228+/−2.2636	125.5742[50.9307 – 216.3257]	CI [102.335,144.7106]	sign-rank	p value = 0.50159
		Time to Song >Median	20	MEAN +/− SE 122.706+/−2.1943	125.3481[57.3089 – 211.60251	CI [102.1666,143.2454]

	Song Sequence Entropy	Time to Song <Median	20	MEAN +/− SE 0.1951+/−0.0082	MEDIAN [RANGE] 0.15957[0 – 0.486371	CI [0.11845,0.27181]	sign-rank	p value = 0.19299
		Time to Song >Median	20	MEAN +/− SE 0.2318+/−0.0071	MEDIAN [RANGE] 0.24343[0 – 0.630161	CI [0.16515,0.29843]

	First song syll. CV of Freq. modulation	Time to Song <Median	20	MEAN +/− SE 0.0692+/−0.0025	MEDIAN [RANGE] 0.055[0.024748 – 0.26642]	CI [0.045368,0.092963]	sign-rank	p value = 0.97022
		Time to Song >Median	20	MEAN +/− SE 0.0701+/−0.0029	0.056609[0.015444 – 0.30047]	CI [0.042704,0.097529]

Mean IN Frequency modulation	First song syll. mean pitch goodness	IN Freq. Modulation First IN < Median	17	MEAN +/− SE 128.1796+/−2.6248	126.6775[52.6362 – 212.7263]	CI [105.2376,151.1215]	sign-rank	p value = 0.86841
		IN Freq. Modulation First IN > Median	17	MEAN +/− SE 128.131+/−2.5845	124.8967[57.6679 – 214.1074]	CI [105.5405,150.7215]

	First song syll. mean pitch goodness	IN Freq. Modulation LastIN < Median	17	MEAN +/− SE 127.4103+/−2.5747	125.6182[50.6308 – 203.6352]	CI [104.906,149.9146]	sign-rank	p value = 0.61915
		IN Freq. Modulation LastIN > Median	17	MEAN +/− SE 128.8826+/−2.6526	126.0253[60.9129 – 224.16731	CI [105.6972,152.0679]

	Song Sequence Entropy	IN Freq. Modulation First IN < Median	17	MEAN +/− SE 0.1822+/−0.0068	MEDIAN [RANGE] 0.19398[0 – 0.37821]	CI [0.12307,0.24139]	sign-rank	p value = 0.94604
		IN Freq. Modulation First IN > Median	17	MEAN +/− SE 0.1749+/−0.0073	MEDIAN [RANGE] 0.20596[0 – 0.40732]	CI [0.11102,0.23869]

	Song Sequence Entropy	IN Freq. Modulation LastIN < Median	17	MEAN +/− SE 0.1831+/−0.0069	MEDIAN [RANGE] 0.20961[0 – 0.39832]	CI [0.12301,0.24313]	sign-rank	p value = 0.39099
		IN Freq. Modulation LastIN > Median	17	MEAN +/− SE 0.1745+/−0.0083	MEDIAN [RANGE] 0.15161[0 – 0.48851]	CI [0.10185,0.24723]

	First song syll. CV of Freq. modulation	IN Freq. Modulation First IN < Median	17	MEAN +/− SE 0.0655+/−0.0033	0.053098[0.018979 – 0.27323]	CI [0.036895,0.094017]	sign-rank	p value = 0.16257
		IN Freq. Modulation First IN > Median	17	MEAN +/− SE 0.0711+/−0.0036	0.05341[0.020524 – 0.29628]	CI [0.039201,0.10298]

	First song syll. CV of Freq. modulation	IN Freq. Modulation LastIN < Median	17	MEAN +/− SE 0.0664+/−0.0029	0.053298[0.016997 – 0.2475]	CI [0.040922,0.09195]	sign-rank	p value = 0.52278
		IN Freq. Modulation LastIN > Median	17	MEAN +/− SE 0.068+/−0.0036	0.053593[0.02202 – 0.29521]	CI [0.036477,0.099552]

	Motif duration mean	IN Freq. Modulation First IN < Median	17	MEAN +/− SE 540.1806+/−8.1151	507.0181[347.9723 – 794.4851]	CI [469.2495,611.1117]	sign-rank	p value = 0.22739
		IN Freq. Modulation First IN > Median	17	MEAN +/− SE 539.6074+/−8.0783	511.695[348.1255 – 793.3096]	CI [468.9981,610.2168]

	Motif duration mean	IN Freq. Modulation LastIN < Median	17	MEAN +/− SE 540.404+/−8.1084	513.1207[347.061 – 795.4847]	CI [469.5315,611.2764]	sign-rank	p value = 0.17729
		IN Freq. Modulation LastIN > Median	17	MEAN +/− SE 539.3276+/−8.0792	505.7908[348.9808 – 792.3758]	CI [468.7109,609.9444]

	First song syll. mean pitch goodness	IN Freq. ModulationFirst IN < variable	17	MEAN +/− SE 128.7139+/−2.6547	125.1689[56.1686 – 216.0633]	CI [105.5102,151.9176]	sign-rank	p value = 0.90579
		IN Freq. ModulationFirst IN > variable	17	MEAN +/− SE 127.5747+/−2.5569	126.4439[54.5083 – 211.0079]	CI [105.2262,149.9233]

	First song syll. mean pitch goodness	IN Freq. ModulationLastIN < variable	17	MEAN +/− SE 128.2524+/−2.5679	125.817[58.4789 – 217.5751]	CI [105.8077,150.6971]	sign-rank	p value = 0.43476
		IN Freq. ModulationLastIN > variable	17	MEAN +/− SE 128.0654+/−2.6364	126.3588[52.979 – 209.4515]	CI [105.0218,151.109]

	Song Sequence Entropy	IN Freq. ModulationFirst IN < variable	17	MEAN +/− SE 0.1822+/−0.0068	MEDIAN [RANGE] 0.19398[0 – 0.37821]	CI [0.12307,0.24139]	sign-rank	p value = 0.94604
		IN Freq. ModulationFirst IN > variable	17	MEAN +/− SE 0.1749+/−0.0073	MEDIAN [RANGE] 0.20596[0 – 0.40732]	CI [0.11102,0.23869]

	Song Sequence Entropy	IN Freq. ModulationLastIN < variable	17	MEAN +/− SE 0.1831+/−0.0069	MEDIAN [RANGE] 0.20961[0 – 0.39832]	CI [0.12301,0.24313]	sign-rank	p value = 0.39099
		IN Freq. ModulationLastIN > variable	17	MEAN +/− SE 0.1745+/−0.0083	MEDIAN [RANGE] 0.15161[0 – 0.48851]	CI [0.10185,0.24723]

	First song syll. CV of Freq. modulation	IN Freq. ModulationFirst IN < variable	17	MEAN +/− SE 0.0681+/−0.0033	0.054209[0.011482 – 0.27301]	CI [0.038987,0.097192]	sign-rank	p value = 0.79459
		IN Freq. ModulationFirst IN > variable	17	MEAN +/− SE 0.0688+/−0.0036	0.055328[0.023822 – 0.29855]	CI [0.037104,0.10043]

	First song syll. CV of Freq. modulation	IN Freq. ModulationLastIN < variable	17	MEAN +/− SE 0.0674+/−0.0035	0.052272[0.017337 – 0.28556]	CI [0.037008,0.097777]	sign-rank	p value = 0.35596
		IN Freq. ModulationLastIN > variable	17	MEAN +/− SE 0.0694+/−0.0033	0.057171[0.022597 – 0.27334]	CI [0.040961,0.09793]

	Motif duration mean	IN Freq. ModulationFirst IN < variable	17	MEAN +/− SE 540.1084+/−8.0747	509.8513[348.6979 – 793.6949]	CI [469.5305,610.6863]	sign-rank	p value = 0.17729
		IN Freq. ModulationFirst IN > variable	17	MEAN +/− SE 539.6431+/−8.1125	508.7642[347.3595 – 793.9809]	CI [468.7348,610.5513]

	Motif duration mean	IN Freq. ModulationLastIN < variable	17	MEAN +/− SE 540.1361+/−8.0878	508.9897[348.602 – 793.3415]	CI [469.444,610.8283]	sign-rank	p value = 0.17729
		IN Freq. ModulationLastIN > variable	17	MEAN +/− SE 539.5851+/−8.0994	509.7336[347.3936 – 794.3277]	CI [468.7917,610.3786]

CV IN Pitch Goodness	First song syll. mean pitch goodness	IN Pitch GoodnessFirst IN < variable	17	MEAN +/− SE 127.2711+/−2.5544	128.3044[54.126 – 208.7475]	CI [104.9437,149.5984]	sign-rank	p value = 0.14879
		IN Pitch GoodnessFirst IN > variable	17	MEAN +/− SE 128.9532+/−2.6568	125.7771[56.5934 – 218.5388]	CI [105.7313,152.175]

	First song syll. mean pitch goodness	IN Pitch GoodnessLastIN < variable	17	MEAN +/− SE 128.3615+/−2.5988	127.4243[56.1387 – 211.9064]	CI [105.6465,151.0764]	sign-rank	p value = 0.46311
		IN Pitch GoodnessLastIN > variable	17	MEAN +/− SE 127.8869+/−2.6044	125.9434[54.8024 – 215.287]	CI [105.1226,150.6513]

	Song Sequence Entropy	IN Pitch GoodnessFirst IN < variable	17	MEAN +/− SE 0.1846+/−0.0068	MEDIAN [RANGE] 0.16901[0 – 0.39832]	CI [0.12533,0.24383]	sign-rank	p value = 0.58789
		IN Pitch GoodnessFirst IN > variable	17	MEAN +/− SE 0.1755+/−0.008	MEDIAN [RANGE] 0.18068[0 – 0.39237]	CI [0.1057,0.24528]

	Song Sequence Entropy	IN Pitch GoodnessLastIN < variable	17	MEAN +/− SE 0.1904+/−0.0078	MEDIAN [RANGE] 0.19053[0 – 0.43366]	CI [0.12214,0.25863]	sign-rank	p value = 0.39099
		IN Pitch GoodnessLastIN > variable	17	MEAN +/− SE 0.174+/−0.0067	MEDIAN [RANGE] 0.1985[0 – 0.37024]	CI [0.11542,0.23264]

	First song syll. CV of Freq. modulation	IN Pitch GoodnessFirst IN < variable	17	MEAN +/− SE 0.069+/−0.0036	0.05312[0.019788 – 0.29365]	CI [0.037933,0.10008]	sign-rank	p value = 0.86841
		IN Pitch GoodnessFirst IN > variable	17	MEAN +/− SE 0.0682+/−0.0034	0.052886[0.018332 – 0.27788]	CI [0.038838,0.097469]

	First song syll. CV of Freq. modulation	IN Pitch GoodnessLastIN < variable	17	MEAN +/− SE 0.0661+/−0.0032	0.048979[0.015658 – 0.26418]	CI [0.038071,0.094113]	sign-rank	p value = 0.13592
		IN Pitch GoodnessLastIN > variable	17	MEAN +/− SE 0.071+/−0.0037	MEDIAN [RANGE] 0.0558[0.024032 – 0.30476]	CI [0.038706,0.10325]

	Motif duration mean	IN Pitch GoodnessFirst IN < variable	17	MEAN +/− SE 540.118+/−8.0985	510.4636[347.9831 – 794.9476]	CI [469.3327,610.9033]	sign-rank	p value = 0.19299
		IN Pitch GoodnessFirst IN > variable	17	MEAN +/− SE 539.6061+/−8.0889	508.0754[348.1141 – 792.7513]	CI [468.9045,610.3076]

	Motif duration mean	IN Pitch GoodnessLastIN < variable	17	MEAN +/− SE 539.7045+/−8.0856	510.0202[347.1989 – 795.2668]	CI [469.0315,610.3774]	sign-rank	p value = 0.49246
		IN Pitch GoodnessLastIN > variable	17	MEAN +/− SE 540.0411+/−8.1053	508.5743[348.9418 – 792.4381]	CI [469.1962,610.8859]

CV IN Duration	First song syll. mean pitch goodness	IN DurationFirst IN < variable	17	MEAN +/− SE 128.4482+/−2.6013	125.6802[57.8534 – 207.1889]	CI [105.711,151.1854]	sign-rank	p value = 0.79459
		IN DurationFirst IN > variable	17	MEAN +/− SE 127.874+/−2.6139	125.9007[53.434 – 220.5358]	CI [105.0267,150.7213]

	First song syll. mean pitch goodnes	IN DurationLastIN < variable	17	MEAN +/− SE 127.2207+/−2.623	125.0715[53.0123 – 211.6273]	CI [104.2942,150.1472]	sign-rank	p value = 0.02772
		IN DurationLastIN > variable	17	MEAN +/− SE 129.1233+/−2.5822	126.5474[58.4331 – 215.5743]	CI [106.5534,151.6933]

	Song Sequence Entropy	IN DurationFirst IN < variable	17	MEAN +/− SE 0.1909+/−0.0068	MEDIAN [RANGE] 0.21116[0 – 0.39477]	CI [0.13168,0.25002]	sign-rank	p value = 0.16772
		IN DurationFirst IN > variable	17	MEAN +/− SE 0.1687+/−0.0076	MEDIAN [RANGE] 0.18119[0 – 0.39832]	CI [0.10242,0.23496]

	Song Sequence Entropy	IN DurationLastIN < variable	17	MEAN +/− SE 0.1949+/−0.0083	MEDIAN [RANGE] 0.21285[0 – 0.437]	CI [0.12244,0.26727]	sign-rank	p value = 0.14648
		IN DurationLastIN > variable	17	MEAN +/− SE 0.1702+/−0.006	MEDIAN [RANGE] 0.19892[0 – 0.33124]	CI [0.11807,0.22243]

	First song syll. CV of Freq. modulation	IN DurationFirst IN < variable	17	MEAN +/− SE 0.0703+/−0.0036	0.056273[0.021323 – 0.29779]	CI [0.038726,0.10185]	sign-rank	p value = 0.20967
		IN DurationFirst IN > variable	17	MEAN +/− SE 0.0663+/−0.0034	0.048202[0.018392 – 0.07986]	CI [0.036368,0.096161]

	First song syll. CV of Freq. modulation	IN DurationLastIN < variable	17	MEAN +/− SE 0.0715+/−0.004	0.06134[0.021127 – 0.32388]	CI [0.036949,0.10608]	sign-rank	p value = 0.52278
		IN DurationLastIN > variable	17	MEAN +/− SE 0.0651+/−0.0028	0.050541[0.018461 – 0.23581]	CI [0.040397,0.089759]

	Motif duration mean	IN DurationFirst IN < variable	17	MEAN +/− SE 540.0089+/−8.0909	508.0247[348.5905 – 793.3571]	CI [469.2893,610.7285]	sign-rank	p value = 0.83131
		IN DurationFirst IN > variable	17	MEAN +/− SE 539.7405+/−8.0971	510.8192[347.4729 – 794.3124]	CI [468.9674,610.5135]

	Motif duration mean	IN DurationLastIN < variable	17	MEAN +/− SE 539.9145+/−8.1055	510.6374[347.7909 – 794.1965]	CI [469.0675,610.7616]	sign-rank	p value = 0.79459
		IN DurationLastIN > variable	17	MEAN +/− SE 539.8063+/−8.0804	507.8798[348.317 – 793.4885]	CI [469.179,610.4336]

Open in a new tab

For IN properties (Time to song, frequency modulation, CV of pitch goodness and duration) and song properties (Motif duration, mean pitch goodness of first song syllable, sequence entropy of the motif, CV of frequency modulation of first song syllable) that change with age, changes within-day were compared. The bouts were split based on mean or CV of IN property into two groups and corresponding mean or CV of song property was measured within each bird. The paired-groups were tested for statistical differences using Wilcoxon sign-rank test.

IN acoustic features share similarities with other syllable repeats but speeding up of intervals between INs is unique to IN repeats

Our current results suggest that none of the IN properties, namely, number, timing and, acoustic features represent motor preparation for features of the upcoming first song syllable. Further, IN acoustic features co-vary with song syllable acoustic features. One possibility for this correlation in acoustic features is the fact that INs are also vocalizations like song syllables. In fact, a recent study has shown that mean IN number and IN acoustic features are learned by young zebra finches, similar to learning of song syllables (Kalra et al., 2021). Unlike most song syllables, INs repeat and this repetition may drive some of the changes in properties of INs as they progress to song. To examine this possibility, we next asked if the properties of INs are more similar to the properties of song syllables that repeat. Such song syllable repeats are present in a small fraction of zebra finches (Fig. 8A top; example spectrogram with repeat of the song syllable ‘c’). Zebra finches also repeat calls (other non-song vocalizations that can be produced outside of song bouts for communication) outside of their song bouts (Fig. 8A bottom). To compare repeats across these three categories, we chose birds with motif syllable repeats. Both mean IN number and the variability of IN number from bout to bout were not significantly different from the mean number and variability of repeats for other categories (Fig. 8B, p=0.049 Kruskal-Wallis test followed by post-hoc Tukey-Kramer test, p = 0.31 for IN vs. song and p =0.67 for IN vs. calls, Fig. 8C p=0.20, Kruskal-Wallis Test).

Fig. 8 — (A) Top - example spectrogram of a bird that produces repeating INs ‘i’ and repeating syllable ‘c’ within the motif. Bottom – example of a bird producing repeats of call syllables ‘z’ outside song. (B-E) Comparison of repeat number (B left), repeat number variability (B right), repeat interval (C), median ratio of repeat interval (D), and median ratio of acoustic features of successive syllables (E) between different types of repeats (motif repeats, IN repeats or call repeats). Circles represent data from individual birds, squares and whiskers represent the mean and SEM across birds. In C, lines join data from the same repeat syllable across different positions. ** denotes p < 0.01 Wilcoxon signed-rank test in (C), * denotes p < 0.05, *** p denotes < 0.005, Kruskal-Wallis test, followed by posthoc Tukey-Kramer Test in (D). Dashed lines in D and E represent a ratio of 1.

Interestingly, intervals between repeat syllables showed interesting differences between the three categories of repeats. For all categories of repeats, we compared the mean interval between the first two syllables in the repeat sequence (first interval) and the last syllable of the repeat sequence and the next syllable in the bout (last interval). As shown earlier (Rajan and Doupe, 2013; Rao et al., 2019), for IN repeats, the last interval was significantly shorter than the first interval (p=0.004, Wilcoxon signed-rank test) and the opposite was true for call repeats with the last interval being significantly longer than the first interval (Fig. 8C, p=0.008, Wilcoxon signed-rank test). Interestingly, for motif syllable repeats, we observed differences depending on the position of the repeat within the song motif; for repeats that were the last syllable of the song motif, the last interval was significantly longer than the first interval (p=0.002, Wilcoxon signed-rank test), while for repeats that occurred at the beginning of the song motif, the last and first interval were not significantly different (Fig. 8C). Thus, the speeding up of intervals between successive INs was a unique feature of INs (Fig. 8D - ratio of successive intervals is less than 1 only for IN repeats; mean +/− SE - 0.65+/−0.03). The acoustic features of all repeats changed from first to last syllable of the repeat and was not significantly different for any of the categories (Fig. 8E, p=0.71), suggesting that changes in acoustic features were a property of syllable repetition. Overall, these results show that bout-to-bout variation in number of INs and variation in acoustic properties of successive INs are a property of all syllable repeats. However, the speeding up of intervals between successive INs is not a property of all syllable repeats but is unique to INs at the beginning of the bout.

Discussion

Here, we tested two predictions of the hypothesis that the variable repeats of INs that precede song represent motor preparation for the upcoming song. First, we examined bout-to-bout correlations between the first song syllable and three properties of INs, namely, (1) number before song, (2) timing of INs and (3) acoustic features of INs. Both number and timing of INs were not correlated with features of the first song syllable. Log amplitude and mean frequency of the last IN were correlated with the corresponding acoustic features of the first song syllable. However, acoustic features of the first song syllable were also correlated with those of the first IN and the second song syllable. This suggested that INs are also vocalizations that are part of the song sequence and IN-song correlations reflect global bout-to-bout variation in acoustic features. Second, we found an age-related increase in mean IN number along with an age-related increase in the speed of song. Age-related changes in INs and song were independent and were not predicted by pre-existing bout-to-bout IN-song correlations in the same sets of features. Finally, we compared IN repeats with other types of syllable repeats. Just like IN repeats, other syllable repeats also varied in number between bouts and varied in acoustic properties as the repeats progressed. The speeding up of intervals between INs with each successive repeat was unique to IN repeats, while the intervals between other kinds of syllable repeats remained constant or increased with each successive repeat. Overall, these results show that INs and song syllables share similarities in acoustic features that suggest shared neural control, while IN and song syllable timing are different highlighting possible differences in the neural control of timing.

Song control pathways may control changes in IN acoustic features

Our results show correlations between IN acoustic properties and the corresponding acoustic properties of the first song syllable. While, this is expected from the motor preparation hypothesis, we also found correlations between the first IN and the first song syllable and the first and second song syllables, suggesting the presence of global correlations between syllables within a bout. Similar correlations in the duration of syllables within a song motif have been described earlier (Glaze and Troyer, 2006). Our results show that such correlations extend to INs as well, suggesting that INs are also syllables similar to song syllables. Consistent with this idea, we also found changes in acoustic features for song syllables that repeat within song and non-song vocalizations that repeat outside of song sequences. These results suggest that INs and song syllables are controlled by the same neural circuitry that control song syllable production.

What areas of the brain control song syllable and IN production? Song syllable structure and timing are known to be controlled by the song motor pathway consisting of premotor nucleus HVC (used as a proper name) and its projection to the robust nucleus of the arcopallium (RA) (Hahnloser et al., 2002; Long and Fee, 2008; Fee and Scharff, 2010). There is some evidence that INs might also be controlled by the same circuitry, albeit with some differences. HVC neurons and RA neurons show temporally precise bursts of neural activity during song syllables (McCasland, 1987; Yu and Margoliash, 1996; Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Long et al., 2010) and some of these neurons also show precise bursts of neural activity during INs (Hahnloser et al., 2002; Kozhevnikov and Fee, 2007; Rajan and Doupe, 2013). However, some HVC neurons also show differences in activity depending on the position of INs within the sequence (Rajan and Doupe, 2013) similar to HVC firing in other songbird species with more complex songs with syllable repeats (Fujimoto et al., 2011; Cohen et al., 2020). Cooling of HVC slows down song syllables at multiple timescales and cooling of HVC also slows down INs, albeit by 70% of the extent of slowing down of song syllables (Long and Fee, 2008). Finally, complete bilateral lesions of HVC abolish the normal production of song syllables (Nottebohm et al., 1976; Simpson and Vicario, 1990; Aronov et al., 2008; Chen et al., 2014), but whether they entirely abolish INs is unclear. Outside of the song motor pathway, a recent study has also shown that lesions of the midbrain dopaminergic nucleus A11 makes birds completely mute for song (Ben-Tov et al., 2023). Given the soft amplitude of initial INs, it is unclear whether these birds produce a few INs that do not progress normally to song. Overall, our results and the results of earlier studies suggest that IN production is also partly controlled by the song motor pathway and partly controlled by neural circuits outside of the song motor pathway.

What controls the speeding up of gaps between INs?

Our results show that IN timing was not correlated with the timing of the upcoming song syllables. However, IN timing, characterized by a speeding up of intervals between successive INs, was not a feature of all other syllables that repeat. Infact, for other syllables that repeated, intervals between successive syllables remained the same or became longer. This suggests that the speeding up of intervals between INs could indicate the time to song initiation (Rao et al., 2019). In primates, one aspect of neural preparatory activity is the consistent, and strong, correlation with the time at which the movement is initiated (Kaufman et al., 2016). Similarly, the speeding up of intervals between INs could reflect the readiness of specific brain regions to initiate song.

Which brain regions could be involved in this process? One candidate region is the premotor nucleus HVC and its inputs. As mentioned above, HVC controls the temporal dynamics of syllable production and is necessary for normal song production (Nottebohm et al., 1976; Simpson and Vicario, 1990; Aronov et al., 2008; Long and Fee, 2008). Intracellular recordings have shown that HVC_RA neurons (HVC neurons projecting to motor nucleus RA) are more depolarised during singing compared to non-singing and this depolarisation begins during the INs (Long et al., 2010). This depolarisation could potentially come from dopaminergic input from midbrain dopaminergic nucleus, A11. In support of the idea that this dopaminergic input is important for IN progression to song, dopaminergic blockers in HVC make birds sing strings of INs, without progressing to song, directed towards female birds (Ben-Tov et al., 2023).

How could this depolarisation of HVC neurons during INs drive IN progression to song? Initiation of song syllables is believed to be driven by thalamic input, from nucleus Uvaeformis (UVa), to HVC at the start of each song syllable (Moll et al., 2023). UVa neurons are also active at the onset of INs suggesting the possibility that UVa input drives HVC at the start of each IN (Danish et al., 2017). The depolarisation state of HVC could play a role in determining the time when UVa input successfully initiates an IN; more depolarised state in HVC, sooner HVC neurons are activated and this in-turn could lead to a shorter interval between INs. In this model, the speeding up of INs would reflect the gradual reduction in the time taken by HVC neurons to respond to UVa input. This model also predicts that INs would not be necessary when HVC neurons are already depolarised. This is supported by the fact that successive songs within a bout can be initiated without INs (Sossinka and Böhner, 1980; Rajan and Doupe, 2013).

Do INs have a preparatory function?

Our results do not completely support the predictions of the motor preparation hypothesis postulated based on neural activity in primates. This suggests that the parallels between INs reaching a consistent “state” before song initiation and primate preparatory neural activity reaching a consistent neural state before movement initiation are restricted to their correlations with the time to movement initiation. This hypothesis predicts that disruption of INs and IN-related activity in HVC would only delay song initiation without affecting the features of upcoming song. Further studies disrupting IN-related neural activity in HVC could be used to test this.

Second, our results showing the similarities in the acoustic features of INs and other repeated syllables suggest that INs might be a special form of repeating syllables that occur at the beginning of zebra finch song bouts. Other songbird species with more complex song bouts have syllable repeats that occur regularly within the song (Okanoya, 2004; Yarden et al., n.d.). However, two questions remain unanswered, namely, (1) do IN-like repeating syllables occur at the beginning of song bouts of other songbird species and (2) do the properties of IN repeats share similarities with syllables repeats that are present in the song sequences of other songbird species. Further comparative analysis of song bouts across different songbird species could be used to address these questions.

Overall, our results suggest that IN acoustic structure and repetition share similarities with song syllables, suggesting shared neural control of IN acoustic structure and repeat number. The speeding up of intervals between INs is a unique feature of INs and could reflect different neural mechanisms controlling the timing of INs.

Supplementary Material

Supplement 1

Extended Data Table supporting Figure 2

Statistical test results for data from Fig 2. - Fig. 8

media-1.xls^{(176KB, xls)}

Acknowledgments

We would like to thank Prakash Raut for bird care. We thank Shikha Kalra, Ananya Kumar, Aditi Agarwal, Vishruta Yawatkar, Harsha K Kumar, Sharvari Tamhankar, Gaurav Isola for labeled songs or songs recorded at different ages, and Harini Suri for sharing head-implanted microphone recordings. We thank Allison Doupe and Michael Brainard for song recordings of birds with repeat syllables. We also thank Mimi Kao, Michael Long and members of the Rajan lab for useful discussions. DR is currently at Department of Neuroscience, University of Copenhagen, Blegdamsvej 3B, Denmark 2200, Copenhagen N

Funding sources:

This work was supported by a DBT/Wellcome Trust India Alliance Senior Fellowship (IA/S/21/1/505621), a Department of Biotechnology (DBT) Ramalingaswami Fellowship (BT/HRD/35/02/2006) and, grants from the Science and Engineering Research Board (EMR/2015/000829, CRG/2021/004690) and the Department of Science and Technology, Cognitive Science Research Initiative (DST/CSRI/2017/163) to RR, an National Institutes of Health Grant R01 MH55987 to Allison Doupe where RR collected some of the songs that were used in this study, graduate student fellowship from IISER Pune to DR, travel support from Department of Biotechnology, Ministry of Science and Technology-Conference, Travel, Exhibition and Popular Lectures (DBT/CTEP/02/2018 0847433) and the Infosys Foundation Travel Award (IISER-P/InfyFnd/Trv/116) to DR.

Footnotes

Conflict of Interest:

A. No - The authors report no conflict of interest

References

Aronov D, Andalman AS, Fee MS (2008) A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science 320:630–634. [DOI] [PubMed] [Google Scholar]
Ben-Tov M, Duarte F, Mooney R (2023) A neural hub for holistic courtship displays. Curr Biol 33:1640–1653. [DOI] [PMC free article] [PubMed] [Google Scholar]
Bottjer SW, Arnold AP (1984) The role of feedback from the vocal organ. I. Maintenance of stereotypical vocalizations by adult zebra finches. J Neurosci 4:2387–2396. [DOI] [PMC free article] [PubMed] [Google Scholar]
Brainard MS, Doupe AJ (2001) Postlearning consolidation of birdsong: stabilizing effects of age and anterior forebrain lesions. J Neurosci 21:2501–2517. [DOI] [PMC free article] [PubMed] [Google Scholar]
Chen JR, Stepanek L, Doupe AJ (2014) Differential contributions of basal ganglia and thalamus to song initiation, tempo, and structure. J Neurophysiol 111:248–257. [DOI] [PMC free article] [PubMed] [Google Scholar]
Churchland MM, Cunningham JP, Kaufman MT, Ryu SI, Shenoy KV (2010) Cortical preparatory activity: representation of movement or first cog in a dynamical machine? Neuron 68:387–400. [DOI] [PMC free article] [PubMed] [Google Scholar]
Churchland MM, Santhanam G, Shenoy KV (2006a) Preparatory activity in premotor and motor cortex reflects the speed of the upcoming reach. J Neurophysiol 96:3130–3146. [DOI] [PubMed] [Google Scholar]
Churchland MM, Yu BM, Ryu SI, Santhanam G, Shenoy KV (2006b) Neural variability in premotor cortex provides a signature of motor preparation. J Neurosci 26:3697–3712. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cohen Y, Shen J, Semu D, Leman DP, Liberti III WA, Perkins LN, Liberti DC, Kotton DN, Gardner TJ (2020) Hidden neural states underlie canary song syntax. Nature 582:539–544. [DOI] [PMC free article] [PubMed] [Google Scholar]
Danish HH, Aronov D, Fee MS (2017) Rhythmic syllable-related activity in a songbird motor thalamic nucleus necessary for learned vocalizations. PloS One 12:e0169568. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fee MS, Scharff C (2010) The songbird as a model for the generation and learning of complex sequential behaviors. ILAR J Natl Res Counc Inst Lab Anim Resour 51:362–377. [DOI] [PubMed] [Google Scholar]
Fujimoto H, Hasegawa T, Watanabe D (2011) Neural Coding of Syntactic Structure in Learned Vocalizations in the Songbird. J Neurosci 31:10023–10033. [DOI] [PMC free article] [PubMed] [Google Scholar]
Glaze CM, Troyer TW (2006) Temporal structure in zebra finch song: implications for motor coding. J Neurosci 26:991–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
Glaze CM, Troyer TW (2013) Development of temporal structure in zebra finch song. J Neurophysiol 109:1025–1035. [DOI] [PMC free article] [PubMed] [Google Scholar]
Guo ZV, Li N, Huber D, Ophir E, Gutnisky D, Ting JT, Feng G, Svoboda K (2014) Flow of cortical activity underlying a tactile decision in mice. Neuron 81:179–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
Hahnloser RHR, Kozhevnikov AA, Fee MS (2002) An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419:65–70. [DOI] [PubMed] [Google Scholar]
Immelmann K (1969) Song development in the zebra finch and other estrildid finches. Bird Vocalizations:61–77. [Google Scholar]
James LS, Sakata JT (2019) Developmental modulation and predictability of age-dependent vocal plasticity in adult zebra finches. Brain Res 1721:146336. [DOI] [PubMed] [Google Scholar]
Kalra S, Yawatkar V, James LS, Sakata JT, Rajan R (2021) Introductory gestures before songbird vocal displays are shaped by learning and biological predispositions. Proc Biol Sci 288:20202796. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kaufman MT, Seely JS, Sussillo D, Ryu SI, Shenoy KV, Churchland MM (2016) The Largest Response Component in the Motor Cortex Reflects Movement Timing but Not Movement Type. eNeuro 3. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kozhevnikov AA, Fee MS (2007) Singing-related activity of identified HVC neurons in the zebra finch. J Neurophysiol 97:4271–4283. [DOI] [PubMed] [Google Scholar]
Li N, Daie K, Svoboda K, Druckmann S (2016) Robust neuronal dynamics in premotor cortex during motor planning. Nature 532:459–464. [DOI] [PMC free article] [PubMed] [Google Scholar]
Long MA, Fee MS (2008) Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456:189–194. [DOI] [PMC free article] [PubMed] [Google Scholar]
Long MA, Jin DZ, Fee MS (2010) Support for a synaptic chain model of neuronal sequence generation. Nature 468:394–399. [DOI] [PMC free article] [PubMed] [Google Scholar]
Mandelblat-Cerf Y, Fee MS (2014) An automated procedure for evaluating song imitation. PloS One 9:e96484. [DOI] [PMC free article] [PubMed] [Google Scholar]
McCasland JS (1987) Neuronal control of bird song production. J Neurosci Off J Soc Neurosci 7:23–39. [DOI] [PMC free article] [PubMed] [Google Scholar]
Moll FW, Kranz D, Corredera Asensio A, Elmaleh M, Ackert-Smith LA, Long MA (2023) Thalamus drives vocal onsets in the zebra finch courtship song. Nature 616:132–136. [DOI] [PMC free article] [PubMed] [Google Scholar]
Nottebohm F, Stokes TM, Leonard CM (1976) Central control of song in the canary, Serinus canarius. J Comp Neurol 165:457–486. [DOI] [PubMed] [Google Scholar]
Okanoya K (2004) The Bengalese finch: a window on the behavioral neurobiology of birdsong syntax. Ann N Y Acad Sci 1016:724–735. [DOI] [PubMed] [Google Scholar]
Price PH (1979) Developmental determinants of structure in zebra finch song. J Comp Physiol Psychol 93:260. [Google Scholar]
Pytte CL, Gerson M, Miller J, Kirn JR (2007) Increasing stereotypy in adult zebra finch song correlates with a declining rate of adult neurogenesis. Dev Neurobiol 67:1699–1720. [DOI] [PubMed] [Google Scholar]
Rajan R, Doupe AJ (2013) Behavioral and neural signatures of readiness to initiate a learned motor sequence. Curr Biol CB 23:87–93. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rao D, Kojima S, Rajan R (2019) Sensory feedback independent pre-song vocalizations correlate with time to song initiation. J Exp Biol 222. [DOI] [PMC free article] [PubMed] [Google Scholar]
Riehle A, Requin J (1989) Monkey primary motor and premotor cortex: single-cell activity related to prior information about direction and extent of an intended movement. J Neurophysiol 61:534–549. [DOI] [PubMed] [Google Scholar]
Rosenbaum DA (1980) Human movement initiation: Specification of arm, direction, and extent. J Exp Psychol Gen J Exp Psychol Gen 109:444. [DOI] [PubMed] [Google Scholar]
Roy A, Mooney R (2007) Auditory Plasticity in a Basal Ganglia–Forebrain Pathway during Decrystallization of Adult Birdsong. J Neurosci 27:6374–6387. [DOI] [PMC free article] [PubMed] [Google Scholar]
Shenoy KV, Kaufman MT, Sahani M, Churchland MM (2011) A dynamical systems view of motor preparation: implications for neural prosthetic system design. Prog Brain Res 192:33–58. [DOI] [PMC free article] [PubMed] [Google Scholar]
Simpson HB, Vicario DS (1990) Brain pathways for learned and unlearned vocalizations differ in zebra finches. J Neurosci 10:1541–1556. [DOI] [PMC free article] [PubMed] [Google Scholar]
Sossinka R, Böhner J (1980) Song Types in the Zebra Finch Poephila guttata castanotis1. Z Für Tierpsychol 53:123–132. [Google Scholar]
Suri H, Rajan R (2018) Distance-dependent changes in courtship song amplitude reflect song state changes. bioRxiv:277210. [Google Scholar]
Svoboda K, Li N (2018) Neural mechanisms of movement planning: motor cortex and beyond. Curr Opin Neurobiol 49:33–41. [DOI] [PubMed] [Google Scholar]
Tchernichovski Nottebohm, Ho Pesaran, Mitra (2000) A procedure for an automated measurement of song similarity. Anim Behav 59:1167–1176. [DOI] [PubMed] [Google Scholar]
Vicario DS (1991) Contributions of syringeal muscles to respiration and vocalization in the zebra finch. J Neurobiol 22:63–73. [DOI] [PubMed] [Google Scholar]
Williams H, McKibben JR (1992) Changes in stereotyped central motor patterns controlling vocalization are induced by peripheral nerve injury. Behav Neural Biol 57:67–78. [DOI] [PubMed] [Google Scholar]
Yarden C, Jun S, Dawit S, Daniel PL, William AL 3rd, Derek CL, Darrell NK, Timothy JG (n.d.) Hidden Neural States Underlie Canary Song Syntax. Nature. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yu AC, Margoliash D (1996) Temporal hierarchical control of singing in birds. Science 273:1871–1875. [DOI] [PubMed] [Google Scholar]
Zann RA (1996) Zebra Finch. Oxford. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplement 1

Extended Data Table supporting Figure 2

Statistical test results for data from Fig 2. - Fig. 8

media-1.xls^{(176KB, xls)}

[R1] Aronov D, Andalman AS, Fee MS (2008) A specialized forebrain circuit for vocal babbling in the juvenile songbird. Science 320:630–634. [DOI] [PubMed] [Google Scholar]

[R2] Ben-Tov M, Duarte F, Mooney R (2023) A neural hub for holistic courtship displays. Curr Biol 33:1640–1653. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] Bottjer SW, Arnold AP (1984) The role of feedback from the vocal organ. I. Maintenance of stereotypical vocalizations by adult zebra finches. J Neurosci 4:2387–2396. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] Brainard MS, Doupe AJ (2001) Postlearning consolidation of birdsong: stabilizing effects of age and anterior forebrain lesions. J Neurosci 21:2501–2517. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] Chen JR, Stepanek L, Doupe AJ (2014) Differential contributions of basal ganglia and thalamus to song initiation, tempo, and structure. J Neurophysiol 111:248–257. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Churchland MM, Cunningham JP, Kaufman MT, Ryu SI, Shenoy KV (2010) Cortical preparatory activity: representation of movement or first cog in a dynamical machine? Neuron 68:387–400. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R7] Churchland MM, Santhanam G, Shenoy KV (2006a) Preparatory activity in premotor and motor cortex reflects the speed of the upcoming reach. J Neurophysiol 96:3130–3146. [DOI] [PubMed] [Google Scholar]

[R8] Churchland MM, Yu BM, Ryu SI, Santhanam G, Shenoy KV (2006b) Neural variability in premotor cortex provides a signature of motor preparation. J Neurosci 26:3697–3712. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Cohen Y, Shen J, Semu D, Leman DP, Liberti III WA, Perkins LN, Liberti DC, Kotton DN, Gardner TJ (2020) Hidden neural states underlie canary song syntax. Nature 582:539–544. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] Danish HH, Aronov D, Fee MS (2017) Rhythmic syllable-related activity in a songbird motor thalamic nucleus necessary for learned vocalizations. PloS One 12:e0169568. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Fee MS, Scharff C (2010) The songbird as a model for the generation and learning of complex sequential behaviors. ILAR J Natl Res Counc Inst Lab Anim Resour 51:362–377. [DOI] [PubMed] [Google Scholar]

[R12] Fujimoto H, Hasegawa T, Watanabe D (2011) Neural Coding of Syntactic Structure in Learned Vocalizations in the Songbird. J Neurosci 31:10023–10033. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Glaze CM, Troyer TW (2006) Temporal structure in zebra finch song: implications for motor coding. J Neurosci 26:991–1005. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R14] Glaze CM, Troyer TW (2013) Development of temporal structure in zebra finch song. J Neurophysiol 109:1025–1035. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R15] Guo ZV, Li N, Huber D, Ophir E, Gutnisky D, Ting JT, Feng G, Svoboda K (2014) Flow of cortical activity underlying a tactile decision in mice. Neuron 81:179–194. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Hahnloser RHR, Kozhevnikov AA, Fee MS (2002) An ultra-sparse code underlies the generation of neural sequences in a songbird. Nature 419:65–70. [DOI] [PubMed] [Google Scholar]

[R17] Immelmann K (1969) Song development in the zebra finch and other estrildid finches. Bird Vocalizations:61–77. [Google Scholar]

[R18] James LS, Sakata JT (2019) Developmental modulation and predictability of age-dependent vocal plasticity in adult zebra finches. Brain Res 1721:146336. [DOI] [PubMed] [Google Scholar]

[R19] Kalra S, Yawatkar V, James LS, Sakata JT, Rajan R (2021) Introductory gestures before songbird vocal displays are shaped by learning and biological predispositions. Proc Biol Sci 288:20202796. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R20] Kaufman MT, Seely JS, Sussillo D, Ryu SI, Shenoy KV, Churchland MM (2016) The Largest Response Component in the Motor Cortex Reflects Movement Timing but Not Movement Type. eNeuro 3. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Kozhevnikov AA, Fee MS (2007) Singing-related activity of identified HVC neurons in the zebra finch. J Neurophysiol 97:4271–4283. [DOI] [PubMed] [Google Scholar]

[R22] Li N, Daie K, Svoboda K, Druckmann S (2016) Robust neuronal dynamics in premotor cortex during motor planning. Nature 532:459–464. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R23] Long MA, Fee MS (2008) Using temperature to analyse temporal dynamics in the songbird motor pathway. Nature 456:189–194. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R24] Long MA, Jin DZ, Fee MS (2010) Support for a synaptic chain model of neuronal sequence generation. Nature 468:394–399. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R25] Mandelblat-Cerf Y, Fee MS (2014) An automated procedure for evaluating song imitation. PloS One 9:e96484. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] McCasland JS (1987) Neuronal control of bird song production. J Neurosci Off J Soc Neurosci 7:23–39. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R27] Moll FW, Kranz D, Corredera Asensio A, Elmaleh M, Ackert-Smith LA, Long MA (2023) Thalamus drives vocal onsets in the zebra finch courtship song. Nature 616:132–136. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R28] Nottebohm F, Stokes TM, Leonard CM (1976) Central control of song in the canary, Serinus canarius. J Comp Neurol 165:457–486. [DOI] [PubMed] [Google Scholar]

[R29] Okanoya K (2004) The Bengalese finch: a window on the behavioral neurobiology of birdsong syntax. Ann N Y Acad Sci 1016:724–735. [DOI] [PubMed] [Google Scholar]

[R30] Price PH (1979) Developmental determinants of structure in zebra finch song. J Comp Physiol Psychol 93:260. [Google Scholar]

[R31] Pytte CL, Gerson M, Miller J, Kirn JR (2007) Increasing stereotypy in adult zebra finch song correlates with a declining rate of adult neurogenesis. Dev Neurobiol 67:1699–1720. [DOI] [PubMed] [Google Scholar]

[R32] Rajan R, Doupe AJ (2013) Behavioral and neural signatures of readiness to initiate a learned motor sequence. Curr Biol CB 23:87–93. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R33] Rao D, Kojima S, Rajan R (2019) Sensory feedback independent pre-song vocalizations correlate with time to song initiation. J Exp Biol 222. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R34] Riehle A, Requin J (1989) Monkey primary motor and premotor cortex: single-cell activity related to prior information about direction and extent of an intended movement. J Neurophysiol 61:534–549. [DOI] [PubMed] [Google Scholar]

[R35] Rosenbaum DA (1980) Human movement initiation: Specification of arm, direction, and extent. J Exp Psychol Gen J Exp Psychol Gen 109:444. [DOI] [PubMed] [Google Scholar]

[R36] Roy A, Mooney R (2007) Auditory Plasticity in a Basal Ganglia–Forebrain Pathway during Decrystallization of Adult Birdsong. J Neurosci 27:6374–6387. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Shenoy KV, Kaufman MT, Sahani M, Churchland MM (2011) A dynamical systems view of motor preparation: implications for neural prosthetic system design. Prog Brain Res 192:33–58. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R38] Simpson HB, Vicario DS (1990) Brain pathways for learned and unlearned vocalizations differ in zebra finches. J Neurosci 10:1541–1556. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R39] Sossinka R, Böhner J (1980) Song Types in the Zebra Finch Poephila guttata castanotis1. Z Für Tierpsychol 53:123–132. [Google Scholar]

[R40] Suri H, Rajan R (2018) Distance-dependent changes in courtship song amplitude reflect song state changes. bioRxiv:277210. [Google Scholar]

[R41] Svoboda K, Li N (2018) Neural mechanisms of movement planning: motor cortex and beyond. Curr Opin Neurobiol 49:33–41. [DOI] [PubMed] [Google Scholar]

[R42] Tchernichovski Nottebohm, Ho Pesaran, Mitra (2000) A procedure for an automated measurement of song similarity. Anim Behav 59:1167–1176. [DOI] [PubMed] [Google Scholar]

[R43] Vicario DS (1991) Contributions of syringeal muscles to respiration and vocalization in the zebra finch. J Neurobiol 22:63–73. [DOI] [PubMed] [Google Scholar]

[R44] Williams H, McKibben JR (1992) Changes in stereotyped central motor patterns controlling vocalization are induced by peripheral nerve injury. Behav Neural Biol 57:67–78. [DOI] [PubMed] [Google Scholar]

[R45] Yarden C, Jun S, Dawit S, Daniel PL, William AL 3rd, Derek CL, Darrell NK, Timothy JG (n.d.) Hidden Neural States Underlie Canary Song Syntax. Nature. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R46] Yu AC, Margoliash D (1996) Temporal hierarchical control of singing in birds. Science 273:1871–1875. [DOI] [PubMed] [Google Scholar]

[R47] Zann RA (1996) Zebra Finch. Oxford. [Google Scholar]

PERMALINK

This is a preprint.

Introductory notes before zebra finch song have unique timing properties while sharing acoustic properties with song

Divya Rao

Raghav Rajan

Abstract

Significance Statement

Introduction

Fig. 1. Preparatory activity and its relationship to upcoming movement.

Materials and Methods

SONG RECORDING

BIRDS USED FOR ANALYSIS OF IN-SONG CORRELATIONS

Fig. 2. Introductory Note (IN) properties change in the direction of the song.

TABLE 1.

Fig. 3. Trial-to-trial correlations between acoustic properties of IN and song.

Fig. 4. Correlations with first song syllable.

Fig. 6. Song and IN properties simultaneously change in the first year of age.

DATA OVERLAP WITH PREVIOUS STUDIES

DATA FOR REPEAT SYLLABLES

DATA ANALYSIS

Defining a bout and bout interval criteria

Determining minimum number of bouts for IN analysis

Calculating IN and song properties

Acoustic properties of syllables

Analyzing age-related changes in IN and song properties

Comparison of IN properties with motif duration and similarity within a session

Analysis of repeat syllables

EXPERIMENTAL DESIGN AND STATISTICAL ANALYSIS

DATA AND CODE ACCESSIBILITY

Results

Introductory note features change in the direction of upcoming song

IN number is not correlated with first song syllable features

IN timing is not correlated with song syllable timing

IN acoustic features are correlated with song features

Song and INs show age-related changes

Fig. 5. IN number increases in the first year of age.

TABLE 2:

IN features that change with age do not show bout-to-bout correlations with song features that change with age

Fig. 7. Correlated IN and song changes with age are not related within a session.

TABLE 3:

IN acoustic features share similarities with other syllable repeats but speeding up of intervals between INs is unique to IN repeats

Fig. 8. Temporal progression of INs is not present in other repeats.

Discussion

Song control pathways may control changes in IN acoustic features

What controls the speeding up of gaps between INs?

Do INs have a preparatory function?

Supplementary Material

Acknowledgments

Funding sources:

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases