(Fig. 4). a, Song examples throughout song development. Panels: i, subsong (49 dph); ii, emergence of protosyllable α from subsong (60 dph); iii, appearance of bout-onset element ε (63 dph); iv, fusion of ε with first α to form new syllable β (67 dph); v–vi, acoustic differentiation of β and α, and incorporation with γ into song motif CBA (70, 90 dph); vii, tutor song. b, Schematic of syllable formation (same as Fig. 4a), inferred by tracking backward in development the adult syllables C, B and A. Early on, protosyllable (labeled α) is produced rhythmically. The first protosyllable in each bout fuses with a brief bout-onset vocal element ε to form a new emerging syllable type β. Both α and β undergo subsequent acoustic differentiation to form adult syllables A and B, respectively. (An additional syllable γ emerges at bout onset to form adult syllable C). c, Developmental time course of the occurrence probability of different syllable types at bout onsets (mean ± s.e.m.). d, Syllable duration distribution showing three non-overlapping peaks (67 dph). Colored bars indicated syllable duration ranges used for syllable labeling. This separation of durations allowed automatic determination of syllable identity. e, Pitch goodness trajectories of syllables α (red) and β (blue) at three stages of vocal development (median ± quartiles; n=100 syllables per day). Black bar: region used to compute data in Fig. 4b. f, Example of a neuron active during both syllables α and β (HVCRA; 69 dph). Note that the activity of this neuron during syllable α was weak, and did not quite reach our statistical criterion for being a ‘shared’ neuron.