An official website of the United States government
Here's how you know
Official websites use .gov
A
.gov website belongs to an official
government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you've safely
connected to the .gov website. Share sensitive
information only on official, secure websites.
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with,
the contents by NLM or the National Institutes of Health.
Learn more:
PMC Disclaimer
|
PMC Copyright Notice
. Author manuscript; available in PMC: 2021 Apr 23.
Published in final edited form as: Nature. 2020 Jan 8;577(7791):526–530. doi: 10.1038/s41586-019-1892-x
Nearest neighbors reveal fast and slow components of motor learning
The publisher's version of this article is available at Nature
Abstract
Changes in behavior, due to environmental influences, development, and learning1–5, are commonly quantified based on a few hand-picked, domain-specific, features2–4,6,7 (e.g. the average pitch of acoustic vocalizations3) and assuming discrete classes of behaviors (e.g. distinct vocal syllables)2,3,8–10. Such methods generalize poorly across different behaviors and model systems and may miss important components of change. Here we present a more general account of behavioral change based on nearest-neighbor statistics11–13 and apply it to song development in a songbird, the zebra finch3. First, we introduce “repertoire dating”, whereby each rendition of a behavior (e.g. each vocalization) is assigned a repertoire time, reflecting when similar renditions were typical in the behavioral repertoire. Repertoire time (rT) isolates the components of vocal variability congruent with the long-term changes due to vocal learning and development and stratifies the behavioral repertoire into regressions (rT < true production time, t), anticipations (rT > t), and typical renditions (rT ≈ t). Second, we obtain a holistic, yet low-dimensional14, description of vocal change in terms of a stratified “behavioral trajectory”, revealing multiple, previously unrecognized, components of behavioral change on fast and slow timescales, as well as distinct patterns of overnight consolidation1,2,4,15,16. Diurnal changes in regressions undergo only weak consolidation, whereas anticipations and typical renditions consolidate fully. Because of its generality, our non-parametric description of how behavior evolves relative to itself, rather than relative to a potentially arbitrary, experimenter-defined, goal2,3,15,17 appears well-suited to compare learning and change across behaviors and species18,19, as well as biological and artificial systems5.
Zebra finches acquire complex, stereotyped, vocalizations through a months-long process of sensory-motor learning3,20–22. We obtained dense audio recordings between 35-123 days post hatch (dph) in 5 male birds (73.4±18.6 consecutive days, mean±STD). Birds were isolated from other males after birth and, on average, live-tutored between dph46 and 63 (Extended Data Fig. 1a). Band passed (0.35-8kHz) audio recordings were segmented into individual vocal renditions, and represented as song spectrogram segments (Fig. 1a, 563, 124-1, 203, 647 renditions per bird). Noise and isolated calls were excluded from the analyses. During development, syllable order, i.e. syntax, and the spectral structure of syllables evolve3. These two aspects of vocal learning may be mediated by largely independent mechanisms with distinct anatomical substrates23. Here we focus on characterizing the development of spectral structure.
Behavioral change in single features
Vocal development is often characterized by considering change in acoustic features such as pitch, frequency modulation3, or entropy variance2,15 (Fig. 1b). Such characterizations readily reveal multiple timescales of behavioral change—individual features can vary consistently within a day; can display overnight discontinuities; and can show drifts over the duration of weeks or months (Fig. 1b,c).
We summarize the relation between change at these different timescales through a consolidation index (Fig. 1c, CI), measuring whether within-day change in a feature (Fig. 1c, span) is maintained or lost overnight (Fig. 1c, shift). Weak consolidation2,15 corresponds to a CI close to -1 (no consolidation: shift = -span), strong consolidation4,16 to a CI close to 0 (perfect consolidation: shift = 0 days), and offline learning4,24,25 to a CI larger than 0. Across 32 commonly used acoustic features, CIs are mostly negative, indicating weak consolidation (Fig. 1d, top, median: -0.67). This finding holds even for random spectral features (Fig. 1d, bottom, median: -0.64) and is consistent with past accounts of song-development in zebra finches2,15.
Individual features, however, may provide an incomplete account of change in a complex behavior such as song vocalizations. To illustrate this point, we consider three simple scenarios. In the first two (Fig. 1e,f), the change in behavior occurring within any given day largely mirrors, on a faster timescale, the slow change occurring over the course of many days or weeks. In the third scenario (Fig. 1g), within-day change is partly “misaligned” with slow change, i.e. it involves behavioral features that do not consistently change on slower timescales. Within-day change could reflect metabolic, neural, or other changes that are not necessarily congruent with longer-term learning or development; the slow change reflects long-term modifications in behavior typically equated with learning and development. We abstractly refer to these slow components as the direction of slow change (DiSC).
Critically, simulations of these scenarios show that negative CIs for single features can result from very different time courses of development (Fig. 1h, i). Negative CIs occur both when within-day and slow changes are closely aligned but daily gains along the DiSC are mostly lost overnight (Fig. 1f, weak consolidation), as well as when diurnal gains along the DiSC are perfectly consolidated, but within-day change is substantially misaligned with slow change (Fig. 1g). The broad distributions of indices observed during song development (Fig. 1d, top), which also include strongly positive indices, seem more consistent with the misaligned scenario (Fig. 1i, histogram 3).
Nearest-neighbor measures of change
We developed a general characterization of high-dimensional behavioral change, based on nearest-neighbor statistics12,13that can distinguish between the scenarios in Fig. 1e-g. We initially analyze song-spectrogram segments of fixed duration aligned to syllable onset (Fig. 1a) but later extend our analysis to alternative parameterizations of vocalizations. Vocal renditions are represented as vectors xi ∈ ℝd (i indexes renditions), each associated with a production time, ti ∈ ℝ (e.g. the bird’s age when singing xi). The K-neighborhood of rendition xi is given by those K renditions (among the set of all renditions) that are closest to xi based on some metric (e.g. Euclidean distance). For small enough K, different syllable types do not mix within a neighborhood (Extended Data Fig. 1e, Fig. 3a) and neighborhood statistics are largely independent of cluster boundaries, obviating the need for clustering renditions into syllables.
We visualize all vocalizations produced by a bird throughout development with Barnes-Hut-t-SNE11 (which predominantly preserves local neighborhoods11). Each point in the embedding corresponds to a spectrogram segment xi (Fig. 1a). Different locations correspond to different vocalization types (Fig. 2b, Extended Data Fig. 2a,d). The embedding suggests that vocalizations change from undifferentiated subsong3,21 (Fig. 2a, middle) to clearly differentiated syllables falling into at least four categories (Fig. 2a, syllables a, b, c and introductory note i; same labels as in Fig. 1a). The emergence of clustered syllables from un-clustered subsong is confirmed by standard clustering approaches (Fig. 2g, Extended Data Fig. 1). Notably, the embedding does not preserve all local structure in the data, as nearest neighbors in the embedding space are not necessarily nearest neighbors in the high dimensional data space (Fig. 2a;black crosses: high-d neighbors). We therefore quantify behavioral change directly on the high-dimensional databy analyzing the composition of high dimensional neighborhoods12,13 (Extended Data Fig. 2).
For each datapoint, we refer to the production times of all data points in its K-neighborhood as the “neighborhood production times” (or “neighborhood times”; Fig. 2a, inset). We summarize the neighborhood times of many data points (Fig. 2d) through “pooled neighborhood times” (Fig. 2c) and the “neighborhood mixing-matrix” (Fig 2e; Extended Data Fig. 3d; Extended Data Fig. 2). Each value in the mixing matrix represents the similarity between behaviors from two production periods. Deviations from zero indicate that behaviors from the corresponding production periodsare more (> 0) or less (< 0) similar (i.e. mixed at the level of K-neighborhoods) than expected from a shuffling null-hypothesis.
We use multi-dimensional scaling14to represent the similarity between behaviors from different production times as a “behavioral trajectory” (Fig. 2h). Each point on the trajectory represents the distribution of all vocalizations produced on a given day. Pairwise distances between points represent the dissimilarity between distributions (Extended Data Fig. 2). Here we focus on a16-day-phase of gradual change midway through development (Fig. 2f). During this phase, the behavioral trajectory is structured differently on fast and slow timescales (Extended Data Fig. 3f-h). The 2d-projection of the trajectory that explains maximal variance mainly reflects the direction of slow change (DiSC, Fig. 2h; Fig 1e-g).
The behavioral trajectory summarizes the progressive differentiation of vocalizations into distinct syllables, as well as simultaneous, continuous, change in many spectral features of individual syllables (Extended Data Fig. 7). Notably, change is characterized through the behavioral trajectory by comparing the bird’s song to itself across time, rather than to a tutor song. Thus the behavioral trajectory may also reflect innate song priors that can result in crystallized song deviating from the tutor song26 and additional change due to other developmental processes27.
Repertoire extent and consolidation
t-SNE suggests that renditions from nearby days overlap considerably, whereby changes occurring within a day partly mimic the slow change across days (Extended Data Fig. 2b,c). We quantify this apparent spread along the DiSC, reflecting different degrees of behavioral “maturity”, through neighborhood times (Fig. 2d). We refer to behavioral renditions that predominantly have neighbors produced in the future as “anticipations” and to renditions that predominantly have neighbors that were produced in the past as “regressions” (Extended Data Fig. 3b). By contrast, renditions that are “typical” for a given developmental stage mostly have neighbors produced on the same or nearby days. We denote the median neighborhood time as the “repertoire time” (rT) of a rendition. The repertoire time effectively places each rendition along the DiSC (Fig. 2d, horizonal axis), i.e. dates it with respect to the progression of vocal development (“repertoire dating”). Abroad distribution of repertoire times across all renditions in a day (Fig. 2d) suggests considerable behavioral variability along the DiSC—the most extreme regressions are backdated more than 10 days into the past, and the most extreme anticipations are postdated more than 10 days into the future (Fig. 2d, rT).
To quantify behavioral change on the time scale of hours, we subdivide each day into 10 consecutive periods, and compute pooled neighborhood times separately for each period. The percentiles of the pooled neighborhood times chart the evolution of behavior within and across days throughout development (Fig. 3a). Each repertoire dating percentile is akin to a learning curve for a part of the behavioral repertoire (e.g. typical renditions, 50th percentile; extreme anticipations, 95th percentile). The evolution of each percentile captures the progress along the DiSC (Fig. 3a, vertical axis) over time (Fig. 3a, horizontal axis). We validated this characterization of behavioral change on simulated behavior mimicking vocal development (Extended Data Fig. 4a-d).
The repertoire-dating percentiles reveal that typical renditions move gradually along the DiSC throughout the day and changes along the DiSC acquired during the day are, on average, fully consolidated overnight (Fig. 3a,b; red). Anticipations undergo a similar or smaller degree of within-day-change (Fig. 3a, b; 75th and 95th percentiles) whereas regressions move by a larger distance within each day, but this change is only weakly consolidated overnight (Fig. 3a,b; 5th and 25th percentiles; Fig 3e). The most “immature” renditions thus improve markedly throughout a day, more than typical renditions or anticipations, but these improvements are mostly lost overnight. This pattern of change seems characteristic of development, as it is absent in adults (Extended Data Fig. 6).
Movement along the DiSC also occurs on faster timescales than hours, namely within bouts of singing, i.e. groups of vocalizations preceded and followed by a pause (average bout duration: 3.81±0.83s across birds). We subdivide each bout into 10 consecutive periods, compute pooled neighborhood times for each period (over all bouts in a day), and track change through the corresponding percentiles (Fig. 3c,d). Within bouts, large changes along the DiSC occur at the regressive tail of the behavioral repertoire—vocalizations are most regressive at the onset and offset of bouts (Fig. 3c,d; 5th percentile). Similar, albeit weaker, changes occur for typical renditions (Fig. 3c,d; red). The same apparent changes in song maturity are observed when short and long bouts (durations 2.30±0.54s vs. 6.28±1.73s) are considered separately. Song maturity thus decreases at the end of a bout, not after a fixed time into the bout (Extended Data Fig. 5a-c).
Misaligned behavioral components
The repertoire time reveals within-day and within-bout changes that mirror, on a faster timescale, changes occurring also over many days (see Methods). As above (Fig. 1), we refer to such components of change as being aligned with the DiSC, and to components that are not reflected in the repertoire time as being misaligned.
We identify both aligned and misaligned components of change through the “stratified mixing matrix”, which combines a neighborhood mixing-matrix (e.g. Fig. 2f) with repertoire dating. Each day’s behavioral repertoire is binned into 5 consecutive production periods. Within each period, the behavioral repertoire is subdivided into 5 strata, based on repertoire time (Fig. 2e, quintiles). All renditions from a day thus fall into 5x5=25 bins. The stratified mixing matrix measures similarity between 50 bins combining data from two adjacent days (Fig. 3g). We compare the measured stratified mixing matrix with simulations differing with respect to how within-day change and change across adjacent days align with the DiSC (Fig. 3f; Extended Data Fig. 4e-j). In model 1development is “1-dimensional”, i.e. aligned with the DiSC (Fig. 3f, top; similar to Fig. 1e). In model 2, within-day change involves a component not aligned with the DiSC (Fig. 3f, middle; similar to Fig. 1g). In model 3, adjacent days are separated not only along the DiSC, but also along a direction orthogonal to both the DiSC and the direction of within day change (Fig. 3f, bottom, across-day change). Prominent “stripes” along every other diagonal in the measured mixing matrix (Fig. 3g) indicate a larger similarity between renditions from the same day than between renditions from adjacent days, as predicted by model 3, suggesting that several misaligned components contribute to change at fast timescales.
Based on the stratified mixing-matrix, we infer stratified behavioral trajectories. The 2d-projection capturing most of the variance due to strata (Fig. 3h) resembles Fig. 2 hand reflects the DiSC. Consistent with repertoire-dating, behavioral change along the DiSC between adjacent days (Fig. 3h, blue vs. red for each stratum) is small compared to the spread of one day’s behavior along the DiSC (e.g. blue points, strata 1-5). For each stratum, however, much of the change occurring within a day is misaligned with the DiSC (Fig. 3i,k; early vs. late separated along orthogonal dimension of within-day change). Yet another misaligned component is necessary to appropriately capture change across adjacent days (Fig. 3j). These properties of aligned and misaligned components are replicated by a linear analysis based on spectral features chosen to capture change at specific timescales (Extended Data Fig. 7, 8) and are robust to how song is parameterized and segmented, and to how nearest neighbors are defined (Extended Data Fig. 9, 10).
Discussion
Our analysis of high-dimensional vocalizations reveals that the developmental trajectory does not merely reflect an underlying one-dimensional process. Single behavioral features in isolation therefore provide an incomplete account of behavioral change during development and learning. The weak consolidation observed here (Fig. 1d) and elsewhere2,15 at the level of single features appears to reflect prominent misaligned components of within-day change rather than weak consolidation along the DiSC (Fig. 1h). Strong overnight consolidation along the DiSC across much of the behavioral repertoire (Fig. 3a,b) seems consistent with consolidation patterns observed for skilled motor learning in humans24,25,28 and of motor adaptation in humans1,19 and birds4.
Our characterization of behavior based on nearest-neighbor statistics can be applied when no accurate parametric model of the behavior is known, as currently is the case for most natural, complex behaviors. The approach is largely complementary to methods that rely on clustering behavior into distinct categories2,3,10,29. Foregoing an explicit clustering of the data can be advantageous since assuming the existence of clusters can be an unwarranted approximation30; may impede the characterization of behavior that appears not clustered (such as juvenile zebra finch song; Extended Data Fig. 1);and determining correct cluster boundaries is in general an ill-defined problem30. Importantly, our analyses require only an indicator function that selects nearest neighbors (here based on a “locally meaningful” distance metric), a much weaker requirement than a globally valid distance metric or the existence of a low dimensional feature space that maps behavioral space11. These properties make repertoire dating applicable to almost any behavior and other high dimensional datasets, including data characterized by “labels” other than production time. Repertoire dating may thus provide a general account of learning and change amenable to comparisons between different behaviors and model systems, including different species18 and artificial systems5.
We thank Joshua Herbst and Ziqiang Huang for performing the tutoring experiments. We also thank Anja Zai, Simone Surace, Adrian Huber, Ioana Calangiu, and Kevan Martin for discussions of the manuscript.
Funding
This work was supported by grants from the Simons Foundation (SCGB 328189, VM; SCGB 543013, VM) and the Swiss National Science Foundation (SNSFPP00P3_157539, VM; SNSF31003A_182638, RH).
Footnotes
Ethics Oversight
All experimental procedures were approved by the Veterinary Office of the Canton of Zurich.
Author Contributions
S.K. conceived of the approach. S.K. and V.M. performed analyses. S.K., V.M, and R.H. wrote the paper. R.H. conceived and supervised collection of the behavioral data.
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
1.Brashers-Krug T, Shadmehr R, Bizzi E. Consolidation in human motor memory. Nature. 1996;382:252–255. doi: 10.1038/382252a0. [DOI] [PubMed] [Google Scholar]
2.Derégnaucourt S, Mitra PP, Fehér O, Pytte C, Tchernichovski O. How sleep affects the developmental learning of bird song. Nature. 2005;433:710–6. doi: 10.1038/nature03275. [DOI] [PubMed] [Google Scholar]
3.Tchernichovski O, Mitra PP, Lints T, Nottebohm F. Dynamics of the vocal imitation process: how a zebra finch learns its song. Science. 2001;291:2564–9. doi: 10.1126/science.1058522. [DOI] [PubMed] [Google Scholar]
4.Andalman AS, Fee MS. A basal ganglia-forebrain circuit in the songbird biases motor output to avoid vocal errors. Proc Natl Acad Sci U S A. 2009;106:12518–23. doi: 10.1073/pnas.0903214106. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Arulkumaran K, Deisenroth MP, Brundage M, Bharath AA. Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine. 2017 doi: 10.1109/MSP.2017.2743240. [DOI] [Google Scholar]
6.Ingram JN, Flanagan JR, Wolpert DM. Context-Dependent Decay of Motor Memories during Skill Acquisition. Curr Biol. 2013;23:1107–1112. doi: 10.1016/j.cub.2013.04.079. [DOI] [PMC free article] [PubMed] [Google Scholar]
7.Klaus A, et al. The Spatiotemporal Organization of the Striatum Encodes Action Space. Neuron. 2017;95:1171–1180.:e7. doi: 10.1016/j.neuron.2017.08.015. [DOI] [PMC free article] [PubMed] [Google Scholar]
8.Han S, Taralova E, Dupre C, Yuste R. Comprehensive machine learning analysis of Hydra behavior reveals a stable basal behavioral repertoire. Elife. 2018;7 doi: 10.7554/eLife.32605. [DOI] [PMC free article] [PubMed] [Google Scholar]
9.Egnor SER, Branson K. Computational Analysis of Behavior. Annu Rev Neurosci. 2016;39:217–236. doi: 10.1146/annurev-neuro-070815-013845. [DOI] [PubMed] [Google Scholar]
11.van der Maaten L. Accelerating t-SNE using Tree-Based Algorithms. J Mach Learn Res. 2014;15:3221–3245. [Google Scholar]
12.Chen H, Friedman JJH. A new graph-based two-sample test for multivariate and object data. J Am Stat Assoc. 2016;1459:1–41. [Google Scholar]
13.Hawks M. Graph-theoretic statistical methods for detecting and localizing distributional change in multivariate data. 2015 [Google Scholar]
14.Kruskal JB. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika. 1964;29:1–27. [Google Scholar]
15.Shank SS, Margoliash D. Sleep and sensorimotor integration during early vocal learning in a songbird. Nature. 2009;458:73–7. doi: 10.1038/nature07615. [DOI] [PMC free article] [PubMed] [Google Scholar]
16.Fenn KM, Nusbaum HC, Margoliash D. Consolidation during sleep of perceptual learning of spoken language. Nature. 2003;425:614–616. doi: 10.1038/nature01951. [DOI] [PubMed] [Google Scholar]
17.Tchernichovski O, Nottebohm F, Ho C, Pesaran B, Mitra P. A procedure for an automated measurement of song similarity. Anim Behav. 2000;59:1167–1176. doi: 10.1006/anbe.1999.1416. [DOI] [PubMed] [Google Scholar]
18.Anderson DJJ, Perona P. Neuron. Vol. 84. Cell Press; 2014. Toward a science of computational ethology; pp. 18–31. [DOI] [PubMed] [Google Scholar]
19.Krakauer JW, Shadmehr R. Consolidation of motor memory. Trends Neurosci. 2006;29:58–64. doi: 10.1016/j.tins.2005.10.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
20.Brainard MS, Doupe AJ. What songbirds teach us about learning. Nature. 2002;417:351–8. doi: 10.1038/417351a. [DOI] [PubMed] [Google Scholar]
22.Dhawale AK, Smith MA, Ölveczky BP. The Role of Variability in Motor Learning. Annu Rev Neurosci. 2017;40:479–498. doi: 10.1146/annurev-neuro-072116-031548. [DOI] [PMC free article] [PubMed] [Google Scholar]
23.Lipkind D, et al. Songbirds work around computational complexity by learning song vocabulary independently of sequence. Nat Commun. 2017;8:1247. doi: 10.1038/s41467-017-01436-0. [DOI] [PMC free article] [PubMed] [Google Scholar]
24.Korman M, et al. Daytime sleep condenses the time course of motor memory consolidation. Nat Neurosci. 2007;10:1206–1213. doi: 10.1038/nn1959. [DOI] [PubMed] [Google Scholar]
25.Fischer S, Hallschmid M, Elsner AL, Born J. Sleep forms memory for finger skills. Proc Natl Acad Sci U S A. 2002;99:11987–91. doi: 10.1073/pnas.182178199. [DOI] [PMC free article] [PubMed] [Google Scholar]
26.Fehér O, Wang H, Saar S, Mitra PP, Tchernichovski O. De novo establishment of wild-type song culture in the zebra finch. Nature. 2009;459:564–568. doi: 10.1038/nature07994. [DOI] [PMC free article] [PubMed] [Google Scholar]
27.Adam I, Elemans CPH. Vocal Motor Performance in Birdsong Requires Brain-Body Interaction. eNeuro. 2019;6 doi: 10.1523/ENEURO.0053-19.2019. [DOI] [PMC free article] [PubMed] [Google Scholar]
28.Walker MP, Brakefield T, Allan Hobson J, Stickgold R. Dissociable stages of human memory consolidation and reconsolidation. Nature. 2003;425:616–620. doi: 10.1038/nature01930. [DOI] [PubMed] [Google Scholar]
29.Vogelstein JT, et al. Discovery of brainwide neural-behavioral maps via multiscale unsupervised structure learning. Science. 2014;344:386–92. doi: 10.1126/science.1250298. [DOI] [PubMed] [Google Scholar]
30.Fahad A, et al. A survey of clustering algorithms for big data: Taxonomy and empirical analysis. IEEE Trans Emerg Top Comput. 2014;2:267–279. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.