Skip to main content
. 2018 Jun 20;4(6):eaaq1084. doi: 10.1126/sciadv.aaq1084

Fig. 1. Outline of bioacoustic methodology.

Fig. 1

We present two analytical approaches, supervised and unsupervised classifications. Both approaches rely on the same initial statistical characterization of the acoustic data set to identify songbird vocalizations, regardless of species. The supervised approach used a linear classifier to classify every 4-s segment of the data set for the presence/absence of songbird vocalizations, trained on a subset of listener-determined scores (<1% of data set). We used the proportion of segments per day containing songbird vocalizations as a relative score, referred to as the VAI. We estimated the arrival dates as the first date that exceeded 50% of the maximum value of the VAI. The unsupervised approach used a series of signal processing and machine learning techniques to cluster the acoustic data into potential physical sources (for example, vocalizations, wind, and trucks) without training from listener input. Because the number of physical sources is not known a priori, we initially clustered the data into 100 clusters. We then performed principal components analysis on the histograms of cluster assignments to reduce data to 20 dimensions. We estimated the arrival dates as the optimal segmentation boundary in principal components, as measured by the fit of Gaussian distributions on either side of the boundary (see the Supplementary Materials).