Skip to main content

This is a preprint.

It has not yet been peer reviewed by a journal.

The National Library of Medicine is running a pilot to include preprints that result from research funded by NIH in PMC and PubMed.

ArXiv logoLink to ArXiv
[Preprint]. 2023 Sep 7:arXiv:2309.04044v1. [Version 1]

Capturing continuous, long timescale behavioral changes in Drosophila melanogaster postural data

Grace C McKenzie-Smith 1,, Scott W Wolf 2,, Julien F Ayroles 2,3, Joshua W Shaevitz 1,2
PMCID: PMC10508836  PMID: 37731659

Abstract

Animal behavior spans many timescales, from short, seconds-scale actions to circadian rhythms over many hours to life-long changes during aging. Most quantitative behavior studies have focused on short-timescale behaviors such as locomotion and grooming. Analysis of these data suggests there exists a hierarchy of timescales; however, the limited duration of these experiments prevents the investigation of the full temporal structure. To access longer timescales of behavior, we continuously recorded individual Drosophila melanogaster at 100 frames per second for up to 7 days at a time in featureless arenas on sucrose-agarose media. We use the deep learning framework SLEAP to produce a full-body postural data set for 47 individuals resulting in nearly 2 billion pose instances. We identify stereotyped behaviors such as grooming, proboscis extension, and locomotion and use the resulting ethograms to explore how the flies’ behavior varies across time of day and days in the experiment. We find distinct circadian patterns in all of our stereotyped behavior and also see changes in behavior over the course of the experiment as the flies weaken and die.

Keywords: behavioral tracking, pose estimation, circadian rhythms, aging

Introduction

Uncovering the temporal structure of behavior has long been a topic of theoretical interest and experimental challenge [14]. Animals carry out sequences of behaviors on many timescales, from the short timescales of the individual movements required for grooming, eating, and social communication to the longer timescales of hunger, arousal, circadian cycles, mating seasons, and the aging process. The specifics of these behavior sequences determine much of what we can characterize about an animal, such as its health, reproductive fitness, and that idiosyncrasy of action that we might call ‘personality.’ These behavior sequences also give us indirect ways to assess the internal processes of the animal, such as neural activity, gene expression, and other internal states like hunger or fatigue. Finding general principles that govern the order of behaviors would be an exciting step forward in understanding how animals interact with the world around them and how internal factors may shape that interaction. This course of study requires data that covers the many timescales over which animal behavior varies.

Historically, taking long-timescale data covering days or weeks of an animal’s life has required balancing continuity, throughput, and dimensionality. In Drosophila melanogaster, simple experimental setups, such as beam-break assays, allow for continuous monitoring of activity levels over days [5, 6], but fail to capture the high-resolution data necessary for modern techniques of behavior analysis such as MotionMapper [7], B-SOiD [8], VAME [9], or Keypoint-MoSeq [10]. On the other hand, the acquisition of high-resolution data has been restricted to short timescales by the computational resources required to store and process the extremely large imaging data, imposing an upper limit on the order of an hour. When studying fine-grained behavioral variation at longer timescales, previous work utilized short recordings taken from different individuals with ages distributed across the lifespan of the animal [11].

Here, we leverage recent computational advances to record a high-resolution continuous data set of D. melanogaster behavior spanning 4–8 days. We recorded 47 freely moving D. melanogaster using constant IR illumination and an IR-sensitive camera at a frame rate of 100Hz, with a 12-hour visible-light day/night cycle. We tracked 14 body parts from each fly using SLEAP [12] and utilized MotionMapper to characterize stereotyped behavioral states, such as grooming, locomoting, and feeding. Using techniques of compositional data analysis [13], we characterize the dynamics of this behavioral repertoire across time of day and over the days of the experiment. We find distinct circadian patterns in all measured behaviors, including grooming, proboscis extension, and locomotion speed. We see an overall decline in circadianicity, the difference in behavior between day and night hours, across days in the experiment as flies weaken an die, and see general declines in feeding and locomotion speed as the fraction of time spent in an idle state increases. Overall, we find that our data captures both expected and novel patterns of D. melanogaster behavior across multiple 24-hour periods. We also provide this data to the broader community as a resource to study D. melongaster behavior as it evolves along timescales beyond the scope of previous research.

Results and discussion

We designed a recording apparatus to allow for continuous capture of D. melanogaster behavior over the course of days (see Methods for details and Figure S1A). D. melanogaster were constantly illuminated from above with IR light, to which they have minimal visual sensitivity [14], while LED panels provided a 12-hour visible-light day/night cycle with the same on/off times under which the animals were raised. We made arenas by layering pieces of transparent laser-cut acrylic to create cylindrical chambers in which flies lived and behaved over the course of our experiments (Figure S1B). We limited the arenas to 1.5mm in height to prevent flying and to decrease the incidence of wall walking and ceiling walking, which lower tracking quality. We provided the flies with a base gel layer of sucroseagarose, which permitted survival of up to 7 days while preventing the significant fungal growth observed when yeast extract was included. We recorded four freely behaving D. melanogaster in individual chambers per camera at 100 Hz with a resolution of 28.25 pixels/mm (Figure 1A). This is sufficient to resolve relevant features of the D. melanogaster body such as the tarsi (leg tips) and proboscis (Figure 1B).

Figure 1.

Figure 1.

Experimental schematic showing tracking, lifespans, and behavioral segmentations across timescales. A Image showing the experimental arena as viewed from below. The behavior of 4 D. melanogaster is captured simultaneously while giving each fly enough room to freely carry out all behaviors except flight. B Magnified view of a single individual showing tracks for each node of the SLEAP skeleton. Each color denotes a node and circle sizes increase with time. C Survival curve of the 47 flies included in the experiment. Death occurs on average after ~119 hours, or almost 5 full days into the experiment. D Ethogram and egocentrized traces for each tarsi and a raster denoting proboscis visibility. E Barplot showing the geometric means of stereotyped behavior components across all flies and all complete 24h periods. F Barplot showing the geometric means of stereotyped behavior components across all flies and all hours grouped by experimental day.

For each experiment, we imaged male isoKH11 D. melanogaster from two days post-eclosion (emergence from the pupa as an adult insect) until death, yielding 4–8 days of continuous recording with half the flies dying by Day 5 (Figure 1C), for a total of 5,584 fly-hours. Note that this lifespan is shorter than conventional assays due to the nutrient-limited sucrose-based food source we used to avoid fungal growth [15].

In the natural world, daytime conditions increase lighting and temperature, but in the lab, the D. melanogaster circadian cycle can be maintained by these factors in isolation, with a day/night lighting cycle under constant temperature or a temperature cycle under constant lighting conditions [1618]. Our experiments have both a change in light intensity and temperature conditions between day and night, with daytime temperature levels varying between different experiments (~28–29 °C for experiments 1–2, and ~ 30–31 °C for experiments 3–4) and nighttime temperatures settling to ~27 °C for all experiments (see Methods and Figure S9). We provide temperature and humidity recordings with our dataset.

To extract postural information from our data, we used SLEAP, a deep-learning-based framework that can infer animal posture based on user-generated training data [12] (See Methods for details). We tracked a 14-point skeleton comprised of the head, eyes, thorax, abdomen, wing tips, tarsi, and proboscis of each individual (Figure 1B). While our mean localization error was less than .1mm (Figure S2), the quality of the tracks decreased when animals walked along the edges of the arenas. Accordingly, we built a classifier to identify the time points when flies walked on the edges (Figure S3), and excluded these time points from the portions of our analysis reliant on accurate tracking of any body part but the thorax.

In order to quantify discrete, stereotyped behaviors, we modified the MotionMapper pipeline [7] to parallelize more steps and optimize use for postural data instead of raw images (Figure S4). We used the Lomb-Scargle periodogram rather than a continuous Morlet wavelet transform to generate power spectra for each body part as this algorithm does not require interpolation of missing data. As a first step, we classified all time points with a total power of less than 0.5012mm2, summed over all tracked positions, as ‘idle’, i.e. times where the flies are not moving at all. We exclude all time points classified as idle and all non-idle edge time points from the spectral analysis that follows. ‘Idle’ and ‘non-idle edge’ then become their own behavioral categories.

Our total amount of data is too large to allow for direct classification of behaviors from all time points. Instead, we generated a set of 141 one-hour videos sampled evenly across flies, time of day, and day of experiment. From this subset of the videos, we selected 64,014 time points representative of the full suite of observed dynamics via an importance-sampling algoirthm [7], and embed the power spectra from these points into two dimensions using the UMAP algorithm.

We then embedded all time points from the 141 hour subset and found well-separated peaks of high density using an adaptive threshold (Figure S5). We assigned behavior labels to these regions by looking at randomly selected clips from time points where the flies’ dynamics fell within a given region’s boundaries for a reasonable length of time. We grouped together regions of similar dynamics, and identified seven well-defined behaviors: idle, proboscis extension, fore grooming (of the eyes or forelegs), hind-grooming (of the abdomen or hindlegs), wing grooming, altered locomotion (often involving slipping or limping), and locomotion. The idle behavior state includes all points assigned as idle using the total power cutoff as well as several regions of the spectral embedding that contained idle behaviors with single-limb tracking errors. In addition to these well defined behaviors, ~15% of all time points represent unstereotyped dynamics, where the fly is either on the edge and non-idle, or where its dynamics fall outside the boundaries of the identified peaks of stereotyped behaviors. We exclude these time points from later analyses. Finally, we project the full data set into the two-dimensional space and use the behavioral boundaries from the training set to classify each time point as one of the six stereotyped behaviors, idle, non-idle edge, or unstereotyped.

We used 5 frames (1/20 of a second) as a minimum bout length for each stereotyped behavior, and forward-filled each fly’s behavior sequence with this bout length, assigning any bout of 4 frames or less to the previous region of long duration. The resulting ethograms permit analysis of patterns in locomotion, feeding, and grooming (Figure 1D). Because our data is continuous over multiple 24-hour periods, we can look at how behavior varies with time of day and across days of the experiment.

Our data is closed (i.e. the fraction of time spent in all behaviors must add up to one) requiring us to use methods of compositional data analysis to examine changes across flies, hours, and days [13, 19]. Averages of closed data are best calculated as geometric means, which we denote ‘behavior components’. To discuss circadian behavioral effects, we use Zeitgeber time (ZT), where time is measured from the onset of a periodic stimulus rather than from midnight on a clock, to capture the cyclic nature of circadian effects. For this set of experiments, ZT = 0h corresponds to the visible lights coming on, and ZT = 12h corresponds to lights turning off.

Looking across all fly hours and all days, we see a distinct circadian pattern of behavior with higher levels of idle during the dark hours, and more locomotion and grooming during the light hours (Figure 1E). The first hour after the lights turn on is particularly distinct, with comparatively high levels of locomotion and grooming. The locomotion and grooming behavior components increase in the hours leading up to lights on and lights off, indicating anticipation of the change in lighting condition. Over the course of the experiment, the flies’ behaviors start changing significantly after Day 3 (Figure 1F). Time spent in idle increases over Days 4 through 6 as flies begin dying on the nutritionally incomplete food used for this experiment.

To examine overall behavior variation across hours of the day, we carried out a principal components analysis of the compositional data [20, 21] using the compositions package in R [22]. We used the isometric log-ratio transformed behavior compositions to carry out robust principal components analysis using the Minimum Covariance Determinant (MCD) method, and then backtransformed the result into centered log-ratio loadings. The first three principal components (PCs) explain ~85% of the variance across all fly hours (Figure S6). As can be seen in the biplot of the first two PCs (Figure 2A), PC1 largely weights the locomotion behaviors, locomotion and altered locomotion, against idle and proboscis extension. This PC describes the main differences between day and night, with positive projection averages during the day, corresponding to more locomotion/grooming, and negative values at night when the animals are idle (Figure 2B). The average projection along PC1 begins to increase before the lights turn on, indicating that the animals anticipate the rise of the sun. Peak amplitude along this PC occurs just after the lights turn on, potentially indicating a slow morning transition from nighttime behaviors to daytime activity. The level of this projection stays roughly constant throughout the day, but then increases and peaks just before the lights turn off at 12h ZT. This is followed by a slow decline in the amplitude after dark until reaching a steady night level.

Figure 2.

Figure 2.

Principal component analysis of stereotyped behavioral components of all flies across all experimental hours. A Biplot showing the projections of individual fly hours and loadings of each stereotyped behavioral component. Dark gray dots show timepoints from when lights are off and light gray dots show timepoints when lights are on. B Projection of PC1 against time of day for all complete 24h periods of all flies. C Projection of PC2 against time of day for all complete 24h periods of all flies. D Circadianicity vs day of experiment as measured by the difference between the average projection onto PC1 during night hours minus the projection onto PC1 during day hours.

Previous behavioral studies of the D. melanogaster circadian cycle have used relatively coarse metrics, such as the activity counts generated by Drosophila Activity Monitors [5]. These studies have shown that D. melanogaster have peaks of locomotion activity around their subjective morning and evening, with the increase in activity slightly anticipating the actual change in lighting conditions [23, 24]. Our high-resolution behavioral data and the projection along PC1 recapitulate these general trends, but show quantitative difference when the lights change. In particular, we see gradual change in amplitude after lights turn off that last several hours whereas this previous work sees a more abrupt sesation of locomotion at this time.

The second principle component weights the three grooming behaviors (fore, hind, and wing) against the locomotion behaviors and proboscis extension. The average projection onto PC2 has a distinct peak during the hour just after lights turn on, separating this unique part of the circadian pattern from the more general day vs. night changes in behavior picked up by PC1 (Figure 2C). PC3 largely separates the first two experiments (begun 02/17/2022 and 03/13/2022) from the second set of experiments (begun 03/26/2022 and 04/18/2022) (Figure S7). The second set of experiments took place at higher temperatures (Figure S9). The difference in the projections of each fly-hour along PC3 between the two sets of experiments is lowest on Day 1, and increases over experimental days.

Since the amplitude along PC1 largely follows the day-night cycle and describes the circadian change in behaviors, we use the difference between the average value of PC1 during light and dark hours to define a ‘circadianicity’ value for each fly day. We find that circadianicity decreases steadily over days in the experiment (Figure 2D). Previous studies have found that the sleep/wake cycles of behavior in D. melanogaster weaken as they age [25]. While the flies in our experiment were all comparatively young (all died before 10 full days, whereas life expectancy is 2–3 months under ideal conditions), they were living in very harsh conditions of relatively high temperature, low humidity, and poor nutrient availability. The gradual weakening over the course of the experiment is in some ways similar to an accelerated aging, and the steady decline in circadianicity over 6 days is similar to the decline in the strength of the sleep/wake cycle seen in over experiments over 60 days [25].

We leveraged our high-resolution behavior data to carry out an in-depth analysis of D. melanogaster circadian patterns of behavior, focusing our analysis on the first day of the experiment when circadianicity was strongest(Figure 2D). Flies were reared from embryos to two day old adults with the same light/dark cycle time and phase as the experiments. Even before eclosing, D. melanogaster exhibit circadian patterns of certain behaviors, such as larval negative phototaxis [28] or eclosion [29], so it is unsurprising that even 2–3 day old adults already have a strong circadian pattern.

The geometric means of behavior components across all flies versus ZT for Day 1 show the expected pattern of increased idle during the night and increased locomotion during the day (Figure 3A). The hour just after lights on remains the most distinct, with a very low idle behavior component. After lights off, the flies take ~2 hours to settle into their characteristic high idle, low locomotion night state.

Figure 3.

Figure 3.

Circadian patterns of behavior on experimental Day 1. A Barplot showing the geometric means of stereotyped behavioral components of the first experimental day across all flies. B Ternary plot showing the geometric means of condensed behavioral components across all flies for each circadian hour of Day 1. Directions along PC1 (dashed) and PC2 (solid) as calculated by perturbing the geometric mean of the displayed data points [26]. The ternary plot was generated using the Ternary Plots package in MATLAB [27]. C Grooming enrichment with respect to the geometric mean of the condensed grooming behavioral component of the first experimental day for each fly with bootstrapped confidence intervals. D Mean proboscis bout length by hour of the first experimental day. The shaded region is the standard error. E Mean locomotion speed (mm/s) during stereotyped locomotion state by hour of the first experimental day. The shaded region is the standard error.

The temporal changes we observe in the two locomotion behaviors (altered locomotion and locomotion) are similar, as are the changes in the different grooming behaviors (fore, hind, and wing grooming). Using these strong correlations, we condensed our seven stereotyped behavior components into three categories, grouping together the grooming behaviors, the locomotion behaviors, and idle and proboscis extension. This allowed us to plot the average behavior composition for each circadian hour in a ternary plot to visualize differences in overall behavior across circadian time and along the previously identified PCs (Figure 3B). The day and night hours cluster together and largely lie along PC1 as expected. The two hours just after lights off fall between these clusters as the flies transition into their night state of behavior. The hour just after lights on is an outlier, falling well off the line of variance explained by PC1, with higher proportions of grooming and locomotion behaviors compared to all other circadian hours. This hour lies in the direction of increasing PC2, which explains ~17% of the variance in the data. This, combined with the peak in the projection of behavior components along PC2 during the hour after lights on (Figure 2C) indicates that this hour is a unique time point in the circadian cycle of behavior.

To further investigate the circadian pattern of grooming, we looked at the enrichment of grooming behaviors at each circadian hour compared to the geometric mean of the grooming behavior component across all hours (Figure 3C). It has been shown that spontaneous grooming is under circadian control, but no clear pattern of when grooming happens throughout the day has been identified [30]. We find that grooming behaviors peak in the hour after lights on, contributing to the uniqueness of that time point, in agreement with our analysis of PC2. This temporary spike in grooming behavior may come from a need to refresh the various sensory appendages that lie along the body after a prolonged nighttime period without grooming.

Grooming remains enriched during the day, although this enrichment decreases after the early morning hours. Of the specifically identified grooming states, flies spend the most time grooming their fore limbs and eyes, with a lower proportion of time spent in hind grooming and wing grooming. This follows the flies’ hierarchy of grooming motor programs, where fore grooming is prioritized, followed by abdomen grooming, which is captured in our hind grooming state, and finally wing grooming [31].

We also looked at daily eating patterns, using proboscis visibility as a proxy for feeding as proboscis extension is well correlated with food intake [32]. Previous studies report peak feeding activity centered around lights on and lights off in the mornings and evenings, with more feeding concentrated in the evening [33, 34]. Proboscis extension comprises a very small fraction of our data, less than 1% of the overall behaviors across all time points. Because it is such a small component, using compositional data analysis techniques is challenging, as many true zeros exist in the proboscis data. To get a better sense of the circadian nature of proboscis extension (and feeding), we instead look at the average duration of proboscis extension bouts over the course of the day (Figure 3D). We find that flies typically leave their proboscis extended for about three seconds during night bouts, and about 2 seconds during day bouts. By this measure we do not see notable peaks of morning and evening feeding, but instead a more general trend of more time spent feeding at night, and less during the day.

Our observed trend of the locomotion behavior component with the time of day differs from results from previous studies using activity counts to measure overall movement levels. While there is a slight increase in the locomotion behavior component in anticipation of lights on in our study, it is less dramatic than the increase in activity counts observed in previous work [23], and we see no peak in locomotion around lights off compared to the locomotion behavior component throughout the day hours. However, the circadian pattern of locomotion speed (the mean speed of flies only when they are in the ‘locomotion’ state, calculated with the mean on a 5 frame rolling window) has peaks around each change in lighting conditions, along with anticipatory increases, particularly for lights on (Figure 3E). In Drosophila Activity Monitors, activity counts are recorded each time a fly crosses an infrared beam [5]. These counts could increase due to a combination of increased movement time and increased movement speed. Our results indicate that it is an increase in movement speed, rather than time spent moving, that is responsible for the larger activity count peaks around lights on and lights off. The increase in locomotion speed before lights on, and the gradual falling off after lights off, indicates that flies are modulating their movement speed partially due to internal cues, rather than only as a startle response or some other reaction to lights on.

In addition to circadian patterns of behavior, the flies’ behavior changed across experimental days as they weakened and died. Because of the nutritionally incomplete food and the relatively high temperature and low humidity, flies in our experiment all died within 8 days. The behavioral composition remained relatively constant across the first 3 days of the experiment, but starting at Day 4 the idle component began to increase (Figure 1F). This is similar to the increase in the proportion of time male flies spend idle near the end of their lives in a more natural aging paradigm [11]. Flies also show a reduction in the propensity to spend more time near the edge of the arena rather than the center after the first 3 days of the experiment (Figure S8A). The wall following behavior of D. melanogaster likely arises from boundary exploration, possibly as a means of seeking escape from a given enclosure [35]. Over the course of the experiment, flies decrease wall following behavior as habituation to an unchanging environment decreases exploration activities [36]. However, the edge preference is difficult to disentangle from the differences in the fraction of time spent in other stereotyped behaviors at different radii (Figure S8B). Flies spend an increased fraction of time in locomotion near the edge of the area and a increased fraction of idle near the center of the arena, and these differences may drive the observed changes in edge preference.

Since the hour after the lights turn on is such a unique time point in the circadian pattern of behavior, we were curious how the behavior components during that hour change over the days of the experiment. The geometric means of the relative proportions of the grooming behaviors, idle behaviors, and locomotion behaviors across all surviving flies in the hour after dawn remain similar for the first 3 days of the experiment, lying in a cluster offset from PC1 in the ternary plot (Figure 4A). Starting at Day 4, however, the hour after dawn components begin falling onto PC1, and are much more similar to other circadian time points. This behavioral composition moves towards lower values of PC1 with age and becomes more similar to the nighttime composition. Thus, as the flies in our experiment weaken and die, not only do their day and night behavior patterns begin to look more similar, they also lose the distinctive behavioral character of the hour after dawn.

Figure 4.

Figure 4.

Day wise behavioral changes throughout the experiment. A Ternary plot showing the geometric mean of the condensed stereotyped behavioral components of the first hour after lights on across all surviving flies for each complete 24h period. Directions of PC1 (dashed) and PC2 (solid) are also shown, as calculated based on perturbation of the geometric means of all circadian hours from Day 1. B Mean proboscis bout length by day of experiment. The shaded region is the standard error. C Mean locomotion speed (mm/s) during the stereotyped locomotion state by day of experiment. The shaded region is the standard error.

We also asked how feeding and locomotion change with age in our experiments. We find that proboscis bout duration decreased steadily through Day 3 and then plateaued (Figure 4B). It has been reported that flies eat more as they age [37], but the limited food source and harsh environmental conditions may change this trend for the flies in our experiment. In contrast, the average locomotion speed remained steady through Day 3 and then began decreasing with age (Figure 4C). The combination of steady locomotion speed and no increase in the fraction of time spent locomoting means that overall ‘locomotion activity’, comparable to traditional activity counts, does not appreciably change over the first 3 days of the experiment after which there is a decline. Previous studies have shown that male flies in a natural aging context have an increase in locomotion activity during early life, before a decrease leading up to death [11, 38]. We do not see this increase at young age in our experiment, however, lifelong locomotion patterns are genotype-dependent [39], so results from theisoKH11 flies used here may not be not directly comparable to these previous studies.

Conclusion

We report the first measurements of high resolution D. melanogaster behavior recorded over many days with high temporal bandwidth. By leveraging recent advances in GPU-based video processing and postural inference, we captured the behavior of freely moving D. melanogaster over the course of multiple days, encompassing the behavioral effects of circadian rhythms, starvation, aging, and habituation at continuous high resolution. Our data recapitulates many previously described trends in D. melanogaster circadian and aging/dying patterns of behavior. We also leveraged high resolution postural data in combination with fine-grained ethograms to characterize changes in proboscis extension bout duration and locomotion speed across time of day and over the days of our experiment. With compositional data analysis techniques, we identified the hour after lights on as a uniquely distinctive time point in the circadian pattern of behavior.

Our data addresses several limitations of the high-quality ethological data currently available. Previous work on the temporal structure of behavior has found correlations extending beyond the length of the available data, typically 30–60 minutes [40, 41]. The data presented here extends these time scales by more than two orders of magnitude. This data set is also the first to continuously capture high dimensional, high-resolution behavioral data across a circadian cycle, allowing us to investigate how changes in internal state related to time of day affect behavior. By recording when flies feed (as measured by proboscis extension), this data may also provide new insights into the effects of hunger and satiety. We provide both high-resolution recordings and our postural tracking output to facilitate further data analysis. The analyses presented here leverage only a fraction of the resolution and dimensionality provided by our data, and we hope this 100-fold increase in the amount of high-quality ethological data available will give rise to yet more tools and techniques. Finally, aging in our experiments was significantly accelerated due to nutrient limitation. Future work with new kinds of arenas and food sources may extend this type of high-resolution behavioral recordings to cover the full natural lifespan of a fly.

Methods

Fly rearing

To control for possible genetic effects, we used the highly inbred wild-type isoKH11 strain. isoKH11 flies were raised on standard cornmeal media (see github.com/shaevitz-lab/long-timescale-analysis for complete recipe) at 25°C under humidity 60% with a 12-hour light/dark cycle, with visible light of ~1300lux. Before each experiment, we performed egg lays and, on eclosion, flipped flies into new vials. We allowed the flies to age for two days, yielding 2–3 day-old flies, which we anesthetized using CO2 and distributed males to arenas to be imaged.

Media

During experiments, flies were allowed to feed ad lib from a pad of optically clear media (10% sucrose, 1.5% agarose). We were not able to include a protein source, such as yeast extract, as this led to high levels of fungal growth within 1–2 days that obscured imaging.

Arena

We constructed experimental arenas out of laser-cut acrylic using acrylic cement (McMaster 7517A4) to adhere layers together (Figure S1B). The bottom layer of each arena consisted of a 3mm layer of food (described above). Each individual fly was able to freely move about within a 25mm diameter cylinder of height 1.5mm. Because these arenas have straight walls, flies are able to walk along the sides, which can cause limb occlusions that pose difficulties to downstream postural tracking. To address this, we used a low arena height that impedes flies from easily maneuvering off the base layer. We also coated the top and walls with Sigmacote (Sigma-Aldrich SL2), which discourages flies from walking on the ceiling of the arena but does not fully restrict them from walking on the edges of the arenas.

Imaging and illumination

The arenas are lit from above using 880nm IR LED pads (Advanced Illumination BL040801–880-IC). Below each arena, we placed high-resolution, high frame-rate cameras (FLIR BFS-U3–32S4M-C) paired with 880nm band-pass filters (ThorLabs FB880–70) (Figure S1A). This combination allows bright, uniform lighting across the arenas permitting extremely short exposure times to reduce motion blur. Imaging from above and recording from below also eliminates condensation in the arenas. We found that the ideal balance between contrast and motion blur was at 1 ms exposure time. In addition, we used a pair of visible light LED panels at the top of the tent enclosing the experimental setup to provide a 12-h visible light (~6500lux) and 12-h darkness cycle (< 6lux), matching the timing of the light/dark cycle under which experimental flies were reared. This visible light cycle did not appreciably affect the IR imaging.

Temperature and humidity

We recorded temperature and humidity within the imaging enclosure throughout the trials (Supplemental Figure 2) with a Extech RHT10 datalogger. As temperature and humidity have known effects on fly behavior [42, 43], these data are provided with the behavioral data set so that they may be taken into account (Figure S9). The environmental controls of the room in which our experiments were housed cycle on and off, leading to ~ 1° C temperature fluctuations with a period of ~1 hour.

Acquisition software

We used a modified version of campy (github.com/Wolfffff/campy) forked from github.com/ksseverson57/campy which was developed by Kyle Seversson. We altered the package to suit our specific use-case, including chunking videos and adjusting the exception handling. Campy pipes frames from FLIR’s Spinnaker SDK (PySpin) to FFmpeg. The flexibility of FFmpeg allows us to drastically reduce the file size of our videos by utilizing hardware-based compression. Specifically, we use Nvidia NVENC (hevc_nvenc) paired with the segment_-time flag to produce hour-long chunks. This increased compression makes it feasible to perform high-throughput recordings of 8 flies simultaneously on a single computer. To facilitate ease of use in analysis and distribution, we merge these videos into long videos; however, because loading tens of millions of frames and instances can cause IO issues, we use hour long segments for training.

The machines used for recording were running Windows 10 with 64GB RAM, Intel(R) Core(TM) i7–8700K CPU processor, and either Nvidia Quadro RTX 4000, Quadro P2000, or GeForce RTX 2080 GPU.

SLEAP tracking

After imaging, SLEAP [12] was used to estimate the pose of each individual and maintain identity across videos. We used a 14 node skeleton: head, eyes (eyeL, eyeR), proboscis, thorax, abdomen, fore legs (forelegL, forelegR), mid legs (midlegL, midlegR), and the hind legs (hindlegL, hindlegR). We labeled 1930 individuals across 482 frames. 434 frames (1738 instances) were used for training, with 48 frames (192 instances) reserved for validation. We trained a U-Net based model with a receptive field size of 76 pixels (2.6mm) on Nvidia A100 GPUs. The complete hyperparameter set is provided along with the model. We include some training data from recordings not included in the final data set due to early truncation but with identical frame rates and resolution. To facilitate dealing with the more than 500 million frame dataset, we use SLURM to distribute our inference across 30 Nvidia P100 GPUs at approximately 20 fps yielding approximately 600fps – 6x speed – tracking. After inferring locations with identity, we merged the resulting .slp files together and ran SLEAP’s identity tracking script to preserve identity over time. For convenience of analysis and storage, we convert each .slp file to HDF5. Since individuals are in separate chambers, we can validate these identity tracks by the amount of time spent in each quadrant of the arena. The pipeline for sectioning, merging, and tracking can be found on the associated GitHub repository.

Edge detection

While flies spend the majority of their time in the flat bottoms of the arenas, there is a small proportion of time (~5%) when they are oriented sideways with respect to the cameras with their tarsi on the walls of the arenas. In this position the legs are often occluded and difficult to identify, leading to SLEAP tracking errors. In order to provide a flag for time points when the flies are on the edge and tracking fidelity is compromised we used the MATLAB Classification Learner App to train an SVM to identify whether flies are on or off the edge based on the all-by-all distances between tracked body coordinates (excluding the proboscis), the speed of each body coordinate, and the distance between each body coordinate and the edge of the arena. We used 2788 training points equally split between on and off edge instances, and sampled evenly across all experimental flies. Our final model accurately labeled 95% of held out validation points (Figure S3).

Unsupervised behavioral classification

To identify stereotyped behaviors from body-part dynamics, we adapted the previously described MotionMapper pipeline [7] for our data (Figure S4). We first partially filled in missing data, interpolating all missing data for head and thorax points using Piecewise Cubic Hermite Interpolating Polynomial (PCHIP), to allow for subsequent egocentrizing. For all other nodes, we performed PCHIP interpolation with a limit of filling 5 consecutive missing values. Further, for the proboscis node, we replaced all missing values with the location of the had, representing a retracted proboscis. Further, we performed a median filter on all nodes with a window size of 5 and Gaussian filtering with standard deviation 1 and window size 5. Following this, we egocentrized the data by shifting all individuals so that the thorax is at (0, 0) and rotating each node location so that the thorax-head connection falls along the positive x-axis. After this, we calculated the Lomb-Scargle periodogram on rolling windows for each coordinate of each node. Because the Lomb-Scargle periodogram allows the utilization of unevenly sampled data and avoids the necessity of providing fully interpolated data. Further, by adjusting the window size based on our frequency of interest, we are able to capture behaviors across timescales similar to the envelope size in continuous wavelet transforms.

We compiled a representative subsample of our data by selecting 141 fly hours evenly across flies and time of day. Because flies are dying throughout the course of the experiment our sample set is slightly skewed towards earlier days to maintain even sampling across flies. We filtered training points from this subsample of data by removing time points where the flies were on the edge. We also removed time points we classified as idle where the total amplitude of the wavelets was less than 0.5012mm2, a threshold we empirically determined to separate the majority of idle instances where the fly was largely motionless. From these, we sampled 36000, or the maximum number of unfiltered time points, from each fly-hour. From each of the these groups, we importance sampled 454 time points for a total of 64,014 training points.

We embedded our importance-sampled training set into two dimensions using UMAP and used this map for behavioral segmentation. We found that UMAP resulted in superior separation into unique clusters for the total training set when compared with t-SNE. We used kernel density estimation to create a 2D probability distribution of our training points. To identify distinct peaks in the density of training points we eliminated points of extreme low density and utilized adaptive thresholding on the resulting distribution. We adjusted parameters by eye to achieve distinct clusters for obviously separate peaks of density while aiming to avoid oversegmentation.

In order to assign specific discrete behaviors to each region of stereotyped power spectra we randomly selected clips from our sample set (141 fly hours) corresponding to each region. We imposed a minimum duration based on the dwell time distribution for each region to avoid very short bouts where behaviors might be difficult to identify. We identified six well-defined stereotyped behaviors (proboscis extension, fore grooming, hind grooming, wing grooming, altered locomotion, and locomotion) as well as many clusters that corresponded to idle behaviors with single-joint SLEAP tracking errors.

We then embedded our entire dataset into the same two dimensional space. Using the boundaries defined on the training set we assigned all time points to one of our six well-defined stereotyped behaviors, idle, edge (as called by our edge detector), or unstereotyped. With this method, only ~15% of our data is classified as unstereotyped behavior.

Dwell times within these behavior states can vary from single frames to many hundreds of frames. To identify a reasonable minimum bout length we fit two geometric distributions to the total dwell time histogram. We selected 5 frames (~1/20 of a second) as a minimum bout duration, as this excludes ~95% of bouts from the distribution dominated by shorter bouts, and only ~14% of bouts from the distribution of longer bouts, which we take to include legitimate behavior bouts. We forward-filled ethograms with this bout duration, assigning any bout of 4 frames or less to the previous behavior of long duration.

Supplementary Material

1

Significance Statement.

Animal behaviors exist on many timescales, ranging from the milliseconds required for speaking individual words to the years of behavioral shifts due to aging. Investigating the temporal structure of behaviors at longer timescales is challenging, and requires continuous, high resolution data taken over days. Here we present a data set of continuously captured high resolution Drosophila melanogaster behavior recorded over 4–7 days. Our continuous, high resolution data allows us to describe patterns in fine-grained behaviors such as locomotion speed, proboscis extension, and grooming across minutes, hours, and days. With this data we reveal detailed circadian cycles of behavior and trends of behavior over the lifetime of the fly.

Acknowledgements

The authors acknowledge the Aspen Center for Physics where this work was first conceptualized, Gordon Berman, Ugne Klibaite, and Greg Stephens for inspirational discussion, and Diogo Melo for insightful comments on how to speed up our processing pipeline. This work was supported in part by the NSF through the Center for the Physics of Biological Function (PHY-1734030). SWW is supported by the NSF Graduate Research Fellowship Program (DGE-2039656). GCM-S is supported by the Paul F. Glenn Laboratories For Aging Research at Princeton University. JFA is funded by grants from the NIH: National Institute of Environmental Health Sciences (R01-ES029929) and National Institute of General Medical Sciences (R35GM124881). We also acknowledge that the work reported in this paper was substantially performed using the Princeton Research Computing resources at Princeton University, which is a consortium of groups led by the Princeton Institute for Computational Science and Engineering (PICSciE) and the Office of Information Technology’s Research Computing group.

Footnotes

Code availability

The source code for the data analysis is publicly available. The code can be found on GitHub (github.com/shaevitz-lab/long-timescaleanalysis). The repository includes the scripts used in this paper along with other pragmatic tools and examples.

The modified version of MotionMapperPy [7] we use can be found at https://github.com/Wolfffff/motionmapperpy and included as a git submodule in the primary repository.

Competing interests

The authors declare no competing interests.

Data availability

The data repository associated with this paper can be found at doi.org/10.34770/1sab-8845. For each individual, we provide a single HDF5 file that includes datasets for the tracked body parts, stereotyped behaviors, on/off edge classification, temperature and humidity data, along with experimental metadata such as start date and time and lights on and off times. Videos cropped to contain individual flies are also provided. The original uncropped videos and the full postural tracking data, as .slp files with prediction scores for each body part of each individual, are available upon request.

References

  • [1].Dawkins Richard. The selfish gene. New York: Oxford University Press, 1976. (cit. on p. 1). [Google Scholar]
  • [2].André E X Brown and Benjamin de Bivort. “Ethology as a physical science”. en. In: Nat. Phys. 14.7 (Apr. 2018), pp. 653–657 (cit. on p. 1). [Google Scholar]
  • [3].Berman Gordon J. “Measuring behavior across scales”. en. In: BMC Biol. 16.1 (Feb. 2018), p. 23 (cit. on p. 1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [4].Bialek William. “On the dimensionality of behavior”. en. In: Proc. Natl. Acad. Sci. U. S. A. 119.18 (May 2022), e2021860119 (cit. on p. 1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [5].Pfeiffenberger Cory et al. “Locomotor activity level monitoring using the Drosophila Activity Monitoring (DAM) System”. en. In: Cold Spring Harb. Protoc. 2010.11 (Nov. 2010), db.prot5518 (cit. on pp. 1, 4, 6). [DOI] [PubMed] [Google Scholar]
  • [6].Harbison Susan T et al. “Selection for long and short sleep duration in Drosophila melanogaster reveals the complex genetic network underlying natural variation in sleep”. en. In: PLoS Genet. 13.12 (Dec. 2017), e1007098 (cit. on p. 1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [7].Berman Gordon J et al. “Mapping the stereotyped behaviour of freely moving fruit flies”. en. In: J. R. Soc. Interface 11.99 (Oct. 2014) (cit. on pp. 1, 3, 8, 9). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Hsu Alexander I and Yttri Eric A. “B-SOiD, an open-source unsupervised algorithm for identification and fast prediction of behaviors”. In: Nature communications 12.1 (2021), p. 5188 (cit. on p. 1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [9].Luxem Kevin et al. “Identifying behavioral structure from deep variational embeddings of animal motion”. In: Communications Biology 5.1 (2022), p. 1267 (cit. on p. 1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [10].Weinreb Caleb et al. “Keypoint-MoSeq: parsing behavior by linking point tracking to pose dynamics”. In: BioRxiv (2023), pp. 2023–03 (cit. on p. 1). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Overman Katherine E et al. “Measuring the repertoire of age-related behavioral changes in Drosophila melanogaster”. en. In: PLoS Comput. Biol. 18.2 (Feb. 2022), e1009867 (cit. on pp. 1, 6, 7). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Pereira Talmo D et al. “SLEAP: A deep learning system for multi-animal pose tracking”. en. In: Nat. Methods 19.4 (Apr. 2022), pp. 486–495 (cit. on pp. 1, 3, 8). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [13].Aitchison J.. The statistical analysis of Compositional Data. Chapman and Hall, 1986. (cit. on pp. 2, 4). [Google Scholar]
  • [14].de Salomon C Hernández and Spatz H-C. “Colour vision inDrosophila melanogaster: Wavelength discrimination”. In: J. Comp. Physiol. 150.1 (Mar. 1983), pp. 31–37 (cit. on p. 2). [Google Scholar]
  • [15].Hassett Charles C. “The utilization of sugars and other substances by Drosophila”. In: The Biological Bulletin 95.1 (1948), pp. 114–123 (cit. on p. 3). [PubMed] [Google Scholar]
  • [16].Glaser Franz T and Stanewsky Ralf. “Temperature synchronization of the Drosophila circadian clock”. In: Current Biology 15.15 (2005), pp. 1352–1363 (cit. on p. 3). [DOI] [PubMed] [Google Scholar]
  • [17].Zhang Yong et al. “Light and temperature control the contribution of specific DN1 neurons to Drosophila circadian behavior”. In: Current Biology 20.7 (2010), pp. 600–605 (cit. on p. 3). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [18].Vanin Stefano et al. “Unexpected features of Drosophila circadian behavioural rhythms under natural conditions”. In: Nature 484.7394 (2012), pp. 371–375 (cit. on p. 3). [DOI] [PubMed] [Google Scholar]
  • [19].Jensen Greg. matlab-compositional-analysis. Matlab package version 0.0.1. 2014. URL: https://github.com/belarius/matlab-compositional-analysis (cit. on p. 4).
  • [20].Aitchison John and Greenacre Michael. “Biplots of compositional data”. In: Journal of the Royal Statistical Society Series C: Applied Statistics 51.4 (2002), pp. 375–392 (cit. on p. 4). [Google Scholar]
  • [21].Filzmoser Peter, Hron Karel, and Reimann Clemens. “Principal component analysis for compositional data with outliers”. In: Environmetrics: The Official Journal of the International Environmetrics Society 20.6 (2009), pp. 621–632 (cit. on p. 4). [Google Scholar]
  • [22].Gerald van den Boogaart K. and Tolosana-Delgado R.. ““compositions”: A unified R package to analyze compositional data”. In: Computers & Geosciences 34.4 (Apr. 1, 2008), pp. 320–338. ISSN: 0098–3004 (cit. on p. 4). [Google Scholar]
  • [23].Chiu Joanna C et al. “Assaying locomotor activity to study circadian rhythms and sleep parameters in Drosophila”. en. In: J. Vis. Exp. 43 (Sept. 2010) (cit. on pp. 4, 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [24].Dubowy Christine and Sehgal Amita. “Circadian Rhythms and Sleep in Drosophila melanogaster”. en. In: Genetics 205.4 (Apr. 2017), pp. 1373–1397 (cit. on p. 4). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [25].Koh Kyunghee et al. “A Drosophila model for age-associated changes in sleep: wake cycles”. In: Proceedings of the National Academy of Sciences 103.37 (2006), pp. 13843–13847 (cit. on p. 4). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [26].Billheimer Dean, Guttorp Peter, and Fagan William F. “Statistical interpretation of species composition”. In: Journal of the American statistical Association 96.456 (2001), pp. 1205–1214 (cit. on p. 5). [Google Scholar]
  • [27].Theune Ulrich. Ternary Plots. MATLAB Central File Exchange. Retrieved August 5, 2023. 2005. URL: https://www.mathworks.com/matlabcentral/fileexchange/7210-ternary-plots) (cit. on p. 5). [Google Scholar]
  • [28].Esteban O Mazzoni Claude Desplan, and Blau Justin. “Circadian pacemaker neurons transmit and modulate visual information to control a rapid behavioral response”. In: Neuron 45.2 (2005), pp. 293–300 (cit. on p. 5). [DOI] [PubMed] [Google Scholar]
  • [29].Pittendrigh Colin S. “On temperature independence in the clock system controlling emergence time in Drosophila”. In: Proceedings of the National Academy of Sciences 40.10 (1954), pp. 1018–1029 (cit. on p. 5). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [30].Qiao Bing et al. “Automated analysis of long-term grooming behavior in Drosophila using ak-nearest neighbors classifier”. In: Elife 7 (2018), e34497 (cit. on p. 5). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [31].Seeds Andrew M et al. “A suppression hierarchy among competing motor programs drives sequential grooming in Drosophila”. In: Elife 3 (2014), e02951 (cit. on p. 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [32].Wong Richard et al. “Quantification of food intake in Drosophila”. In: PloS one 4.6 (2009), e6063 (cit. on p. 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [33].Xu Kanyan, Zheng Xiangzhong, and Sehgal Amita. “Regulation of feeding and metabolism by neuronal and peripheral clocks in Drosophila”. en. In: Cell Metab. 8.4 (Oct. 2008), pp. 289–300 (cit. on p. 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Ro Jennifer, Harvanek Zachary M, and Pletcher Scott D. “FLIC: high-throughput, continuous analysis of feeding behaviors in Drosophila”. In: PloS one 9.6 (2014), e101107 (cit. on p. 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Soibam Benjamin et al. “Modeling Drosophila positional preferences in open field arenas with directional persistence and wall attraction”. en. In: PLoS One 7.10 (Oct. 2012), e46570 (cit. on p. 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Liu Lingzhi, Davis Ronald L, and Roman Gregg. “Exploratory activity in Drosophila requires the kurtz nonvisual arrestin”. en. In: Genetics 175.3 (Mar. 2007), pp. 1197–1212 (cit. on p. 6). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Driver CJI and Lamb Marion J. “Metabolic changes in ageing Drosophila melanogaster”. In: Experimental gerontology 15.3 (1980), pp. 167–175 (cit. on p. 6). [DOI] [PubMed] [Google Scholar]
  • [38].Bourg E Le. “The rate of living theory. Spontaneous locomotor activity, aging and longevity in Drosophila melanogaster”. en. In: Exp. Gerontol. 22.5 (1987), pp. 359–369 (cit. on p. 7). [DOI] [PubMed] [Google Scholar]
  • [39].Fernandez JR et al. “Differences in locomotor activity across the lifespan of Drosophila melanogaster”. In: Experimental Gerontology 34.5 (1999), pp. 621–631 (cit. on p. 7). [DOI] [PubMed] [Google Scholar]
  • [40].Gordon J Berman William Bialek, and Shaevitz Joshua W. “Predictability and hierarchy in Drosophila behavior”. en. In: Proc. Natl. Acad. Sci. U. S. A. 113.42 (Oct. 2016), pp. 11943–11948 (cit. on p. 7). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [41].Bialek William and Shaevitz Joshua W. “Long time scales, individual differences, and scale invariance in animal behavior”. In: (Apr. 2023). arXiv: 2304.09608 [q-bio.NC] (cit. on p. 7). [DOI] [PubMed] [Google Scholar]
  • [42].Sayeed Omer and Benzer Seymour. “Behavioral genetics of thermosensation and hygrosensation in Drosophila.” In: Proceedings of the National Academy of Sciences 93.12 (1996), pp. 6079–6084 (cit. on p. 8). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [43].Soto-Padilla Andrea et al. “Thermosensory perception regulates speed of movement in response to temperature changes in Drosophila melanogaster”. en. In: J. Exp. Biol. 221.Pt 10 (May 2018) (cit. on p. 8). [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

1

Data Availability Statement

The data repository associated with this paper can be found at doi.org/10.34770/1sab-8845. For each individual, we provide a single HDF5 file that includes datasets for the tracked body parts, stereotyped behaviors, on/off edge classification, temperature and humidity data, along with experimental metadata such as start date and time and lights on and off times. Videos cropped to contain individual flies are also provided. The original uncropped videos and the full postural tracking data, as .slp files with prediction scores for each body part of each individual, are available upon request.


Articles from ArXiv are provided here courtesy of arXiv

RESOURCES