Skip to main content
eLife logoLink to eLife
. 2025 Jul 28;13:RP99767. doi: 10.7554/eLife.99767

Optimal information gain at the onset of habituation to repeated stimuli

Giorgio Nicoletti 1,2,3, Matteo Bruzzone 4,5,6, Samir Suweis 3,5, Marco dal Maschio 4,5, Daniel Maria Busiello 3,7,
Editors: Arvind Murugan8, Aleksandra M Walczak9
PMCID: PMC12303571  PMID: 40720501

Abstract

Biological and living systems process information across spatiotemporal scales, exhibiting the hallmark ability to constantly modulate their behavior to ever-changing and complex environments. In the presence of repeated stimuli, a distinctive response is the progressive reduction of the activity at both sensory and molecular levels, known as habituation. In this work, we solve a minimal microscopic model devoid of biological details, where habituation to an external signal is driven by negative feedback provided by a slow storage mechanism. We show that our model recapitulates the main features of habituation, such as spontaneous recovery, potentiation, subliminal accumulation, and input sensitivity. Crucially, our approach enables a complete characterization of the stochastic dynamics, allowing us to compute how much information the system encodes on the input signal. We find that an intermediate level of habituation is associated with a steep increase in information. In particular, we are able to characterize this region of maximal information gain in terms of an optimal trade-off between information and energy consumption. We test our dynamical predictions against experimentally recorded neural responses in a zebrafish larva subjected to repeated looming stimulations, showing that our model captures the main components of the observed neural habituation. Our work makes a fundamental step towards uncovering the functional mechanisms that shape habituation in biological systems from an information-theoretic and thermodynamic perspective.

Research organism: Zebrafish

Introduction

Sensing mechanisms in biological systems span a wide range of temporal and spatial scales, from cellular to multi-cellular level, forming the basis for decision-making and the optimization of limited resources (Tkačik and Bialek, 2016; Azeloglu and Iyengar, 2015; Gnesotto et al., 2018; Whiteley et al., 2017; Perkins and Swain, 2009). Emergent macroscopic phenomena such as adaptation and habituation reflect the ability of living systems to effectively process the information they collect from their noisy environment (Nemenman, 2012; Nakajima, 2015; Koshland et al., 1982). Prominent examples include the modulation of flagellar motion operated by bacteria according to changes in the local nutrient concentration (Tu et al., 2008; Tu, 2008; Mattingly et al., 2021), the regulation of immune responses through feedback mechanisms (Cheong et al., 2011; Wajant et al., 2003), the progressive reduction of neural activity in response to repeated looming stimulation (Marquez-Legorreta et al., 2022; Fotowat and Engert, 2023), and the maintenance of high sensitivity in varying environments for olfactory or visual sensing in mammalian neurons (Lan et al., 2012; Menini, 1999; Kohn, 2007; Lesica et al., 2007; Benucci et al., 2013).

In the last decade, advances in experimental techniques fostered the quest for the core biochemical mechanisms governing information processing. Simultaneous recordings of hundreds of biological signals made it possible to infer distinctive features directly from data (Schneidman et al., 2006; Tkačik et al., 2014; Kurtz et al., 2015; Tunstrøm et al., 2013). However, many of these approaches fall short of describing the connection between observed behaviors and underlying microscopic drivers (Nicoletti and Busiello, 2021; Nicoletti and Busiello, 2022a; De Smet and Marchal, 2010; Nicoletti et al., 2022b). To fill this gap, several works focused on the architecture of specific signaling networks, from tumor necrosis factor (Cheong et al., 2011; Wajant et al., 2003) to chemotaxis (Tu et al., 2008; Celani et al., 2011), highlighting the essential structural ingredients for their efficient functioning. An observation shared by most of these studies is the key role of a negative feedback mechanism to induce emergent adaptive responses (Kollmann et al., 2005; De Ronde et al., 2010; Selimkhanov et al., 2014; Barkai and Leibler, 1997). Moreover, any information-processing system, biological or not, must obey information-thermodynamic laws that prescribe the necessity of a storage mechanism (Parrondo et al., 2015). This is an unavoidable feature highlighted in numerous chemical signaling networks (Tu et al., 2008; Kollmann et al., 2005) and biochemical realizations of Maxwell Demons (Flatt et al., 2023; Bilancioni et al., 2023). As the storage of information during processing generally requires energy (Bennett, 1982; Sagawa and Ueda, 2009), sensing mechanisms have to take place out of equilibrium (Gnesotto et al., 2018; Hartich et al., 2015; Skoge et al., 2013; Lestas et al., 2010). Recently, the discovery of memory molecules (Coultrap and Bayer, 2012; Frankland and Josselyn, 2016; Lisman et al., 2002) hinted at the possibility that storing mechanisms might be instantiated directly at the molecular scale. Overall, negative feedback, storage, and out-of-equilibrium conditions seem to be necessary requirements for a system to process environmental information and act accordingly. To quantify the performance of a biological information-processing system, theoretical developments made substantial progress in highlighting thermodynamics limitations and advantages (Sartori et al., 2014; Barato et al., 2014; Lan et al., 2012), making a step towards linking information and dissipation from a molecular perspective (Ouldridge et al., 2017; Flatt et al., 2023; Penocchio et al., 2022).

Here, we consider an archetypal yet minimal model for sensing that is inspired by biological networks (Lan et al., 2012; Tadres et al., 2022; Ma et al., 2009) and encapsulates all these key ingredients, that is negative feedback, storage, and energy dissipation, and study its response to repeated stimuli. Indeed, in the presence of dynamic environments, it is common for a biological system to keep encountering the same stimulus. Under these conditions, a progressive decay in the amplitude of the response is often observed, both at sensory and molecular levels. In general terms, such adaptive behavior is usually named habituation and is a common phenomenon recorded in various systems, from biochemical networks (Rahi et al., 2017; Tadres et al., 2022; Jalaal et al., 2020) to populations of neurons (Malmierca et al., 2014; Shew et al., 2015; Marquez-Legorreta et al., 2022; Fotowat and Engert, 2023). In particular, habituation characterizes many neuronal circuits along the sensory-motor processing pathways in most living organisms, either invertebrates or vertebrates (Malmierca et al., 2014; Shew et al., 2015), where inhibitory feedback mechanisms are believed to modulate the stimulus weight (Lamiré et al., 2022; Fotowat and Engert, 2023; Barzon et al., 2025). Most importantly, the first complete characterization of habituating phenomena dates back to 1966 (Thompson and Spencer, 1966), when different hallmarks of habituation in vertebrate animals were characterized. Despite its widespread occurrence across remarkably different scales, the connection between habituation in the animal kingdom and brainless molecular systems has only recently attracted considerable attention. A limited number of dynamical models have been proposed to explore the similarities and differences between the manifestations of these two fundamentally distinct phenomena (Eckert et al., 2024; Smart et al., 2024). However, dynamical characterizations of habituation still lack a clear identification of the functional role of habituation in regulating information flow, optimal processing, and sensitivity calibration (Benda, 2021), and in controlling behavior and prediction during complex tasks (Bueti et al., 2010; Sederberg et al., 2018; Palmer et al., 2015).

In this work, we explicitly compute the information shared between readout molecules and external stimulus over time. We find that the information gain peaks at intermediate levels of habituation, uncovering that optimal processing performances are necessarily tangled with maximal activity reduction. This region of optimal information gain can be retrieved by simultaneously minimizing dissipation and maximizing information in the presence of a prolonged stimulation, hinting at an a priori optimality condition for the operations of biological systems. Our results unveil the role of habituation in enhancing processing abilities and open the avenue to understanding the emergence of basic learning mechanisms in simple molecular scenarios.

Results

Archetypal model for sensing in biological systems

Several minimal models for adaptation are composed of three building blocks (Ma et al., 2009; Tadres et al., 2022; Tu et al., 2008; Celani et al., 2011; Rahi et al., 2017): one responsible for buffering the input signal; one representing the output; and one usually reminiscent of an internal memory. Here, we start with an analogous archetypal architecture. The three building blocks (or units) are represented by a receptor (R), and readout (U) and storage (S) populations.

To introduce our model in general terms, we consider a time-varying environment H, representing an external signal characterized by a probability pH(h,t) of being equal to h at time t. This input signal is read by the receptor unit R. The receptor can be either active (A), taking the value r=1, or passive (P), r=0, with these two states separated by an energetic barrier ΔE. The transitions between passive and active states can happen through two different pathways, a ‘sensing’ reaction path (superscript H) that is stimulated by the external signal h, and an ‘internal’ path (superscript I) that mediates the effect of the negative feedback from the storage unit (see Figure 1a). We further assume, for simplicity, that the rates follow an effective Arrhenius’ law:

ΓPA(H)=eβ(hΔE)ΓR(H)ΓAP(H)=ΓR(H)ΓPA(I)=eβΔEΓR(I)ΓAP(I)=ΓR(I)eβκσs/NS (1)

Figure 1. Sketch of the model architecture and biological examples at different scales.

Figure 1.

(a) A receptor R transitions between an active (A) and passive (P) state along two pathways, one used for sensing (red) and affected by the environment h, and the other (blue) modified by the energy of storage molecules, σs, tuned by inhibition strength κ and storage capacity NS. Here, β=(kBT)1 encodes the inverse temperature. An active receptor increases the response of a readout population U (orange), which in turn stimulates the production of storage units S (green) that provide negative feedback to the receptor. (b) In the chemical network underlying chemotactic response, we can identify a similar architecture. The input ligand binds to membrane receptors, decreasing kinase activity and producing phosphate groups whose concentration regulates the receptor methylation level. (c) Similarly, in olfactory sensing, odorant binding induces the activation of adenylyl cyclase (AC). AC stimulates a calcium flux, eventually producing phosphorylase calmodulin kinase II (CAMKII) which phosphorylates and deactivates AC. (d) In neural response, multiple mechanisms take place at different scales. In zebrafish larvae, visual stimulation is projected along the visual stream from the retina to the cortex, a coarse-grained realization of the R-U dynamics. Neural habituation emerges upon repeated stimulation, as measured by calcium fluorescence signals (dF/F0) and by the corresponding two-dimensional PCA of the activity profiles.

where the input is modeled as an additional thermodynamic driving with an energy βh, and ΓR(H)=gΓR(I)=τR1 sets the timescale of the receptor. In particular, g represents the ratio between the timescales of the two pathways, and the inverse temperature β=(kBT)1 encodes the role of the thermal noise, as lower values of β are associated with faster reactions.

The negative feedback depends on the energy provided by the storage, σs, where s is the number of active storage molecules. The parameter κ represents the strength of the inhibition, and NS is the storage capacity. For ease of interpretation, we assume that the activation rate of the receptor due to a reference signal Href is balanced by the deactivation rate provided by the feedback of a fraction α=S/NS of average active storage population:

logΓPA(H)ΓAP(I)=βg(Hrefκσα)=0κ=Hrefασ. (2)

This condition sets the inhibition strength by choosing the inhibiting fraction α. At this stage, the reference signal represents the typical environmental stimulus to which the system is exposed. This choice rationalizes the physical meaning of the model parameters, but it does not alter the phenomenology of the system. Crucially, the presence of two different transition pathways, motivated by molecular considerations and pivotal in many energy-consuming biochemical systems (De Los Rios and Barducci, 2014; Astumian, 2019; Flatt et al., 2023), creates an internal non-equilibrium cycle in receptor dynamics. Without the storage population, the internal pathway would not be present and the receptor would satisfy an effective detailed balance.

Whenever active, the receptor drives the production of readout population U, which represents the direct response of the system to environmental signals. As such, it is the observable characterizing habituation (see Figure 1a). We model its dynamics with a controlled stochastic birth-and-death process (Yan et al., 2019; Hilfinger et al., 2016; Nicoletti and Busiello, 2024a):

UΓuu+1(r)UUΓu+1uUΓuu+1=eβ(Vcr)ΓU0Γu+1u=(u+1)ΓU0 (3)

where u denotes the number of molecules, ΓU0=τU1 sets the timescale of readout production, and V is the energy needed to produce a readout unit. When the receptor is active, r=1, this energetic cost is reduced by an effective additional driving βc. Active receptors transduce the environmental energy into an active pumping in the readout unit, allowing readout population to encode information on the external signal.

Finally, readout units stimulate the production of the storage population S. Its number of molecules s follows again a controlled birth-and-death process:

SΓss+1(u)SSΓs+1sSΓss+1(u)=ueβσΓS0Γs+1s=(s+1)Γs0 (4)

where σ is the energetic cost of a storage molecule and ΓS0 sets the timescale, i.e., ΓS0=τS1. For simplicity, we assume that readout molecules can catalytically activate storage molecules from a passive pool (see Figure 1a). Storage units are responsible for encoding the response, playing the role of a finite-time memory.

Our architecture, being devoid of specific biological details, can be adapted to describe systems operating at very different scales (Figure 1b–d). However, we emphasize that the proposed model is intentionally oversimplified compared to realistic biochemical or neural systems, yet it contains the minimal ingredients for habituation to emerge naturally. As such, the examples shown in Figure 1b–d are meant solely to illustrate the core architecture. In particular, while receptors can be readily identified, the role of readout is played by photo-receptors or calcium concentration for olfactory or visual sensing mechanisms (Menini, 1999; Kohn, 2007; Lesica et al., 2007; Benucci et al., 2013; Benda, 2021; Marquez-Legorreta et al., 2022; Fotowat and Engert, 2023), while storage may represent different molecular mechanisms at a coarse-grained level as, for example, memory molecules sensitive to calcium activity (Coultrap and Bayer, 2012), synaptic depotentiation, and neural populations that regulate neuronal response (Marquez-Legorreta et al., 2022; Fotowat and Engert, 2023).

As a final remark, we expect from previous studies (Nicoletti and Busiello, 2024a) that the presence of multiple timescales in the system will be fundamental in shaping information between the different components. Thus, we employ the biologically plausible assumption that U undergoes the fastest evolution, while S and H are the slowest degrees of freedom (Celani et al., 2011; Ngampruetikorn et al., 2020). We have that τUτRτSτH, where τH is the timescale of the environment.

The hallmarks of habituation

Habituation occurs when, upon repeated presentation of the same stimulus, a progressive decrease to an asymptotic level is observed in some parameters (Thompson and Spencer, 1966; Eckert et al., 2024). In our model, the response of the system is represented by the average number of active readout units, U(t). This behavior resembles recent observations on habituation under analogous external conditions in various experimental systems (Rahi et al., 2017; Jalaal et al., 2020; Tadres et al., 2022; Marquez-Legorreta et al., 2022; Fotowat and Engert, 2023). However, habituation in its strict sense is not sufficient to encompass the diverse array of emergent features recorded in biological systems. In fact, several other hallmarks are closely associated with habituating behavior (Thompson and Spencer, 1966; Smart et al., 2024; Eckert et al., 2024):

  1. Potentiation of habituation — After a train of stimulations and a subsequent short pause, the response decrement becomes more rapid and/or more pronounced.

  2. Spontaneous recovery — If, after response decrement, the stimulus is suppressed for a sufficiently long time, the response recovers at least partially at subsequent stimulations.

  3. Subliminal accumulation — The effect of stimulation may accumulate after the habituation level, thus delaying the onset of spontaneous recovery.

  4. Intensity sensitivity — Other conditions being fixed, the less intense the stimulus, the more rapid and/or pronounced the response decrease.

  5. Frequency sensitivity — Other conditions being fixed, more frequent stimulation results in a more rapid and/or more pronounced response decrease.

These hallmarks have been originally proposed from observations of vertebrate animals, but they are not the sole properties characterizing the most general definition of habituation. However, the list above encompasses the features that can be obtained from a single stimulation, as in our case, and without any ambiguity in the interpretation (for a detailed discussion, we refer to Thompson and Spencer, 1966; Eckert et al., 2024).

To explore the ability of the proposed archetypal mode to capture the aforementioned hallmarks, we consider the simple case of an exponential input distribution, pH(h,t)exp[hH(t)] with uncorrelated signals, that is h(t)h(t)=H(t)H(t). The time-dependent average H periodically switches between two values, Hmin and Hmax, corresponding to a (non-zero) background signal and a (strong) stimulation of the receptor, respectively. The system dynamics is governed by four different operators, W^X, with X=R,U,S,H, one for each unit and one for the environment. The resulting master equation is:

tP=[W^R(s,h)τR+W^U(r)τU+W^S(u)τS+W^HτH]P, (5)

where P denotes, in general, the joint propagator P(u,r,s,h,t|u0,r0,s0,h0,t0), with u0, r0, s0 and h0 initial conditions at time t0. By taking advantage of the timescale separation, we can write an exact self-consistent solution to Equation 8 at all times t (see Materials and methods and Supplementary Information).

In Figure 2a, we show that the system exhibits habituation in its strict sense. Here, for simplicity, we consider a train of signals arriving at times t1,,tN, each lasting a time Ts with equal pauses between them of duration ΔT. We define the time to habituate, t(hab), as the first time at which the relative change of our observable, H(t), is less than 0.5%, in analogy to Eckert et al., 2024. Clearly, t(hab) is associated with a number of stimuli necessary to habituate, n(hab), i.e.,

U(tn(hab)1)U(tn(hab)t(hab))U(tn(hab))0.005 (6)

Figure 2. Hallmarks of habituation.

Figure 2.

(a) An external signal switch between two values, Hmin=0.1 (background) and Hmax=Href=10 (stimulus). The inter-stimuli interval is ΔT=100(a.u.) and the duration of each stimulus Ts=100(a.u.). The average readout population (black) follows the stimulation, increasing when the stimulus is presented. The response decreases upon repeated stimulation, signaling the presence of habituation. Conversely, the average storage population (gray) increases over time. The black dashed line represents the time to habituate t(hab) (Equation 9). (b) If the stimulus is paused and presented again after a short time, the system habituates more rapidly, that is the number of stimulations to habituate n(hab) is reduced. (c) After waiting a sufficiently long time, the response can be fully recovered. (d) If the stimulation continues beyond habituation, the time to recover the response t(recovery) (Equation 10) increases by an amount δt (in red). (e) The relative decrement of the average readout with respect to the initial response, U(in), shows that habituation becomes less and less pronounced as we increase Hmax. (f) As expected, the initial response increases with Hmax. (g) The relative difference between H(t(hab)) and U(in), ΔU, decreases with the stimulus strength. (h) By changing ΔT and keeping the stimulus duration Ts fixed, we observe that more pronounced and more rapid response decrements are associated with more frequent stimulation. Parameters are reported in the Methods, and these hallmarks are qualitatively independent of their specific choice.

Our results do not qualitatively change when choosing a different threshold. Hallmark 1, potentiation of habituation, corresponds to a reduction of n(hab) after one series of stimulation and recovery. This implies a more rapid decrement in the response and a shorter time to achieve habituation, as we show in Figure 2b. Analogously, hallmark 2 is presented in Figure 2c, where we show that by suppressing the stimulus for a sufficiently long amount of time, the response spontaneously recovers to the pre-habituation level. Furthermore, by stimulating the system beyond t(hab), we also observe an increase in the amount of time to achieve complete recovery (hallmark 3). We define this recovery period t(recovery) as the first time required to have a response with a relative strength not greater than 1% with respect to the one at the first stimulus, that is

U(t1)U(t(recovery))U(t1)0.01. (7)

In Figure 2d, we show that the recovery period increases by 5% as a consequence of this subliminal accumulation.

Within the same setting, in Figure 2e–g we applied stimuli of different strengths Hmax to study the sensitivity to input intensity (hallmark 4). When normalized by the initial response U(in)U(t1), less intense stimuli result in stronger response decrements (see Figure 2e). At the same time, as expected, the absolute value of the initial response increases instead (see Figure 2f). Hallmark 4 is clearly captured by Figure 2g, where we quantify the decrease of the normalized total habituation level, ΔU=U(t(hab))U(in), when exposed to increasing Hmax. The last feature (hallmark 5) is reported in Figure 2h, where we keep the duration of the stimulus Ts fixed while changing the inter-stimuli interval ΔT. By showing the responses up to the habituation time, we clearly notice that more frequent stimulation is associated with a more rapid and more pronounced response decrement.

Summarizing, despite its simplicity and lack of biological details, our model encompasses the minimal ingredients to capture the main hallmarks defining habituation.

Information from habituation

In our architecture, habituation emerges due to the increase in the storage population, which provides increasing negative feedback to the receptor and thus lowers the number of active readout units U(t). Crucially, by solving the master equation in Equation 8, we can also study the evolution of the full probability distribution pU,S,H(t). This approach allows us to quantify how the system encodes information on the environment H through its readout population and how it changes during habituation. To this end, we introduce the mutual information between U and H at time t (see Materials and methods):

IU,H(t)=H[pU](t)0dhpH(h,t)H[pUH](t) (8)

where H[p](t) is the Shannon entropy of the probability distribution p, and pU|H denotes the conditional probability distribution of U given H measures information in terms of statistical dependencies, that is of how factorizable the joint probability distribution pU,H is. It vanishes if and only if U and H are independent. Notably, the mutual information coincides with the entropy increase of the readout distribution:

kBIU,H=kB(H[pU|H]H[pU])=ΔSU (9)

where ΔSU is the change in entropy of the readout population due to repeated measurements of the signal (Parrondo et al., 2015).

As in the previous section, we considered a switching signal with Hmax=Href, the typical environmental stimulus strength. In Figure 3a–b, we plot the mutual information at the first signal, IU,H(in), and when the system has habituated, IU,H(hab), as a function of β and σ. Crucially, we find that there exist parameters for which IU,H(hab) is larger than IU,H(in). This result suggests that the information on H encoded by U in the habituated system is larger than the initial one. We can quantify this effect by introducing the mutual information gain

ΔIU,H=IU,H(hab)IU,H(in). (10)

Figure 3. Information and thermodynamics of the model during repeated external stimulation, as a function of the inverse temperature β and the energetic cost of storage σ.

Figure 3.

(a–b) The mutual information between readout population and external signal at the first stimulus, IU,H(in), is typically lower than the one when the system has habituated, IU,H(hab). (c) The change in the mutual information, ΔIU,H, displays a peak in a region of the (β,σ) space, where the system exhibits optimal information gain during habituation. (d) This region corresponds to intermediate habituation strength, as measured by ΔU. (e) The corresponding increase in the feedback information ΔIf indicates that storage is fostering the gain in ΔIU,H. (f) Habituation promotes a decrease of the internal energy flux ΔJint, suggesting a synergistic energetic advantage of habituation. (g–h) From the dynamical point of view, in the region of maximal information gain (β=3, σ=0.6) the average number of readout units, U, decreases over time, while the average storage population, S, increases. (i–j) Similarly, both the information encoded on H by the readout, IU,H, and the feedback information, ΔIf, increase upon repeated stimulations. (k) The absolute value of the internal energy flux, |Jint|, decreases upon stimulations, while increasing for repeated pauses when the system moves downhill in energy. Model parameters are as specified in the Methods, Hmin=0.1, and Hmax=Href=10.

In Figure 3c, we show that ΔIU,H displays a peak in an intermediate region of the (β,σ) plane. In this region, the corresponding habituation strength

ΔU=U(hab)U(in) (11)

attains intermediate values, suggesting that too strong habituation can be detrimental (Figure 3d). This behavior is tightly related to the presence of the storage S, which acts as an information reservoir for the system. To rationalize this feature, we introduce the feedback information

ΔIf=I(U,S),HIU,H>0 (12)

quantifying how much the simultaneous knowledge of U and S increases information compared to U alone. Indeed, the change in feedback information after habituation, ΔΔIf=ΔIf(hab)ΔIf(in), peaks in the same region of ΔIU,H (Figure 3e).

For small σ we find that ΔΔIf may become negative, indicating that a too strong storage production may ultimately impede the information-theoretic performances of the system. Moreover, producing storage molecules requires energy. We can compute the internal energy flux associated with the storage of information through S as

Jint=σu,s[Γss+1pU,S(u,s,t)+Γs+1spU,S(u,s+1,t)], (13)

which is the total energy flux to produce the internal populations (U and S), since U always reaches equilibrium, being the fastest species at play. Its change during habituation is defined as ΔJint=Jint(hab)Jint(in). In Figure 3f, we show that ΔJint is typically smaller than zero, hinting at a synergistic thermodynamic advantage of habituation.

In Figure 3g–k, we show the evolution of the system for values of (β,σ) that lie in the region of maximal information gain. The readout activity decreases in time (Figure 3g), due to the habituation driven by the increase of S (Figure 3h). In this region, both IU,H and ΔIf increase over time (Figure 3i–j). We note that the increase in IU,H is concomitant to a reduction of the population that is encoding the signal. Although this may seem surprising, we stress that the mean of U is not directly related to the factorizability of the joint distribution pU,H. Finally, in Figure 3k, we show that the absolute value of the internal energy flux |Jint| in the presence of the stimulus sharply decreases as well, while increasing during its pauses (the value of Jint is negative in the presence of the background signal since the system is moving downhill in energy). This behavior is due to the interplay between storage and readout populations during habituation and signals the fact that the system requires progressively less energy to respond as time passes, while also moving less downhill in energy when the stimulus is paused. This observation suggests that the regime of maximal information gain supports habituation with a concurrent energetic advantage.

The onset of habituation and its functional role

As habituation, information, and their energetic cost appear to be tightly related, we now investigate whether the region of maximal information gain can be retrieved by means of an a priori optimization principle. To do so, we first focus on the case of a constant environment. We assume that the system can tune its internal parameters to optimally respond to the statistics of a prolonged external signal. Thus, we consider a fixed input statistics given by pHst(h)exp[h/Hst], with Hst the average signal strength.

When the system reaches its steady state, we compute the information that the readout has on the signal, IU,Hst (Figure 4a) and the total energy consumption. To this end, we must take into account two terms. First, the energy flux in Equation 13 represents the rate of change in energy due to the driven storage production. The energy consumption associated with this process per unit energy is Eintst=τSJintst/σ. Second, the inhibition pathway is also driving the receptor out of equilibrium, leading to a dissipation per unit temperature given by

δQR=log(ΓPA(H)ΓAP(I)ΓAP(H)ΓPA(I))=β(Hst+κσSNS). (14)

Figure 4. Optimality at the onset of habituation and dependence on the external signal strength.

Figure 4.

(a–b) Contour plots in the (β,σ) plane of the stationary mutual information IU,Hst, and the total dissipation of the system per unit energy, δQRst+Eintst, in the presence of a constant signal H=Href=10. For a given value of β, the system can optimize σ to the Pareto front (black line) to simultaneously minimize energy consumption and maximize information. Below the front, the system exploits the available energy suboptimally, while the region above the front is physically inaccessible. (b) In the presence of a dynamical input switching between Hmin=0.1 and Hmax=Href, the parameters defining the optimal front capture the region of maximal information gain corresponding to the onset of habituation, where ΔU starts to be significantly smaller than zero. The gray area enclosed by the dashed vertical lines indicates the location of the Pareto front for values of β[33.5]. (c) The Pareto front depends on the strength of the external signal Hmax. In particular, for Hmax<Href, at fixed β a larger storage cost σ is needed. Conversely, for Hmax>Href, an optimal system can harvest more information by producing more storage, thus exhibiting a smaller σ. (d) If we allow the system to adapt its inhibition strength κ to the stimulus (Equation 16), the Pareto fronts for different external signals collapse into a single optimal curve. Model parameters are specified in the Materials and methods.

We plot the total energy consumption per unit energy Etotst=δQRst+Eintst in Figure 4a. In order to understand how the system may achieve large values of mutual information while minimizing its intrinsic dissipation, we can maximize the Pareto functional (Seoane and Solé, 2015; Nicoletti and Busiello, 2024b):

L(β,σ)=γIU,Hst(β,σ)(1γ)Etotst(β,σ) (15)

where γ[0,1] sets the strategy implemented by the system. If γ1, the system prioritizes minimizing dissipation, whereas if γ1 it acts to preferentially maximize information. The set of (β,σ) that maximize Equation 15 defines a Pareto optimal front in the (Etotst,IU,Hst) space (Figure 4a). At fixed energy consumption, this front represents the maximum information between the readout and the external input that can be achieved. The region below the front is therefore suboptimal. Instead, the points above the front are inaccessible, as higher values of IU,Hst cannot be attained without increasing Etotst. We note that, since β usually cannot be directly controlled by the system, the Pareto front indicates the optimal σ to which the system tunes at fixed β (see Materials and methods and Appendices for details).

We now consider once more a system receiving a dynamically switching signal with Hmax=Hst. We first focus on the case Href=Hst, with Href the reference signal appearing in Equation 2. Remarkably, we find that the Pareto optimal front in the (β,σ) plane qualitatively corresponds to the region of maximal information gain, as we show in Figure 4b. This implies that a system that has tuned its internal parameters to respond to a constant signal also learns how to respond optimally to the time-varying input of the same strength, in terms of information gain. Since the region identified by the front leads to intermediate values of ΔU, it corresponds to the ‘onset of habituation’, where the system decreases its response enough to reduce the energy dissipation while storing information to increase IU,H. Heuristically, the onset of habituation emerges spontaneously when the system attempts to activate its receptor as little as possible, while producing the minimum amount of storage molecules retaining enough information about the external environment.

In Figure 4c, we then study what happens to the optimal front if Hmax is larger or smaller than the reference signal. We find that, at low Hmax, the Pareto front moves in such a way that a larger storage cost σ is needed at fixed β. This is expected since, at lower signal strengths, it is harder for the system to distinguish the input from the background thermal noise. Conversely, when Hmax>Href, an optimal system, it needs to reduce σ to produce more storage and harvest information. Importantly, we find that if Hmax remains close to Href, the optimal front remains close to the onset of habituation and thus lies within the region of maximal information gain.

However, we can achieve a collapse of the optimal front if we allow the system to tune the inhibition strength κ to the value of the external signal, that is

κ(Hmax)=Hmaxασ. (16)

In this way, a stronger input will correspond to a larger κ, and thus a stronger inhibition. In Figure 4d, we show that the Pareto fronts obtained with this choice collapse into a single curve. Crucially, this front still corresponds to the region of maximal information gain, although the specific values of ΔIU,H naturally depend on Hmax (see Supplementary Information). Thus, in this scenario, a system that is capable of adapting the negative feedback to its environment is also able to always tune itself to the onset of habituation at different values of the external stimulus and without tinkering with the energy cost σ, where its responses are optimal from an information-theoretic perspective.

The role of information storage

The presence of a storage mechanism is fundamental in our model. Furthermore, its role in mediating the negative feedback is suggested by several experimental and theoretical observations (Celani et al., 2011; Tu et al., 2008; Kollmann et al., 2005; Barkai and Leibler, 1997; De Ronde et al., 2010; Selimkhanov et al., 2014). Whenever the storage is eliminated from our model, habituation cannot take place, highlighting its key role in driving the observed dynamics (see Supplementary Information).

In Figure 5a, we show that the degree of habituation, ΔU, and the change in the storage population, ΔS, are deeply related to one another. The more S relaxes between two consecutive signals, the less the readout population reduces its activity. This ascribes to the storage population the role of an effective memory and highlights its dynamical importance for habituation. Moreover, the dependence of the storage dynamics on the interval between consecutive signals, ΔT, influences information gain as well. Indeed, increasing ΔT, we observe a decrease of the mutual information (Figure 5b) on the next stimulus. In the Supplementary Information, we further analyze the impact of different signal and pause durations.

Figure 5. The role of memory in shaping habituation.

Figure 5.

(a) The system response depends on the waiting time ΔT between two external signals. As ΔT increases, the storage decays, and thus memory is lost (green). Consequently, the habituation of the readout population decreases (yellow). (b) As a consequence, the information IU,H that the system has on the signal H when the new stimulus arrives decays as well. Model parameters for this figure are β=2.5, σ=0.5 in the unit measure of the energy, and as specified in the Materials and methods.

We remark here that the proposed model is fully Markovian in its microscopic components, and the memory that governs readout habituation spontaneously emerges from the interplay among the internal timescales. In particular, recent works have highlighted that the storage needs to evolve on a slower timescale, comparable to that of the external input, in order to generate information in the receptor and in the readout (Nicoletti and Busiello, 2024a). To strengthen our conclusions, we remark that an instantaneous negative feedback implemented directly by U (bypassing the storage mechanism) would lead to no time-dependent modulations of the readout and thus no habituation (see Supplementary Information). Similarly, a readout population evolving on a timescale comparable to that of the signal cannot effectively mediate the negative feedback on the receptor since its population increase would not lead to habituation (see Supplementary Information). Thus, negative feedback has to be implemented by a separate degree of freedom evolving on a timescale which is slow and comparable to that of external signal.

Minimal features of neural habituation

In neural systems, habituation is typically measured as a progressive reduction of the stimulus-driven neuronal firing rate (Malmierca et al., 2014; Shew et al., 2015; Benda, 2021; Marquez-Legorreta et al., 2022; Fotowat and Engert, 2023). To test whether our minimal model can be used to capture the typical neural habituation dynamics, we measured the response of zebrafish larvae to repeated looming stimulations via volumetric multiphoton imaging (Bruzzone et al., 2021). From a whole-brain recording of 55000 neurons, we extracted a subpopulation of 2400 neurons in the optic tectum with a temporal activity profile that is most correlated with the stimulation protocol (see Materials and methods).

Our model can be extended to qualitatively reproduce some features of the progressive decrease in neuronal response amplitudes. We identify a single readout unit with a subpopulation of binary neurons. Then, a fraction of neurons is randomly turned on each time the corresponding readout unit is activated (see Materials and methods). We tune the model parameters to have a comparable number of total active neurons at the first stimulus with respect to the experimental setting. Moreover, we set the pause and signal durations in line with the typical timescales of the looming stimulation. We choose the model parameters β and σ in such a way that the system operates close to the peak of information gain, with an activity decrease over time that is comparable to the activity decrease in experimental data (see Supplementary Information). In this way, we can focus on the effects of storage and feedback mechanisms without modeling further biological details.

The patterns of the model-generated activity are remarkably similar to the experimental ones (see Figure 6a). We performed a two-dimensional embedding of the neural activity profiles of all recorded neurons via PCA (explained variance 70%) and we plot the temporal evolution in this low-dimensional space (Figure 6b). This procedure reveals that the first principal component (PC) accounts for the evoked neural response, while the second PC mostly reflects the habituation dynamics. We perform the same analysis on data generated from the model as explained above. As we see in Figure 6c, the second PC encodes habituation, as in experimental data, although the neural response in the first PC is replaced by the switching on/off dynamics of the input. This shows that our model is able to capture the main features of the observed neural habituation, without the need for biological details.

Figure 6. Habituation in zebrafish larvae.

Figure 6.

(a) Normalized neural activity profile in a zebrafish larva in response to the repeated presentation of visual (looming) stimulation, and comparison with the fraction of active neurons Nact=Nact/N in our model with stochastic neural activation (see Methods). Stimuli are indicated with colored dots from blue to red as time increases. (b) PCA of experimental data reveals that habituation is captured mostly by the second principal component, while features of the evoked neural response are captured by the first one. Different colors indicate responses to different stimuli. (c) PCA of simulated neural activations. Although we cannot capture the dynamics of the evoked neural response with a switching input, the core features of habituation are correctly captured along the second principal component. Model parameters are β=4.5, σ=0.15 in energy units, and as in the Materials and methods, so that the system is tuned to the onset of habituation.

Discussion

In this work, we studied a minimal architecture that serves as a microscopic and archetypal description of sensing processes across biological scales. Informed by theoretical and experimental observations, we focused on three fundamental mechanisms: a receptor, a readout population, and a storage mechanism that drives negative feedback. Despite its simplicity, we have shown that our model robustly reproduces the hallmarks associated with habituation in the presence of a single type of repeated stimulation, a widespread phenomenon in both biochemical and neural systems. By quantifying the mutual information between the external signal and readout population, we identified a regime of optimal information gain during habituation. Remarkably, the system can spontaneously tune to this region of parameters if it enforces an information-dissipation trade-off. In particular, optimal systems lie at the onset of habituation, characterized by intermediate levels of activity reduction, as both too-strong and too-weak negative feedback are detrimental to information gain. Finally, we found that, by allowing for a storage inhibition strength that can adapt to the environmental signal, this optimality is input-independent and requires no further adjustment of other internal model parameters. Our results suggest that the functional advantages of the onset of habituation are rooted in the interplay between energy dissipation and information gain, and its general features are tightly linked to the internal mechanisms to store information.

Although minimal, our model can capture basic features of neural habituation, where it is generally accepted that inhibitory feedback mechanisms modulate the stimulus weight (Lamiré et al., 2022). Remarkably, recent works reported the existence of a separate inhibitory neuronal population whose activity increases during habituation (Fotowat and Engert, 2023). Our model suggests that this population might play the role of a storage mechanism, allowing the system to habituate to repeated signals. However, in neural systems, a prominent role in encoding both short- and long-term information is also played by synaptic plasticity (Abbott and Nelson, 2000; Martin et al., 2000) as well as by memory molecules (Coultrap and Bayer, 2012; Frankland and Josselyn, 2016; Lisman et al., 2002), at a biochemical level. A comprehensive analysis of how information is encoded and retrieved will most likely require all these mechanisms at once. Including an explicit connectivity structure with synaptic updates in our model may help in this direction, at the price of analytical tractability. Furthermore, future works may be able to compare our theoretical predictions with experiments in which the modulation of frequency (Fotowat and Engert, 2023) and intensity of stimulation trigger the observed hallmarks. In this way, we could elucidate the roles and features of internal processes characterizing the system under investigation, along with its information-theoretic performance. Overall, the present results hint at the fact that our minimal architecture may provide crucial insights into the functional advantages of habituation in a wide range of biological systems.

Extensions of these ideas are manifold. The definition of a habituated system relies, in this work as well as in other studies (Eckert et al., 2024), on the definition of a response threshold. However, some of the hallmarks might disappear when habituation is defined as a phenomenon appearing in a time-periodic steady state. To overcome this issue, it may be necessary to extend the model to more realistic molecular schemes encompassing the presence of additional storage mechanisms. More generally, understanding the information-theoretic performance of real-world biochemical networks exhibiting habituation remains a fascinating perspective to explore. Upon these premises, the possibility of inferring the underlying biochemical structure from observed behaviors is a fascinating direction (Rahi et al., 2017). Furthermore, since we focused on repetitions of statistically identical signals, it will be fundamental to characterize the system’s response to diverse environments (Hidalgo et al., 2014). To this end, incorporating multiple receptors or storage populations may be needed to harvest information in complex conditions. In such scenarios, correlations between external signals may help reduce the encoding effort as, intuitively, S is acting as an information reservoir for the system. Moreover, such stored information could be used to make predictions on future stimuli and behavior (Bueti et al., 2010; Sederberg et al., 2018; Palmer et al., 2015). Indeed, living systems do not passively read external signals but often act upon the environment. We believe that both storage mechanisms and their associated negative feedback will remain core modeling ingredients.

Our work paves the way to understanding how information is encoded and guides learning, predictions, and decision-making, a paramount question in many fields. On the one hand, it encapsulates key ingredients to support habituation while still being minimal enough to allow for analytical treatment. On the other hand, it may help the experimental quest for signatures of these physical ingredients in a variety of systems. Ultimately, our results show how habituation – a ubiquitous phenomenon taking place at strikingly different biological scales – may stem from an information-based advantage, shedding light on the optimization principle underlying its emergence and relevance for any biological system.

Materials and methods

Model parameters

In this section, we briefly recall the free parameters of the model and the values we use in numerical simulations, unless otherwise specified. In particular, the energetic barrier (Vcr) fixes the average values of the readout population both in the passive and active state, namely UP=eβV and UA=eβ(Vc) (see Equation 3). Thus, we can fix UP and UA in lieu of V and c. Similarly, as in Equation 2, we can set the inhibiting storage fraction α to fix κ. At any rate, we remark that the emerging features of the model are qualitatively independent of the specific choice of these parameters. Furthermore, we typically consider the average of the exponentially distributed signal to be Hmax=10 and Hmin=0.1 (see Supplementary Information for details). Overall, we are left with β and σ as free parameters. β quantifies the amount of thermal noise in the system, and at small β the thermal activation of the receptor hinders the effect of the signal and makes the system almost unable to process information. Conversely, if β is high, the system must overcome large thermal inertia, increasing the dissipative cost. In this regime of weak thermal noise, we expect that, given a sufficient amount of energy, the system can effectively process information. In Table 1, we summarize the specific parameter values we used throughout the main text. Other values to explore the robustness of the model are discussed in the Supplementary Information.

Table 1. Summary of the model parameters and the values used for numerical simulations, unless otherwise specified.

The parameters β and σ qualitatively determine the behavior of the model and are varied throughout the main text.

Parameter Description Value
MS Maximum number of storage units 30
ΔE Receptor energetic barrier 1
UP Average readout with passive receptor 150
UA Average readout with active receptor M S
ΓS0 Inverse timescale of the storage 1
g Receptor’s pathways timescale ratio 1
α Inhibiting storage fraction 2/3
Href Reference signal 10
β Inverse temperature -
σ Storage energy cost -

Timescale separation

We solve our system in a timescale separation framework (Busiello et al., 2020; Bo and Celani, 2017; Nicoletti and Busiello, 2024a), where the storage evolves on a timescale that is much slower than all the other internal ones, that is

τUτRτSτH.

The fact that τS is the slowest timescale at play is crucial to making these components act as an information reservoir. This assumption is also compatible with biological examples. The main difficulty arises from the presence of the feedback, that is the signal influences the receptor and thus the readout population, which in turn impacts the storage population and finally changes the deactivation rate of the receptor - schematically, HRUSR, but the causal order does not reflect the temporal one.

We start with the master equation for the propagator P(u,r,s,h,t|u0,r0,s0,h0,t0),

tP=[W^U(r)τU+W^R(s,h)τR+W^S(u)τS+W^HτH]P.

We rescale the time by τS and introduce two small parameters to control the timescale separation analysis, ϵ=τU/τR and δ=τR/τH. Since τS/τH=O(1), we set it to 1 without loss of generality. We then write P=P(0)+ϵP(1) and expand the master equation to find P(0)=pU|Rst(u|r)Π, with W^UpU|Rst=0. We obtain that Π obeys the following equation:

tΠ=[δ1W^R(s,h)+W^S(u)+W^H]Π.

Yet again, Π=Π(0)+δΠ(1) allows us to write Π(0)=pR|S,Hst(r|s,h)F(s,h,t|s0,h0,t0) at order O(δ1), where W^RpR|S,Hst=0. Expanding first in ϵ and then in δ sets a hierarchy among timescales. Crucially, due to the feedback present in the system, we cannot solve the next order explicitly to find F. Indeed, after a marginalization over r, we find tF=[W^H+W^S(u¯(s,h))]F, at order O(1), where u¯(s,h)=u,rupU|Rst(u|r)pR|S,Hst(r|s,h). Hence, the evolution operator for F depends manifestly on s, and the equation cannot be self-consistently solved. To tackle the problem, we first discretize time, considering a small interval, that is t=t0+Δt with ΔtτU and thus u¯(s,h)u0. We thus find F(s,h,t|s0,h0,t0)=P(s,t|s0,t0)PH(h,t|h0,t0) in the domain t[t0,t0+Δt], since H evolves independently from the system (see also Supplementary Information for analytical steps).

Iterating the procedure for multiple time steps, we end up with a recursive equation for the joint probability pU,R,S,H(u,r,s,h,t0+Δt). We are interested in the following marginalization

pU,S(u,t+Δt)=r=010dhpu|Rst(u|r)pR|S,Hst(r|h,s)pH(h,t+Δt)s=0NSu=0P(s,ts,t+Δt|u)pU,S(u,s,t)

where P(s,ts,t+Δt) is the propagator of the storage at fixed readout. This is the Chapman-Kolmogorov equation in the timescale separation approximation. Notice that this solution requires the knowledge of pU,S at the previous time step, and it has to be solved iteratively.

Explicit solution for the storage propagator

To find a numerical solution to our system, we first need to compute the propagator P(s0,t0s,t). Formally, we have to solve the master equation

tP(s0s|u0)=ΓS0[eβσu0P(s0s)δs,s1+sP(s0s)δs,s+1P(s0s)δs,s(s+eβσu0)]

where we used the shorthand notation P(s0s)=(s0,t0s,t). Since our formula has to be iterated for small timesteps, that is tt0=Δt1, we can write the propagator as follows

P(s0,t0s,t0+Δt|u0)=pS|Ust+νwνa(ν)eλνΔt

where wν and λν are respectively eigenvectors and eigenvalues of the transition matrix W^S(u0),

(W^S(u0))ij=eβσu0if i=j+1(W^S(u0))ij=jif i=j1(W^S(u0))ij=0otherwise

and the coefficients a(ν) are such that

pS|U(s0,t0s,t0+Δt|u0)=pS|Ust+νwνa(ν)=δs,s0.

Since eigenvalues and eigenvectors of W^S(u0) might be computationally expensive to find, we employ another simplification. As Δt0, we can restrict the matrix only to jumps to the n-th nearest neighbors of the initial state (s0,t0), assuming that all other states are left unchanged in small time intervals. We take n=2 and check the accuracy of this approximation against the full simulation for a limited number of timesteps.

Mean-field relations

We note that U and S satisfies the following mean-field relationship:

UUr=1Ur=1Ur=0=f0(SNS), (17)

where f0(x) is an analytical function of its argument (see Supplementary Information). This relation clearly states that only the fraction of active storage units is relevant to determining the habituation dynamics.

Mutual information

Once we have pU(u,t) (obtained marginalizing pU,S over s) for a given pH(h,t), we can compute the mutual information

IU,H(t)=H[pU](t)0dhpH(h,t)H[pU|H](t)

where H is the Shannon entropy. For the sake of simplicity, we consider that the external signal follows an exponential distribution pH(h,t)=λ(t)eλ(t)h. Notice that, in order to determine such quantity, we need the conditional probability pU|H(u,t). In the Supplementary Information, we show how all the necessary joint and conditional probability distributions can be computed from the dynamical evolution derived above.

We also highlight here that the timescale separation implies IS,H=0, since

pS|H(s,t|h)=upU,S|H(u,s,t|h)=pS(s,t)urpU|Rst(u|r)pR|S,Hst(r|s,h)=pS(s,t).

Although it may seem surprising, this is a direct consequence of the fact that S is only influenced by H through the stationary state of U. Crucially, the presence of the feedback is still fundamental in promoting habituation. Indeed, we can always write the mutual information between the signal H and both the readout U and the storage S together as I(U,S),H=ΔIf+IU,H, where ΔIf=I(U,S),HIU,H=I(U,H),SIU,S. Since ΔIf>0 (by standard information-theoretic inequalities), the storage is increasing the information of the two populations together on the external signal. Overall, although S and H are independent in this limit, the feedback is paramount in shaping how the system responds to the external signal and stores information about it.

Pareto optimization

We perform a Pareto optimization at stationarity in the presence of a prolonged stimulation. We seek the optimal values of (β,σ) by maximizing the functional in Equation 15 of the main text. Hence, we maximize the information between the readout and the signal, simultaneously minimizing the dissipation of the receptor induced by both the signal and feedback process and the dissipation associated with storage production, as discussed in the main text. The dissipative contributions have been computed per unit energy to be comparable with the mutual information. In the Supplementary Information, we detailed the derivation of the Pareto front and investigated the robustness of this optimization strategy.

Recording of whole brain neuronal activity in zebrafish larvae

Acquisitions of the zebrafish brain activity were carried out in one Elavl3:H2BGCaMP6s larvae at 5 days post fertilization raised at 28 °C on a 12 hr light/12 hr dark cycle according to the approval by the Ethical Committee of the University of Padua (61/2020 dal Maschio). The subject was embedded in 2% agarose gel and brain activity was recorded using a multiphoton system with a custom 3D volumetric acquisition module. Data were acquired at 30 frames per second covering an effective field of view of about 450×900um with a resolution of 512×1024 pixels. The volumetric module acquires a volume of about 180200um in thickness encompassing 30 planes separated by about 7um, at a rate of 1 volume per second, sufficient to track the slow dynamics associated with the fluorescence-based activity reporter GCaMP6s. Visual stimulation was presented in the form of a looming stimulus with 150 s intervals, centered with the fish eye (see Supplementary Information). Neurons identification and anatomical registrations were performed as described in Bruzzone et al., 2021.

Data analysis

The acquired temporal series were first processed using an automatic pipeline, including motion artifact correction, temporal filtering with a 3s rectangular window, and automatic segmentation. The obtained dataset was manually curated to resolve segmentation errors or to integrate cells not detected automatically. We fit the activity profiles of about 55,000 cells with a linear regression model using a set of base functions representing the expected responses to each stimulation event. These base functions have been obtained by convolving the exponentially decaying kernel of the GCaMP signal lifetime with square waveforms characterizing the presentation of the corresponding visual stimulus. The resulting score coefficients of the fit were used to extract the cells whose score fell within the top 5% of the distribution, resulting in a population of 2400 neurons whose temporal activity profile correlates most with the stimulation protocol. The resulting fluorescence signals F(i) were processed by removing a moving baseline to account for baseline drifting and fast oscillatory noise (Jia et al., 2011). See Supplementary Information.

Model for neural activity

Here, we describe how our framework is modified to mimic neural activity. Each readout unit, u, is interpreted as a population of N neurons, i.e., a region dedicated to the sensing of a specific input. When a readout population is activated at time t, each of its N neurons fires with a probability p. We set N=20 and p=0.5 has been set to have the same number of observed neurons in data and simulations, while p only controls the dispersal of the points in Figure 6c, thus not altering the main message. The dynamics of each readout unit follows our dynamical model. Due to habituation, some of the readout units activated by the first stimulus will not be activated by subsequent stimuli. Although the evoked neural response cannot be captured by this extremely simple model, its archetypal ingredients (dissipation, storage, and feedback) are informative enough to reproduce the low-dimensional habituation dynamics found in experimental data.

Acknowledgements

GN, SS, and DMB acknowledge Amos Maritan for fruitful discussions. DMB thanks Paolo De Los Rios for insightful comments. GN and DMB acknowledge the Max Planck Institute for the Physics of Complex Systems for hosting GN during several stages of this work. SS acknowledges #NEXTGENERATIONEU (NGEU) and funding by the Ministry of Universities and Research (MUR), National Recovery and Resilience Plan (NRRP), project MNESYS (PE0000006) – A Multiscale integrated approach to the study of the nervous system in health and disease (DN 1553 11.10.2022). GN acknowledges funding provided by the Swiss National Science Foundation through its Grant CRSII5_186422. DMB is funded by the STARS@UNIPD grant with the project “ActiveInfo”

Appendix 1

Detailed solution of the master equation

Consider the transition rates introduced in the main text:

ΓPA(H)=eβ(hΔE)ΓH0ΓAP(H)=ΓH0ΓPA(I)=eβΔEΓI0ΓAP(I)=ΓI0eβκσs/NsΓuu+1=eβ(Vcr)ΓU0Γu+1u=(u+1)ΓU0Γss+1=eβσuΓS0Γs+1s=(s+1)ΓS0.

We set a reflective boundary for the storage at s=NS, corresponding to the maximum amount of storage molecules in the system. Moreover, for the sake of simplicity, we take ΓI0=ΓH0ΓR0. Retracing the steps of the Materials and methods, the master equation governing the evolution of the propagator of all variables, P(u,r,s,h,t|u0,r0,s0,h0,t0), is:

tP=[W^U(r)τU+W^R(s,h)τR+W^S(u)τS+W^HτH]P. (A1 - 1)

We solve this equation employing a timescale separation, i.e., τUτRτSτH, where τX=ΓX0 for X=U,R,S and τH is the typical timescale of the signal dynamics. Motivated by several biological examples, we assumed that the readout population undergoes the fastest dynamics, while storage and signal evolution are the slowest ones. Defining ϵ=τU/τR and δ=τR/τH, and setting τS/τH=1 without loss of generality, we have:

tP=[ϵ1δ1W^U(r)+δ1W^R(s,h)+W^S(u)+τSτHW^H]P. (A1 - 2)

We propose a solution in the following form, P=P(0)+ϵP(1). By inserting this expression in the equation above, and solving order by order in ϵ, at order ϵ1, we have that:

P(0)=PURst(ur)Π(r,h,tr0,h0,t0) (A1-3)

where pst solves the master equation for the readout evolution at a fixed r:

0=pU|Rst(u+1)[u+1]+pU|Rstα(r)pU|Rst(u)[u+α(r)] (A1 - 4)

with α(r)=eβ(Vcr). Hence,

pU|Rst(u|r)=eα(r)α(r)uu!. (A1 - 5)

At order ϵ0, we find the equation for Π, also reported in the Materials and methods:

tΠ(r,h,t|r0,h0,t0)=[δ1W^R(h)+W^S(u)+W^H]Π(r,h,t|r0,h0,t0). (A1 - 6)

To solve this equation, we propose a solution of the form Π=Π(0)+δΠ(1). Hence, again, at order δ1, we have that Π(0)=pR|S,Hst(r|s,h)F(s,h,t|s0,h0,t0), where pR|S,Hst satisfy the steady-state equation for the fastest degree of freedom, with all the others fixed. In the case, it is just the solution of the rate equation for the receptor:

pR|H,Sst(r=1)=ΓPAeffΓPAeff+ΓAPeff,pR|Hst(r=0)=1pR|Hst(r=1,t) (A1 - 7)

where ΓPAeff=ΓPA(I)+ΓPA(H), and the same for the reverse reaction. At order δ1, we have an equation for F:

tF(s,h,t|s0,h0,t0)=r,upU|Rst(u|r)[W^S(u)+W^H][pR|S,Hst(r|s,h)F(s,h,t|s0,h0,t0)] (A1 - 8)

As already explained in the Materials and methods, due to the feedback, this equation cannot be solved explicitly. Indeed, the operator governing the evolution of F is:

W^eff=W^H+upU|S,Hst(u|s,h)W^S(u)=W^H+W^S(uupU|S,Hst(u|s,h))=W^H+W^S(u¯(s,h)) (A1 - 9)

with pU|S,Hst(u|s,h)=rpU|Rst(u|r)pR|S,Hst(r|s,h) and using the linearity of W^S(u). In order to solve this equation, we shall assume that u¯(s,h)=u0, bearing in mind that this approximation holds if t is small enough, that is t=t0+Δt with ΔtτU. Therefore, for a small interval, we have:

tF(s,h,t0+Δt|s0,h0,t0)=[W^S(u0)+W^H]F(s,h,t|s0,h0,t0) (A1 - 10)

Overall, we end up with the following joint probability of the model at time pU,R,S,H(u,r,s,h,t0+Δt)=

pU,R,S,H(u,r,s,h,t0+Δt)=u0,s0PURst(ur)PRs,hst(rs,h)P(s,t0+Δts0,u0,t0)dh0PH(h,t0+Δth,T0)PU,S,H(u0,s0,h0,t0)=u0,s0PU|Rst(u|r)PRs,hst(rs,h)P(s,t0+Δts0,u0,t0)pU,S(u0,s0,t0)pH(h,t0+Δt) (A1-11)

where dh0PH(h,t0+Δt|h0,t0)pU,S,H(u0,s0,h0,t0)=pU,S(u0,s0,t0)pH(h,t0+Δt) since H at time t0+Δt is independent of S and U. When propagating the evolution through intervals of duration Δt, we also assume that H evolves independently since it is an external variable, while affecting the evolution of the other degrees of freedom. This structure reflects into the equation above. For simplicity, we prescribe pH(h,t) to be an exponential distribution, pH(h,t)=λ(t)eλ(t)h, and solve iteratively Equation A1-11 from t0 to a given T in steps of duration Δt, as indicated above. This complex iterative solution arises from the timescale separation because of the cyclic feedback structure: {S,H}RUS. This solution corresponds explicitly to

pU,S(u,t+Δt)=r=010dhpU|Rst(u|r)pR|S,Hst(r|h,s)pH(h,t+Δt)s=0NSu=0P(s,ts,t+Δt|u)pU,S(u,s,t) (A1-12)

where P(s,ts,t+Δt) is the propagator of the storage at fixed readout. This is the Chapman-Kolmogorov equation in the time-scale separation approximation. Notice that this solution requires the knowledge of pU,S at the previous time step, and it has to be solved iteratively. Both pU and pS can be obtained by an immediate marginalization.

As detailed in the Materials and methods, the propagator P(s0,t0s,t), when restricted to small time intervals, can be obtained by solving the birth-and-death process for storage molecules at fixed readout, limiting the state space only to n nearest neighbors (we checked that our results are robust increasing n for the selected simulation time step).

Information-theoretic quantities

By direct marginalization of Equation A1-12, we obtain the evolution of pU(u,t) and pS(s,t) for a given pH(h,t). Hence, we can compute the mutual information as follows:

IU,H(t)=H[pU](t)0dhpH(h,t)H[pU|H](t)=ΔSUkB (A1-13)

where H[pX] is the Shannon entropy of X, and ΔSU is the reduction in the entropy of U due to repeated measurement (see main text). Notice that, in order to determine such quantity, we need the conditional probability pU|H(u,t). This distribution represents the probability that, at a given time, the system jumps at a value u in the presence of a given signal h. In order to compute it, we can write

pU|H(u,t+Δt)=s=0MSr=01pU|Rst(u|r)pR|S,Hst(r|h,s)pS(s,t+Δt) (A1 - 14)

by definition. The only dependence on h enters in pR|S,Hst through the eβh dependence in the rates.

Analogously, all the other mutual information can be obtained. As we showed in the Materials and methods, although IS,H=0 due to the time-scale separation, the presence of the feedback is still fundamental to effectively process information about the signal. This effect can be quantified through the feedback information ΔIf=I(U,S),HIU,H>0, as it captures how much the knowledge of S and U together helps to encode information about the signal with respect to U alone. In terms of system entropy, we equivalently have:

kBΔIf=ΔSU,S+ΔSU>0 (A1 - 15)

that highlights how much the effect of S (feedback) reduces the entropy of the system due to repeated measurements. In practice, in order to evaluate I(U,S),H, we exploit the following equality:

I(U,S),H=H[pU,S](t)0dhpH(h,t)H[pU,S|H](t). (A1 - 16)

for which we need pU,S|H. It can be obtained by noting that

pU,S(u,s,t)=pU|S(u,t|s)pS(s,t)=dhrpU|Rst(u|r)pR|S,Hst(r|s,h)pS(s,t)pH(h,t) (A1-17)

from which we immediately see that

pU,S|H(u,s,t)=r=01pU|Rst(u|r)pR|S,Hst(r|h,s)pS(s,t) (A1 - 18)

that can be easily computed at any given time t.

Mean-field relation between average readout and storage

Fixing all model parameters, the average value of storage, S, and readout, U, is numerically determined by solving iteratively the system, as shown above. However, an analytical relation between these two quantities can be found starting from the definition of U:

U=u,suPU,Sst(u,s)=u,suPU|Sst(u|s)PSst(s)=u,s,ruPU|Rst(u|r)PR|Sst(r|s)PSst(s) (A1 - 19)

Then, inserting the expression for the stationary probability that we know analytically:

U=s(uPU|Rst(u|r=0)PR|Sst(r=0|s)+uPU|Rst(u|r=1)PR|Sst(r=1|s))PSst(s) (A1 - 20)

where PR|Sst=dhPR|H,SstPHdhfR(ρS) has a complicated expression involving the hypergeometric function 2F1 in terms of model parameters and only the fraction of active S, ρS=s/NS (the explicit derivation of this formula is not shown here). Then, we have:

U=s(eβVf0(ρS)+eβ(Vc)(1f0(ρS)))PSst(s) (A1 - 21)

Since we do not have an analytical expression for PSst(s), we employ the mean-field approximation, reducing all the correlation functions to products of averages:

U=eβ(Vc)+eβVf0(ρ¯S)(1eβc) (A1 - 22)

where ρ¯S=S/NS. This clearly shows that, given a set of model parameters, U and the average fraction of storage molecules, ρ¯S are related. In particular, introducing the change of parameters presented in the Materials and methods, we have the following collapse:

UUAUAUP=f0(ρ¯S) (A1 - 23)

where UA and UP are respectively the average of U fixing r=1 (active receptor) and r=0 (passive receptor). It is also possible to perform an expansion of f0 which numerically results in being very precise:

UUAUAUP=a1(λH,β,g)z+a0(λH,β)+a1(λH,β,g)z1λH/β+a2(λH,β)zλH/β (A1 - 24)

where z=eβΔE(1+g eβρ¯S/αλH). Since all these relations just depend on the average fraction of storage molecules, it is natural to ask what happens when NSNS=nNS. Fixing all the remaining parameters, both U and ρ¯S will change, still satisfying the mutual relation presented above. Let us consider, for NS, the stationary solution that has the same fraction of S, i.e., (ρ¯S)NS=(ρ¯S)NS. As a consequence of the scaling relation, UNSUNS. Considering UP0 in both settings, we can ask ourselves what is the factor γ such that γ(UA)NS=(UA)NS. Since u only enters linearly in the dynamics of the storage, and the mutual relation only depends on the fraction of active S, we guess that γ=1/n, as numerically observed. As stated in the main text, we can finally conclude that the storage fraction is the most relevant quantity in our model to determine the effect of the feedback and characterize the dynamical evolution. This observation makes our conclusions more robust, as they do not depend on the specific choice for the storage reservoir since there always exists a scaling relation connecting U and ρ¯S. As such, changing the value of the model parameters we fixed will only affect the number of active molecules without modifying the main results presented in this work.

Appendix 1—table 1. Summary of the model parameters and the values used for numerical simulations, unless otherwise specified.

The parameters 𝛽 and 𝜎 qualitatively determine the behavior of the model and are varied.

Model parameter Description Typical value
MS Maximum number of storage units 30
ΔE Receptor energetic barrier 1
UP Average readout with passive receptor 150
UA Average readout with active receptor 0.5
ΓS0 Inverse timescale of the storage 1
g Receptor’s pathways timescale ratio 1
α Inhibiting storage fraction 2∕3
Href Reference signal 10
Hmax Average signal strength 10
Hmin Average background strength 0.1
ΔT Duration of the pause between two signals 100
Ton Duration of a signal 100
Δt Timestep used in simulations 5.1
β Inverse temperature -
σ Storage energy cost -

Appendix 2

Model features and robustness of optimality

In this section, we show how different choices of model parameters and the external signal features impact the results presented in the main text. In Appendix 1—table 1 we summarize for convenience the parameters of the model. We recall that, for analytical ease, we take the environment to be an exponentially distributed signal,

pH(h,t)=λ(t)ehλ(t) (A2 - 1)

where λ is its inverse characteristic scale. In particular, we describe the case in which no signal is present by setting λ to be large, so that the typical realizations of H would be too small to activate the receptors. On the other hand, when λ is small, the values of h appearing in the rates of the model are large enough to activate the receptor and thus allow the system to sense the signal. In the dynamical case, we take λ(t) to be a square wave, so that H=1/λ alternates between two values Hmin - the input signal - and Hmax - the background. We denote with Ton the duration of Hmax, and with ΔT the one of Hmin, that is the pause between two subsequent signals. In practice, this mimics an on-off dynamics, where the stochastic signal is present when its average is Hmax.

Effects of the external signal strength and thermal noise level

In Appendix 2—figure 1a, we study the behavior of the model in the presence of a static exponential signal, with average H. We focus on the case of low σ, so that the production of storage is favored. As H decreases, IU,H decreases as well. Hence, as expected, information acquired through sensing depends on the strength of the external signal that coincides with the energy input driving receptor activation. However, the system does not display for all parameters an emergent information dynamics, memory, and habituation. In Appendix 2—figure 1b, we see that, when the temperature is low but σ is high, the system does not show habituation and ΔIU,H=0. On the other hand, when thermal noise dominates (Appendix 2—figure 1c), even when the external signal is small, the system produces a large readout population due to random thermal activation. As a consequence, these random activations hinder the signal-driven ones, thus the system does not effectively sense the external signal even when present and IU,H is always small. It is important to remind here that, as we see in the main text, IU,H is not monotonic at fixed σ and as a function of β. This is due to the fact that low temperatures typically favor sensing and habituation, but they also intrinsically suppress readout production. Thus, at high β, σ needs to be small to effectively store information since thermal noise is negligible. Vice versa, a small σ is detrimental at high temperatures since the system produces storage as a consequence of thermal noise. This complex interplay is captured by the Pareto optimization, which gives us an effective relation between β and σ to maximize storage while minimizing dissipation.

Stationary behavior of the model with a constant signal

In this section, we detail the behavior of the model when exposed to a static signal. As in the main text, we take

pH(h)=λstehλst (A2 - 2)

with H=1/λst=Hst.

We first consider the case where the system does not adapt its inhibition strength κ, that is we set

κ=Hrefασ (A2 - 3)

where Href is the reference signal, and α the fraction of storage population needed to inhibit the receptor on average (see Table 1). In Appendix 2—figure 2, we plot as a function of β and σ the behavior of the stationary average readout population Ust, the average storage population Sst, the mutual information between readout and the signal IU,Hst, and the total energy consumption δQRst+Eintst, where

Eintst=τSJintst/σ=τSu,s[Γss+1pU,Sst(u,s)Γs+1spU,Sst(u,s+1)] (A2 - 4)

with τS=1/ΓS0. As shown in the main text, however, we can achieve a collapse of the Pareto fronts at different external signals if we allow the system to tune the inhibition strength as

κ(H)=Hασ (A2-5)

so that a stronger input will correspond to a larger κ, and thus a stronger inhibition. In Appendix 2—figure 3 , we show the behavior of the same stationary quantities in this case, and for a large range of Hst.

Static and dynamical optimality

We now study the dynamical behavior of the model under a repeated external signal, for different values of Hmax. In particular, given an observable O, we define its change under a repeated signal, ΔO, as the difference between the maximal response to the signal after several repetitions, once the system has habituated, and the maximal response to the first signal. In Appendix 2—figure 4 we plot, as a function of β and σ, the mutual information gain ΔIU,H, the feedback information gain ΔΔIf, the habituation strength ΔU, and the change in the internal energy flux ΔJint, when as before κ is fixed by a reference signal Href. As in the main text, we see in particular that ΔU is maximal in the region where the change in the mutual information ΔIU,H and the feedback information ΔΔIf are both small, suggesting that a strong habituation fueled by a large number of storage molecules with low energy cost is ultimately detrimental for information processing. Furthermore, in this region, the change in the internal energy flux, Jint, is large. For completeness, in Appendix 2—figure 5 we plot all relevant dynamical quantities at different signal strength Hmax in the case of a fixed κ with a reference signal (Equation 3), whereas in Appendix 2—figure 6 we focus on an adaptive κ (Equation A2-5).

Interplay between information storage and signal duration

In the main text and insofar, we have always considered the case Ton=Δt. We now study the effect of the signal duration and the pause length on sensing (Appendix 2—figure 7). If the system only receives short signals between long pauses, the slow storage build-up does not reach a high level of fraction of active molecules. As a consequence, the negative feedback on the receptor is less effective and habituation is suppressed (Appendix 2—figure 7a). Therefore, the peak of ΔIU,H in the (β,σ) plane takes place below the optimal curve, as σ needs to be smaller than in the static case to boost storage production during the brief periods in which the signal is present. On the other hand, in Appendix 2—figure 7b we consider the case of a long signal with short pauses. In this scenario, the slow dynamical evolution of the storage can reach large values of number of molecules at larger values of σ, thus moving the optimal dynamical region slightly above the Pareto-like curve. The case of a short signal is comparable to the durations of the looming stimulations in the experimental setting, which can be used to tune the parameters of the model to the peak of information gain.

Appendix 2—figure 1. Effects of the external signal strength and thermal noise level on sensing.

Appendix 2—figure 1.

(a) At fixed 𝜎 = 0.1 and constant H, the system captures less information as H decreases and it needs to operate at high 𝛽 to sense the signal. In particular, as 𝛽 increases, IU,H becomes larger. (b) In the dynamical case, outside the optimal curve (black dashed line), at high 𝛽 and high 𝜎, storage is not produced and no negative feedback is present. The system does not display habituation, and IU,H is smaller than on the optimal curve. (c) In the opposite regime, at low 𝛽 and 𝜎, the system is dominated by thermal noise. As a consequence, the average readout ⟨𝑈⟩ is high even when the external signal is not present (H=Hmin=0.1), and it captures only a small amount of information 𝐼𝑈,𝐻 , which is masked by thermal activation. Simulation parameters are as in Appendix 1—table 1.

Appendix 2—figure 2. Behavior of the stationary average readout population ⟨𝑈⟩ st , the average storage population Sst, the mutual information between readout and the signal 𝐼st 𝑈,𝐻, and the total energy consumption δQRst+Eintst, as a function of 𝛽 and 𝜎 and in the presence of a static signal with average Hst.

Appendix 2—figure 2.

The value of 𝜅 is fixed by a reference signal as in Equation 2. The dashed black line indicates the corresponding Pareto front. Simulation parameters are as in Appendix 1—table 1.

Appendix 2—figure 3. Behavior of the stationary average readout population ⟨𝑈⟩st , the average storage population ⟨𝑆⟩st , the mutual information between readout and the signal 𝐼st𝑈,𝐻, and the total energy consumption 𝛿𝑄st𝑅+ 𝐸stint , as a function of 𝛽 and 𝜎 and in the presence of a static signal with average 𝐻st .

Appendix 2—figure 3.

The value of 𝜅 is tuned so that it follows the average value of the external signal, Equation A2-5. The dashed black line indicates the corresponding Pareto front. Simulation parameters are as in Appendix 1—table 1.

Appendix 2—figure 4. Dynamical optimality under a repeated external signal.

Appendix 2—figure 4.

(a) Schematic definition of how we study the dynamical evolution of relevant observables, by comparing the maximal response to a first signal with the one to a signal after the system has habituated. (b) Behavior of the increase in readout information, Δ𝐼𝑈,𝐻 , in feedback information, ΔΔ𝐼𝑓 , in average readout population, Δ⟨𝑈⟩, and in the internal energy flux, Δ𝐽int . The value of 𝜅 is fixed by a reference signal as in Equation 2. The dashed black line indicates the corresponding Pareto front. Simulation parameters are as in Appendix 1—table 1.

Appendix 2—figure 5. Behavior of the change in average readout population Δ⟨𝑈⟩, readout information gain Δ𝐼𝑈,𝐻 , change in internal energy flux Δ𝐽int , feedback information gain, ΔΔ𝐼𝑓 , final readout information after habituation 𝐼(hab)𝑈,𝐻, as a function of 𝛽 and 𝜎 and in the presence of a switching signal with average ⟨𝐻⟩max.

Appendix 2—figure 5.

The value of 𝜅 is fixed by a reference signal as in Equation 2. The dashed black line indicates the corresponding Pareto front. Simulation parameters are as in Appendix 1—table 1.

Appendix 2—figure 6. Behavior of the change in average readout population Δ⟨𝑈⟩, readout information gain Δ𝐼𝑈,𝐻 , change in internal energy flux Δ𝐽int , feedback information gain, ΔΔ𝐼𝑓 , final readout information after habituation 𝐼(hab)𝑈,𝐻, as a function of 𝛽 and 𝜎 and in the presence of a switching signal with average ⟨𝐻⟩max.

Appendix 2—figure 6.

The value of 𝜅 is tuned so that it follows the average value of the external signal as in Equation A2-5. The dashed black line indicates the corresponding Pareto front. Simulation parameters are as in Appendix 1—table 1.

Appendix 2—figure 7. Effect of the signal duration on habituation.

Appendix 2—figure 7.

(a) If the system only receives the signal for a short time (𝑇on = 50Δ𝑡 < Δ𝑇 = 200Δ𝑡), it does not have enough time to reach a high level of storage molecules. As a consequence, both Δ𝑈 and Δ𝐼𝑈,𝐻 are smaller, and thus habituation is less effective. (b) If the system receives long signals with brief pauses (𝑇on = 200Δ𝑡 > Δ𝑇 = 50Δ𝑡), instead, the habituation mechanism promotes information storage and thus a reduction in the readout activity. The dashed black line indicates the corresponding Pareto front. Simulation parameters are as in Appendix 1—table 1.

Appendix 3

The necessity of storage

Here, we discuss in detail the necessity of slow storage implementing the negative feedback to have habituation. We will first investigate the possibility that negative feedback, necessary for any kind of habituative behaviors, is implemented directly through the readout population that undergoes a fast dynamics. We will analytically show that this limit leads to the absence of habituation, hinting at the necessity of having a slow dynamical feedback in the system (Sec. 1). Then, we will study the system in the scenario in which U applies the feedback, bypassing the storage S, but it acts as a slow variable. Solving the Master Equation through our iterative numerical method, we show that, also in this case, habituation disappears (Sec. E). These results suggest that not only the feedback must be applied by a slow variable, but that such a slow variable must have a role different from the readout population, in line with recent observations in neural systems (Fotowat and Engert, 2023). The model proposed in the main text is indeed minimal in this respect, other than compatible with biological examples.

Dynamical feedback cannot be implemented by a fast readout

If the storage is directly implemented by the readout population, the transition rates get modified as follows:

ΓPA(H)=eβ(hΔE)ΓR0ΓAP(H)=ΓR0ΓPA(C)=eβΔEΓR0ΓAP(C)=eβθuΓR0Γuu+1=eβ(Vcr)ΓU0Γu+1u=ΓU0(u+1) (A3 - 1)

At this level, θ is a free parameter playing the same role as κ/NS in the complete model with the storage. We start again from the master equation for the propagator P(u,r,h,t|u0,r0,h0,t0):

tP=[W^U(r)τU+W^R(u,h)τR+W^HτH]P, (A3 - 2)

where τUτRτH, since we are assuming, as before, that U is the fastest variable. Here, ϵ=τU/τR and δ=τR/τH. Notice that now W^R depends also on u. We can solve the system again by resorting to a timescale separation and scaling the time by the slowest timescale, τH. We have:

tP=[ϵ1δ1W^U(r)+δ1W^R(u,h)+W^H]P. (A3 - 3)

We now expand the propagator at first order in ϵ, P=P(0)+ϵP(1). Then, the order ϵ1 of the master equation gives, as above, P(0)=pU|Rst(u|r)Π(r,h,t|r0,h0,t0). At order ϵ0, (Equation A3-4) leads to

tΠ(r,h,t|r0,h0,t0)=[δ1uW^R(u,h)pU|Rst(u|r)+W^H]Π(r,h,t|r0,h0,t0). (A3-4)

To solve this, we expand the propagator as Π=Π(0)+δΠ(1) and, at order δ1, we obtain:

(uW^R(u,h)pU|Rst(u|r))Π(0)=0 (A3 - 5)

This is a 2×2 effective matrix acting on Π(0), where the only rate affected by u is ΓAP(C), which multiplies the active states, that is r=1. This equation can be analytically computed, and the solution of Equation A3-6 is:

Π(0)=ρR|Hst(r|h)f(h,t|h0t0)ρR|Hst(r=0|h)=eβΔE(1+Θ)eβh+1+eβΔE(1+Θ) (A3-6)

with log(Θ)=eβ(Vc)(eβθ1). Clearly, ρR|Hst(r|h) does not depend on u since we summed over the fast variable. Going on with the computation, at order δ0, we obtain:

tf(h,t|h0,t0)=W^Hf(h,t|h0,t0) (A3 - 7)

So that the full propagator results to be:

P(0)(u,r,h,t|u0,r0,h0,t0)=pU|Rst(u|r)ρR|Hst(r|h)PH(h,t|h0,t0) (A3 - 8)

From this expression, we can find the joint probability distribution, following the same steps as before:

pU,R,H(u,r,h,t)=pU|Rst(u|r)ρR|Hst(r|h)pH(h,t) (A3 - 9)

As expected, since U relaxes instantaneously, the feedback is instantaneous as well. As a consequence, the time-dependent behavior of the system is solely driven by the external signal H, with a fixed amplitude that takes into account the effect of the feedback only on average. This means that there will be no dynamic reduction of activity and, as such, no habituation in this scenario. This was somehow expected, since all variables are faster than the external signal and, as a consequence, the feedback cannot be implemented over time. The first conclusion is that the variable implementing the feedback has to evolve together with H.

Effective dynamical feedback requires an additional population

We now assume that the feedback is, again, implemented by U, but it acts as a slow variable. Formally, we take τRτUτH. Rescaling the time by the slowest timescale, τH (works the same for τU), we have:

tP=[τHτUW^U(r)+ϵ1W^R(u,h)+W^H]P (A3 - 10)

with ϵ=τR/τH. We now expand the propagator at first order in ϵ, P=P(0)+ϵP(1). Then, the order ϵ1 of the master equation is simply W^RP(0)=0, whose solution gives P(0)=pR|U,Hst(r|u,h)Π(u,h,t|u0,h0,t0). At order ϵ0:

tΠ(u,h,t|u0,h0,t0)=[τHτUrW^U(r)pR|U,Hst(r|u,h)+W^H]Π(u,h,t|u0,h0,t0). (A3 - 11)

The only dependence on r in W^U(r) is through the production rate of U. Indeed, the effective transition matrix governing the birth-and-death process of readout molecules is characterized by:

Γuu+1eff=eβV(eβcpR|U,Hst(r=1|u,h)+pR|U,Hst(r=0|u,h))ΓU0 (A3 - 12)

This rate depends only on h, but h evolves in time. Therefore, we should scan all possible (infinite) values that h takes and build an infinite-dimensional transition matrix. In order to solve the system, imagine that we are looking at the interval [t0,t0+Δt]. Then, we can employ the following approximation if ΔtτH:

Γuu+1eff(h)=Γuu+1eff(h0) (A3 - 13)

Using this simplification, we need to solve the following equation:

tΠ(u,h,t0+Δt|u0,h0,t0)=[τHτUW^Ueff(u,h0)+W^H]Π(u,h,t0+Δt|u0,h0,t0). (A3 - 14)

The explicit solution in the interval t[t0,t0+Δt] can be found to be:

Π(u,h,t0+Δt|u0,h0,t0)=PUeff(u,t0+Δt|u0,h0,t0)PH(h,t0+Δt|h0,t0) (A3 - 15)

with PUeff a propagator. The full propagator at time t0+Δt is then:

pU,R,H(u,r,h,t0+Δt|u0,r0,h0,t0)=u0pR|U,Hst(r|u,h)PUeff(u,t0+Δt|u0,h0,t0)PH(h,t0+Δt|h0,t0)pU,H(u0,h0,t0) (A3 - 16)

Integrating over the initial conditions, we finally obtain:

pU,R,H(u,r,h,t0+Δt)=u0pR|U,Hst(r|u,h)dh0PUeff(u,t0+Δt|u0,h0,t0)PH(h,t0+Δt|h0,t0)pU,H(u0,h0,t0) (A3 - 17)

To numerically integrate this equation, we make two approximations. The first one is that we solve the dynamics in all intervals in which the signal does not evolve, where PH is a delta function peaked at the initial condition. For all time points in which the signal changes, this amounts to considering the signal at the previous instant, a good approximation as long ΔtτH, particularly when the time dependence of the signal is a square wave, as in our case.

The second approximation is to compute the propagator of PU. As explained in the Materials and methods of the main text, we restrict our computation to the transitions between n nearest neighbors in the U space. In the case of transitions only among next-nearest neighbors, we have the following dynamics:

tP(u|u0,h)=WnnP(u|u0) (A3 - 18)

with the transition matrix:

W12nn=W^u0u01=Γ0Uu0W13nn=W^u0+1u01=0W21nn=W^u01u0=Γu0u0u0effW13nn=Wu0+1u0^=Γ0U(u0+1)W32nn=W^u0+1u0=0W32nn=W^u0u0+1=Γu0u0+1eff

the diagonal is fixed to satisfy the conservation of normalization, as usual. The solution is:

P(u|u0,h)=pU|Hst+νwνa(ν)eλνΔt (A3 - 19)

where wν and λν are respectively eigenvectors and eigenvalues of the transition matrix Wnn. The coefficients a(ν) have to be evaluated according to the condition at time t0:

PU|H(u|u0,h)=pU|Hst+νwνa(ν)=δu,u0 (A3 - 20)

δu,u0 where is the Kronecker’s delta. To evaluate the information content of this model, we also need:

pU(u,t0+Δt)=u0pU(u0,t0)dhPUeff(u,t0+Δt|u0,h,t0)pH(h,t0+Δt)pU|H(u,t0+Δt|h)=u0PUeff(u,t0+Δt|u0,h,t0)PU(u0,t0) (A3 - 21)

In Appendix 3—figure 1 we show that, in this model, U does not display habituation. Rather, it increases upon repeated stimuli, acting as the storage in the main text. On the other hand, the probability of the receptor being active does habituate. This suggests that habituation can only occur in fast variables modulated by slow variables.

It is straightforward to intuitively understand why a direct feedback from U, with this population undergoing a slow dynamics, cannot lead to habituation. Indeed, at a fixed distribution of the external signal, the stationary solution of U already takes into account the effect of the negative feedback. Hence, if the system starts with a very low readout population (no signal), the dynamics induced by a switching signal can only bring U to its steady state with intervals in which the population will grow and intervals in which it decreases. Naively speaking, the dynamics of U becomes similar to the one of the storage in the complete model, since it is actually playing the same role of storing information in this simplified context.

Appendix 3—figure 1. Dynamics of a system where 𝑈 evolves on the same timescale of 𝐻 and implements directly a negative feedback on the receptor.

Appendix 3—figure 1.

In this model, ⟨𝑈⟩ (in red) increases upon repeated stimulation rather than decreasing, responding to changes in ⟨𝐻⟩ (in gray) as the storage of the full model. On the other hand, the probability of the receptor being active, 𝑝𝑅(𝑟 = 1) (black), shows signs of habituation.

Appendix 4

Experimental setup

Acquisitions of the zebrafish brain activity were carried out in Elavl3:H2BGCaMP6s larvae at 5 days post fertilization raised at 28C on a 12 hr light/12 hr dark cycle according to the approval by the Ethical Committee of the University of Padua (61/2020 dal Maschio). Larvae were embedded in 2% agarose gel and their brain activity was recorded using a multiphoton system with a custom 3D volumetric acquisition module. Briefly, the imaging path is based on an 8 kHz galvo-resonant commercial 2 P design (Bergamo I Series, Thorlabs, Newton, NJ, United States) coupled to a Ti:Sapphire source (Chameleon Ultra II, Coherent) tuned to 920 nm for imaging GCaMP6 signals and modulated by a Pockels cell (Conoptics). The fluorescence collection path includes a 705 nm long-pass main dichroic and a 495nm long-pass dichroic mirror transmitting the fluorescence light toward a GaAsP PMT detector (H7422PA-40, Hamamatsu) equipped with EM525/50 emission filter. Data were acquired at 30 frames per second, using a water dipping Nikon CFI75 LWD 16 X W objective covering an effective field of view of about 450×900um with a resolution of 512 × 1024 pixels. The volumetric module is based on an electrically tunable lens (Optotune) moving continuously according to a saw-tooth waveform synchronized with the frame acquisition trigger. An entire volume of about 180200um in thickness encompassing 30 planes separated by about 7um is acquired at a rate of 1 volume per second, sufficient to track the relative slow dynamics associated with the fluorescence-based activity reporter GCaMP6s.

As for the visual stimulation, looming stimuli were generated using Stytra and presented monocularly on a 50×50mm screen using a DPL4500 projector by Texas Instruments. The dark looming dot was presented 10 times with 150 s interval, centered with the fish eye and with a l/v parameter of 8.3 s, reaching at the end of the stimulation a visual angle of 79.4 corresponding to an angular expansion rate of 9.5/s. The acquired temporal series were first processed using an automatic pipeline, including motion artifact correction, temporal filtering with a rectangular window 3 s long, and automatic segmentation using Suite2P. Then, the obtained dataset was manually curated to resolve segmentation errors or to integrate cells not detected automatically. We fit the activity profiles of about 52,000 cells with a linear regression model (scikit-learn Python Library) using a set of base functions representing the expected responses to each of the stimulation events, obtained by convolving an exponentially decaying kernel of the GCaMP signal lifetime with square waveforms characterized by an amplitude different from zero only during the presentation of the corresponding visual stimulus. The resulting coefficients were divided for the mean squared error of the fit to obtain a set of scores. The cells, whose score fell within the top 5 of the distribution, were considered for the dimensionality reduction analysis.

The resulting fluorescence signals F(i), for i=1,,Ncells, were processed by removing a moving baseline to account for baseline drifting and fast oscillatory noise (Jia et al., 2011). Briefly, for each time point t, we selected a window [tτ2,t] and evaluated the minimum smoothed fluorescence,

F0(i)=minu[tτ2,t][1τ1uτ1/2u+τ1/2F(s)ds]. (A4 - 1)

Then, the relative change in fluorescence signal,

R(i)(t)=F(i)(t)F0(i)F0(i) (A4 - 2)

is smoothed with an exponential moving average. Thus, the neural activity profile for the i-th cell that we use in the main text is given by

x(i)(t)=0tR(tτ)w(τ)dτ0tw(τ)dτ,w(t)=exp[tτ0]. (A4 - 3)

In accordance with the previous literature (Jia et al., 2011), we set τ0=0.2s, τ1=0.75s, and τ2=3s. The qualitative nature of the low-dimensional activity in the PCA space is not altered by other sensible choices of these parameters.

Funding Statement

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Open access funding provided by Max Planck Society.

Contributor Information

Daniel Maria Busiello, Email: busiello@pks.mpg.de.

Arvind Murugan, University of Chicago, United States.

Aleksandra M Walczak, CNRS, France.

Funding Information

This paper was supported by the following grants:

  • Ministero dell'Università e della Ricerca MNESYS PE0000006 to Samir Suweis.

  • Swiss National Science Foundation CRSII5_186422 to Giorgio Nicoletti.

  • Max Planck Institute for the Physics of Complex Systems to Daniel Maria Busiello.

  • University of Padova ActiveInfo to Daniel Maria Busiello.

  • Ministero dell'Università e della Ricerca #NEXTGENERATIONEU (NGEU) to Samir Suweis.

Additional information

Competing interests

No competing interests declared.

Author contributions

Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Data curation, Methodology, Writing – original draft, Writing – review and editing.

Conceptualization, Formal analysis, Funding acquisition, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing.

Ethics

Acquisitions of the zebrafish brain activity were carried out in one Elavl3:H2BGCaMP6s larvae at 5 days post fertilization raised at 28°C on a 12 h light/12 h dark cycle according to the approval by the Ethical Committee of the University of Padua (61/2020 dal Maschio).

Additional files

MDAR checklist

Data availability

The data to produce Figure 6 have been deposited on Zenodo and are accessible through the following link: https://doi.org/10.5281/zenodo.15683642.

The following dataset was generated:

Matteo B, Marco DM, Giorgio N. 2025. Habituation during visual stimulation in zebrafish brain activity. Zenodo.

References

  1. Abbott LF, Nelson SB. Synaptic plasticity: taming the beast. Nature Neuroscience. 2000;3 Suppl:1178–1183. doi: 10.1038/81453. [DOI] [PubMed] [Google Scholar]
  2. Astumian RD. Kinetic asymmetry allows macromolecular catalysts to drive an information ratchet. Nature Communications. 2019;10:3837. doi: 10.1038/s41467-019-11402-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
  3. Azeloglu EU, Iyengar R. Signaling networks: information flow, computation, and decision making. Cold Spring Harbor Perspectives in Biology. 2015;7:a005934. doi: 10.1101/cshperspect.a005934. [DOI] [PMC free article] [PubMed] [Google Scholar]
  4. Barato AC, Hartich D, Seifert U. Efficiency of cellular information processing. New Journal of Physics. 2014;16:103024. doi: 10.1088/1367-2630/16/10/103024. [DOI] [Google Scholar]
  5. Barkai N, Leibler S. Robustness in simple biochemical networks. Nature. 1997;387:913–917. doi: 10.1038/43199. [DOI] [PubMed] [Google Scholar]
  6. Barzon G, Busiello DM, Nicoletti G. Excitation-inhibition balance controls information encoding in neural populations. Physical Review Letters. 2025;134:068403. doi: 10.1103/PhysRevLett.134.068403. [DOI] [PubMed] [Google Scholar]
  7. Benda J. Neural adaptation. Current Biology. 2021;31:R110–R116. doi: 10.1016/j.cub.2020.11.054. [DOI] [PubMed] [Google Scholar]
  8. Bennett CH. The thermodynamics of computation—a review. International Journal of Theoretical Physics. 1982;21:905–940. doi: 10.1007/BF02084158. [DOI] [Google Scholar]
  9. Benucci A, Saleem AB, Carandini M. Adaptation maintains population homeostasis in primary visual cortex. Nature Neuroscience. 2013;16:724–729. doi: 10.1038/nn.3382. [DOI] [PMC free article] [PubMed] [Google Scholar]
  10. Bilancioni M, Esposito M, Freitas N. A chemical reaction network implementation of a Maxwell demon. The Journal of Chemical Physics. 2023;159:204103. doi: 10.1063/5.0173889. [DOI] [PubMed] [Google Scholar]
  11. Bo S, Celani A. Multiple-scale stochastic processes: Decimation, averaging and beyond. Physics Reports. 2017;670:1–59. doi: 10.1016/j.physrep.2016.12.003. [DOI] [Google Scholar]
  12. Bruzzone M, Chiarello E, Albanesi M, Miletto Petrazzini ME, Megighian A, Lodovichi C, Dal Maschio M. Whole brain functional recordings at cellular resolution in zebrafish larvae with 3D scanning multiphoton microscopy. Scientific Reports. 2021;11:11048. doi: 10.1038/s41598-021-90335-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Bueti D, Bahrami B, Walsh V, Rees G. Encoding of temporal probabilities in the human brain. The Journal of Neuroscience. 2010;30:4343–4352. doi: 10.1523/JNEUROSCI.2254-09.2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  14. Busiello DM, Gupta D, Maritan A. Coarse-grained entropy production with multiple reservoirs: Unraveling the role of time scales and detailed balance in biology-inspired systems. Physical Review Research. 2020;2:043257. doi: 10.1103/PhysRevResearch.2.043257. [DOI] [Google Scholar]
  15. Celani A, Shimizu TS, Vergassola M. Molecular and functional aspects of bacterial chemotaxis. Journal of Statistical Physics. 2011;144:219–240. doi: 10.1007/s10955-011-0251-6. [DOI] [Google Scholar]
  16. Cheong R, Rhee A, Wang CJ, Nemenman I, Levchenko A. Information transduction capacity of noisy biochemical signaling networks. Science. 2011;334:354–358. doi: 10.1126/science.1204553. [DOI] [PMC free article] [PubMed] [Google Scholar]
  17. Coultrap SJ, Bayer KU. CaMKII regulation in information processing and storage. Trends in Neurosciences. 2012;35:607–618. doi: 10.1016/j.tins.2012.05.003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  18. De Los Rios P, Barducci A. Hsp70 chaperones are non-equilibrium machines that achieve ultra-affinity by energy consumption. eLife. 2014;3:e02218. doi: 10.7554/eLife.02218. [DOI] [PMC free article] [PubMed] [Google Scholar]
  19. De Ronde WH, Tostevin F, Ten Wolde PR. Effect of feedback on the fidelity of information transmission of time-varying signals. Physical Review E. 2010;82:031914. doi: 10.1103/PhysRevE.82.031914. [DOI] [PubMed] [Google Scholar]
  20. De Smet R, Marchal K. Advantages and limitations of current network inference methods. Nature Reviews. Microbiology. 2010;8:717–729. doi: 10.1038/nrmicro2419. [DOI] [PubMed] [Google Scholar]
  21. Eckert L, Vidal-Saez MS, Zhao Z. Biochemically Plausible Models of Habituation for Single-Cell Learning. Current Biology; 2024. [DOI] [PubMed] [Google Scholar]
  22. Flatt S, Busiello DM, Zamuner S, De Los Rios P. ABC transporters are billion-year-old Maxwell Demons. Communications Physics. 2023;6:205. doi: 10.1038/s42005-023-01320-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  23. Fotowat H, Engert F. Neural circuits underlying habituation of visually evoked escape behaviors in larval zebrafish. eLife. 2023;12:e82916. doi: 10.7554/eLife.82916. [DOI] [PMC free article] [PubMed] [Google Scholar]
  24. Frankland PW, Josselyn SA. In search of the memory molecule. Nature. 2016;535:41–42. doi: 10.1038/nature18903. [DOI] [PubMed] [Google Scholar]
  25. Gnesotto FS, Mura F, Gladrow J, Broedersz CP. Broken detailed balance and non-equilibrium dynamics in living systems: a review. Reports on Progress in Physics. Physical Society. 2018;81:066601. doi: 10.1088/1361-6633/aab3ed. [DOI] [PubMed] [Google Scholar]
  26. Hartich D, Barato AC, Seifert U. Nonequilibrium sensing and its analogy to kinetic proofreading. New Journal of Physics. 2015;17:055026. doi: 10.1088/1367-2630/17/5/055026. [DOI] [Google Scholar]
  27. Hidalgo J, Grilli J, Suweis S, Muñoz MA, Banavar JR, Maritan A. Information-based fitness and the emergence of criticality in living systems. PNAS. 2014;111:10095–10100. doi: 10.1073/pnas.1319166111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  28. Hilfinger A, Norman TM, Vinnicombe G, Paulsson J. Constraints on fluctuations in sparsely characterized biological systems. Physical Review Letters. 2016;116:058101. doi: 10.1103/PhysRevLett.116.058101. [DOI] [PMC free article] [PubMed] [Google Scholar]
  29. Jalaal M, Schramma N, Dode A, de Maleprade H, Raufaste C, Goldstein RE. Stress-induced dinoflagellate bioluminescence at the single cell level. Physical Review Letters. 2020;125:028102. doi: 10.1103/PhysRevLett.125.028102. [DOI] [PubMed] [Google Scholar]
  30. Jia H, Rochefort NL, Chen X, Konnerth A. In vivo two-photon imaging of sensory-evoked dendritic calcium signals in cortical neurons. Nature Protocols. 2011;6:28–35. doi: 10.1038/nprot.2010.169. [DOI] [PubMed] [Google Scholar]
  31. Kohn A. Visual adaptation: physiology, mechanisms, and functional benefits. Journal of Neurophysiology. 2007;97:3155–3164. doi: 10.1152/jn.00086.2007. [DOI] [PubMed] [Google Scholar]
  32. Kollmann M, Løvdok L, Bartholomé K, Timmer J, Sourjik V. Design principles of a bacterial signalling network. Nature. 2005;438:504–507. doi: 10.1038/nature04228. [DOI] [PubMed] [Google Scholar]
  33. Koshland DE, Goldbeter A, Stock JB. Amplification and adaptation in regulatory and sensory systems. Science. 1982;217:220–225. doi: 10.1126/science.7089556. [DOI] [PubMed] [Google Scholar]
  34. Kurtz ZD, Müller CL, Miraldi ER, Littman DR, Blaser MJ, Bonneau RA. Sparse and compositionally robust inference of microbial ecological networks. PLOS Computational Biology. 2015;11:e1004226. doi: 10.1371/journal.pcbi.1004226. [DOI] [PMC free article] [PubMed] [Google Scholar]
  35. Lamiré L-A, Haesemeyer M, Engert F, Granato M, Randlett O. Inhibition drives habituation of a larval zebrafish visual response. bioRxiv. 2022 doi: 10.1101/2022.06.17.496451. [DOI]
  36. Lan G, Sartori P, Neumann S, Sourjik V, Tu Y. The energy-speed-accuracy tradeoff in sensory adaptation. Nature Physics. 2012;8:422–428. doi: 10.1038/nphys2276. [DOI] [PMC free article] [PubMed] [Google Scholar]
  37. Lesica NA, Jin J, Weng C, Yeh CI, Butts DA, Stanley GB, Alonso JM. Adaptation to stimulus contrast and correlations during natural visual stimulation. Neuron. 2007;55:479–491. doi: 10.1016/j.neuron.2007.07.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  38. Lestas I, Vinnicombe G, Paulsson J. Fundamental limits on the suppression of molecular fluctuations. Nature. 2010;467:174–178. doi: 10.1038/nature09333. [DOI] [PMC free article] [PubMed] [Google Scholar]
  39. Lisman J, Schulman H, Cline H. The molecular basis of CaMKII function in synaptic and behavioural memory. Nature Reviews Neuroscience. 2002;3:175–190. doi: 10.1038/nrn753. [DOI] [PubMed] [Google Scholar]
  40. Ma W, Trusina A, El-Samad H, Lim WA, Tang C. Defining network topologies that can achieve biochemical adaptation. Cell. 2009;138:760–773. doi: 10.1016/j.cell.2009.06.013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  41. Malmierca MS, Sanchez-Vives MV, Escera C, Bendixen A. Neuronal adaptation, novelty detection and regularity encoding in audition. Frontiers in Systems Neuroscience. 2014;8:111. doi: 10.3389/fnsys.2014.00111. [DOI] [PMC free article] [PubMed] [Google Scholar]
  42. Marquez-Legorreta E, Constantin L, Piber M, Favre-Bulle IA, Taylor MA, Blevins AS, Giacomotto J, Bassett DS, Vanwalleghem GC, Scott EK. Brain-wide visual habituation networks in wild type and fmr1 zebrafish. Nature Communications. 2022;13:895. doi: 10.1038/s41467-022-28299-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
  43. Martin SJ, Grimwood PD, Morris RGM. Synaptic plasticity and memory: an evaluation of the hypothesis. Annual Review of Neuroscience. 2000;23:649–711. doi: 10.1146/annurev.neuro.23.1.649. [DOI] [PubMed] [Google Scholar]
  44. Mattingly HH, Kamino K, Machta BB, Emonet T. Escherichia coli chemotaxis is information limited. Nature Physics. 2021;17:1426–1431. doi: 10.1038/s41567-021-01380-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  45. Menini A. Calcium signalling and regulation in olfactory neurons. Current Opinion in Neurobiology. 1999;9:419–426. doi: 10.1016/S0959-4388(99)80063-4. [DOI] [PubMed] [Google Scholar]
  46. Nakajima T. Biologically inspired information theory: Adaptation through construction of external reality models by living systems. Progress in Biophysics and Molecular Biology. 2015;119:634–648. doi: 10.1016/j.pbiomolbio.2015.07.008. [DOI] [PubMed] [Google Scholar]
  47. Nemenman I. Information Theory and Adaptation. arXiv. 2012 https://arxiv.org/abs/1011.5466
  48. Ngampruetikorn V, Schwab DJ, Stephens GJ. Energy consumption and cooperation for optimal sensing. Nature Communications. 2020;11:975. doi: 10.1038/s41467-020-14806-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
  49. Nicoletti G, Busiello DM. Mutual information disentangles interactions from changing environments. Physical Review Letters. 2021;127:228301. doi: 10.1103/PhysRevLett.127.228301. [DOI] [PubMed] [Google Scholar]
  50. Nicoletti G, Busiello DM. Mutual information in changing environments: Nonlinear interactions, out-of-equilibrium systems, and continuously varying diffusivities. Physical Review. E. 2022a;106:014153. doi: 10.1103/PhysRevE.106.014153. [DOI] [PubMed] [Google Scholar]
  51. Nicoletti G, Maritan A, Busiello DM. Information-driven transitions in projections of underdamped dynamics. Physical Review. E. 2022b;106:014118. doi: 10.1103/PhysRevE.106.014118. [DOI] [PubMed] [Google Scholar]
  52. Nicoletti G, Busiello DM. Information propagation in multilayer systems with higher-order interactions across timescales. Physical Review X. 2024a;14:021007. doi: 10.1103/PhysRevX.14.021007. [DOI] [Google Scholar]
  53. Nicoletti G, Busiello DM. Tuning transduction from hidden observables to optimize information harvesting. Physical Review Letters. 2024b;133:158401. doi: 10.1103/PhysRevLett.133.158401. [DOI] [PubMed] [Google Scholar]
  54. Ouldridge TE, Govern CC, ten Wolde PR. Thermodynamics of computational copying in biochemical systems. Physical Review X. 2017;7:021004. doi: 10.1103/PhysRevX.7.021004. [DOI] [Google Scholar]
  55. Palmer SE, Marre O, Berry MJ, Bialek W. Predictive information in a sensory population. PNAS. 2015;112:6908–6913. doi: 10.1073/pnas.1506855112. [DOI] [PMC free article] [PubMed] [Google Scholar]
  56. Parrondo JMR, Horowitz JM, Sagawa T. Thermodynamics of information. Nature Physics. 2015;11:131–139. doi: 10.1038/nphys3230. [DOI] [Google Scholar]
  57. Penocchio E, Avanzini F, Esposito M. Information thermodynamics for deterministic chemical reaction networks. The Journal of Chemical Physics. 2022;157:034110. doi: 10.1063/5.0094849. [DOI] [PubMed] [Google Scholar]
  58. Perkins TJ, Swain PS. Strategies for cellular decision-making. Molecular Systems Biology. 2009;5:326. doi: 10.1038/msb.2009.83. [DOI] [PMC free article] [PubMed] [Google Scholar]
  59. Rahi SJ, Larsch J, Pecani K, Katsov AY, Mansouri N, Tsaneva-Atanasova K, Sontag ED, Cross FR. Oscillatory stimuli differentiate adapting circuit topologies. Nature Methods. 2017;14:1010–1016. doi: 10.1038/nmeth.4408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  60. Sagawa T, Ueda M. Minimal energy cost for thermodynamic information processing: measurement and information erasure. Physical Review Letters. 2009;102:250602. doi: 10.1103/PhysRevLett.102.250602. [DOI] [PubMed] [Google Scholar]
  61. Sartori P, Granger L, Lee CF, Horowitz JM. Thermodynamic costs of information processing in sensory adaptation. PLOS Computational Biology. 2014;10:e1003974. doi: 10.1371/journal.pcbi.1003974. [DOI] [PMC free article] [PubMed] [Google Scholar]
  62. Schneidman E, Berry MJ, Segev R, Bialek W. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature. 2006;440:1007–1012. doi: 10.1038/nature04701. [DOI] [PMC free article] [PubMed] [Google Scholar]
  63. Sederberg AJ, MacLean JN, Palmer SE. Learning to make external sensory stimulus predictions using internal correlations in populations of neurons. PNAS. 2018;115:1105–1110. doi: 10.1073/pnas.1710779115. [DOI] [PMC free article] [PubMed] [Google Scholar]
  64. Selimkhanov J, Taylor B, Yao J, Pilko A, Albeck J, Hoffmann A, Tsimring L, Wollman R. Systems biology: Accurate information transmission through dynamic biochemical signaling networks. Science. 2014;346:1370–1373. doi: 10.1126/science.1254933. [DOI] [PMC free article] [PubMed] [Google Scholar]
  65. Seoane LF, Solé R. Phase transitions in Pareto optimal complex networks. Physical Review E. 2015;92:32807. doi: 10.1103/PhysRevE.92.032807. [DOI] [PubMed] [Google Scholar]
  66. Shew WL, Clawson WP, Pobst J, Karimipanah Y, Wright NC, Wessel R. Adaptation to sensory input tunes visual cortex to criticality. Nature Physics. 2015;11:659–663. doi: 10.1038/nphys3370. [DOI] [Google Scholar]
  67. Skoge M, Naqvi S, Meir Y, Wingreen NS. Chemical sensing by nonequilibrium cooperative receptors. Physical Review Letters. 2013;110:248102. doi: 10.1103/PhysRevLett.110.248102. [DOI] [PMC free article] [PubMed] [Google Scholar]
  68. Smart M, Shvartsman SY, Mönnigmann M. Minimal motifs for habituating systems. PNAS. 2024;121:e2409330121. doi: 10.1073/pnas.2409330121. [DOI] [PMC free article] [PubMed] [Google Scholar]
  69. Tadres D, Wong PH, To T, Moehlis J, Louis M. Depolarization block in olfactory sensory neurons expands the dimensionality of odor encoding. Science Advances. 2022;8:eade7209. doi: 10.1126/sciadv.ade7209. [DOI] [PMC free article] [PubMed] [Google Scholar]
  70. Thompson RF, Spencer WA. Habituation: a model phenomenon for the study of neuronal substrates of behavior. Psychological Review. 1966;73:16–43. doi: 10.1037/h0022681. [DOI] [PubMed] [Google Scholar]
  71. Tkačik G, Marre O, Amodei D, Schneidman E, Bialek W, Berry MJ. Searching for collective behavior in a large network of sensory neurons. PLOS Computational Biology. 2014;10:e1003408. doi: 10.1371/journal.pcbi.1003408. [DOI] [PMC free article] [PubMed] [Google Scholar]
  72. Tkačik G, Bialek W. Information processing in living systems. Annual Review of Condensed Matter Physics. 2016;7:89–117. doi: 10.1146/annurev-conmatphys-031214-014803. [DOI] [Google Scholar]
  73. Tu Y. The nonequilibrium mechanism for ultrasensitivity in a biological switch: sensing by Maxwell’s demons. PNAS. 2008;105:11737–11741. doi: 10.1073/pnas.0804641105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  74. Tu Y, Shimizu TS, Berg HC. Modeling the chemotactic response of Escherichia coli to time-varying stimuli. PNAS. 2008;105:14855–14860. doi: 10.1073/pnas.0807569105. [DOI] [PMC free article] [PubMed] [Google Scholar]
  75. Tunstrøm K, Katz Y, Ioannou CC, Huepe C, Lutz MJ, Couzin ID. Collective states, multistability and transitional behavior in schooling fish. PLOS Computational Biology. 2013;9:e1002915. doi: 10.1371/journal.pcbi.1002915. [DOI] [PMC free article] [PubMed] [Google Scholar]
  76. Wajant H, Pfizenmaier K, Scheurich P. Tumor necrosis factor signaling. Cell Death & Differentiation. 2003;10:45–65. doi: 10.1038/sj.cdd.4401189. [DOI] [PubMed] [Google Scholar]
  77. Whiteley M, Diggle SP, Greenberg EP. Progress in and promise of bacterial quorum sensing research. Nature. 2017;551:313–320. doi: 10.1038/nature24624. [DOI] [PMC free article] [PubMed] [Google Scholar]
  78. Yan J, Hilfinger A, Vinnicombe G, Paulsson J. Kinetic uncertainty relations for the control of stochastic reaction networks. Physical Review Letters. 2019;123:108101. doi: 10.1103/PhysRevLett.123.108101. [DOI] [PubMed] [Google Scholar]

eLife Assessment

Arvind Murugan 1

This manuscript presents a valuable minimal model of habituation which is quantified by information theoretic measures. The results here could be of use in interpreting habituation behavior in a range of biological systems. The evidence presented is solid, and uses simulations of the minimal model to recapitulate several hallmarks of habituation from a simple model.

Reviewer #2 (Public review):

Anonymous

In this study, the authors aim to investigate habituation, the phenomenon of increasing reduction in activity following repeated stimuli, in the context of its information theoretic advantage. To this end, they consider a highly simplified three-species reaction network where habituation is encoded by a slow memory variable that suppresses the receptor and therefore the readout activity. Using analytical and numerical methods, they show that in their model the information gain, the difference between the mutual information between the signal and readout after and before habituation, is maximal for intermediate habituation strength. Furthermore, they demonstrate that the Pareto front corresponding to an optimization strategy that maximizes the mutual information between signal and readout in the steady-state and minimizes dissipation in the system also exhibits similar intermediate habituation strength. Finally, they briefly compare predictions of their model to whole-brain recordings of zebrafish larvae under visual stimulation.

The author's simplified model serves as a good starting point for understanding habituation in different biological contexts as the model is simple enough to allow for some analytic understanding but at the same time exhibits most basic properties of habituation in sensory systems. Furthermore, the author's finding of maximal information gain for intermediate habituation strength via an optimization principle is, in general, interesting. However, the following points remain unclear:

(1) How general is their finding that the optimal Pareto front coincides with the region of maximal information gain? For instance, what happens if the signal H_st (H_max) isn't very strong? Does it matter that in this case, H_st only has a minor influence on delta Q_R? In the binary switching case, what happens if H_max is rather different from H_st (and not just 20% off)? Or in a case where the adapted value corresponds to the average of H_max and H_min?

(2) The comparison to experimental data isn't very convincing. For instance, is PCA performed simultaneously on both the experimental data set and on the model or separately? What are the units of the PCs in Fig. 6(b,c)? Given that the model parameters are chosen so that the activity decrease in the model is similar to the one in the data (i.e., that they show similar habituation in terms of the readout), isn't it expected that the dynamics in the PC1/2 space look very similar?

Reviewer #3 (Public review):

Anonymous

The authors use a generic model framework to study the emergence of habituation and its functional role from information-theoretic and energetic perspectives. Their model features a receptor, readout molecules, and a storage unit, and as such, can be applied to a wide range of biological systems. Through theoretical studies, the authors find that habituation (reduction in average activity) upon exposure to repeated stimuli should occur at intermediate degrees to achieve maximal information gain. Parameter regimes that enable these properties also result in low dissipation, suggesting that intermediate habituation is advantageous both energetically and for the purpose of retaining information about the environment.

A major strength of the work is the generality of the studied model. The presence of three units (receptor, readout, storage) operating at different time scales and executing negative feedback can be found in many domains of biology, with representative examples well discussed by the authors (e.g. Figure 1b). A key takeaway demonstrated by the authors that has wide relevance is that large information gain and large habituation cannot be attained simultaneously. When energetic considerations are accounted for, large information gain and intermediate habituation appear to be the favorable combination.

Comments on the revision:

The authors have adequately addressed the points I raised during the initial review. The text has been clarified at multiple instances, and the treatment of energy expenditure is now more rigorous. The manuscript is much improved both in terms of readability and scientific content.

eLife. 2025 Jul 28;13:RP99767. doi: 10.7554/eLife.99767.3.sa3

Author response

Giorgio Nicoletti 1, Matteo Bruzzone 2, Samir Suweis 3, Marco dal Maschio 4, Daniel Maria Busiello 5

The following is the authors’ response to the original reviews

Reviewer #1 (Public Review):

Summary:

The manuscript by Nicoletti et al. presents a minimal model of habituation, a basic form of non-associative learning, addressing both from dynamical and information theory aspects of how habituation can be realized. The authors identify that negative feedback provided with a slow storage mechanism is sufficient to explain habituation.

Strengths:

The authors combine the identification of the dynamical mechanism with information-theoretic measures to determine the onset of habituation and provide a description of how the system can gain maximum information about the environment.

We thank the reviewer for highlighting the strength of our work and for their comments, which we believe have been instrumental in significantly improving our work and its scope. Below, we address all their concerns.

Weaknesses:

I have several main concerns/questions about the proposed model for habituation and its plausibility. In general, habituation does not only refer to a decrease in the responsiveness upon repeated stimulation but as Thompson and Spencer discussed in Psych. Rev. 73, 16-43 (1966), there are 10 main characteristics of habituation, including (i) spontaneous recovery when the stimulus is withheld after response decrement; dependence on the frequency of stimulation such that (ii) more frequent stimulation results in more rapid and/or more pronounced response decrement and more rapid spontaneous recovery; (iii) within a stimulus modality, the less intense the stimulus, the more rapid and/or more pronounced the behavioral response decrement; (iv) the effects of repeated stimulation may continue to accumulate even after the response has reached an asymptotic level (which may or may not be zero, or no response). This effect of stimulation beyond asymptotic levels can alter subsequent behavior, for example, by delaying the onset of spontaneous recovery.

These are only a subset of the conditions that have been experimentally observed and therefore a mechanistic model of habituation, in my understanding, should capture the majority of these features and/or discuss the absence of such features from the proposed model.

We are really grateful to the reviewer for pointing out these aspects of habituation that we overlooked in the previous version of our manuscript. Indeed, our model is able to capture most of these 10 observed behaviors, specifically: (1) habituation; (2) spontaneous recovery; (3) potentiation of habituation; (4) frequency sensitivity; (5) intensity sensitivity; (6) subliminal accumulation. Here, we are following the same terminology employed in Eckert et al., Current Biology 34, 5646–5658 (2024), the paper highlighted by the reviewer. We have dedicated a section of the revised version of the manuscript to these hallmarks, substantiating the validity of our framework as a minimal model to have habituation. We remark that these are the sole hallmarks that can be discussed by considering one single external stimulus and that can be identified without ambiguity in a biochemical context. This observation is again in line with Eckert et al., Current Biology 34, 5646–5658 (2024).

In the revised version, we employ the same strategy of the aforementioned work to determine when the system can be considered “habituated”. Indeed, we introduce a response threshold that is now discussed in the manuscript. We also included a note in the discussions stating that, since any biochemical model will eventually reach a steady state, subliminal accumulation, for example, can only be seen with the use of a threshold. The introduction of different storage mechanisms, ideally more detailed at a molecular level, can shed light on this conceptual gap. This is an interesting direction of research.

Furthermore, the habituated response in steady-state is approximately 20% less than the initial response, which seems to be achieved already after 3-4 pulses, the subsequent change in response amplitude seems to be negligible, although the authors however state "after a large number of inputs, the system reaches a time-periodic steady-state". How do the authors justify these minimal decreases in the response amplitude? Does this come from the model parametrization and is there a parameter range where more pronounced habituation responses can be observed?

The reviewer is correct, but this is solely a consequence of the specific set of parameters we selected. We made this choice solely for visualization purposes in the previous version. In the revised version, in the section discussing the hallmarks of habituation, we also show other parameter choices when the response decrement is more pronounced. Moreover, we remark that the contour plot of \Delta⟨U> clearly shows that the decrement can largely exceed the 20% threshold presented in the previous version.

In the revised version, also in light of the works highlighted by the reviewer, we decided to move the focus of the manuscript to the information-theoretic advantage of habituation. As such, we modified several parts of the main text. Also, in the region of optimal information gain, habituation is at an intermediate level. For this reason, we decided to keep the same parameter choice as the previous version in Figure 2.

We stated that the time-periodic steady-state is reached “after a large number of stimuli” from a mathematical perspective. However, by using a habituation threshold, as done in Eckert et al., Current Biology 34, 5646–5658 (2024), we can state that the system is habituated after a few stimuli for each set of parameters. This aspect is highlighted in the revised version of the manuscript (see also the point above).

The same is true for the information content (Figure 2f) - already at the first pulse, IU, H ~ 0.7 and only negligibly increases afterwards. In my understanding, during learning, the mutual information between the input and the internal state increases over time and the system extracts from these predictions about its responses. In the model presented by the authors, it seems the system already carries information about the environment which hardly changes with repeated stimulus presentation. The complexity of the signal is also limited, and it is very hard to clarify from the presented results, whether the proposed model can actually explain basic features of habituation, as mentioned above.

As for the response decrement of the readout, we can certainly choose a set of parameters for which the information gain is higher. In the revised version, we also report the information at the first stimulation and when the system is habituated to give a better idea of the range of these quantities. At any rate, as the referee correctly points out, it is difficult to give an intuitive interpretation of the information in our minimal model.

It is also important to remark that, since the readout population and the receptor both undergo fast dynamics (with appropriate timescales as discussed in the text), we are not observing the transient gain of information associated with the first stimulus. As such, the mutual information presents a discontinuous behavior that resembles the dynamics of the readout, thereby starting at a non-zero value already at the first stimulus.

Additionally, there have been two recent models on habituation and I strongly suggest that the authors discuss their work in relation to recent works (bioRxiv 2024.08.04.606534; arXiv:2407.18204).

We thank the reviewer for pointing out these relevant references. In the revised version, we highlighted that we discuss the information-theoretic aspects of habituation, while the aforementioned references focus on the dynamics of this phenomenon.

Reviewer #1 (Recommendations for the authors):

I would also like to note here the simplification of the proposed biological model - in particular, that the receptor can be in an active/passive state, as well as proposing the Nf-kB signaling module as a possible molecular realization. Generally, a large number of cell surface receptors including RTKs of GPCRs have much more complex dynamics including autocatalytic activation that generally leads to bistability, and the Nf-kB has been demonstrated to have oscillatory even chaotic dynamics (works of Savas Tsay, Mogens Jensen and others). Considering this, the authors should at least discuss under which conditions these TNF-Alpha signaling could potentially serve as a molecular realisation for habituation.

We thank the reviewer for bringing this to our attention. In the previous version, we reported the TNF signaling network only to show a similar coarse-grained modular structure. However, following a suggestion of reviewer #2, we decided to change Figure 1 to include a simplified molecular scheme of chemotaxis rather than TNF signaling, to avoid any source of confusion about this issue.

Also, a minor point: Figures 2d-e are cited before 2a-c.

We apologize for the oversight. The structure of the Figures and their order is now significantly different, and they are now cited in the correct order.

Reviewer #2 (Public review):

In this study, the authors aim to investigate habituation, the phenomenon of increasing reduction in activity following repeated stimuli, in the context of its information-theoretic advantage. To this end, they consider a highly simplified three-species reaction network where habituation is encoded by a slow memory variable that suppresses the receptor and therefore the readout activity. Using analytical and numerical methods, they show that in their model the information gain, the difference between the mutual information between the signal and readout after and before habituation, is maximal for intermediate habituation strength. Furthermore, they demonstrate that the Pareto front corresponds to an optimization strategy that maximizes the mutual information between signal and readout in the steady state, minimizes some form of dissipation, and also exhibits similar intermediate habituation strength. Finally, they briefly compare predictions of their model to whole-brain recordings of zebrafish larvae under visual stimulation.

The author's simplified model might serve as a solid starting point for understanding habituation in different biological contexts as the model is simple enough to allow for some analytic understanding but at the same time exhibits all basic properties of habituation in sensory systems. Furthermore, the author's finding of maximal information gain for intermediate habituation strength via an optimization principle is, in general, interesting. However, the following points remain unclear or are weakly explained:

We thank the reviewer for deeming our work interesting and for considering it a solid starting point for understanding habituation in biological systems.

(1) Is it unclear what the meaning of the finding of maximal information gain for intermediate habituation strength is for biological systems? Why is information gain as defined in the paper a relevant quantity for an organism/cell? For instance, why is a system with low mutual information after the first stimulus and intermediate mutual information after habituation better than one with consistently intermediate mutual information? Or, in other words, couldn't the system try to maximize the mutual information acquired over the whole time series, e.g., the time series mutual information between the stimulus and readout?

This is a delicate aspect to discuss and we thank the referee for the comment. In the revised version, we report information gain, initial and final information, highlighting that both gain and final information are higher in regions where habituation is present. They have qualitatively similar behavior and highlight a clear information-theoretic advantage of this dynamical phenomenon. An important point is that, to determine the optimal Pareto front, we consider a prolonged stimulus and its associated steady-state information. Therefore, from the optimization point of view, there is no notion of “information gain” or “final information”, which are intrinsically dynamical quantities. As a result, the fact that optimal curve lies in the region of optimal information gain is a-priori not expected and hints at the potential crucial role of this feature. In the revised version, we elucidate this aspect with several additional analyses.

We would like to add that, from a naive perspective, while the first stimulation will necessarily trigger a certain (non-zero) mutual information, multiple observations of the same stimulus have to reflect into accumulated information that consequently drives the onset of observed dynamical behaviors, such as habituation.

(2) The model is very similar to (or a simplification of previous models) for adaptation in living systems, e.g., for adaptation in chemotaxis via activity-dependent methylation and demethylation. This should be made clearer.

We apologize for having missed this point. Our choice has been motivated by the fact that we wanted to avoid confusion between the usual definition of (perfect) adaptation and habituation. However, we now believe that this is not the case for the revised manuscript, and we now include chemotaxis as an example in Figure 1.

(3) It remains unclear why this optimization principle is the most relevant one. While it makes sense to maximize the mutual information between stimulus and readout, there are various choices for what kind of dissipation is minimized. Why was δQR chosen and not, for instance, Σ˙int or the sum of both? How would the results change in that case? And how different are the results if the mutual information is not calculated for the strong stimulation input statistics but for the background one?

We thank the reviewer for the suggestion. We agree that a priori, there is no reason to choose \delta Q_R or a function of the internal energy flux J_int (that, in the revised version, we are using in place of \dot\Sigma_int following the suggestion of reviewer #3). The rationale was to minimize \delta Q_R since this dissipation is unavoidable and stems from the presence of the storage inhibiting the receptor through the internal pathway. Indeed, considering the existence of two different pathways implementing sensing and feedback, the presence of any input will result in a dissipation produced by the receptor. This energy consumption is reflected in \delta Q_R.

In the revised version, we now include in the optimization principle two energy contributions (see Eq. (14) of the revised manuscript): \delta Q_R and E_int, which is the energy consumption associated with the driven storage production per unit energy. All Figures have been updated accordingly. The results remain similar, as \delta Q_R still represents the main contribution, especially at high \beta.

Furthermore, in the revised version, we include examples of the Pareto optimization for different values of input strength. As detailed both in the main text and the Supplementary Information, changing the value of ⟨H⟩ moves the Pareto frontier in the (\beta, \sigma) space, since the signal needs to be strong enough for the system to distinguish it from the intrinsic thermal noise (controlled by beta). We also show that if the system is able to tune the inhibition strength \kappa, the Pareto frontiers at different ⟨H⟩ collapse into a single curve. This shows that, although the values of, e.g., the mutual information, depend on ⟨H⟩, the qualitative behavior of the system in this regime is effectively independent of it. We also added more details about this in the Supplementary Information.

(4) The comparison to the experimental data is not too strong of an argument in favor of the model. Is the agreement between the model and the experimental data surprising? What other behavior in the PCA space could one have expected in the data? Shouldn't the 1st PC mostly reflect the "features", by construction, and other variability should be due to progressively reduced activity levels?

The agreement between data and model is not surprising - we agree on this - since the data exhibit habituation. However, we believe that the fact that our minimal model is able to capture the features of a complex neural system just by looking at the PCs, without any explicit biological details, is non-trivial. We also stress that the 1st PC only reflects the feature that captures most of the variance of the data and, as such, it is difficult to have a-priori expectations on what it should represent. In the case of the data generated from the model, most of the variance of the activity comes from the switching signal, and similar considerations can be made for the looming stimulations in the data. We updated the manuscript to clarify this point.

Reviewer #2 (Recommendations for the authors):

(1) The abstract makes it sound like a new finding is that habituation is due to a slow, negative feedback mechanism. But, as mentioned in the introduction, this is a well-known fact.

We agree with the reviewer. We have revised the abstract.

(2) Figure 2c Why does the range of Delta Delta I_f include negative values if the corresponding region is shaded (right-tilted stripes)?

The negative values in the range are those attained in the shaded region with right-tilted stripes. We decided to include them in the colorbar for clarity, since Delta Delta I_f is also plotted in the region where it attains negative values.

(3) What does the Pareto front look like if the optimization is done for input statistics given by ⟨H⟩_min?

In the revised version, we include examples of the Pareto optimization for different values of input strength. As detailed both in the main text and the Supplementary Information, changing the value of ⟨H⟩ moves the Pareto frontier in the (\beta, \sigma) space, since the strength of the signal is crucial for the system to discriminate input and thermal noise (see also the answers above).

In particular, in Figure 4 we explicitly compare the results of the Pareto optimization (which is done with a static input of a given statistics) with the dynamics of the model for different values of ⟨H⟩ in two scenarios, i.e., adaptive and non-adaptive inhibition strength (see answers above for details).

We also remark that ⟨H⟩_min represents the background signal that the system is not trying to capture, which is why we never used it for optimization.

(4) From the main text, it is rather difficult to understand how the comparison to the experimental data was performed. How was the PCA done exactly? What are the "features" of the evoked neural response?

The PCA on data is performed starting from the single-neuron calcium dynamics. To perform a far comparison, we reconstruct a similar but extremely simplified dynamics using our model as explained in Methods to perform the PCA on analogous simulated data. We added a comment on this in the revised version. While these components capture most of the variance in the data, their specific interpretation is usually out of reach and we believe that it lies beyond the scope of this theoretical work. We also remark that the model does not contain all these biological details - a strong aspect in our opinion - and, as such, it cannot capture specific biological features.

Reviewer #3 (Public review):

The authors use a generic model framework to study the emergence of habituation and its functional role from information-theoretic and energetic perspectives. Their model features a receptor, readout molecules, and a storage unit, and as such, can be applied to a wide range of biological systems. Through theoretical studies, the authors find that habituation (reduction in average activity) upon exposure to repeated stimuli should occur at intermediate degrees to achieve maximal information gain. Parameter regimes that enable these properties also result in low dissipation, suggesting that intermediate habituation is advantageous both energetically and for the purpose of retaining information about the environment.

A major strength of the work is the generality of the studied model. The presence of three units (receptor, readout, storage) operating at different time scales and executing negative feedback can be found in many domains of biology, with representative examples well discussed by the authors (e.g. Figure 1b). A key takeaway demonstrated by the authors that has wide relevance is that large information gain and large habituation cannot be attained simultaneously. When energetic considerations are accounted for, large information gain and intermediate habituation appear to be a favorable combination.

We thank the reviewer for this positive assessment of our work and its generality.

While the generic approach of coarse-graining most biological detail is appealing and the results are of broad relevance, some aspects of the conducted studies, the problem setup, and the writing lack clarity and should be addressed:

(1) The abstract can be further sharpened. Specifically, the "functional role" mentioned at the end can be made more explicit, as it was done in the second-to-last paragraph of the Introduction section ("its functional advantages in terms of information gain and energy dissipation"). In addition, the abstract mentions the testing against experimental measurements of neural responses but does not specify the main takeaways. I suggest the authors briefly describe the main conclusions of their experimental study in the abstract.

We thank the reviewer for raising this point. In the revised version, we have changed the abstract to reflect the reviewer’s points and the new structure and results of the manuscript.

(2) Several clarifications are needed on the treatment of energy dissipation.

- When substituting the rates in Eq. (1) into the definition of δQ_R above Eq. (10), "σ" does not appear on the right-hand side. Does this mean that one of the rates in the lower pathway must include σ in its definition? Please clarify.

We apologize to the reviewer for this typo. Indeed, \sigma sets the energy scale of feedback and, as such, it appears in the energetic driving given by the feedback on the receptor, i.e., in Eq. (1) together with \kappa. This typo has been corrected in the revised manuscript, and all subsequent equations are consistent.

- I understand that the production of storage molecules has an associated cost σ and hence contributes to dissipation. The dependence of receptor dissipation on ⟨H⟩, however, is not fully clear. If the environment were static and the memory block was absent, the term with ⟨H⟩ would still contribute to dissipation. What would be the nature of this dissipation?

In the spirit of building a paradigmatic minimal model with a thermodynamic meaning, we considered H to act as an external thermodynamic driving. Since this driving acts on a different pathway with respect to the one affected by the storage, the receptor is driven out of equilibrium by its presence.

By eliminating the memory block, we would also be necessarily eliminating the presence of the pathway associated with the storage effect (“internal pathway” in the manuscript), since its presence is solely due to the existence of a storage population. Therefore, in this case, the receptor would be a 2-state, 1-pathway system and, as such, it would always satisfy an effective detailed balance. As a consequence, the definition of \delta Q_R reported in the manuscript would not hold anymore and the receptor would not exhibit any dissipation. Thus, in a static environment and without a memory block, no receptor dissipation would be present. We would also like to stress that our choice to model two different pathways has been motivated by the observation that the negative feedback acts along a different pathway in several biochemical and biological examples. We made some changes to the model description in the revised version and we hope that this aspect has been clarified.

- Similarly, in Eq. (9) the authors use the ratio of the rates Γ_{s → s+1} and Γ_{s+1 → s} in their expression for internal dissipation. The first-rate corresponds to the synthesis reaction of memory molecules, while the second corresponds to a degradation reaction. Since the second reaction is not the microscopic reverse of the first, what would be the physical interpretation of the log of their ratio? Since the authors already use σ as the energy cost per storage unit, why not use σ times the rate of producing S as a metric for the dissipation rate?

We agree with the referee that the reverse reaction we considered is not the microscopic reverse of the storage production. In the case of a fast readout population, we employed a coarse-grained view to compute this entropy production. To be more precise, we gladly welcomed the referee’s suggestion in the revised version and modified the manuscript accordingly. As suggested, we now employ the energy flux associated with the storage production to estimate the internal dissipation (see new Fig. 3).

In the revised version, we also use this quantity in the optimization procedure in combination with \deltaQ_R (see new Fig. 4) to have a complete characterization of the system’s energy consumption. The conclusions are qualitatively identical to before, but we believe that now they are more solid from a theoretical perspective. For this important advance in the robustness and quality of our work, we are profoundly grateful to the referee.

(3) Impact of the pre-stimulus state. The plots in Figure 2 suggest that the environment was static before the application of repeated stimuli. Can the authors comment on the impact of the pre-stimulus state on the degree of habituation and its optimality properties? Specifically, would the conclusions stay the same if the prior environment had stochastic but aperiodic dynamics?

The initial stimulus is indeed stochastic with an average constant in time and mimics the background (small) signal. We apply the (strong) stimulation when the system already reached a stationary state with respect to the background. As it can be appreciated in Fig. 2 of the revised version, the model response depends on the pre-stimulus level, since it sets the storage concentration before the stimulation arrives and, as such, the subsequent habituation dynamics. This dependence is important from a dynamical perspective. The information-theoretic picture has been developed, as said above, by letting the system relax before the first stimulus. This eliminates this arbitrary dependence and provides a clearer idea of the functional advantages of habituation. Moreover, the optimization procedure is performed in a completely different setting, with no pre-stimulus at all, since we only have one prolonged stimulation. We hope that the revised version is clearer on all these points.

(4) Clarification about the memory requirement for habituation. Figure 4 and the associated section argue for the essential role that the storage mechanism plays in habituation. Indeed, Figure 4a shows that the degree of habituation decreases with decreasing memory. The graph also shows that in the limit of vanishingly small Δ⟨S⟩, the system can still exhibit a finite degree of habituation. Can the authors explain this limiting behavior; specifically, why does habituation not vanish in the limit Δ⟨S⟩ -> 0?

We apologize for the lack of clarity and we thank the reviewer for spotting this issue. In Figure 4 (now Figure 5 in the revised manuscript) Δ⟨S⟩ is not exactly zero, but equal to 0.15% at the final point. It appeared as 0% in the plot due to an unwanted rounding in the plotting function that we missed. This has been fixed in the revised version, thank you.

Reviewer #3 (Recommendations for the authors):

(1) Page 2 | "Figure 1b-e" should be "Figure 1b-d" since there is no panel (e) in Figure 1.

(2) Figure 1a | In the top schematic, the symbol "k" is used, while in the rest of the text, the proportionality constant is denoted by κ.

We thank the reviewer for pointing this out. Figure 1 has been revised and the panels are now consistent. The proportionality constant (the inhibition strength) has also been fixed.

(3) Figure 1a | I find the upper part of the schematic for Storage hard to perceive. I understand the lower part stands for the degradation reaction for storage molecules. The upper part stands for the synthesis reaction catalyzed by the readout population. I think the bolded upper arrow would explain it sufficiently well; the left/right arrows, together with the crossed green circle make that part of the figure confusing. Consider simplifying.

We decided to remove the left/right arrows, as suggested by the reviewer, as we agree that they were unnecessarily complicating the schematic. We hope that the revised version will be easier to understand.

(4)Page 3 | It would be helpful to tell what the temporal statistics of the input signal $p_H(h,t)$ is, i.e. <h(t) h(t')>. Looking at the example trajectory in Figure 1a, consecutive signal values do not seem correlated.

We agree with the reviewer that this is an important detail and worth mentioning. We now explicitly state that consecutive values are not correlated, for simplicity.

(5)Figure 2 | I believe the label "EXTERNAL INPUT" refers to the *average* external input, not one specific realization (similar to panels (d) and (e) that report on average metrics). I suggest you indicate this in the label, or, what may be even better, add one particular realization of the stochastic input to the same graph.

We thank the reviewer for spotting this. We now write that what we show is the average external signal. We prefer this solution rather than showing a realization of the stochastic input, since it is more consistent with the rest of the plots, where we always show average quantities. We also note that Figure 2 is now Figure 3 in the revised manuscript.

(6)Figure 2d | The expression of Δ⟨U⟩ is the negative of the definition in Eq. (5). It should be corrected.

In the revised version, both the definitions in Figure 2 (now Figure 3) and in the text (now Eq. (11)) are consistent.

(7) Figure 3(d-e) caption | "where ⟨U⟩ starts to be significantly smaller than zero." There, it should be Δ⟨U⟩ instead of ⟨U⟩.

Thanks again, we corrected this typo.

Associated Data

    This section collects any data citations, data availability statements, or supplementary materials included in this article.

    Data Citations

    1. Matteo B, Marco DM, Giorgio N. 2025. Habituation during visual stimulation in zebrafish brain activity. Zenodo. [DOI]

    Supplementary Materials

    MDAR checklist

    Data Availability Statement

    The data to produce Figure 6 have been deposited on Zenodo and are accessible through the following link: https://doi.org/10.5281/zenodo.15683642.

    The following dataset was generated:

    Matteo B, Marco DM, Giorgio N. 2025. Habituation during visual stimulation in zebrafish brain activity. Zenodo.


    Articles from eLife are provided here courtesy of eLife Sciences Publications, Ltd

    RESOURCES