(A) History dependence R is estimated from the observed joint statistics of current spiking in a small time bin [t + Δt) (dark grey) and the embedded past, i.e. a binary sequence representing past spiking in a past window [t − T, t). We systematically vary the number of bins d and bin sizes for fixed past range T. Bin sizes scale exponentially with bin index and a scaling exponent κ to reduce resolution for spikes farther into the past. (B) The joint statistics of current and past spiking are obtained by shifting the past range in steps of Δt and counting the resulting binary sequences. (C) Finding a good choice of embedding parameters (e.g. embedding dimension d) is challenging: When d is chosen too small, the true history dependence R(T) (dashed line) is not captured appropriately (insufficient embedding) and underestimated by estimates (blue solid line). When d is chosen too high, estimates are severely biased and R(T, d), as well as R(T), are overestimated (biased regime). Past-embedding optimization finds the optimal embedding parameter d* that maximizes the estimated history dependence subject to regularization. This yields a best estimate of R(T) (blue diamond). (D) Estimation of history dependence R(T) as a function of past range T. For each past range T, embedding parameters d and κ are optimized to yield an embedding-optimized estimate . From estimates , we obtain estimates and of the information timescale τR and total history dependence Rtot (vertical and horizontal dashed lines). To compute we average estimates in an interval [TD, Tmax], for which estimates reach a plateau (vertical blue bars, see Materials and methods). For high past ranges T, estimates may decrease because a reliable estimation requires past embeddings with reduced temporal resolution.