Skip to main content
Wiley Open Access Collection logoLink to Wiley Open Access Collection
. 2022 Jun 9;103(8):e3718. doi: 10.1002/ecy.3718

A real‐time data assimilative forecasting system for animal tracking

Marine Randon 1,, Michael Dowd 2, Ruth Joy 1,3
PMCID: PMC9541799  PMID: 35405019

Abstract

Monitoring technologies now provide real‐time animal location information, which opens up the possibility of developing forecasting systems to fuse these data with movement models to predict future trajectories. State‐space modeling approaches are well established for retrospective location estimation and behavioral inference through state and parameter estimation. Here we use a state‐space model within a comprehensive data assimilative framework for probabilistic animal movement forecasting. Real‐time location information is combined with stochastic movement model predictions to provide forecasts of future animal locations and trajectories, as well as estimates of key behavioral parameters. Implementation uses ensemble‐based sequential Monte Carlo methods (a particle filter). We first apply the framework to an idealized case using a nondimensional animal movement model based on a continuous‐time random walk process. A set of numerical forecasting experiments demonstrates the workflow and key features, such as the online estimation of behavioral parameters using state augmentation, the use of potential functions for habitat preference, and the role of observation error and sampling frequency on forecast skill. For a realistic demonstration, we adapt the framework to short‐term forecasting of the endangered southern resident killer whale (SRKW) in the Salish Sea using visual sighting information wherein the potential function reflects historical habitat utilization of SRKW. We successfully estimate whale locations up to 2.5 h in advance with a moderate prediction error (<5 km), providing reasonable lead‐in time to mitigate vessel–whale interactions. It is argued that this forecasting framework can be used to synthesize diverse data types and improve animal movement models and behavioral understanding and has the potential to lead to important advances in movement ecology.

Keywords: animal movement, continuous‐time correlated random walk, data assimilation, ecological forecasting, particle filter, potential function, southern resident killer whale, state augmentation, state‐space models, trajectory prediction, whale collision avoidance

INTRODUCTION

Understanding ecological processes relies on our ability to make predictions and confront them with observations to refine hypotheses and theories. This is also the essence of the emerging field of ecological forecasting, which has arisen due to the many new data types becoming available. Ecological forecasting differs from standard statistical projection methods by its iterative nature and its reliance on dynamic models. The central idea is to generate forecasts of a future ecological state using dynamic models of ecological processes, compare the predictions to observations, and then refine hypotheses and models to improve predictive skill (Dietze et al., 2018). The focus on forecasting shifts the emphasis to the iterative refinement of ecological dynamic models, as well as to identifying key observational needs, thereby driving understanding and advancement of the ecological sciences.

Dynamical ecological forecasting is distinct from forecasting via statistical prediction. The former is based on using mechanistic or process‐based models to project an ecological system forward in time, whereas the latter is based on using the established statistical models together with forecasts of their key environmental predictors. For instance, correlative species distributions project future animal distributions according to forecasts of their environmental drivers (e.g., Barlow & Torres, 2021; Breece et al., 2021). This type of model is being used increasingly for managing human–wildlife conflicts in real time for the purpose of limiting the probability of encounter. However, statistical predictions have a limited ability to incorporate ecological processes and dynamics (Yates et al., 2018), and they rely upon existing conditions, which may or may not hold in the future. Hence, transitioning from empirical to dynamical models may lead to better ecological forecasting (Payne et al., 2017). Dynamics‐based ecological forecasting relies on accurate initial conditions and a useful mathematical description of processes that can project the ecological system state into the future. Contrary to correlation‐based forecasts, dynamics‐based forecasts are process‐based and can actively learn from real‐time observations when embedded in a data assimilative framework (Kitagawa, 1998). They can thus adapt to changing environmental conditions and structural ecological changes. Dynamics‐based forecasting has not been extensively applied in ecology (Dowd et al., 2014; Payne et al., 2017), especially in the field of animal movement; doing so constitutes the purpose of this paper.

New technologies for animal tracking (e.g., satellite tags, acoustic and electromagnetic detection) and communication networks (e.g., reporting apps) yield real‐time information that has improved our understanding of movement ecology (Wall et al., 2014; Williams et al., 2020). Retrospective analyses of such tracking data has led to the development of sophisticated fit‐for‐purpose statistical approaches, usually based on state‐space models (SSMs) (Hooten et al., 2017; Patterson et al., 2017). SSMs combine a statistical model of observations (i.e., a measurement model) with a dynamic process model (i.e., a movement model). The central goal is to estimate the system state (i.e., the unobserved animal locations) (Auger‐Méthé et al., 2020), but SSMs also can be used to determine system parameters linked to behavioral dynamics (Dowd & Joy, 2011; Kitagawa, 1998).

SSMs can ingest and synthesize various sources of location information (e.g., tags, telemetry, visual or acoustic detections) (Patterson et al., 2017) and make use of increasingly sophisticated movement models (McClintock et al., 2017; Michelot et al., 2021). Consequently, SSMs are well adapted for ecological forecasting (Dietze et al., 2018; Dowd et al., 2014). Forecasting shifts the emphasis to predictive skill, which is distinct from retrospective model fitting that focuses on location in‐filling and estimation of behavioral parameters and states. Forecasting strongly depends on having good movement models, which in turn requires an understanding of ecological processes. A prediction system enhances this by placing an emphasis on refining the model structure, estimating its parameters, incorporating environmental features, and allowing the model to adaptively learn from tracking data (Dietze et al., 2018; Payne et al., 2017).

Practical goals for studying the real‐time location of animals and forecasting their future trajectories and locations include management and conservation objectives, especially for at‐risk species. Forecasting systems may facilitate proactive management and increase the efficiency of mitigation measures by limiting the probability of human–wildlife conflicts in time and space (e.g., animal–vehicle or animal–vessel collisions, animal incursions into sensitive areas) (Gervaise et al., 2021; Wall et al., 2014). This study proposes a general framework for an animal forecasting system that provides real‐time fusion of location data with a movement model to yield probabilistic forecasts of animal location and key behavioral parameters. This iterative forecasting system uses state‐space models and ensemble methods. It follows the data assimilation (DA) cycle, alternating between a prediction step (i.e., forecast) using a process model, followed by an observation update (i.e., nowcast) using real‐time observations (Dowd et al., 2014). An idealized nondimensional example highlights the major features of the forecasting system, including the use of a stochastic movement model, real‐time DA, state augmentation to estimate behavioral parameters, potential functions to incorporate habitat preference, and the evaluation of forecast skill. A realistic demonstration is then undertaken for short‐term prediction for the endangered population of southern resident killer whales (SRKWs), Orcinus orca, in the Salish Sea off southern British Columbia and northern Washington state, with the aim of mitigating disturbance from commercial shipping traffic (McWhinnie et al., 2021).

METHODS

General framework

State‐space model

SSMs are a general framework that couples a process model to a measurement model:

xt=dxt1θtZt+wt, (1)
yt=hxt+εt, (2)

where x t is the state of the system (e.g., the animal location), and y t represents observations (e.g., error‐prone location measurements) at time t. The process Equation (1) represents the dynamics (e.g., an animal movement model), where x t depends on its value at the previous time, x t−1, a set of parameters, θ t , and a set of covariates, Z t . Note that parameters and covariates may or may not be time dependent. The stochastic error or forcing, w t , is assumed to be additive, but it could be multiplicative. The functional form of the model is embodied in the d(∙) operator and reflects time‐dependent ecological dynamics. The measurement Equation (2) relates observations y t to the state x t through the measurement operator h(∙). Direct observation of the state implies that h(∙) is the identity operator. The observation error term is given by ε t . The goal of the basic state‐space model is online (real‐time) estimate of the state, x t , using observations, y t , for t = 1, … , T, with all other quantities known or specified. Parameters of the system can also be estimated online by the technique of state augmentation (see State augmentation ).

Data assimilation

The aim of our prediction system is to provide online estimates of the current location of an animal (a nowcast) and short‐term predictions of future locations (a forecast).1 Sequential state estimation follows the DA cycle (Dowd et al., 2014). Figure 1 shows a schematic of this procedure. It describes the transition of the system from one time to the next (with the understanding that this is part of a continuously operating real‐time sequential estimation). We assume that the probabilistic location nowcast is available at time t−1 and given by xt1y1:t1, where [∙] designates a probability density function and y1:t1 are the location observations from time 1 to time t − 1 inclusive. A one‐step‐ahead forecast is undertaken to transition the system from time t − 1 to time t, yielding the animal location forecast xt|y1:t1. This is done by applying the movement model given by Equation (1) using the nowcast as the initial condition. Note that n‐step‐ahead forecasts can also be produced to yield future predictions of animal locations on longer time horizons (Figure 1). Next, location observations, y t , may become available at time t. If so, the assimilation step statistically blends location forecasts with the new observations yielding the nowcast at time t, or xt|y1:t. This probabilistic observational update is based on Bayesian principles treating the forecast as a prior and using the likelihood of the new observation. The procedure can continue indefinitely through time, cycling between movement forecasts and assimilation steps. It is initialized at time 0 with a location density [x 0]. In practice, prediction and observation updates are carried out in an ensemble framework wherein samples (or particles) are used to represent the target nowcast and forecast densities. Specifically, forecasting (one‐step or n‐steps ahead) is based on ensemble prediction using Equation (1), and assimilation is carried out with a particle filter (see Particle filter ).

FIGURE 1.

FIGURE 1

Schematic of DA cycle used for animal prediction system. It shows a single‐stage transition of this probabilistic system from time t − 1 to t and how it toggles between movement model forecasts and particle‐filter‐based assimilation of incoming observations (see Data assimilation for further details). Light and dark blue dots represent ensemble members (particles) at the nowcast and forecast steps, respectively. Red dots are the location observations, and red circles correspond to measurement errors. Assimilation and forward model prediction are symbolized by A and M, respectively.

Particle filter

The particle filter is a sampling‐based solution algorithm for sequential DA. The DA cycle is divided into two steps: (i) forecasting and (ii) observation update. Suppose we are at time t – 1 and have a sample from the nowcast distribution, xt1|y1:t1. We designate this sample of size N as xt1t1ii=1N, where i identifies a sample member, or particle. The standard particle filter algorithm (sequential importance resampling) (Gordon et al., 1993) proceeds as follows:

  1. Prediction: Apply the movement model given by Equation (1) for one‐step‐ahead prediction to each ensemble member of the nowcast xt1t1ii=1N

xtt1i=dxt1t1iθtZt+wti,fori=1,,N, (3)

with wti an independent realization of the system noise. This yields the forecast ensemble xtt1ii=1N, which is a draw from xty1:t1.

  • 2

    Observation update: Carry out weighted resampling of the forecast ensemble xtt1ii=1N using observation y t at time t. The weights are based on the likelihood [y t | x t ] determined from Equation (2) and computed as

Wtipytxtt1i,fori=1,,N, (4)

where Wti is the weight given the ith particle. The weights are normalized so they sum to one. A weighted bootstrap (resampling with replacement) of xtt1ii=1N is carried out to yield the nowcast ensemble xttii=1N at time t, which is a draw from xt|y1:t.

This single‐state recursive transition of the system from time t − 1 to t is carried on sequentially through time by predicting forward with the model and assimilating new observations. An initial condition for the state, [x 0], must be specified as an initial ensemble x0ii=1N.

State augmentation

In the context of animal movement, estimation of key parameters may be important for representing underlying ecological processes and improving prediction. State augmentation appends such parameters to the original state vector so that the augmented state is x˜t=xtθtT and, therefore, includes both the geographical location and parameters of interest. This allows for simultaneous estimates of states and time‐varying parameters. Specifically, following Kitagawa (1998), the original process model given by Equation (1) is transformed to the augmented one,

xtθt=dxt1θtZtθt1+wtνt, (5)

where the parameter θ t varies as a random walk with a disturbance term νt. The augmented measurement equation is a trivial alteration of Equation (2) to reflect the fact that the state, but not the parameters, is observable. Most importantly, since the augmented state‐space model is the same general form as the usual state model given by Equation (1) and (2), it can be estimated using standard sequential Monte Carlo methods, such as the particle filter. More sophisticated algorithms based on state augmentation are available if static parameter estimation is the goal (Ionides et al., 2011).

Idealized case

The aim of the idealized simulation experiments is twofold: (i) to demonstrate the general workflow and implementation of real‐time DA in the context of animal movement forecasting and (ii) to provide a concrete illustration of key features such as state augmentation (for estimating behavioral parameters) and potential functions (for incorporating the environment through habitat preference), as well as forecast skill assessment.

Movement model

We choose a specific animal movement model corresponding to the general process model given by Equation (1). This takes the form of a continuous‐time correlated random walk, a reasonably sophisticated and well‐used stochastic model in the random walk family (Johnson et al., 2008). This movement model also forms the basis for the application found in the section Application: southern resident killer whales , and so it additionally serves to introduce its major features. At its core is an Orstein–Ulhenbeck process for animal velocity (Russell et al., 2018):

dVt=1τμtVtdt+σdWt, (6)

where V t is the velocity at time t, τ is a memory time scale parameter, μ t is a time‐dependent drift term, and σ is the scale factor for the Wiener process W t . The application is two‐dimensional (2D) and defined in the horizontal plane. For implementation, we numerically integrate Equation (6) using the Euler–Maruyama approximation method (Kloeden & Platen, 2013), which yields the stochastic difference equation

Vt=1ϕtμt+ϕtVtΔ+wt, (7)

where Δ is the time step, ϕt=1Δ/τ is a time‐varying velocity persistence, and wtN0σw2I is bivariate white noise forcing, with σw=σΔ and I the identity matrix. It is straightforward to use this discrete‐time model to generate realizations vt from the probabilistic velocity process V t for any arbitrarily small Δ. As a general rule, with shorter time steps, the approximation Equation (7) of Equation (6) is more accurate and the trajectory smoother and more continuous. To obtain the horizontal animal position x t , we integrate the velocity ν t by summing its increments:

xt=xtnΔ+Δi=1nVtiΔ, (8)

where xtnΔ is the animal position at a time nΔ before the present time t. The time history of ν t is obtained from Equation (7).

The time‐dependent drift term μ t in Equation (7) is an externally imposed velocity perturbation, or forcing, due to exogenous environmental conditions. The drift term, μ t , is determined using a potential field approach (Brillinger et al., 2012). Its value depends on an animal's current location such that the local gradient of the potential function influences the magnitude and directional tendency of the animal's movement (Russell et al., 2018). This potential function is a mixture of a Gaussian and a parabola (Figure 2a,e). It is isotropic, resembles a Gaussian near the origin, and decreases as a parabola far from the origin. The drift term, μ t , is proportional to its local gradient at the animal's current location, and thus movement is steered toward higher values of the potential function. Here, we interpret the potential function as a habitat preference, or resource selection.

FIGURE 2.

FIGURE 2

Idealized example. (a, c, e, g) Observed (red dots) and predicted animal locations (blue dots are the ensemble median; light blue dots show full ensemble). The true animal track is shown (black line) along with initial position (black cross). In panels (a) and (e), the gray scale represents potential field reflecting animal's preferred habitat. Panels (a) and (c) present high‐quality location data (σε=0.1, observations every time step), as opposed to panels (e) and (g), which present lower‐quality location data (σε=0.2, observations every second time step). (b, d, f, h) Time‐varying estimates of velocity persistence parameter ϕ t . The black lines show the true persistence velocity used for computing the true track (a sine wave) and estimation results (ensemble, light blue dots; ensemble median, blue dots; fitted smooth curve, blue lines). Note that spatial coordinates and time vectors are nondimensional.

The term ϕ t in Equation (7) is a behavioral parameter that encapsulates the tendency of an animal to move in the same direction, in other words, its autocorrelation properties (Russell et al., 2018). Note that ϕ t in Equation (7) also acts as a weighting factor and so ranges from 0 to 1. As ϕ t  → 0, the drift term μ t dominates, with animal behavior resembling foraging (tortuous paths); when ϕ t  → 1, the velocity process tends to a first‐order autoregressive process, and the behavior resembles transiting (directed paths). This behavioral parameter has two notable features: (i) ϕ t is a time‐dependent parameter, so it allows for a continuum of behavioral states ranging from, say, foraging to transiting, and (ii) it is estimated, along with the animal position, using a state augmented particle filter (sections “ State augmentation ” and “ Particle filter ”). Hence, online parameter estimation uses information contained in the recent history of observed location data.

The idealized movement model is further transformed to be scale independent. To do this, the model given by Equations (7) and (8) is rendered nondimensional using the following quantities. The characteristic length scale is assumed to be 2L, or two standard deviations of the Gaussian that (partly) defines the potential function. The velocity scale used is, σ w , or the standard deviation of the velocity forcing. These together imply a characteristic time scale of 2σ w /L, which can roughly be interpreted as the time it takes an animal to transverse the Gaussian part of the potential field. Using nondimensional quantities makes the application scale‐free and, thus, applicable to organisms from viruses to whales.

Numerical experiments

In this section, idealized scenarios are presented to illustrate the implementation, features, and properties of the forecasting system. The idea is to vary the accuracy (observation error) and availability (sampling rate) of the animal location data, as well as to illustrate the use of habitat preference through potential functions. These simulation experiments are based on realizations of the movement model, described in the previous section titled Movement model , which provides the known true positions x t . The true time‐varying velocity persistence behavioral parameter, ϕ t , followed a sinusoid (Figure 2b,d,f,h). Synthetic observations y t are created by adding an observation error, εtN0σεI, to the estimated true positions x t following Equation (2), with h(∙) being the identity operator. Simulated tracks are selected that show a clear overlap between the foraging behavior (i.e., small velocity persistence) and the highest values of the potential field to mimic a habitat preference corresponding to a foraging or resting area.

Four simulation experiments, or scenarios, are considered: (1) low observation error, high sampling rate, drift; (2) low observation error, high sampling rate, no drift; (3) high observation error, low sampling rate, drift; and (4) high observation error, low sampling rate, no drift. Two realizations are used to generate the true positions: one for the drift case (i.e., using a potential function), and one for the no‐drift case. We run nowcast and forecast scenarios for each experiment and estimate the location state, along with time‐varying parameter ϕ t . Our particle filter algorithm uses ensembles to yield probabilistic estimates of locations and time‐varying parameters. For our idealized case, we use N = 100 particles, or ensemble members. A prototype R code (R version 4.0.3) of the forecasting system is available from Figshare: https://doi.org/10.6084/m9.figshare.17046026.v2.

The central metric for nowcast and forecast skill is the root mean square error (RMSE). It measures the discrepancy between the observed and predicted animal position as et=ytx^t, where y t represents the observed location at time t and x^t is the predicted position taken to be the median of the nowcast or forecast ensembles for time t. Then RMSE=1/qi=1qei2, with the vector norm and q the number of observations. For each of the scenarios, we computed the RMSE for the nowcast location estimate and for different forecast horizons, or n‐step‐ahead forecasts, with n = 1, …, 30 time units. Furthermore, to understand whether forecast skill depends on the behavioral mode of an animal (i.e., foraging implied as ϕ t  → 0, or transiting as ϕ t  → 1), we assess the relationship between the behavioral parameter ϕ t , estimated via state augmentation, and the n‐step‐ahead forecast RMSE using Simulation 1 (i.e., low observation error, high sampling rate, drift).

Application: southern resident killer whales

We illustrate and adapt the general framework to the specific problem of nowcasting and forecasting endangered SRKW pod trajectories in the Salish Sea. This population (clan) is composed of three stable matriarchal social groups, termed J, K, and L pods, each having a tendency to move as a coherent group. Hence, our application is designed to track pods, not individuals. We focus on J pod, the most observed pod in the Salish Sea during summer (Olson et al., 2018). The idealized movement model and DA systems outlined in the previous section provides the basis for the SRKW application, and we outline the specific implementation details in what follows.

Movement model

The whale movement model follows the model formulation described previously in Idealized case: Movement model , but it is dimensional. For our particular application, we fix the model time step Δ = 5 min to provide for accurate numerical implementation. This is also taken to be the time scale for the DA cycle, meaning model output can match the times of available SRKW observations. The movement model is implemented for the spatial domain associated with the Salish Sea (a portion of which is shown in Figure 4). The main modification is to incorporate SRKW avoidance of land and shallow waters <5 m. This is done in the movement model forecast step wherein ensemble members that fall on land or shallow water are removed.

FIGURE 4.

FIGURE 4

Assimilation experiment. (a) Visual observations of SRKW J pod on 18 August 2016 (white symbols with red outline) and predicted whale locations (blue dots, ensemble median). Solid red symbols represent starting (10:34 AM) and ending (4:00 PM) observations for day. Letters A to N designate the chronology of these observations, with A being the first observation. The gray scale represents the whale intensity field of J pod in August, expressed in log scale (from Watson et al. [2019]). (b, c) UTM easting and northing coordinates of whale locations, including ensembles (gray dots) and their median (blue dots). Symbols denote visual sighting location observations following panel (a).

The two main control parameters in the movement model, the persistence, ϕ t , and the drift term, μ t , allow whale trajectories to mimic different movement behaviors (e.g., transiting, resting, foraging, attraction to preferred habitat). To specify the value of ϕ t , we make it part of the online estimation procedure for whale location by using a state augmented particle filter (sections “ Particle filter and “ State augmentation ”). The drift term, μ t , on the other hand, is designed to take account of SRKW historical habitat usage in the Salish Sea. Watson et al. (2019) developed the framework of a spatiotemporal point process model to create time‐indexed spatial whale intensity fields (maps) for each of pods J, K, and L. In this study, these whale intensity maps are created at a monthly resolution and define the potential functions for each month of sighting data, e.g., U m (x), m = 1, …, M. The drift term is then defined for month m as the gradient of U m at the current whale location, that is, μt=Um. This gradient acts as a force that determines the drift direction and magnitude, attracting trajectories toward areas of highest historical whale intensity. The drift term thus adds realism to simulated whale trajectories in the absence of direct location observations.

Observations

Observations of SRKW locations for this application are based on visual sighting data from the OrcaMaster database (Olson et al., 2018). These are available at irregular time intervals when SRKWs are present in the Salish Sea and take the form of real‐time opportunistic SRKW locations (to pod level, from a reporting app) during daylight hours. Like the idealized example, SRKW observations y t are assumed to have an additive error εtN0σε2I. For simplicity, we fix σε=1 km for SRKW observations. In reality, the error may be considerably more complex due to, for example, mismatches between the sighting and reporting times leading to location and timing errors, weather conditions, and observer effects. Another important issue is that SRKW pods may split or disperse, meaning our visual detections may not necessarily reflect the core distribution of the pod. For our demonstration of real‐time DA and probabilistic prediction, we selected a single 5.5 h track of J pod from 18 August 2016. The track consists of 14 observations between 10:34 AM and 4:00 PM (Pacific Daylight Time), going southward (Figure 4a). Time intervals between observations are irregular and range from 5 to 90 min.

Numerical experiments

A primary goal of this particular application is to produce a short‐term SRKW location forecast on a time scale of hours to aid in the mitigation of ship collision risk or acoustic disturbance. This time scale is considered useful by marine operations because pilots have sufficient warning to alter the pathways and speed of incoming vessels. We carry out two experiments: an assimilation experiment and a forecast experiment. For the assimilation experiment, we used all location observations from 10:34 AM to 4:00 PM (n = 14) in the DA cycle to sequentially provide online probabilistic location estimates. The forecast experiment aimed at assessing the capability of the prediction system over time horizons of interest (a few hours). We assimilated the visual observations from 10:34 AM to 12:25 PM, that is, the first n = 6 observations. We then forecast the pod locations up to 3.5 h ahead starting from 12:30 PM (corresponding to the current time) out to 4:00 PM (i.e., future times). These were then compared to the observations from 12:55 to 4:00 PM that were artificially removed from the system (not assimilated) for the purposes of validation. We computed the direct position errors (DPEs) as the discrepancy between the observations and predicted locations et=ytx^t. We represented the forecast probability density function using a kernel density estimate (KDE) of the ensemble.

RESULTS

Idealized example

Simulation experiments

The four simulation experiments that made use of DA for nowcasting are presented in Figure 2, as detailed in Methods: Idealized case: Numerical experiments . The true animal reference track starts at the top right of the 2D domain (the “X” in Figure 2a,c,e,g). The animal first moves towards the origin, consistent with the potential field and its velocity persistence. When near the origin, this section of the track corresponds to low values of ϕ t where the gradient of the potential field is small, and mimics a foraging behavior (Figure 2b,d,f,h). Finally, the animal moves toward the bottom left of the plane as ϕ t increases toward 1 and velocity persistence dominates, that is, a transiting mode. At the end of the track, the animal then loops back toward the origin as ϕ t decreases and habitat preference asserts itself.

The first two simulation experiments assimilated accurate and regular location data close to the true track (Figure 2a,c). These highly informative data led to predicted nowcast animal positions very close to the observations, and the spread of the ensemble was quite limited. No major difference was found in terms of predicted positions between the first two simulations and the predicted positions were very similar to the true track. The last two simulation experiments ingested less accurate and more irregular data spread around the true track (Figure 2e,g). These less informative data led to predicted positions that deviated noticeably from the observations, especially around the origin. There are small differences in the two cases since the movement model only used habitat preference in one case. Ensembles generally spread more widely around the median owing to the less informative location data.

The behavioral parameter, velocity persistence ϕ t , was estimated along with the location via the state augmentation procedure (Figure 2b,d,f,h). In Simulation 1, accurate and regular location data and the use of habitat preference allow for reliable estimation of the temporal pattern of the true ϕ t (Figure 2b). The ensemble spread was generally large and bigger when ϕ t decreased. The time lag between the true and estimated ϕ t was small, meaning that the DA was able to quickly learn the proper value for the behavioral parameter from the accurate and frequently available location data. In Simulation 2, despite the location data being the same, the lag was much larger. This suggests that the use of habitat preference in Simulation 1 improved online behavioral parameter estimates (this makes sense since the data were generated assuming habitat preference). In Simulations 3 and 4, recovery of ϕ t also indicated that the accuracy and regularity of location data, as well as the use of habitat preference, influenced the quality of the online behavioral parameter that could be estimated.

Forecast skill

For all simulation experiments, RMSE increased with forecast horizon, as expected (Figure 3a). Forecast models with regular, lower error observations (Simulations 1 and 2) performed better than irregular, higher error observations (Simulations 3 and 4) up to forecast time horizons of 12 time units. Therefore, accurate and regular location data provided better short‐term forecasts of animal locations. However, beyond a forecast horizon of 12 units, simulations with a potential field (intensity map) increased error forecasts rapidly and exceeded the error forecasts of Simulations 2 and 4 without a potential field. The use of habitat preference in the movement model steered the animal toward the origin (toward the higher potential function intensity). However, the second part of the observed track corresponded to the animal moving away from the center (i.e., going against the potential function). This contradictory feature of this particular realization of the observations thus produced higher long‐term forecast error. Finally, the forecast error was higher when the animal was in a transiting mode (ϕ t  1) (Figure 3b), particularly when the forecast time horizon increased.

FIGURE 3.

FIGURE 3

(a) An n‐step ahead error forecast (root mean square error [RMSE]) of idealized example. (b) Heatmap representing relationship between behavioral parameter ϕ t and n‐step ahead error forecast (RMSE) for Simulation 1 (low observation error σε=0.1, observations every time step, and drift term)

Southern resident killer whale application

Assimilation experiment

Figure 4 shows the results for the online estimation of J pod location nowcasts. The visual observations occurred at irregular time intervals over the course of the day and were, on occasion, clustered closely together in space and time, consistent with the observation error. The movement of J pod was generally to the south and veering eastward and covered about 50 km. There were variations in both speed and directional persistence, including a brief doubling back to the north off the west coast of San Juan Island between 12:25 and 12:55 PM (Figure 4a). The historical whale occupancy for August, which acted as the potential function, was highest off the southwest coast of San Juan Island indicating preferred habitat. Predicted whale locations conformed well to the observed locations at the observation times, as expected given that these nowcast estimates were sequentially corrected to be near the visual sightings as they became available.

Figure 4b,c provides a detailed view of how the DA cycle operates and performs by showing the individual components of the nowcast location state in terms of the easting (longitudinal) and northing (latitudinal) coordinates. The full ensemble that represents the estimated whale location is shown along with its median value, that is, the most likely whale location. The prediction–correction aspect of the DA procedure is evident. To specifically illustrate this key feature, consider how the system reacts to the observed temporary northward reversal in whale direction seen at 12:55 PM (G). Immediately prior to this, a forecast was made after assimilating the last observation at 12:25 PM (F). The increasing uncertainty, or spread, in the forecast ensemble is clear, reaching its maximum at 12:50 PM, just prior to the next observation. With the DA system then receiving the observation at 12:55 PM (G), the particles nearest the observation were resampled, leading to a reduction in the ensemble spread and a median estimate closer to the observation.

Overall, the movement was biased eastward and southward owing to the drift term μ t that remained mostly positive in both the longitudinal and latitudinal directions (Appendix S1: Figure S1c). More precisely, from C to K, the movement was forced eastward with a higher drift term in the longitudinal direction (Appendix S1: Figure S1c). After K, the movement was slightly biased southward and eastward corresponding to the pod entering an area of high whale intensity southwest of San Juan Island (Figure 4a; Appendix S1: Figure S1a).

Along with the whale location, the velocity persistence parameter, ϕ t , was also estimated online using state augmentation. Its value remained <0.5 for most of the time series (Appendix S1: Figure S1b), which indicated an intermediate behavior state between transiting (ϕ t tending toward 1) and resting (ϕ t tending toward 0), consistent with the general features of observed trajectory showing systematic north–south movement with some reversals. Two periods do show ϕ t  > 0.5, which may correspond to exploratory behaviors (e.g., the northward reversal from F to G, Figure 4c).

Forecast experiment

The forecast experiment demonstrated short‐term predictions on the time scale of hours. Here, the first six observations (A–F) until 12:25 PM were assimilated, with the associated whale location nowcasts mimicking the assimilation experiment (Figures 4a and 5a). After the final observation F was assimilated, the longitudinal drift decreased 68%, whereas the latitudinal drift increased 77% until observation N (Appendix S1: Figure S1a,f). Both drift components remained positive throughout the time series, which indicated slow directional movement toward the south and east, with J pod predicted to remain southwest of San Juan Island. The directed movement was in general smaller than in the assimilation experiment, and largely due to the whale intensity field (Appendix S1: Figure S1f) since no observations were assimilated to draw the whales southward. Therefore, in the absence of new observations, the pod had weak movement directionality. This might have contributed to an increase in the forecast error over the experiment's time horizon.

FIGURE 5.

FIGURE 5

Forecast experiment. (a) This follows Figure 4 except solid orange symbols represent first data point used for assimilation, and open symbols represent location observations used for n‐step ahead forecast validation (see Data assimilation for details). (b) Direct position error of forecast (km) against time. Symbols show discrepancy of ensemble median and observed location, with range being 5th and 95th percentiles of position error associated with full ensemble. (c–f) Kernel density estimates (KDE) of forecast probability density function (PDF) of whale locations up to 3.5 h ahead shown together with future observation

Forecast errors were quantified with the DPE metric (Figure 5b). The main noteworthy feature is the growing error as the forecast time horizon increases. The median DPE remained <5 km for a forecast out to 2.5 h and exceeded 10 km for the final 3.5‐h forecast in comparison to observations at these time horizons. The other key metric of forecast skill is the forecast uncertainty, here quantified by the range associated with each DPE. This range is the spread of the forecast ensemble about the future observation, or the 90% outer credible interval (i.e., the most likely region for whales). This interval increases with the forecast time horizon and shows a forecast uncertainty of about 10 km for 1‐h forecasts, 15 km for 1.5‐ to 2.5‐h forecasts, and 30 km for a 3.5‐h forecast. Figure 5c–f shows the kernel‐density estimated forecast probability density functions (PDFs) for selected observations. In general, the forecast PDF increases its spatial extent with larger forecast time horizons. It is also clear that the forecast PDF has a shape that is distinctly non‐Gaussian. For forecast horizons of 0.5, 1.5, and 2.5 h, the forecast PDF overlaps with the corresponding observations (G, J, M), but for the 3.5‐h forecast it does not overlap with the final observation N.

DISCUSSION

In this paper, we developed a statistical framework for a real‐time forecasting system for animal movement for the purpose of advancing understanding of movement ecology through adaptive learning and providing a flexible framework that could potentially be operationalized to facilitate management and conservation of animal populations. A nondimensional idealized case for generic animal movement demonstrated the key features of the system. We then presented a prototypical, but realistic, prediction system for endangered SRKWs.

Improving animal movement models is key for ecological forecasting, since forecast skill rests on a dynamic model's efficacy (Dietze et al., 2018; Payne et al., 2017). We used a stochastic movement model that is extremely flexible with respect to the animal trajectories it can generate, a salient feature of most stochastic movement models. Hence, we further constrained the movement model by integrating preferred habitat through a drift term computed as the local gradient of a potential function. The behavioral persistence parameter ϕ t acts as a weighting factor for the past velocity and the current drift component. Adaptive online learning of the persistence parameter was implemented as part of a forecasting system using state augmentation (Kitagawa, 4,191,998). The dynamical interplay between the drift and velocity meant that high values of the persistence parameter (ϕ t  → 1) could reduce the influence of the drift term, which in the absence of location observations can lead to weak directionality and, in some cases, low predictive skill. The idealized example showed that the forecast error was typically higher when the persistence was larger, that is, when an animal was transiting. Low predictive skills also arose when the observations were inconsistent with the underlying preferred habitat. Improvement will rely on better estimates of the positions and persistence parameter through the refinement of the movement model structure and the integration of high‐quality (small observation error and high sampling rate) location observations along with animal pathway information and environmental drivers.

Ecological understanding comes from a forecasting system as a byproduct of online learning, both in terms of adaptively estimating informative movement model parameters and ultimately in the iterative refinement of the movement models themselves. Forecasting emphasizes optimizing predictive skill, rather than the traditional metrics for retrospective studies (goodness of fit, cross‐validation skill). Forecasting skill is the key metric that underlies location estimates and parameter and model refinement. It depends on both the accuracy of the initial condition (nowcast) and how well the movement model represents the actual ecological dynamics. We used a basic RMSE skill metric and the DPE but recognize that other skill metrics are possible (e.g., bias, mean absolute error, threat score, Brier skill score) (Hamill & Juras, 2006), as well as comparison to basic persistence forecasts. Many of these metrics would, however, need to be adapted for probabilistic cases.

Our SRKW demonstration system shows clearly how general ecological forecasting framework can be adapted to particular setting with aim of achieving conservation goals through the fusion of movement models and observations. We made use of 1 day of visual sightings compiled to pod level, which would be, in practice, available at irregular intervals in real‐time. These opportunistic data are limited by viewing conditions (e.g., daylight, sea state), and in the near future we plan to integrate SRKW detections from passive acoustic monitoring. The statistical character of this data type is very different from the visual detection data. However, our system is flexible and can assimilate multiple complex data types through suitable specification of the measurement model. Future challenges to be addressed with respect to the SRKW measurement model includes: recording errors, the consequences of pod splitting, and detection false positives.

The use of habitat preference and the avoidance of shallow waters in our SRKW application allowed the model to bias the whale movement in the correct direction and provide moderate forecast error (5 km) up to 2.5 h. Considering that a container vessel transiting the Salish Sea moves at a median speed of 18 knots (Joy et al., 2019), the vessel would be 83 km from the median position of the whale pod, a distance well outside the envelope of error—hence our forecasting system provides reasonable lead‐in time to mitigate vessel‐whale interactions. With additional efforts to incorporate real‐time location observations along with model refinements to improve directionality, this could become an operational tool for managing SRKW in the Salish Sea. Towards this end, future information that we are considering for our target SRKW system includes: pathway information derived from visual sightings (Olson et al., 2018), prey fields (Kent et al., 2020), and habitat use models (Abrahms et al., 2019). Another promising direction is to couple the system to other environmental or biological forecasts (Payne et al., 2017), such as, for the Salish Sea, an already existing oceanographic forecasting system (e.g., Olson et al., 2020). While our proof‐of‐concept system provides the basis for an SKRW forecasting system, operationalizing it is not trivial: quality controlled, real‐time data feeds are required; ensemble DA must be robust (e.g., particle collapse); and movement models need to be further refined to incorporate environmental information.

In summary, our forecasting system for animal tracking provides a synthesis tool for assimilation of the real‐time location information into a movement model, and makes use of potential functions and online parameter learning. It can assimilate any direct (e.g., visual) or indirect (e.g., passive acoustic detections) location observations and can handle multiple location observations at the same time (i.e., within the DA window), or data available irregularly in time. We have extended the ensemble approach to a coherent group (whale pod) but it could integrate any level of aggregation, that is, multiple individuals each represented by ensembles and interacting with one another (Russell et al., 2017). Our system is thus flexible enough to be adapted to any data types, movement models, animal species and environmental conditions. Importantly, the forecasting framework provides a step towards making more effective use of data streams on animal movement with a goal of forecasting and proactive management of animal populations, especially in the context of human–wildlife conflicts. Pursuing real‐time animal movement prediction will drive advances in ecology by encouraging practitioners to confront their bio‐logging data with model predictions and so drive the iterative refinement of movement models, ultimately leading to improved understanding of ecological processes. The approach could even be used for retrospective studies and so provides a complementary way to better interpret and understand the ecological implications of movement data. We anticipate that the continued improvement of such a system will provide for ecological hypothesis testing and the refinement of predictive movement models to drive future insights into animal ecology.

CONFLICT OF INTEREST

The authors declare no conflict of interest.

Supporting information

Appendix S1

ACKNOWLEDGMENTS

We thank The Whale Museum for providing sightings data and A. Harris and E. Cummings for preparing the database. We thank J. Watson for sharing the SRKW intensity fields. We acknowledge H. Yurk of Fisheries & Oceans Canada (DFO) for championing our ideas. The project was funded by DFO as part of the Whale Detection and Collision Avoidance program.

Randon, Marine , Dowd Michael, and Joy Ruth. 2022. “A Real‐Time Data Assimilative Forecasting System for Animal Tracking.” Ecology 103(8): e3718. 10.1002/ecy.3718

Handling Editor: Brett T. McClintock

Funding information Department of Fisheries and Ocean Canada

Footnotes

1

In this section, for clarity, we present state estimation but note that time‐varying parameters may also be simultaneously estimated using an augmented state following the same procedure (see State augmentation ). In addition, we suppress the explicit dependence on covariates and static parameters for notational simplicity.

DATA AVAILABILITY STATEMENT

R code is available on figshare with DOI https://doi.org/10.6084/m9.figshare.17046026.v2. The data associated with the code were simulated inside the R script.

REFERENCES

  1. Abrahms, B. , Welch H., Brodie S., Jacox M. G., Becker E. A., Bograd S. J., Irvine L. M., Palacios D. M., Mate B. R., and Hazen E. L.. 2019. “Dynamic Ensemble Models to Predict Distributions and Anthropogenic Risk Exposure for Highly Mobile Species.” Diversity and Distributions 25(8): 1182–93. [Google Scholar]
  2. Auger‐Méthé, M. , Newman K., Cole D., Empacher F., Gryba R., King A. A., Leos‐Barajas V., et al. 2020. “An Introduction to State‐Space Modeling of Ecological Time Series.” arXiv preprint arXiv:2002.02001.
  3. Barlow, D. R. , and Torres L. G.. 2021. “Planning Ahead: Dynamic Models Forecast Blue Whale Distribution with Applications for Spatial Management.” Journal of Applied Ecology 58(11): 2493–504. [Google Scholar]
  4. Breece, M. W. , Oliver M. J., Fox D. A., Hale E. A., Haulsee D. E., Shatley M., Bograd S. J., Hazen E. L., and Welch H.. 2021. “A Satellite‐Based Mobile Warning System to Reduce Interactions with an Endangered Species.” Ecological Applications 31(6): e02358. [DOI] [PMC free article] [PubMed] [Google Scholar]
  5. Brillinger, D. R. , Preisler H. K., Ager A. A., and Kie J.. 2012. “The Use of Potential Functions in Modelling Animal Movement.” In Selected Works of David Brillinger, edited by Guttorp P. and Brillinger D., 385–409. New York, NY: Springer. [Google Scholar]
  6. Dietze, M. C. , Fox A., Beck‐Johnson L. M., Betancourt J. L., Hooten M. B., Jarnevich C. S., Keitt T. H., et al. 2018. “Iterative near‐Term Ecological Forecasting: Needs, Opportunities, and Challenges.” Proceedings of the National Academy of Sciences 115(7): 1424–32. [DOI] [PMC free article] [PubMed] [Google Scholar]
  7. Dowd, M. , Jones E., and Parslow J.. 2014. “A Statistical Overview and Perspectives on Data Assimilation for Marine Biogeochemical Models.” Environmetrics 25(4): 203–13. [Google Scholar]
  8. Dowd, M. , and Joy R.. 2011. “Estimating Behavioral Parameters in Animal Movement Models Using a State‐Augmented Particle Filter.” Ecology 92(3): 568–75. [DOI] [PubMed] [Google Scholar]
  9. Gervaise, C. , Simard Y., Aulanier F., and Roy N.. 2021. “Optimizing Passive Acoustic Systems for Marine Mammal Detection and Localization: Application to Real‐Time Monitoring North Atlantic Right Whales in Gulf of St. Lawrence.” Applied Acoustics 178: 107949. [Google Scholar]
  10. Gordon, N. J. , Salmond D. J., and Smith A. F.. 1993. “Novel Approach to Nonlinear/Non‐gaussian Bayesian State Estimation.” IEE Proceedings F‐Radar and Signal Processing, 140(2): 107–13. [Google Scholar]
  11. Hamill, T. M. , and Juras J.. 2006. “Measuring Forecast Skill: Is it Real Skill or Is it the Varying Climatology?” Quarterly Journal of the Royal Meteorological Society 132(621C): 2905–23. [Google Scholar]
  12. Hooten, M. B. , Johnson D. S., McClintock B. T., and Morales J. M.. 2017. Animal Movement: Statistical Models for Telemetry Data. Boca Raton, FL: CRC Press. [Google Scholar]
  13. Ionides, E. L. , Bhadra A., Atchadé Y., King A., et al. 2011. “Iterated Filtering.” The Annals of Statistics 39(3): 1776–802. [Google Scholar]
  14. Johnson, D. S. , London J. M., Lea M.‐A., and Durban J. W.. 2008. “Continuous‐Time Correlated Random Walk Model for Animal Telemetry Data.” Ecology 89(5): 1208–15. [DOI] [PubMed] [Google Scholar]
  15. Joy, R. , Tollit D., Wood J., MacGillivray A., Li Z., Trounce K., and Robinson O.. 2019. “Potential Benefits of Vessel Slowdowns on Endangered Southern Resident Killer Whales.” Frontiers in Marine Science 6: 344. [Google Scholar]
  16. Kent, C. S. , Bouchet P., Wellard R., Parnum I., Fouda L., and Erbe C.. 2020. “Seasonal Productivity Drives Aggregations of Killer Whales and Other Cetaceans over Submarine Canyons of the Bremer Sub‐Basin, South‐Western Australia.” Australian Mammalogy 43(2): 168–78. [Google Scholar]
  17. Kitagawa, G. 1998. “A Self‐Organizing State‐Space Model.” Journal of the American Statistical Association 93(443): 1203–15. [Google Scholar]
  18. Kloeden, P. E. , and Platen E.. 2013. Numerical Solution of Stochastic Differential Equations. Berlin: Springer Science & Business Media. [Google Scholar]
  19. McClintock, B. T. , London J. M., Cameron M. F., and Boveng P. L.. 2017. “Bridging the Gaps in Animal Movement: Hidden Behaviors and Ecological Relationships Revealed by Integrated Data Streams.” Ecosphere 8(3): e01751. [Google Scholar]
  20. McWhinnie, L. H. , O'Hara P. D., Hilliard C., Le Baron N., Smallshaw L., Pelot R., and Canessa R.. 2021. “Assessing Vessel Traffic in the Salish Sea Using Satellite AIS: An Important Contribution for Planning, Management and Conservation in Southern Resident Killer Whale Critical Habitat.” Ocean & Coastal Management 200: 105479. [Google Scholar]
  21. Michelot, T. , Glennie R., Harris C., and Thomas L.. 2021. “Varying‐Coefficient Stochastic Differential Equations with Applications in Ecology.” Journal of Agricultural, Biological and Environmental Statistics 26(3): 1–18. [Google Scholar]
  22. Olson, E. M. , Allen S. E., Do V., Dunphy M., and Ianson D.. 2020. “Assessment of Nutrient Supply by a Tidal Jet in the Northern Strait of Georgia Based on a Biogeochemical Model.” Journal of Geophysical Research: Oceans 25(8): e2019JC015766. [Google Scholar]
  23. Olson, J. K. , Wood J., Osborne R. W., Barrett‐Lennard L., and Larson S.. 2018. “Sightings of Southern Resident Killer Whales in the Ssalish Sea 1976–2014: The Importance of a Long‐Term Opportunistic Dataset.” Endangered Species Research 37: 105–18. [Google Scholar]
  24. Patterson, T. A. , Parton A., Langrock R., Blackwell P. G., Thomas L., and King R.. 2017. “Statistical Modelling of Individual Animal Movement: An Overview of Key Methods and a Discussion of Practical Challenges.” AStA Advances in Statistical Analysis 101(4): 399–438. [Google Scholar]
  25. Payne, M. R. , Hobday A. J., MacKenzie B. R., Tommasi D., Dempsey D. P., Fässler S. M., Haynie A. C., et al. 2017. “Lessons from the First Generation of Marine Ecological Forecast Products.” Frontiers in Marine Science 4: 289. [Google Scholar]
  26. Russell, J. C. , Hanks E. M., Haran M., Hughes D., et al. 2018. “A Spatially Varying Stochastic Differential Equation Model for Animal Movement.” Annals of Applied Statistics 12(2): 1312–31. [Google Scholar]
  27. Russell, J. C. , Hanks E. M., Modlmeier A. P., and Hughes D. P.. 2017. “Modeling Collective Animal Movement through Interactions in Behavioral States.” Journal of Agricultural, Biological and Environmental Statistics 22(3): 313–34. [Google Scholar]
  28. Wall, J. , Wittemyer G., Klinkenberg B., and Douglas‐Hamilton I.. 2014. “Novel Opportunities for Wildlife Conservation and Research with Real‐Time Monitoring.” Ecological Applications 24(4): 593–601. [DOI] [PubMed] [Google Scholar]
  29. Watson, J. , Joy R., Tollit D., Thornton S. J., and Auger‐Méthé M.. 2019. “Estimating Animal Utilization Distributions from Multiple Data Types: A Joint Spatiotemporal Point Process Framework.” The Annals of Applied Statistics 15(4): 1872–96. [Google Scholar]
  30. Williams, H. J. , Taylor L. A., Benhamou S., Bijleveld A. I., Clay T. A., de Grissac S., Demšar U., et al. 2020. “Optimizing the Use of Biologgers for Movement Ecology Research.” Journal of Animal Ecology 89(1): 186–206. [DOI] [PMC free article] [PubMed] [Google Scholar]
  31. Yates, K. L. , Bouchet P. J., Caley M. J., Mengersen K., Randin C. F., Parnell S., Fielding A. H., et al. 2018. “Outstanding Challenges in the Transferability of Ecological Models.” Trends in Ecology & Evolution 33(10): 790–802. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix S1

Data Availability Statement

R code is available on figshare with DOI https://doi.org/10.6084/m9.figshare.17046026.v2. The data associated with the code were simulated inside the R script.


Articles from Ecology are provided here courtesy of Wiley

RESOURCES